A gRPC based services layer runs on a container cluster and services all inbound requests from clients, queuing runs for processing by worker pools. This cluster scales up and down automatically based upon CPU load and memory utilization.
Worker pools are container clusters that work off of a queue and scale in/out based upon the number of messages in the queue. Ideally when processing decision optimizations, the same compute type is used for consistency of results. To ensure that containers have the same resources the cluster runs on top of an autoscaling group. Worker pools can be either based on ARM (lower cost, more compute) or AMD. The customer can configure multiple worker pools to ensure fairness amongst applications. The worker pool to use is specified when submitting a run to the run engine.
You choose the instance sizes and types for the pools, and the minimum and maximum number of containers to run. Pools allow you to manage dedicated resources for applications with different compute requirements, and scale independently without interfering with other applications.
Core Concepts
Jobs
The runner executes jobs. Jobs define a specific type of optimization to execute. There are several job types available; fleet, binary, and SDK (coming soon). Any supported job type can be assigned to any worker pool. A run is considered the execution of a specific type of job.
Fleet – A fleet run executes a Nextmv provided application for vehicle routing problems. Runs submitted to this job must conform to the nextmv specified fleet input model.
Binary – this type of job executes a binary that solves a decision optimization problem defined using the Nextmv engine. We intend to deprecate this type of run in the future in favor of a SDK plugin.
SDK – The SDK job executes a binary that uses the Nextmv plugin. This is not yet implemented. We can help with a workaround until this job is added using binary plugins.
Onfleet – Onfleet is an application that integrates Onfleet with fleet. The input must conform to the specified Onfleet input model. Nextmv calls Onfleet on your behalf to get input data, runs the optimization, then returns results.
Artifacts
Artifacts are customer produced objects used in the processing of runs. Currently artifacts supported are either a binary executable named main, or tarballs that contain a set of assets including an executable named main. These define the executable used when making binary runs. Binary run requests include an artifact identifier.
Worker Pools
Worker pools are dedicated compute pools for processing runs. You specify what pool to use on a run when submitting the request. You can control the minimum and maximum compute resources in a worker pool, and they will be scaled automatically for you based upon the number of runs waiting to be processed. Worker pools can use either ARM or AMD based containers and instances. Binaries that you run must match the architecture of the worker pool containers/instances.
Secrets
A secret is a customer-provided sensitive secret that the runner needs to protect, for example, an Onfleet API key. These secrets are encrypted and stored by the runner. For jobs that require secrets the input specifies a secret key identifier used by the job to access a previously defined secret.