To try out experiments with Nextmv, you must have a Nextmv account and be signed up for a free trial or paid account. For questions, please contact Nextmv support.
Experiments are used to evaluate changes to your models by running and comparing the results of one or more executable binaries (i.e. different versions). Experimentation is a key part of developing a good model and Nextmv’s goal is to make it easier to run experiments so you can focus on improving your model.
Nextmv Platform provides a suite of products to create and manage different types of experiments. Currently there are three types of experiments: batch, acceptance, and shadow; and a feature called input sets which is a way to manage the inputs that are used for the experiments. Experiments are always created and managed in the context of an application. That is, each application will have its own set of experiments (that you have created). See the Apps core concepts page for more information about applications.
Experiments and input sets can be created and managed with Nextmv CLI, Nextmv Console, or the HTTP API endpoints. Created experiments are saved and can be accessed at any time. After experiments have been started, the results are aggregated and can be retrieved with the same tools. When viewing the result of an experiment, Console provides a visual interpretation of the results, while the API and Nextmv CLI provide the raw JSON.
The different types of experiments and input sets are summarized below.
Types of experiments
Batch experiment
Batch experiments are used to analyze the output from one or more decision models. They are generally used as an exploratory test to understand the impacts to business metrics (or KPIs) when updating a model with a new feature, such as an additional constraint. They can also be used to validate that a model is ready for further testing — and likely to make an intended business impact.
See the batch experiment reference guide for more information on batch experiments.
Acceptance Testing
Acceptance testing builds on the core concept of a batch test with a focus on evaluating the differences between exactly two models and assigning a pass / fail label based on predefined thresholds. They are used to verify if business or operational requirements (e.g., KPIs and OKRs) are being met. Acceptance tests involve running an existing production model and a new updated model against a set of test data. You then look at the results and determine if the new model is acceptable based on criteria identified beforehand.
Shadow testing
A shadow test is an experiment that runs in the background and compares the results of a baseline instance against a candidate instance. When the shadow test has started, any run made on the baseline instance will trigger a run on the candidate instance using the same input and options. The results of the shadow test are often used to determine if a new version of a model is ready to be promoted to production.
Shadow tests can be created using the CLI, Nextmv console or the HTTP API. See the shadow test reference guide for more information on shadow tests
Input sets
Input sets are defined sets of input files to use for an experiment. You can create input sets with Nextmv CLI, in Nextmv Console, or with the HTTP API endpoints.
At the moment inputs for the input sets can only be retrieved from prior runs. So to “upload” an input you must make a run using this input. Then when you create an input set you can reference the run ID and when the input set is created it will take the input used for this run as the input file. Alternatively you can specify a date range and an instance ID to gather inputs for an input set. Note that the maximum number of inputs allowed in an input set is 20.
Review the Results
After running an experiment from the CLI, navigate to the Nextmv console to view the results of your experiment comparing the models. Note, when running large experiments, you may need to check back later to view results.
Within the Nextmv console, you'll find your experiment under the Experiments
section.