To try out batch experiments with Nextmv, you must have a Nextmv account and be signed up for a free trial or on an active plan. For questions, please contact Nextmv support or start a free trial.
This page assumes you have read the Experiments Core Concept page.
Batch experiments are used to analyze the output from one or more decision models on a fixed set of inputs. They are generally used as an exploratory test to understand the impacts to business metrics (or KPIs) when updating a model with a new feature, such as an additional constraint. They can also be used to validate that a model is ready for further testing and likely to make an intended business impact.
Creating a batch experiment
Batch experiments can be created from Nextmv CLI or Nextmv Console. You can create a batch experiment without specifying an input set or instances, but it is best to be explicit when defining which models to test and which input set to use. The required entities to start a batch experiment are described in the table below.
Field | Description |
---|---|
Name & description | These are for your reference only and have no bearing on the experiment. The name is required, the description is optional. |
ID | The ID may only include lowercase letters, numbers, periods, or hyphens and must be between 3 and 30 characters (and cannot start or end with a hyphen or period). The ID must be unique to the app the experiment is located in as well. |
Input set | An input set must be specified when running a batch experiment. An input set is a collection of inputs; when you run the batch experiment each input file in the input set will be run on the executable binaries specified by the instances in the experiment and the results compared. |
Instances | At least one instance must be specified when running a batch experiment. Pay attention to the version that is set for the instances because it’s the version’s executable binary that will be used to run the inputs in the input set. |
Using Nextmv CLI
You can create a batch experiment using the Nextmv CLI:
After executing the command you will be prompted with a summary of how many runs will be performed for the experiment and if you want to continue. Entering y
will return the experiment ID and start the batch experiment. The results of the batch experiment can be viewed with Nextmv CLI or in Nextmv Console.
For a full reference on using Nextmv CLI commands for managing batch experiments visit the CLI reference page.
Using Nextmv Console
Navigate to the Experiments section in your app (it will land on Batch as the default view) and then click on the New Experiment button. Fill in the fields and then click Create & run experiment.
You will be returned to the list of experiments where you can then click on the newly created experiment to view the results. Note, when running large experiments, you may need to check back later to view the results.
Viewing a batch experiment
The results of a batch experiment can be retrieved via Nextmv CLI or viewed in Nextmv Console. Nextmv CLI returns the raw data in JSON format and is useful if you would like to perform your own operations on the experiment results.
Nextmv Console displays the results grouped by indicator keys and for each one includes a summary table, the percentile values and a box plot chart displaying summary values from the versions compared. To view the batch experiment result in Console, navigate to your app and then click on the Experiments section of your app. You can type the name of your batch experiment in the filter box at the top of the experiment list or scroll to find it. Click on the experiment name and the experiment details view will load.
Structure of result
The result of a batch experiment is returned as JSON. If you’re viewing the result in Console, the JSON is parsed and displayed in a web view; Nextmv CLI returns the complete JSON response. The response contains the batch experiment metadata and the results in the form of grouped distributional summaries.
Top-level properties
The table below summarizes the top-level properties in the return.
Field | Description |
---|---|
id | The batch experiment ID that was specified. |
name | The batch experiment name that was specified. |
description | The batch experiment description that was specified. If no description was specified this field will not exist. |
status | The status of the experiment. The status can be: started , completed , or failed . |
created_at | The date the experiment was created and started. |
input_set_id | The input set ID specified for the batch experiment. |
instance_ids | An array of the Instance IDs that were specified for the batch experiment. |
grouped_distributional_summaries | The grouped distributional summaries is an array that contains the results of the batch experiment. It is a collection of calculated results from the individual runs by certain groupings. |
Grouped distributional summaries
There are three types of summaries included in the grouped_distributional_summaries
array (there could be more in the future):
- Version (instance)
- Version (instance) + input
- Input
Each type is included for every experiment. However, note that if you are viewing the experiment in Console, ONLY the version summaries are displayed. In the future, Console will display all types of summaries.
No matter the type, each grouped distributional summary includes the following:
Field | Description |
---|---|
group_keys | This describes the type of group distributional summary which can be one of three options:
|
group_values | The values that correspond to the group_keys . So if the group keys are instanceID and versionID , the group_values will be the ID of the instance and the ID of the version. |
| These are the statistics that are being evaluated by the batch experiment. If you’re using the
Custom statistics can be specified by adding them to the output JSON’s top-level |
indicator_distributions | An object that contains all of the values from the analysis for that particular indicator. If there are six indicator keys for example, the indicator_distributions will contain six object properties, the property key will correspond to the values in the indicator_keys array and each property value will be an object with matching data (see Indicator distributions section below). |
number_of_runs_total | This is the number of runs that were analyzed for this particular summary. For example, if you ran an experiment with two instances and an input set with three runs, the version summary (instanceID + versionID ) will have a run total of three runs because it is running all three input files on that particular version. The version & input summary (instanceID + versionID + inputID ) will have one run because it ran that one input file on that particular version. And the input summary (inputID ) will have two runs because it ran that input file on the two instances. |
Indicator distributions
Each object property value in the indicator_distributions
contains the values in the table below. Note that for some runs, certain values may be missing (a custom statistic for example). If you’re viewing the results in Console and a grouped distributional summary is missing values, a warning message will appear. If you are analyzing the results from the returned JSON, you must handle this check in your own systems.
When the runs are being evaluated, the final value is taken from the last solution found before the run has been terminated. The run duration can be set as an option on the experiment or will be set by the executable binary used for the run.
All values in the indicator distributions are either numbers or strings. If they are strings, they are one of three string values: nan
, +inf
, or -inf
.
Field | Description |
---|---|
min | The minimum of the values returned from the runs for the statistic being evaluated. For example, if you are viewing the result.custom.used_vehicles indicator distribution, and if there were three runs with one input file returning 50, another 40, and the other 60; the min value would be 40. |
max | The maximum of the values returned from the runs for the statistic being evaluated. |
count | The number of successful runs that have the specific indicator in their statistics output. |
mean | The average of the values returned from the runs for the statistic being evaluated. |
std | The standard deviation of the values returned from the runs for the statistic being evaluated. Uses a denominator of n−1 (see Corrected sample standard deviation). |
shifted_geometric_mean | The shifted geometric mean of the values returned from the runs for the statistic being evaluated. (The shift parameter is equal to 10.) |
percentiles | An object that contains the percentiles of the values returned from the runs for the statistic being evaluated. There are nine values that give the following percentiles: 1%, 5%, 10%, 25%, 50%, 75%, 90%, 95%, and 99%. |
Experiment runs
Runs made for the experiment can be retreived with the /applications/{application_id}/experiments/batch/{batch_id}/runs
endpoint. This will return the runs of the experiment in the following format:
Where runs
is an array that contains the runs made for the experiment. Each run object includes the run metadata plus any summary statistics that were specified in the excecutable binaries used for the experiment (see table below).
Experiment runs must be retrieved independent of the batch experiment details. However, in Console this run history table can be viewed at the bottom of the experiment details. Each run history item can be clicked to view the details of the run. If the app is a routing app (using either the routing
or nextroute
template) the run details will also include a visualization of the results.
The run history data can also can be downloaded as a CSV file in Console. Click the Download CSV link in the upper right area of the experiment run history table to download the data as a CSV file.
For more information on retrieving experiment runs, visit the Experiments API reference.
Note that experiment runs are not shown in your app’s run history.
Experiment run summary data
Run history objects can include the items specified in the table below.
Field | Description |
---|---|
id | The id for the run. This is generated automatically and cannot be changed. |
created_at | The date and time the run was created. |
application_id | The ID of the application in which the run was made. |
application_instance_id | The ID of the instance used for the run. |
application_version_id | The ID of the version used for the run. |
experiment_id | The ID of the experiment for which the run was made. |
input_id | The ID of the input used for the run. |
status | The status of the run. This can be either running , succeeded , or failed . If the status if failed , no error will be shown in the run history data, but the details of this run will contain relevant error messages. The run details for an experiment run can be viewed in Console by clicking on the run ID in the experiment run history table or retrieved using the /applications/{application_id}/runs/{run_id} endpoint. |
| The statistics evaluated for the experiment and their values. The
The status of the statistics can be The In Console, the indicator values are shown in the experiment run history table only if they are present. No errors are displayed. |