Experiments

Batch Experiments

Reference for creating and analyzing the results of batch experiments.

To try out batch experiments with Nextmv, you must have a Nextmv account and be signed up for a free trial or on an active plan. For questions, please contact Nextmv support or start a free trial.

This page assumes you have read the Experiments Core Concept page.

Batch experiments are used to analyze the output from one or more decision models on a fixed set of inputs. They are generally used as an exploratory test to understand the impacts to business metrics (or KPIs) when updating a model with a new feature, such as an additional constraint. They can also be used to validate that a model is ready for further testing and likely to make an intended business impact.

Creating a batch experiment

Batch experiments can be created from Nextmv CLI or Nextmv Console. You can create a batch experiment without specifying an input set or instances, but it is best to be explicit when defining which models to test and which input set to use. The required entities to start a batch experiment are described in the table below.

Field Description
Name & descriptionThese are for your reference only and have no bearing on the experiment. The name is required, the description is optional.
IDThe ID may only include lowercase letters, numbers, periods, or hyphens and must be between 3 and 30 characters (and cannot start or end with a hyphen or period). The ID must be unique to the app the experiment is located in as well.
Input setAn input set must be specified when running a batch experiment. An input set is a collection of inputs; when you run the batch experiment each input file in the input set will be run on the executable binaries specified by the instances in the experiment and the results compared.
InstancesAt least one instance must be specified when running a batch experiment. Pay attention to the version that is set for the instances because it’s the version’s executable binary that will be used to run the inputs in the input set.

Using Nextmv CLI

You can create a batch experiment using the Nextmv CLI:

nextmv experiment batch start \
  --app-id "pizza-delivery" \
  --experiment-id "pizza-delivery-region-compare" \
  --name "Pizza Delivery Region Compare" \
  --description "Compares NYC and Philly region models" \
  --input-set-id "sample-day-runs" \
  --instance-ids "nyc-pizza-region,philly-pizza-region"
Copy

After executing the command you will be prompted with a summary of how many runs will be performed for the experiment and if you want to continue. Entering y will return the experiment ID and start the batch experiment. The results of the batch experiment can be viewed with Nextmv CLI or in Nextmv Console.

For a full reference on using Nextmv CLI commands for managing batch experiments visit the CLI reference page.

Using Nextmv Console

Navigate to the Experiments section in your app (it will land on Batch as the default view) and then click on the New Experiment button. Fill in the fields and then click Create & run experiment.

You will be returned to the list of experiments where you can then click on the newly created experiment to view the results. Note, when running large experiments, you may need to check back later to view the results.

Viewing a batch experiment

The results of a batch experiment can be retrieved via Nextmv CLI or viewed in Nextmv Console. Nextmv CLI returns the raw data in JSON format and is useful if you would like to perform your own operations on the experiment results.

Nextmv Console displays the results grouped by indicator keys and for each one includes a summary table, the percentile values and a box plot chart displaying summary values from the versions compared. To view the batch experiment result in Console, navigate to your app and then click on the Experiments section of your app. You can type the name of your batch experiment in the filter box at the top of the experiment list or scroll to find it. Click on the experiment name and the experiment details view will load.

Structure of result

The result of a batch experiment is returned as JSON. If you’re viewing the result in Console, the JSON is parsed and displayed in a web view; Nextmv CLI returns the complete JSON response. The response contains the batch experiment metadata and the results in the form of grouped distributional summaries.

Top-level properties

The table below summarizes the top-level properties in the return.

Field Description
idThe batch experiment ID that was specified.
nameThe batch experiment name that was specified.
descriptionThe batch experiment description that was specified. If no description was specified this field will not exist.
statusThe status of the experiment. The status can be: started, completed, or failed.
created_atThe date the experiment was created and started.
input_set_idThe input set ID specified for the batch experiment.
instance_idsAn array of the Instance IDs that were specified for the batch experiment.
grouped_distributional_summariesThe grouped distributional summaries is an array that contains the results of the batch experiment. It is a collection of calculated results from the individual runs by certain groupings.

Grouped distributional summaries

There are three types of summaries included in the grouped_distributional_summaries array (there could be more in the future):

  1. Version (instance)
  2. Version (instance) + input
  3. Input

Each type is included for every experiment. However, note that if you are viewing the experiment in Console, ONLY the version summaries are displayed. In the future, Console will display all types of summaries.

No matter the type, each grouped distributional summary includes the following:

Field Description
group_keysThis describes the type of group distributional summary which can be one of three options:
  • instanceID, versionId (the version summary)
  • inputID, instanceID, versionId (the version & input summary)
  • inputID (the input summary)
group_valuesThe values that correspond to the group_keys. So if the group keys are instanceID and versionID, the group_values will be the ID of the instance and the ID of the version.

indicator_keys

These are the statistics that are being evaluated by the batch experiment.

If you’re using the nextroute template, six statistics are automatically added for evaluation:

  • result.value,
  • result.elapsed,
  • metadata.duration,
  • result.custom.routing.stops.unassigned,
  • result.custom.routing.stops.assigned, and
  • result.custom.used_vehicles.

Custom statistics can be specified by adding them to the output JSON’s top-level statistics block. See the Add custom statistics section of the vehicle routing input and output schema reference.

indicator_distributionsAn object that contains all of the values from the analysis for that particular indicator. If there are six indicator keys for example, the indicator_distributions will contain six object properties, the property key will correspond to the values in the indicator_keys array and each property value will be an object with matching data (see Indicator distributions section below).
number_of_runs_totalThis is the number of runs that were analyzed for this particular summary. For example, if you ran an experiment with two instances and an input set with three runs, the version summary (instanceID + versionID) will have a run total of three runs because it is running all three input files on that particular version. The version & input summary (instanceID + versionID + inputID) will have one run because it ran that one input file on that particular version. And the input summary (inputID) will have two runs because it ran that input file on the two instances.

Indicator distributions

Each object property value in the indicator_distributions contains the values in the table below. Note that for some runs, certain values may be missing (a custom statistic for example). If you’re viewing the results in Console and a grouped distributional summary is missing values, a warning message will appear. If you are analyzing the results from the returned JSON, you must handle this check in your own systems.

When the runs are being evaluated, the final value is taken from the last solution found before the run has been terminated. The run duration can be set as an option on the experiment or will be set by the executable binary used for the run.

All values in the indicator distributions are either numbers or strings. If they are strings, they are one of three string values: nan, +inf, or -inf.

Field Description
minThe minimum of the values returned from the runs for the statistic being evaluated. For example, if you are viewing the result.custom.used_vehicles indicator distribution, and if there were three runs with one input file returning 50, another 40, and the other 60; the min value would be 40.
maxThe maximum of the values returned from the runs for the statistic being evaluated.
countThe number of successful runs that have the specific indicator in their statistics output.
meanThe average of the values returned from the runs for the statistic being evaluated.
stdThe standard deviation of the values returned from the runs for the statistic being evaluated. Uses a denominator of n−1 (see Corrected sample standard deviation).
shifted_geometric_meanThe shifted geometric mean of the values returned from the runs for the statistic being evaluated. (The shift parameter is equal to 10.)
percentilesAn object that contains the percentiles of the values returned from the runs for the statistic being evaluated. There are nine values that give the following percentiles: 1%, 5%, 10%, 25%, 50%, 75%, 90%, 95%, and 99%.

Experiment runs

Runs made for the experiment can be retreived with the /applications/{application_id}/experiments/batch/{batch_id}/runs endpoint. This will return the runs of the experiment in the following format:

{
  "runs": [
    {
        "id": "staging-dehnyogSg",
        "input_id": "staging-wZ8xcIQ4R",
        "created_at": "2023-08-21T21:09:41.48243386Z",
        "application_id": "pizza-delivery",
        "experiment_id": "pizza-delivery-region-compare",
        "application_instance_id": "nyc-pizza-region",
        "application_version_id": "nyc-v1.2.13",
        "status": "succeeded",
        "statistics": {
          "status": "succeeded",
          "indicators": [
            {
              "name": "result.value",
              "value": 2175064,
            },
            ...
          ]
        }
    },
    ...
  ]
}
Copy

Where runs is an array that contains the runs made for the experiment. Each run object includes the run metadata plus any summary statistics that were specified in the excecutable binaries used for the experiment (see table below).

Experiment runs must be retrieved independent of the batch experiment details. However, in Console this run history table can be viewed at the bottom of the experiment details. Each run history item can be clicked to view the details of the run. If the app is a routing app (using either the routing or nextroute template) the run details will also include a visualization of the results.

The run history data can also can be downloaded as a CSV file in Console. Click the Download CSV link in the upper right area of the experiment run history table to download the data as a CSV file.

For more information on retrieving experiment runs, visit the Experiments API reference.

Note that experiment runs are not shown in your app’s run history.

Experiment run summary data

Run history objects can include the items specified in the table below.

Field Description
idThe id for the run. This is generated automatically and cannot be changed.
created_atThe date and time the run was created.
application_idThe ID of the application in which the run was made.
application_instance_idThe ID of the instance used for the run.
application_version_idThe ID of the version used for the run.
experiment_idThe ID of the experiment for which the run was made.
input_idThe ID of the input used for the run.
statusThe status of the run. This can be either running, succeeded, or failed. If the status if failed, no error will be shown in the run history data, but the details of this run will contain relevant error messages. The run details for an experiment run can be viewed in Console by clicking on the run ID in the experiment run history table or retrieved using the /applications/{application_id}/runs/{run_id} endpoint.

statistics

The statistics evaluated for the experiment and their values. The statistics field contains the following fields:

  • status
  • error
  • indicators

The status of the statistics can be none, pending, succeeded, or failed. If it’s failed, then indicators is not present and error contains the error message. If the status is succeded, then indicators contains the values and error is not present. If the status is none or pending, then both error and indicators are not present.

The indicators field contains an array of the statistic indicator evaluated for the batch experiment in the format of { name: <indicator_name>, value: <indicator_value> }.

In Console, the indicator values are shown in the experiment run history table only if they are present. No errors are displayed.

Page last updated

Go to on-page nav menu