Launching an experiment
An optimization experiment is defined by the combination of:
-
A component (see section registering a component) and its parametric space (the possible values taken by the optimizer)
-
An application: defined as a file that can be launched through Slurm
-
An optimization heuristic: an optimization algorithm, a number of exploration and exploitation step, a possible pruning strategy, a possible noise reduction strategy. More details on black-box optimization are available here.
Launching an experiment through the Web interface
The preferred way of launching an optimization experiment is through the Web interface, by filling out the menu available at the path /create.
Experiment arguments
-
The component and its parametric space: the component to optimize, among the registered ones, as well as the parametric space, defined through a min, max and a step value.
-
The application: the application to optimize is written as a shell file, directly through the Web interface.
-
Number of iterations: the number of exploration iteration must be specified as an integer value.
Setting up the optimizer
-
Pruning strategies: this option stops jobs that take longer than a certain threshold, either longer than the default parametrization or the current median execution time.
-
Noise reduction strategies: this option resamples parametrization a certain number of times to smooth out possible noise. The resampling can be static (i.e. fixed number of resamples) or dynamic (i.e. the parametrization is repeated until the 95% confidence interval is below a fixed threshold). The fitness aggregation function is the transformation applied to target metric corresponding to the same parametrization (it can be either the mean or median).
-
Initialization steps: this field specifies the number of parametrization sampled using a Latin Hypercube Design before starting the optimization heuristic.
-
Optimization heuristic: three possible heuristics (surrogate models, genetic algorithms and simulated annealing) can be chosen, as well as their different hyperparameters. For a detailed overview of the available heuristics and their hyperparametrization, refer to this section here.
Launching an experiment through the command line interface
The shaman-optimize command
Tip
Launching an experiment through the command line interface gives access to more options to tune the optimizer, but it is also riskier because there is no configuration check before running the experiment. Beware !
Warning
Before running the shaman-optimize command, you must source the environment file written during the installation step.
Launching an optimization experiment can also be done through the command line interface shaman-optimize.
shaman-optimize --help
Usage: shaman-optimize [OPTIONS]
Run an optimization experiment.
Options:
--component-name TEXT The name of the component to tune [required]
--nbr-iteration INTEGER The maximal number of iterations to run the
experiment for. [required]
--sbatch-file TEXT The path to the sbatch file [required]
--experiment-name TEXT The name to give the experiment [required]
--configuration-file TEXT The path to the configuration file [required]
--sbatch-dir TEXT The directory to store the sbatch
--slurm-dir TEXT The directory to write the slurm outputs
--result-file TEXT The path to the result file.
--help Show this message and exit.
It requires:
-
The application to optimize: a file that can be submitted with the sbatch command.
-
The number of iterations: the allocated number of iterations for the experiment.
-
Experiment name: the name of the experiment for re-identification in the Web interface
-
Configuration file: the configuration file of the experiment, see next section.
-
Directory to store the sbatch: an optional argument to indicate where the sbatch generated by shaman must be stored.
-
Directory to store the slurm outputs: an optional argument to indicate where to store the slurm outputs.
Writing a configuration file
SHAMan's configuration file comes with five sections and is written using a yaml format:
-
experiment: contains information about the experiment. Possible options: -
default_first: set to True if the first parameter tested by the optimizer -
bbo: parametrizes the optimizer. The possible options are the different arguments taken by the optimizer. All the arguments described in the file will be passed as kwargs of theBBOptimizerclass (see section stand-alone optimization of the documentation). -
noise_reduction: parametrizes the noise reduction features of the optimizer. The possible arguments are the different noise reduction strategies available inbbo, as well as their specific kwargs. For more details on how to use noise reduction withbbo, see section. -
pruning: parametrizes the pruning strategy features of the optimizer (see section for more details). Possible options: -
max_step_duration: the maximum elapsed time before stopping the parametrization. Can either be:- A numpy estimator (for example,
numpy.median,numpy.mean, etc). Runs that go above the value of the estimator computed on the already tested execution times are interrupted. - A float value (for example, 5), which corresponds to an elapsed time in seconds
- The
defaultstring, which corresponds to stopping runs that take longer than the default value. The option default_first in theexperimentsection must be set toTrue.
- A numpy estimator (for example,
-
components: the selected component and either its defined parametric grid, using themin,maxandstepformat, or a list of the possible taken values. In the case of a parametric grid, if the value is set tomultiplicativethe step option is used as multiplicatively.
An example of a configuration file is written below:
experiment:
default_first: True
bbo:
heuristic: surrogate_model
regression_model: sklearn.gaussian_process.GaussianProcessRegressor
next_parameter_strategy: bbo.heuristics.surrogate_models.next_parameter_strategies.expected_improvement
initial_sample_size: 10
max_retry: 5
reevaluate: false
stop_criterion: improvement_criterion
stop_window: 5
improvement_estimator: numpy.min
improvement_threshold: 0.01
components:
component_1:
param_1:
min: 1
max: 20
step: 1
param_2:
min: 2
max: 16
step: 2
step_type: multiplicative
param_3:
- value_1
- value_3
This example configuration file launches an experiment which:
- Runs the default parametrization first
- Uses Bayesian Optimization with expected improvement as acquisition function
- Uses
component_1andparam_1can take any value from 1 to 20,param_2can take any value from 2 to 16 with a power of 2, and theparam_3value either 'value_1' or 'value_2'.
Vizualizing an experiment
Once the experiment has been launched through either of the two methods described above, it can be vizualized in the Web interface, where different statistics are displayed. If the experiment is still running, the evolution is available in real-time.