SHAC Engine¶
The core algorithm from the paper Parallel Architecture and Hyperparameter Search via Successive Halving and Classification.
This engine provides an interface similar to Scikit-Learn, with two methods - fit
and predict
which training and parameter sampling respectively.
The data for the classifiers is generated via rejection sampling, using Joblib for parallel batch generation and evaluation. It even allows for halted training - such that training can resume from the previous stage without any issue.
Two important user inputs to this class are :
- Evaluation function
- Hyper Parameter list
Note¶
The engine will generate samples at each epoch of training, and supply them to the evaluation function which will also be executed in parallel. Therefore, all stages of sample generation, training, evaluation must be thread safe.
While the training algorithm can manage thread safety of the generator and training modules, the evaluation module is for the user to write.
Therefore, we provide wrappers such as TensorflowSHAC
, KerasSHAC
and TorchSHAC
to allow safer evaluation. However, this comes
at significant cost of execution, as a new Tensorflow Graph and likewise, a new Keras Session must be built for each evaluator thread.
Serial Evaluation
If parallel evaluation is not preferred, please refer the Serial Evaluation page.
Class Information¶
SHAC¶
pyshac.core.engine.SHAC(hyperparameter_list, total_budget, num_batches, objective='max', max_classifiers=18, save_dir='shac')
The default and generic implementation of the SHAC algorithm. It is a wrapper over the abstract class, and performs no additional maintenance over the evaluation function.
It is fastest engine, but assumes that the evaluation function is thread safe and the system has sufficient memory to run several copies of the evaluation function at the same time.
This engine is suitable for Numpy / PyTorch based work. Both numpy and PyTorch dynamically allocate memory, and therefore, as long as the system has enough memory to run multiple copies of the evaluation model, there is no additional memory management to be done.
Still, for PyTorch, the number of evaluation processes should be carefully set, so as not to exhaust all CPU / GPU memory during execution.
Arguments:
- evaluation_function ((int, list) -> float): The evaluation function is passed only the integer id (of the worker) and the sampled hyper parameters in an OrderedDict. The evaluation function is expected to pass a python floating point number representing the evaluated value.
- hyperparameter_list (hp.HyperParameterList | None): A list of parameters (or a HyperParameterList) that are passed to define the search space. Can be None, so that it is loaded from disk instead.
- total_budget (int):
N
. Defines the total number of models to evaluate. - num_batches (int):
M
. Defines the number of batches the work is distributed to. Must be set such thattotal budget
is divisible bybatch size
. - objective (str): Can be
max
ormin
. Whether to maximise the evaluation measure or minimize it. - max_classifiers (int): Maximum number of classifiers that are trained. Default (18) is according to the paper.
- save_dir (str): The base directory where the data of the engine will be stored.
References:
Raises:
- ValueError: If
total budget
is not divisible bybatch size
.
SHAC methods¶
fit¶
fit(eval_fn, skip_cv_checks=False, early_stop=False, relax_checks=False, callbacks=None)
Generated batches of samples, trains total_classifiers
number of XGBoost models
and evaluates each batch with the supplied function in parallel.
Allows manually changing the number of processes that are used to generate samples or to evaluate them. While the defaults generally work well, further performance gains can be had by trying different values according to the limits of the system.
>>> eval = lambda id, params: np.exp(params['x']) >>> shac = SHAC(params, total_budget=100, num_batches=10) >>> shac.set_num_parallel_generators(20) # change the number of generator process >>> shac.set_num_parallel_evaluators(1) # change the number of evaluator processes >>> shac.generator_backend = 'multiprocessing' # change the backend for the generator (default is `multiprocessing`) >>> shac.concurrent_evaluators() # change the backend of the evaluator to use `threading`
Has an adaptive behaviour based on what epoch it is on, since later epochs require far more number of samples to generate a single batch of samples. When the epoch number increases beyond 10, it doubles the number of generator processes.
This adaptivity can be removed by setting the parameter limit_memory
to True.
>>> shac.limit_memory = True
Arguments:
- evaluation_function ((int, list) -> float): The evaluation function is passed only the integer id (of the worker) and the sampled hyper parameters in an OrderedDict. The evaluation function is expected to pass a python floating point number representing the evaluated value.
- skip_cv_checks (bool): If set, will not perform 5 fold cross validation check on the models before adding them to the classifer list. Useful when the batch size is small.
- early_stop (bool): Stop running if fail to find a classifier that beats the last stage of evaluations.
- relax_checks (bool): If set, will allow samples who do not pass all of the checks from all classifiers. Can be useful when large number of models are present and remaining search space is not big enough to allow sample to pass through all checks.
- callbacks (list | None): Optional list of callbacks that are executed when
the engine is being trained.
History
callback is automatically added for all calls tofit
.
Returns:
A History
object which tracks all the important information
during training, and can be accessed using history.history
as a dictionary.
fit_dataset¶
fit_dataset(dataset_path, skip_cv_checks=False, early_stop=False, presort=True, callbacks=None)
Uses the provided dataset file to train the engine, instead of using the sequentual halving and classification algorithm directly. The data provided in the path must strictly follow the format of the dataset maintained by the engine.
Standard format of datasets:
Each dataset csv file must contain an integer id column named "id" as its 1st column, followed by several columns describing the values taken by the hyper parameters, and the final column must be for the the objective criterion, and must be named "scores".
The csv file must contain a header, following the above format.
Example:
id,hp1,hp2,scores 0,1,h1,1.0 1,1,h2,0.2 2,0,h1,0.0 3,0,h3,0.5 ...
Arguments:
- dataset_path (str): The full or relative path to a csv file containing the values of the dataset.
- skip_cv_checks (bool): If set, will not perform 5 fold cross validation check on the models before adding them to the classifer list. Useful when the batch size is small.
- early_stop (bool): Stop running if fail to find a classifier that beats the last stage of evaluations.
- presort (bool): Boolean flag to determine whether to sort the values of the dataset prior to loading. Ascending or descending sort is selected based on whether the engine is maximizing or minimizing the objective. It is preferable to set this always, to train better classifiers.
- callbacks (list | None): Optional list of callbacks that are executed when
the engine is being trained.
History
callback is automatically added for all calls tofit_dataset
.
Raises:
- ValueError: If the number of hyper parameters in the file are not the same as the number of hyper parameters that are available to the engine or if the number of samples in the provided dataset are less than the required number of samples by the engine.
- FileNotFoundError: If the dataset is not available at the provided filepath.
Returns:
A History
object which tracks all the important information
during training, and can be accessed using history.history
as a dictionary.
predict¶
predict(num_samples=None, num_batches=None, num_workers_per_batch=None, relax_checks=False, max_classfiers=None)
Using trained classifiers, sample the search space and predict which samples can successfully pass through the cascade of classifiers.
When using a full cascade of 18 classifiers, a vast amount of time to sample a single sample.
Sample mode vs Batch mode
Parameters can be generated in either sample mode or batch mode or any combination of the two.
num_samples
is on a per sample basis (1 sample generated per count). Can be None
or an int >= 0.
num_batches
is on a per batch basis (M samples generated per count). Can be None
or an integer >= 0.
The two are combined to produce a total number of samples which are provided in a list.
Arguments:
- num_samples (None | int): Number of samples to be generated.
- num_batches (None | int): Number of batches of samples to be generated.
- num_workers_per_batch (int): Determines how many parallel threads / processes
are created to generate the batches. For small batches, it is best to use 1.
If left as
None
, defaults tonum_parallel_generators
. - relax_checks (bool): If set, will allow samples who do not pass all of the checks from all classifiers. Can be useful when large number of models are present and remaining search space is not big enough to allow sample to pass through all checks.
- max_classfiers (int | None): Number of classifiers to use for sampling. If set to None, will use all classifiers.
Raises:
- ValueError: If
max_classifiers
is larger than the number of available classifiers. - RuntimeError: If
classifiers
are not available, either due to not being trained or not being loaded into the engine.
Returns:
batches of samples in the form of an OrderedDict
save_data¶
save_data()
Serialize the class objects by serializing the dataset and the trained models.
restore_data¶
restore_data()
Recover the serialized class objects by loading the dataset and the trained models from the default save directories.
set_num_parallel_generators¶
set_num_parallel_generators(n)
Check and sets the number of parallel generators. If None, checks if the number of workers exceeds the number of virtual cores. If it does, it warns about it and sets the max parallel generators to be the number of cores.
Arguments:
- n (int | None): The number of parallel generators required.
set_num_parallel_evaluators¶
set_num_parallel_evaluators(n)
Check and sets the number of parallel evaluators. If None, checks if the number of workers exceeds the number of virtual cores. If it does, it warns about it and sets the max parallel generators to be the number of cores.
Arguments:
- n (int | None): The number of parallel evaluators required.
parallel_evaluators¶
parallel_evaluators()
Sets the evaluators to use the loky
backend.
The user must take the responsibility of thread safety and memory management.
concurrent_evaluators¶
concurrent_evaluators()
Sets the evaluators to use the threading backend, and therefore be locked by Python's GIL.
While technically still "parallel", it is in fact concurrent execution rather than parallel execution of the evaluators.
set_seed¶
set_seed(seed)
Sets the seed of the parameters and the engine.
Arguments:
- seed (int | None): Seed value of the random state.
as_deterministic¶
as_deterministic()
Context manager that sets the seed of the parameters and the engine only inside the context block.
Arguments:
- seed (int | None): Seed value of the random state.