SHAC Managed Engine for PyTorch¶
Provides a managed engine for PyTorch workflows when using SHAC.
SHAC Engine specifically built for PyTorch when using CPUs/GPUs.
This engine is used primarily for its management of how many evaluation processes are used, and to provide a unified interface similar to TensorflowSHAC in determining the number of GPUs and CPU cores used.
Since PyTorch allocates memory to the graph dynamically, graph maintanence is unnecessary. As long as the system has enough memory to run multiple copies of the evaluation model, there is no additional work required by the user inside the evaluation function.
If parallel evaluation is not preferred, please refer the Serial Evaluation page.
Class Information¶
TorchSHAC¶
pyshac.core.managed.torch_engine.TorchSHAC(hyperparameter_list, total_budget, num_batches, max_gpu_evaluators, objective='max', max_classifiers=18, max_cpu_evaluators=1, save_dir='shac')
SHAC Engine specifically built for PyTorch when using CPUs/GPUs.
This engine is used primarily for its management of how many evaluation processes are used, and to provide a unified interface similar to TensorflowSHAC in determining the number of GPUs and CPU cores used.
Since PyTorch allocates memory to the graph dynamically, graph maintanence is unnecessary. As long as the system has enough memory to run multiple copies of the evaluation model, there is no additional work required by the user inside the evaluation function.
Arguments:
- hyperparameter_list (hp.HyperParameterList | None): A list of parameters (or a HyperParameterList) that are passed to define the search space. Can be None, so that it is loaded from disk instead.
- total_budget (int):
N
. Defines the total number of models to evaluate. - num_batches (int):
M
. Defines the number of batches the work is distributed to. Must be set such thattotal budget
is divisible bybatch size
. - max_gpu_evaluators (int): number of gpus. Can be 0 or more. Decides the number of GPUs used to evaluate models in parallel.
- objective (str): Can be
max
ormin
. Whether to maximise the evaluation measure or minimize it. - max_classifiers (int): Maximum number of classifiers that are trained. Default (18) is according to the paper.
- max_cpu_evaluators (int): Positive integer > 0 or -1. Sets the number of parallel evaluation functions calls are executed simultaneously. Set this to 1 unless you have a lot of memory for 2 or more models to be trained simultaneously. If set to -1, uses all CPU cores to evaluate N models simultaneously. Will cause OOM if the models are large.
- save_dir (str): The base directory where the data of the engine will be stored.
References:
Raises:
- ValueError: If
total budget
is not divisible bybatch size
.
TorchSHAC methods¶
fit¶
fit(eval_fn, skip_cv_checks=False, early_stop=False, relax_checks=False, callbacks=None)
Generated batches of samples, trains total_classifiers
number of XGBoost models
and evaluates each batch with the supplied function in parallel.
Allows manually changing the number of processes that are used to generate samples or to evaluate them. While the defaults generally work well, further performance gains can be had by trying different values according to the limits of the system.
>>> eval = lambda id, params: np.exp(params['x']) >>> shac = TorchSHAC(params, total_budget=100, num_batches=10) >>> shac.set_num_parallel_generators(20) # change the number of generator process >>> shac.set_num_parallel_evaluators(1) # change the number of evaluator processes >>> shac.generator_backend = 'multiprocessing' # change the backend for the generator (default is `multiprocessing`) >>> shac.concurrent_evaluators() # change the backend of the evaluator to use `threading`
Has an adaptive behaviour based on what epoch it is on, since later epochs require far more number of samples to generate a single batch of samples. When the epoch number increases beyond 10, it doubles the number of generator processes.
This adaptivity can be removed by setting the parameter limit_memory
to True.
>>> shac.limit_memory = True
Arguments:
- evaluation_function ((int, list) -> float): The evaluation function is passed only the integer id (of the worker) and the sampled hyper parameters in an OrderedDict. The evaluation function is expected to pass a python floating point number representing the evaluated value.
- skip_cv_checks (bool): If set, will not perform 5 fold cross validation check on the models before adding them to the classifer list. Useful when the batch size is small.
- early_stop (bool): Stop running if fail to find a classifier that beats the last stage of evaluations.
- relax_checks (bool): If set, will allow samples who do not pass all of the checks from all classifiers. Can be useful when large number of models are present and remaining search space is not big enough to allow sample to pass through all checks.
- callbacks (list | None): Optional list of callbacks that are executed when
the engine is being trained.
History
callback is automatically added for all calls tofit
.
Returns:
A History
object which tracks all the important information
during training, and can be accessed using history.history
as a dictionary.
fit_dataset¶
fit_dataset(dataset_path, skip_cv_checks=False, early_stop=False, presort=True, callbacks=None)
Uses the provided dataset file to train the engine, instead of using the sequentual halving and classification algorithm directly. The data provided in the path must strictly follow the format of the dataset maintained by the engine.
Standard format of datasets:
Each dataset csv file must contain an integer id column named "id" as its 1st column, followed by several columns describing the values taken by the hyper parameters, and the final column must be for the the objective criterion, and must be named "scores".
The csv file must contain a header, following the above format.
Example:
id,hp1,hp2,scores 0,1,h1,1.0 1,1,h2,0.2 2,0,h1,0.0 3,0,h3,0.5 ...
Arguments:
- dataset_path (str): The full or relative path to a csv file containing the values of the dataset.
- skip_cv_checks (bool): If set, will not perform 5 fold cross validation check on the models before adding them to the classifer list. Useful when the batch size is small.
- early_stop (bool): Stop running if fail to find a classifier that beats the last stage of evaluations.
- presort (bool): Boolean flag to determine whether to sort the values of the dataset prior to loading. Ascending or descending sort is selected based on whether the engine is maximizing or minimizing the objective. It is preferable to set this always, to train better classifiers.
- callbacks (list | None): Optional list of callbacks that are executed when
the engine is being trained.
History
callback is automatically added for all calls tofit_dataset
.
Raises:
- ValueError: If the number of hyper parameters in the file are not the same as the number of hyper parameters that are available to the engine or if the number of samples in the provided dataset are less than the required number of samples by the engine.
- FileNotFoundError: If the dataset is not available at the provided filepath.
Returns:
A History
object which tracks all the important information
during training, and can be accessed using history.history
as a dictionary.
predict¶
predict(num_samples=None, num_batches=None, num_workers_per_batch=None, relax_checks=False, max_classfiers=None)
Using trained classifiers, sample the search space and predict which samples can successfully pass through the cascade of classifiers.
When using a full cascade of 18 classifiers, a vast amount of time to sample a single sample.
Sample mode vs Batch mode
Parameters can be generated in either sample mode or batch mode or any combination of the two.
num_samples
is on a per sample basis (1 sample generated per count). Can be None
or an int >= 0.
num_batches
is on a per batch basis (M samples generated per count). Can be None
or an integer >= 0.
The two are combined to produce a total number of samples which are provided in a list.
Arguments:
- num_samples (None | int): Number of samples to be generated.
- num_batches (None | int): Number of batches of samples to be generated.
- num_workers_per_batch (int): Determines how many parallel threads / processes
are created to generate the batches. For small batches, it is best to use 1.
If left as
None
, defaults tonum_parallel_generators
. - relax_checks (bool): If set, will allow samples who do not pass all of the checks from all classifiers. Can be useful when large number of models are present and remaining search space is not big enough to allow sample to pass through all checks.
- max_classfiers (int | None): Number of classifiers to use for sampling. If set to None, will use all classifiers.
Raises:
- ValueError: If
max_classifiers
is larger than the number of available classifiers. - RuntimeError: If
classifiers
are not available, either due to not being trained or not being loaded into the engine.
Returns:
batches of samples in the form of an OrderedDict
save_data¶
save_data()
Serialize the class objects by serializing the dataset and the trained models.
restore_data¶
restore_data()
Recover the serialized class objects by loading the dataset and the trained models from the default save directories.
set_num_parallel_generators¶
set_num_parallel_generators(n)
Check and sets the number of parallel generators. If None, checks if the number of workers exceeds the number of virtual cores. If it does, it warns about it and sets the max parallel generators to be the number of cores.
Arguments:
- n (int | None): The number of parallel generators required.
set_num_parallel_evaluators¶
set_num_parallel_evaluators(n)
Check and sets the number of parallel evaluators. If None, checks if the number of workers exceeds the number of virtual cores. If it does, it warns about it and sets the max parallel generators to be the number of cores.
Arguments:
- n (int | None): The number of parallel evaluators required.
parallel_evaluators¶
parallel_evaluators()
Sets the evaluators to use the loky
backend.
The user must take the responsibility of thread safety and memory management.
concurrent_evaluators¶
concurrent_evaluators()
Sets the evaluators to use the threading backend, and therefore be locked by Python's GIL.
While technically still "parallel", it is in fact concurrent execution rather than parallel execution of the evaluators.
set_seed¶
set_seed(seed)
Sets the seed of the parameters and the engine.
Arguments:
- seed (int | None): Seed value of the random state.
as_deterministic¶
as_deterministic()
Context manager that sets the seed of the parameters and the engine only inside the context block.
Arguments:
- seed (int | None): Seed value of the random state.