Skip to content

SHAC Managed Engine for Keras (Tensorflow/CNTK)


Provides a managed engine for Keras with the Tensorflow / CNTK backend when using SHAC.

Performs a few useful tasks (for tensorflow) such as :

  • Provide a tf.Session object: In addition to the worker id and the parameter dictionary, a tf.Session is provided as the first parameter to the evaluation function. This session wraps the underlying graph, and can be used to freely evaluate all operations inside the evaluation function.

  • Graph scope management: All Tensorflow operations inside the evalution function will be under the scope of a managed tf.Graph, such that the provided session can be used to evaluate all ops inside the evaluation function.

  • Memory Management: One the evaluation is done, the graph destruction and session closing are managed automatically.

If parallel evaluation is not preferred, please refer the Serial Evaluation page.

Class Information


[source]

KerasSHAC

pyshac.core.managed.keras_engine.KerasSHAC(hyperparameter_list, total_budget, num_batches, max_gpu_evaluators, objective='max', max_classifiers=18, max_cpu_evaluators=1, save_dir='shac')

SHAC Engine specifically built for the Keras wrapper over the Graph based workflow of Tensorflow. It can also support CNTK, though it is not well tested.

It wraps the abstract SHAC engine with utilities to improve workflow with Keras, and performs additional maintenance over the evaluation function, such as creating a graph and session for it, assigning it to the backend and then destroying it and releasing its resources once evaluation is over.

This provides a cleaner interface to the Tensorflow codebase, and eases the building of models for evaluation. As long as the system has enough memory to run multiple copies of the evaluation model, there is no additional work required by the user inside the evaluation function.

Note : When using Eager Execution, it is preferred to use the default SHAC engine, and use tf.keras, as memory management is done by Tensorflow automatically in such a scenario.

Arguments:

  • hyperparameter_list (hp.HyperParameterList | None): A list of parameters (or a HyperParameterList) that are passed to define the search space. Can be None, so that it is loaded from disk instead.
  • total_budget (int): N. Defines the total number of models to evaluate.
  • num_batches (int): M. Defines the number of batches the work is distributed to. Must be set such that total budget is divisible by batch size.
  • max_gpu_evaluators (int): number of gpus. Can be 0 or more. Decides the number of GPUs used to evaluate models in parallel.
  • objective (str): Can be max or min. Whether to maximise the evaluation measure or minimize it.
  • max_classifiers (int): Maximum number of classifiers that are trained. Default (18) is according to the paper.
  • max_cpu_evaluators (int): Positive integer > 0 or -1. Sets the number of parallel evaluation functions calls are executed simultaneously. Set this to 1 unless you have a lot of memory for 2 or more models to be trained simultaneously. If set to -1, uses all CPU cores to evaluate N models simultaneously. Will cause OOM if the models are large.
  • save_dir (str): The base directory where the data of the engine will be stored.

References:

Raises:

  • ValueError: If keras backend is not Tensorflow or CNTK.
  • ValueError: If total budget is not divisible by batch size.

KerasSHAC methods

fit

fit(eval_fn, skip_cv_checks=False, early_stop=False, relax_checks=False, callbacks=None)

Generated batches of samples, trains total_classifiers number of XGBoost models and evaluates each batch with the supplied function in parallel.

Allows manually changing the number of processes that are used to generate samples or to evaluate them. While the defaults generally work well, further performance gains can be had by trying different values according to the limits of the system.

>>> eval = lambda id, params: np.exp(params['x'])
>>> shac = KerasSHAC(params, total_budget=100, num_batches=10)

>>> shac.set_num_parallel_generators(20)  # change the number of generator process
>>> shac.set_num_parallel_evaluators(1)  # change the number of evaluator processes
>>> shac.generator_backend = 'multiprocessing'  # change the backend for the generator (default is `multiprocessing`)
>>> shac.concurrent_evaluators()  # change the backend of the evaluator to use `threading`

Has an adaptive behaviour based on what epoch it is on, since later epochs require far more number of samples to generate a single batch of samples. When the epoch number increases beyond 10, it doubles the number of generator processes.

This adaptivity can be removed by setting the parameter limit_memory to True.

>>> shac.limit_memory = True

Arguments:

  • evaluation_function ((tf.Session, int, list) -> float): The evaluation function is passed a managed Tensorflow Session, the integer id (of the worker) and the sampled hyper parameters in an OrderedDict. The evaluation function is expected to pass a python floating point number representing the evaluated value.
  • skip_cv_checks (bool): If set, will not perform 5 fold cross validation check on the models before adding them to the classifer list. Useful when the batch size is small.
  • early_stop (bool): Stop running if fail to find a classifier that beats the last stage of evaluations.
  • relax_checks (bool): If set, will allow samples who do not pass all of the checks from all classifiers. Can be useful when large number of models are present and remaining search space is not big enough to allow sample to pass through all checks.
  • callbacks (list | None): Optional list of callbacks that are executed when the engine is being trained. History callback is automatically added for all calls to fit.

Returns:

A History object which tracks all the important information during training, and can be accessed using history.history as a dictionary.


fit_dataset

fit_dataset(dataset_path, skip_cv_checks=False, early_stop=False, presort=True, callbacks=None)

Uses the provided dataset file to train the engine, instead of using the sequentual halving and classification algorithm directly. The data provided in the path must strictly follow the format of the dataset maintained by the engine.

Standard format of datasets:

Each dataset csv file must contain an integer id column named "id" as its 1st column, followed by several columns describing the values taken by the hyper parameters, and the final column must be for the the objective criterion, and must be named "scores".

The csv file must contain a header, following the above format.

Example:

id,hp1,hp2,scores
0,1,h1,1.0
1,1,h2,0.2
2,0,h1,0.0
3,0,h3,0.5
...

Arguments:

  • dataset_path (str): The full or relative path to a csv file containing the values of the dataset.
  • skip_cv_checks (bool): If set, will not perform 5 fold cross validation check on the models before adding them to the classifer list. Useful when the batch size is small.
  • early_stop (bool): Stop running if fail to find a classifier that beats the last stage of evaluations.
  • presort (bool): Boolean flag to determine whether to sort the values of the dataset prior to loading. Ascending or descending sort is selected based on whether the engine is maximizing or minimizing the objective. It is preferable to set this always, to train better classifiers.
  • callbacks (list | None): Optional list of callbacks that are executed when the engine is being trained. History callback is automatically added for all calls to fit_dataset.

Raises:

  • ValueError: If the number of hyper parameters in the file are not the same as the number of hyper parameters that are available to the engine or if the number of samples in the provided dataset are less than the required number of samples by the engine.
  • FileNotFoundError: If the dataset is not available at the provided filepath.

Returns:

A History object which tracks all the important information during training, and can be accessed using history.history as a dictionary.


predict

predict(num_samples=None, num_batches=None, num_workers_per_batch=None, relax_checks=False, max_classfiers=None)

Using trained classifiers, sample the search space and predict which samples can successfully pass through the cascade of classifiers.

When using a full cascade of 18 classifiers, a vast amount of time to sample a single sample.

Sample mode vs Batch mode

Parameters can be generated in either sample mode or batch mode or any combination of the two.

num_samples is on a per sample basis (1 sample generated per count). Can be None or an int >= 0. num_batches is on a per batch basis (M samples generated per count). Can be None or an integer >= 0.

The two are combined to produce a total number of samples which are provided in a list.

Arguments:

  • num_samples (None | int): Number of samples to be generated.
  • num_batches (None | int): Number of batches of samples to be generated.
  • num_workers_per_batch (int): Determines how many parallel threads / processes are created to generate the batches. For small batches, it is best to use 1. If left as None, defaults to num_parallel_generators.
  • relax_checks (bool): If set, will allow samples who do not pass all of the checks from all classifiers. Can be useful when large number of models are present and remaining search space is not big enough to allow sample to pass through all checks.
  • max_classfiers (int | None): Number of classifiers to use for sampling. If set to None, will use all classifiers.

Raises:

  • ValueError: If max_classifiers is larger than the number of available classifiers.
  • RuntimeError: If classifiers are not available, either due to not being trained or not being loaded into the engine.

Returns:

batches of samples in the form of an OrderedDict


save_data

save_data()

Serialize the class objects by serializing the dataset and the trained models.


restore_data

restore_data()

Recover the serialized class objects by loading the dataset and the trained models from the default save directories.


set_num_parallel_generators

set_num_parallel_generators(n)

Check and sets the number of parallel generators. If None, checks if the number of workers exceeds the number of virtual cores. If it does, it warns about it and sets the max parallel generators to be the number of cores.

Arguments:

  • n (int | None): The number of parallel generators required.

set_num_parallel_evaluators

set_num_parallel_evaluators(n)

Check and sets the number of parallel evaluators. If None, checks if the number of workers exceeds the number of virtual cores. If it does, it warns about it and sets the max parallel generators to be the number of cores.

Arguments:

  • n (int | None): The number of parallel evaluators required.

parallel_evaluators

parallel_evaluators()

Sets the evaluators to use the loky backend.

The user must take the responsibility of thread safety and memory management.


concurrent_evaluators

concurrent_evaluators()

Sets the evaluators to use the threading backend, and therefore be locked by Python's GIL.

While technically still "parallel", it is in fact concurrent execution rather than parallel execution of the evaluators.


set_seed

set_seed(seed)

Sets the seed of the parameters and the engine.

Arguments:

  • seed (int | None): Seed value of the random state.

as_deterministic

as_deterministic()

Context manager that sets the seed of the parameters and the engine only inside the context block.

Arguments:

  • seed (int | None): Seed value of the random state.