GaussianHyperparameterSearch
GaussianHyperparameterSearch(
param_space: Dict[str, Any],
pipeline: Pipeline,
score: str = rmse,
n_iter: int = 100,
seed: int = 5483,
ratio_iter: float = 0.8,
optimization_algorithm: str = nelder_mead,
optimization_burn_in_algorithm: str = latin_hypercube,
optimization_burn_ins: int = 500,
surrogate_burn_in_algorithm: str = latin_hypercube,
gaussian_kernel: str = matern52,
gaussian_optimization_burn_in_algorithm: str = latin_hypercube,
gaussian_optimization_algorithm: str = nelder_mead,
gaussian_optimization_burn_ins: int = 500,
gaussian_nugget: int = 50,
early_stopping: bool = True,
)
Bases: _Hyperopt
Bayesian hyperparameter optimization using a Gaussian process.
After a burn-in period, a Gaussian process is used to pick the most promising parameter combination to be evaluated next based on the knowledge gathered throughout previous evaluations. Assessing the quality of potential combinations will be done using the expected information (EI).
Enterprise edition
This feature is exclusive to the Enterprise edition and is not available in the Community edition. Discover the benefits of the Enterprise edition and compare their features.
For licensing information and technical support, please contact us.
PARAMETER | DESCRIPTION |
---|---|
param_space | Dictionary containing numerical arrays of length two holding the lower and upper bounds of all parameters which will be altered in If we have two feature learners and one predictor, the hyperparameter space might look like this:
If we only want to optimize the predictor, then we can leave out the feature learners. |
pipeline | Base pipeline used to derive all models fitted and scored during the hyperparameter optimization. Be careful when constructing it since only the parameters present in TYPE: |
score | The score to optimize. Must be from |
n_iter | Number of iterations in the hyperparameter optimization and thus the number of parameter combinations to draw and evaluate. Range: [1, ∞] TYPE: |
seed | Seed used for the random number generator that underlies the sampling procedure to make the calculation reproducible. Due to nature of the underlying algorithm, this is only the case if the fit is done without multithreading. To reflect this, a TYPE: |
ratio_iter | Ratio of the iterations used for the burn-in. For a As a rule of thumb at least 70 percent of the evaluations should be spent in the burn-in phase. The more comprehensive the exploration of the TYPE: |
optimization_algorithm | Determines the optimization algorithm used for the local search in the optimization of the expected information (EI). Must be from TYPE: |
optimization_burn_in_algorithm | Specifies the algorithm used to draw initial points in the burn-in period of the optimization of the expected information (EI). Must be from DEFAULT: |
optimization_burn_ins | Number of random evaluation points used during the burn-in of the minimization of the expected information (EI). After the surrogate model - the Gaussian process - was successfully fitted to the previous parameter combination, the algorithm is able to calculate the EI for a given point. In order to get to the next combination, the EI has to be maximized over the whole parameter space. Much like the GaussianProcess itself, this requires a burn-in phase. Range: [3, ∞] TYPE: |
surrogate_burn_in_algorithm | Specifies the algorithm used to draw new parameter combinations during the burn-in period. Must be from TYPE: |
gaussian_kernel | Specifies the 1-dimensional kernel of the Gaussian process which will be used along each dimension of the parameter space. All the choices below will result in continuous sample paths and their main difference is the degree of smoothness of the results with 'exp' yielding the least and 'gauss' yielding the most smooth paths. Must be from |
gaussian_optimization_burn_in_algorithm | Specifies the algorithm used to draw new parameter combinations during the burn-in period of the optimization of the Gaussian process. Must be from TYPE: |
gaussian_optimization_algorithm | Determines the optimization algorithm used for the local search in the fitting of the Gaussian process to the previous parameter combinations. Must be from TYPE: |
gaussian_optimization_burn_ins | Number of random evaluation points used during the burn-in of the fitting of the Gaussian process. Range: [3, ∞] TYPE: |
early_stopping | Whether you want to apply early stopping to the predictors. TYPE: |
Note
A Gaussian hyperparameter search works like this:
-
It begins with a burn-in phase, usually about 70% to 90% of all iterations. During that burn-in phase, the hyperparameter space is sampled more or less at random. You can control this phase using
ratio_iter
andsurrogate_burn_in_algorithm
. -
Once enough information has been collected, it fits a Gaussian process on the hyperparameters with the
score
we want to maximize or minimize as the predicted variable. Note that the Gaussian process has hyperparameters itself, which are also optimized. You can control this phase usinggaussian_kernel
,gaussian_optimization_algorithm
,gaussian_optimization_burn_in_algorithm
andgaussian_optimization_burn_ins
. -
It then uses the Gaussian process to predict the expected information (EI), which is how much additional information it might get from evaluating a particular point in the hyperparameter space. The expected information is to be maximized. The point in the hyperparameter space with the maximum expected information is the next point that is actually evaluated (meaning a new pipeline with these hyperparameters is trained). You can control this phase using
optimization_algorithm
,optimization_burn_ins
andoptimization_burn_in_algorithm
.
In a nutshell, the GaussianHyperparameterSearch behaves like human data scientists:
-
At first, it picks random hyperparameter combinations.
-
Once it has gained a better understanding of the hyperparameter space, it starts evaluating hyperparameter combinations that are particularly interesting.
References
Example
from getml import data
from getml import datasets
from getml import engine
from getml import feature_learning
from getml.feature_learning import aggregations
from getml.feature_learning import loss_functions
from getml import hyperopt
from getml import pipeline
from getml import predictors
# ----------------
engine.set_project("examples")
# ----------------
population_table, peripheral_table = datasets.make_numerical()
# ----------------
# Construct placeholders
population_placeholder = data.Placeholder("POPULATION")
peripheral_placeholder = data.Placeholder("PERIPHERAL")
population_placeholder.join(peripheral_placeholder, "join_key", "time_stamp")
# ----------------
# Base model - any parameters not included
# in param_space will be taken from this.
fe1 = feature_learning.Multirel(
aggregation=[
aggregations.COUNT,
aggregations.SUM
],
loss_function=loss_functions.SquareLoss,
num_features=10,
share_aggregations=1.0,
max_length=1,
num_threads=0
)
# ----------------
# Base model - any parameters not included
# in param_space will be taken from this.
fe2 = feature_learning.Relboost(
loss_function=loss_functions.SquareLoss,
num_features=10
)
# ----------------
# Base model - any parameters not included
# in param_space will be taken from this.
predictor = predictors.LinearRegression()
# ----------------
pipe = pipeline.Pipeline(
population=population_placeholder,
peripheral=[peripheral_placeholder],
feature_learners=[fe1, fe2],
predictors=[predictor]
)
# ----------------
# Build a hyperparameter space.
# We have two feature learners and one
# predictor, so this is how we must
# construct our hyperparameter space.
# If we only wanted to optimize the predictor,
# we could just leave out the feature_learners.
param_space = {
"feature_learners": [
{
"num_features": [10, 50],
},
{
"max_depth": [1, 10],
"min_num_samples": [100, 500],
"num_features": [10, 50],
"reg_lambda": [0.0, 0.1],
"shrinkage": [0.01, 0.4]
}],
"predictors": [
{
"reg_lambda": [0.0, 10.0]
}
]
}
# ----------------
# Wrap a GaussianHyperparameterSearch around the reference model
gaussian_search = hyperopt.GaussianHyperparameterSearch(
pipeline=pipe,
param_space=param_space,
n_iter=30,
score=pipeline.metrics.rsquared
)
gaussian_search.fit(
population_table_training=population_table,
population_table_validation=population_table,
peripheral_tables=[peripheral_table]
)
# ----------------
# We want 5 additional iterations.
gaussian_search.n_iter = 5
# We do not want another burn-in-phase,
# so we set ratio_iter to 0.
gaussian_search.ratio_iter = 0.0
# This widens the hyperparameter space.
gaussian_search.param_space["feature_learners"][1]["num_features"] = [10, 100]
# This narrows the hyperparameter space.
gaussian_search.param_space["predictors"][0]["reg_lambda"] = [0.0, 0.0]
# This continues the hyperparameter search using the previous iterations as
# prior knowledge.
gaussian_search.fit(
population_table_training=population_table,
population_table_validation=population_table,
peripheral_tables=[peripheral_table]
)
# ----------------
all_hyp = hyperopt.list_hyperopts()
best_pipeline = gaussian_search.best_pipeline
Source code in getml/hyperopt/hyperopt.py
739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 |
|
best_pipeline property
best_pipeline: Pipeline
The best pipeline that is part of the hyperparameter optimization.
This is always based on the validation data you have passed even if you have chosen to score the pipeline on other data afterwards.
RETURNS | DESCRIPTION |
---|---|
Pipeline | The best pipeline. |
id property
id: str
Name of the hyperparameter optimization. This is used to uniquely identify it on the engine.
RETURNS | DESCRIPTION |
---|---|
str | The name of the hyperparameter optimization. |
name property
name: str
Returns the ID of the hyperparameter optimization. The name property is kept for backward compatibility.
RETURNS | DESCRIPTION |
---|---|
str | The name of the hyperparameter optimization. |
score property
score: str
The score to be optimized.
RETURNS | DESCRIPTION |
---|---|
str | The score to be optimized. |
type property
type: str
The algorithm used for the hyperparameter optimization.
RETURNS | DESCRIPTION |
---|---|
str | The algorithm used for the hyperparameter optimization. |
clean_up
clean_up() -> None
Deletes all pipelines associated with hyperparameter optimization, but the best pipeline.
Source code in getml/hyperopt/hyperopt.py
246 247 248 249 250 251 252 253 254 255 256 257 |
|
fit
fit(
container: Union[Container, StarSchema, TimeSeries],
train: str = "train",
validation: str = "validation",
) -> _Hyperopt
Launches the hyperparameter optimization.
PARAMETER | DESCRIPTION |
---|---|
container | The data container used for the hyperparameter tuning. TYPE: |
train | The name of the subset in 'container' used for training. TYPE: |
validation | The name of the subset in 'container' used for validation. TYPE: |
RETURNS | DESCRIPTION |
---|---|
_Hyperopt | The current instance. |
Source code in getml/hyperopt/hyperopt.py
261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 |
|
refresh
refresh() -> _Hyperopt
Reloads the hyperparameter optimization from the Engine.
RETURNS | DESCRIPTION |
---|---|
_Hyperopt | Current instance |
Source code in getml/hyperopt/hyperopt.py
367 368 369 370 371 372 373 374 375 |
|
validate
validate() -> None
Validate the parameters of the hyperparameter optimization.
Source code in getml/hyperopt/hyperopt.py
795 796 797 798 799 |
|