RandomSearch
RandomSearch(
param_space: Dict[str, Any],
pipeline: Pipeline,
score: str = rmse,
n_iter: int = 100,
seed: int = 5483,
**kwargs
)
Bases: _Hyperopt
Uniformly distributed sampling of the hyperparameters.
During every iteration, a new set of hyperparameters is chosen at random by uniformly drawing a random value in between the lower and upper bound for each dimension of param_space
independently.
Enterprise edition
This feature is exclusive to the Enterprise edition and is not available in the Community edition. Discover the benefits of the Enterprise edition and compare their features.
For licensing information and technical support, please contact us.
PARAMETER | DESCRIPTION |
---|---|
param_space | Dictionary containing numerical arrays of length two holding the lower and upper bounds of all parameters which will be altered in If we have two feature learners and one predictor, the hyperparameter space might look like this:
|
pipeline | Base pipeline used to derive all models fitted and scored during the hyperparameter optimization. Be careful in constructing it since only those parameters present in TYPE: |
score | The score to optimize. Must be from |
n_iter | Number of iterations in the hyperparameter optimization and thus the number of parameter combinations to draw and evaluate. Range: [1, ∞] TYPE: |
seed | Seed used for the random number generator that underlies the sampling procedure to make the calculation reproducible. Due to nature of the underlying algorithm this is only the case if the fit is done without multithreading. To reflect this, a TYPE: |
Example
from getml import data
from getml import datasets
from getml import engine
from getml import feature_learning
from getml.feature_learning import aggregations
from getml.feature_learning import loss_functions
from getml import hyperopt
from getml import pipeline
from getml import predictors
# ----------------
engine.set_project("examples")
# ----------------
population_table, peripheral_table = datasets.make_numerical()
# ----------------
# Construct placeholders
population_placeholder = data.Placeholder("POPULATION")
peripheral_placeholder = data.Placeholder("PERIPHERAL")
population_placeholder.join(peripheral_placeholder, "join_key", "time_stamp")
# ----------------
# Base model - any parameters not included
# in param_space will be taken from this.
fe1 = feature_learning.Multirel(
aggregation=[
aggregations.COUNT,
aggregations.SUM
],
loss_function=loss_functions.SquareLoss,
num_features=10,
share_aggregations=1.0,
max_length=1,
num_threads=0
)
# ----------------
# Base model - any parameters not included
# in param_space will be taken from this.
fe2 = feature_learning.Relboost(
loss_function=loss_functions.SquareLoss,
num_features=10
)
# ----------------
# Base model - any parameters not included
# in param_space will be taken from this.
predictor = predictors.LinearRegression()
# ----------------
pipe = pipeline.Pipeline(
population=population_placeholder,
peripheral=[peripheral_placeholder],
feature_learners=[fe1, fe2],
predictors=[predictor]
)
# ----------------
# Build a hyperparameter space.
# We have two feature learners and one
# predictor, so this is how we must
# construct our hyperparameter space.
# If we only wanted to optimize the predictor,
# we could just leave out the feature_learners.
param_space = {
"feature_learners": [
{
"num_features": [10, 50],
},
{
"max_depth": [1, 10],
"min_num_samples": [100, 500],
"num_features": [10, 50],
"reg_lambda": [0.0, 0.1],
"shrinkage": [0.01, 0.4]
}],
"predictors": [
{
"reg_lambda": [0.0, 10.0]
}
]
}
# ----------------
# Wrap a RandomSearch around the reference model
random_search = hyperopt.RandomSearch(
pipeline=pipe,
param_space=param_space,
n_iter=30,
score=pipeline.metrics.rsquared
)
random_search.fit(
population_table_training=population_table,
population_table_validation=population_table,
peripheral_tables=[peripheral_table]
)
Source code in getml/hyperopt/hyperopt.py
1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 |
|
best_pipeline property
best_pipeline: Pipeline
The best pipeline that is part of the hyperparameter optimization.
This is always based on the validation data you have passed even if you have chosen to score the pipeline on other data afterwards.
RETURNS | DESCRIPTION |
---|---|
Pipeline | The best pipeline. |
id property
id: str
Name of the hyperparameter optimization. This is used to uniquely identify it on the engine.
RETURNS | DESCRIPTION |
---|---|
str | The name of the hyperparameter optimization. |
name property
name: str
Returns the ID of the hyperparameter optimization. The name property is kept for backward compatibility.
RETURNS | DESCRIPTION |
---|---|
str | The name of the hyperparameter optimization. |
score property
score: str
The score to be optimized.
RETURNS | DESCRIPTION |
---|---|
str | The score to be optimized. |
type property
type: str
The algorithm used for the hyperparameter optimization.
RETURNS | DESCRIPTION |
---|---|
str | The algorithm used for the hyperparameter optimization. |
clean_up
clean_up() -> None
Deletes all pipelines associated with hyperparameter optimization, but the best pipeline.
Source code in getml/hyperopt/hyperopt.py
246 247 248 249 250 251 252 253 254 255 256 257 |
|
fit
fit(
container: Union[Container, StarSchema, TimeSeries],
train: str = "train",
validation: str = "validation",
) -> _Hyperopt
Launches the hyperparameter optimization.
PARAMETER | DESCRIPTION |
---|---|
container | The data container used for the hyperparameter tuning. TYPE: |
train | The name of the subset in 'container' used for training. TYPE: |
validation | The name of the subset in 'container' used for validation. TYPE: |
RETURNS | DESCRIPTION |
---|---|
_Hyperopt | The current instance. |
Source code in getml/hyperopt/hyperopt.py
261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 |
|
refresh
refresh() -> _Hyperopt
Reloads the hyperparameter optimization from the Engine.
RETURNS | DESCRIPTION |
---|---|
_Hyperopt | Current instance |
Source code in getml/hyperopt/hyperopt.py
367 368 369 370 371 372 373 374 375 |
|
validate
validate() -> None
Validate the parameters of the hyperparameter optimization.
Source code in getml/hyperopt/hyperopt.py
1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 |
|