Design space
Currently, the following types of parameters are supported:
Integeral parameters
Continuous parameters
Continuous parameters, varying in log space (for example, we may want to search learing rate with in (1e-4, 1e-2))
Boolean parameters
Categorical parameters
These built-in parameter types are managed by DesignSpace
class, we’ll firstly import the class
Define design space
[17]:
# Copyright (C) 2020. Huawei Technologies Co., Ltd. All rights reserved.
# This program is free software; you can redistribute it and/or modify it under
# the terms of the MIT license.
# This program is distributed in the hope that it will be useful, but WITHOUT ANY
# WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
# PARTICULAR PURPOSE. See the MIT License for more details.
import torch
from hebo.design_space.design_space import DesignSpace
Suppose we want to optimize the hyper-parameters of a neural network, the hyper-parameters to be optimized are:
Size of hidden units
hidden_size
, it should be integer, the range is [16, 128]Batch size
batch_size
, the range is also [16, 128]Learning rate
lr
, the range is [1e-4, 1e-2], but we want it to vary in log spaceWhether or not to use batch normalization
use_bn
, it should be a boolean parameterDropout rate
dropout_rate
, it’s a continuous parameter, ranging from 0.1 to 0.5Activation function
activation
, we define it as a categorical parameter, possible candidates arerelu
,tanh
andsigmoid
Optimizer
optimizer
, it is also a categorical parameter, candidates aresgd
,adam
andrmsprop
We can define a list of dictionary to specify the above hyper-parameters, and then pass the list to DesignSpace
class.
[2]:
params = [
{'name' : 'hidden_size', 'type' : 'int', 'lb' : 16, 'ub' : 128},
{'name' : 'batch_size', 'type' : 'int', 'lb' : 16, 'ub' : 128},
{'name' : 'lr', 'type' : 'pow', 'lb' : 1e-4, 'ub' : 1e-2, 'base' : 10},
{'name' : 'use_bn', 'type' : 'bool'},
{'name' : 'activation', 'type' : 'cat', 'categories' : ['relu', 'tanh','sigmoid']},
{'name' : 'dropout_rate', 'type' : 'num', 'lb' : 0.1, 'ub' : 0.9},
{'name' : 'optimizer', 'type' : 'cat', 'categories' : ['sgd', 'adam', 'rmsprop']}
]
space = DesignSpace().parse(params)
That’s it, we have defined the search space, now we can do some random sampling. Runing DesignSpace.sample()
would return a pandas dataframe.
[3]:
space.sample(5)
[3]:
hidden_size | batch_size | lr | use_bn | activation | dropout_rate | optimizer | |
---|---|---|---|---|---|---|---|
0 | 27 | 107 | 0.000930 | True | tanh | 0.730755 | rmsprop |
1 | 115 | 101 | 0.000348 | False | tanh | 0.115669 | sgd |
2 | 27 | 91 | 0.000214 | False | tanh | 0.214109 | adam |
3 | 70 | 56 | 0.000851 | True | sigmoid | 0.763984 | adam |
4 | 94 | 43 | 0.000815 | True | tanh | 0.615758 | rmsprop |
Inside DesignSpace
: parameter transformation
NOTE: You can skip this section if you don’t need to define new parameter types or develop new BO algorithms.
We can see that DesignSpace.sample()
returns a pandas dataframe, that’s how design parameters are represented, however, there are some drawbacks directly using the above dataframe to fit the surrogate model in BO:
Categorical parameters are represented by
str
, they should be transformed to integersFor parameters varying in log space, it would be better to perform log transformation before feeding them to BO algorithms
The DesignSpace.transform
does the above two things, it transforms transforms categorical variables to integers and performs log transformation to parameters varying in log space.
DesignSpace.transform
takes a pandas dataframe as input, and returns a torch.FloatTensor
and a torch.LongTensor
: numerical and boolean parameters are transformed to FloatTensor
(for boolean parameters, we can view True/False
as 0/1), and categorical parameters would be transformed to a LongTensor
[4]:
samp = space.sample(3)
samp
[4]:
hidden_size | batch_size | lr | use_bn | activation | dropout_rate | optimizer | |
---|---|---|---|---|---|---|---|
0 | 82 | 72 | 0.004158 | False | tanh | 0.896806 | rmsprop |
1 | 61 | 99 | 0.000997 | True | sigmoid | 0.239654 | sgd |
2 | 119 | 25 | 0.002203 | False | tanh | 0.186145 | sgd |
[5]:
x,xe = space.transform(samp)
assert isinstance(x, torch.FloatTensor)
assert isinstance(xe, torch.LongTensor)
x.shape, xe.shape
[5]:
(torch.Size([3, 5]), torch.Size([3, 2]))
The five numerical parameters are transformed to x
, and the two categorical parameters are transformed to xe
, the order of each column can be seen from DesignSpace.numeric_names
and DesignSpace.enum_names
[6]:
x, space.numeric_names
[6]:
(tensor([[ 82.0000, 72.0000, -2.3811, 0.0000, 0.8968],
[ 61.0000, 99.0000, -3.0012, 1.0000, 0.2397],
[119.0000, 25.0000, -2.6570, 0.0000, 0.1861]]),
['hidden_size', 'batch_size', 'lr', 'use_bn', 'dropout_rate'])
From the above cell, we can see that log transformation is performed to lr
, and values of the boolean parameter use_bn
is transformed to 0/1.
The two categorical parameters are transformed to integers
[7]:
xe, space.enum_names
[7]:
(tensor([[2, 1],
[1, 2],
[2, 2]]),
['activation', 'optimizer'])
We can use DesignSpace.inverse_transform
to recover the original dataframe
[8]:
space.inverse_transform(x,xe)
[8]:
hidden_size | batch_size | lr | use_bn | activation | dropout_rate | optimizer | |
---|---|---|---|---|---|---|---|
0 | 82 | 72 | 0.004158 | False | tanh | 0.896806 | rmsprop |
1 | 61 | 99 | 0.000997 | True | sigmoid | 0.239654 | sgd |
2 | 119 | 25 | 0.002203 | False | tanh | 0.186145 | sgd |
Bound of transformed parameters
In DesignSpace
class, the bound of the transformed parameters is automatically calculated, we can see the lower bound and upper bound using DesignSapce.opt_lb
and DesignSpace.opt_ub
[9]:
space.opt_lb
[9]:
tensor([16.0000, 16.0000, -4.0000, 0.0000, 0.1000, 0.0000, 0.0000],
dtype=torch.float64)
[10]:
space.opt_ub
[10]:
tensor([128.0000, 128.0000, -2.0000, 1.0000, 0.9000, 2.0000, 2.0000],
dtype=torch.float64)
The order of bound vector elements is space.numeric_names + space.enum_names
[13]:
space.numeric_names + space.enum_names
[13]:
['hidden_size',
'batch_size',
'lr',
'use_bn',
'dropout_rate',
'activation',
'optimizer']
We can see that the third element of the bound vector is lr
, and the range is transformed from [\(10^{-4}\), \(10^{-2}\)] to [-4, -2].