Design space

Currently, the following types of parameters are supported:

  • Integeral parameters

  • Continuous parameters

  • Continuous parameters, varying in log space (for example, we may want to search learing rate with in (1e-4, 1e-2))

  • Boolean parameters

  • Categorical parameters

These built-in parameter types are managed by DesignSpace class, we’ll firstly import the class

Define design space

[17]:
# Copyright (C) 2020. Huawei Technologies Co., Ltd. All rights reserved.

# This program is free software; you can redistribute it and/or modify it under
# the terms of the MIT license.

# This program is distributed in the hope that it will be useful, but WITHOUT ANY
# WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
# PARTICULAR PURPOSE. See the MIT License for more details.

import torch
from hebo.design_space.design_space import DesignSpace

Suppose we want to optimize the hyper-parameters of a neural network, the hyper-parameters to be optimized are:

  • Size of hidden units hidden_size, it should be integer, the range is [16, 128]

  • Batch size batch_size, the range is also [16, 128]

  • Learning rate lr, the range is [1e-4, 1e-2], but we want it to vary in log space

  • Whether or not to use batch normalization use_bn, it should be a boolean parameter

  • Dropout rate dropout_rate, it’s a continuous parameter, ranging from 0.1 to 0.5

  • Activation function activation, we define it as a categorical parameter, possible candidates are relu, tanh and sigmoid

  • Optimizer optimizer, it is also a categorical parameter, candidates are sgd, adam and rmsprop

We can define a list of dictionary to specify the above hyper-parameters, and then pass the list to DesignSpace class.

[2]:
params = [
    {'name' : 'hidden_size', 'type' : 'int', 'lb' : 16, 'ub' : 128},
    {'name' : 'batch_size',  'type' : 'int', 'lb' : 16, 'ub' : 128},
    {'name' : 'lr', 'type' : 'pow', 'lb' : 1e-4, 'ub' : 1e-2, 'base' : 10},
    {'name' : 'use_bn', 'type' : 'bool'},
    {'name' : 'activation', 'type' : 'cat', 'categories' : ['relu', 'tanh','sigmoid']},
    {'name' : 'dropout_rate', 'type' : 'num', 'lb' : 0.1, 'ub' : 0.9},
    {'name' : 'optimizer', 'type' : 'cat', 'categories' : ['sgd', 'adam', 'rmsprop']}
]

space = DesignSpace().parse(params)

That’s it, we have defined the search space, now we can do some random sampling. Runing DesignSpace.sample() would return a pandas dataframe.

[3]:
space.sample(5)
[3]:
hidden_size batch_size lr use_bn activation dropout_rate optimizer
0 27 107 0.000930 True tanh 0.730755 rmsprop
1 115 101 0.000348 False tanh 0.115669 sgd
2 27 91 0.000214 False tanh 0.214109 adam
3 70 56 0.000851 True sigmoid 0.763984 adam
4 94 43 0.000815 True tanh 0.615758 rmsprop

Inside DesignSpace: parameter transformation

NOTE: You can skip this section if you don’t need to define new parameter types or develop new BO algorithms.

We can see that DesignSpace.sample() returns a pandas dataframe, that’s how design parameters are represented, however, there are some drawbacks directly using the above dataframe to fit the surrogate model in BO:

  1. Categorical parameters are represented by str, they should be transformed to integers

  2. For parameters varying in log space, it would be better to perform log transformation before feeding them to BO algorithms

The DesignSpace.transform does the above two things, it transforms transforms categorical variables to integers and performs log transformation to parameters varying in log space.

DesignSpace.transform takes a pandas dataframe as input, and returns a torch.FloatTensor and a torch.LongTensor: numerical and boolean parameters are transformed to FloatTensor (for boolean parameters, we can view True/False as 0/1), and categorical parameters would be transformed to a LongTensor

[4]:
samp = space.sample(3)
samp
[4]:
hidden_size batch_size lr use_bn activation dropout_rate optimizer
0 82 72 0.004158 False tanh 0.896806 rmsprop
1 61 99 0.000997 True sigmoid 0.239654 sgd
2 119 25 0.002203 False tanh 0.186145 sgd
[5]:
x,xe = space.transform(samp)

assert isinstance(x, torch.FloatTensor)
assert isinstance(xe, torch.LongTensor)
x.shape, xe.shape
[5]:
(torch.Size([3, 5]), torch.Size([3, 2]))

The five numerical parameters are transformed to x, and the two categorical parameters are transformed to xe, the order of each column can be seen from DesignSpace.numeric_names and DesignSpace.enum_names

[6]:
x, space.numeric_names
[6]:
(tensor([[ 82.0000,  72.0000,  -2.3811,   0.0000,   0.8968],
         [ 61.0000,  99.0000,  -3.0012,   1.0000,   0.2397],
         [119.0000,  25.0000,  -2.6570,   0.0000,   0.1861]]),
 ['hidden_size', 'batch_size', 'lr', 'use_bn', 'dropout_rate'])

From the above cell, we can see that log transformation is performed to lr, and values of the boolean parameter use_bn is transformed to 0/1.

The two categorical parameters are transformed to integers

[7]:
xe, space.enum_names
[7]:
(tensor([[2, 1],
         [1, 2],
         [2, 2]]),
 ['activation', 'optimizer'])

We can use DesignSpace.inverse_transform to recover the original dataframe

[8]:
space.inverse_transform(x,xe)
[8]:
hidden_size batch_size lr use_bn activation dropout_rate optimizer
0 82 72 0.004158 False tanh 0.896806 rmsprop
1 61 99 0.000997 True sigmoid 0.239654 sgd
2 119 25 0.002203 False tanh 0.186145 sgd

Bound of transformed parameters

In DesignSpace class, the bound of the transformed parameters is automatically calculated, we can see the lower bound and upper bound using DesignSapce.opt_lb and DesignSpace.opt_ub

[9]:
space.opt_lb
[9]:
tensor([16.0000, 16.0000, -4.0000,  0.0000,  0.1000,  0.0000,  0.0000],
       dtype=torch.float64)
[10]:
space.opt_ub
[10]:
tensor([128.0000, 128.0000,  -2.0000,   1.0000,   0.9000,   2.0000,   2.0000],
       dtype=torch.float64)

The order of bound vector elements is space.numeric_names + space.enum_names

[13]:
space.numeric_names + space.enum_names
[13]:
['hidden_size',
 'batch_size',
 'lr',
 'use_bn',
 'dropout_rate',
 'activation',
 'optimizer']

We can see that the third element of the bound vector is lr, and the range is transformed from [\(10^{-4}\), \(10^{-2}\)] to [-4, -2].