- Solve real problems with our hands-on interface
- Progress from basic puts and calls to advanced strategies
Posted March 16, 2023 at 10:58 am
Hyperparameters are nothing but those values that the model uses to learn the optimal parameters in order to correctly map the input features (independent variables) to the labels or targets (dependent variable).
For example, the parameter is the weights in an artificial neural network (ANN). The hyperparameters are the specific parameters that optimise and control the training process.
Confused, right? Not to worry.
We will discuss everything from what the parameters and hyperparameters are to how the hyperparameters can be used to improve your strategy’s performance!
Hence, hyperparameters are a group of parameters that help train a machine learning model to provide the required results every time.
In this blog, we will discuss all about parameters, hyperparameters and hyperparameter tuning. This blog covers:
Parameters are learned by the model purely from the training data during the training process. The machine learning algorithm used tries to learn the mapping between the input features and the targets or the desired output(s).
Model training typically starts with parameters being set to random values or set to zeros. As training/learning progresses the initial values are updated using an optimization algorithm. An example of an optimization algorithm is gradient descent.
At the end of the learning process, model parameters lead to the model’s training.
Some examples of parameters in machine learning are as follows:
A hyperparameter is a parameter set before the learning process begins for a machine learning model. Hence, the algorithm uses hyperparameters to learn the parameters.
These parameters can be tuned according to the requirements of the user and thus, they directly affect how well the model trains.
When creating a machine learning model, there are multiple design options as to how to define the model architecture. Usually, exploring a range of possibilities or probabilities helps decide the optimal model architecture. For a machine learning model to learn aptly, it is better to ask the machine to perform this exploration and automatically select the optimal model architecture.
Parameters, which define the model architecture, are referred to as hyperparameters. Thus, this process of searching for the ideal model architecture and thus, the hyperparameter, is referred to as “hyperparameter tuning”.
For example, the weights learned while training a linear regression model are parameters, but the learning rate in gradient descent is a hyperparameter.
The performance of a model on a dataset significantly depends on the proper tuning, i.e., finding the best combination of the model hyperparameters.
Some examples of hyperparameters in machine learning are as follows:
Hyperparameter tuning is quite important while training a machine learning model.
The hyperparameter tuning addresses model design with questions such as:
Broadly, hyperparameters can be divided into two categories, which are given below:
The process of selecting the best hyperparameters to use is known as hyperparameter tuning, and the tuning process is also known as hyperparameter optimization. Optimisation parameters are used for optimising the model.
Some of the popular optimization parameters are given below:
Hyperparameters that are involved in the structure of the model are known as hyperparameters for specific models. These are given below:
It is important to specify the number of hidden units in the hyperparameter for the neural network. It should be between the size of the input layer and the size of the output layer. More specifically, the number of hidden units should be 2/3 of the size of the input layer, plus the size of the output layer.
For complex functions, it is necessary to specify the number of hidden units, but it should not overfit the model.
For training the machine learning model aptly, tuning the hyperparameters is required. Following are the steps for tuning the hyperparameters:
Now, let us see the two most commonly used techniques for hyperparameters tuning. These two techniques are:
Grid search is the technique of exhaustively searching through every combination of the hyperparameter values specified.
Grid search is the simplest algorithm for hyperparameter tuning. Basically, we divide the domain of the hyperparameters into a discrete grid. Then, we try every combination of values of this grid, calculating some performance metrics using cross-validation.
The point of the grid that maximises the average value in cross-validation is the optimal combination of the hyperparameters values.
Grid search is an exhaustive algorithm that spans all the combinations, so it can actually find the best point in the domain. One drawback is that it’s very slow. Checking every combination of the space requires a lot of time that, sometimes, is not available.
Every point in the grid needs k-fold cross-validation, which requires k training steps. So, tuning the hyperparameters of a model in this way can be quite complex and expensive. However, if we look for the best combination of values of the hyperparameters, grid search is a very good idea.
In case of randomised search, unlike grid search, not all given parameter values are tried out.
Hence, a fixed number of parameter settings is sampled from the specified distribution of values.
For each hyperparameter, either a distribution of possible values or a list of discrete values (to be sampled uniformly) can be specified.
The process of sampling in a randomised search can be specified beforehand. For each hyperparameter, either a distribution over possible values or a list of discrete values (to be sampled uniformly) can be specified.
The smaller this subset, the faster but less accurate the optimization. The larger this dataset, the more accurate the optimisation but the closer to a grid search.
Random search is a very useful option when you have several hyperparameters with a fine-grained grid of values.
Using a subset made by 5-100 randomly selected points, we are able to get a reasonably good set of values of the hyperparameters.
It will not likely be the best point, but it can still be a good set of values that gives us a good model.
Hyperparameters and hyperparameter tuning are quite important to ensure that the machine learning model stores the correct information for mapping the inputs with the right outputs.
With the right knowledge with regard to hyperparameters, one can train the machine learning model for the required actions.
We discussed the basics of hyperparameters with some examples in this blog and briefly about the parameters, to begin with.
Moreover, we discussed how hyperparameter tuning takes place along with the two most popular techniques for hyperparameter tuning.
This course on machine learning and deep learning helps one understand how different machine learning algorithms can be implemented in financial markets by creating your own prediction algorithms using classification and regression techniques. Do feel free to check it out.
Disclaimer: All data and information provided in this article are for informational purposes only. QuantInsti® makes no representations as to accuracy, completeness, currentness, suitability, or validity of any information in this article and will not be liable for any errors, omissions, or delays in this information or any losses, injuries, or damages arising from its display or use. All information is provided on an as-is basis.
Originally posted on QuantInsti blog.
Information posted on IBKR Campus that is provided by third-parties does NOT constitute a recommendation that you should contract for the services of that third party. Third-party participants who contribute to IBKR Campus are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.
This material is from QuantInsti and is being posted with its permission. The views expressed in this material are solely those of the author and/or QuantInsti and Interactive Brokers is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to buy or sell any security. It should not be construed as research or investment advice or a recommendation to buy, sell or hold any security or commodity. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.
Join The Conversation
For specific platform feedback and suggestions, please submit it directly to our team using these instructions.
If you have an account-specific question or concern, please reach out to Client Services.
We encourage you to look through our FAQs before posting. Your question may already be covered!