What is hyperparameter search?

A hyperparameter search is the process of finding the best hyperparameters by training models with different values of hyperparameters and evaluating their performance.

Typically, there are two types of hyperparameter search: 

  • Brute force - This involves selecting a set of hyperparameters to try and all the hyper-parameters in the set are tried in a brute-forced fashion. This method does not work well if there are a large number of possible hyperparameters.

  • Sampling/Bayesian search - In contrast to the brute force approach, sampling/Bayesian hyperparameter search methods focus on a smaller subset of the possible hyperparameters by making a succession of intelligent “guesses” of where to look for better hyperparameters. This can often involve building a model that predicts some model validation metrics using the hyperparameters as inputs.

What is a Hyperparameter?

All machine learning algorithms work by learning a set of parameters that will lead to the most accurate prediction of the outcome or to optimize some mathematical measure.

So how are hyperparameters different? In machine learning, a hyperparameter is a model parameter that controls models selection and the learning process. The value of a hyperparameter needs to be set before training begins.

Simply put, the hyperparameter directly impacts the parameters that get chosen for the models. Even though the hyperparameter is not a parameter in the model per see, it sits higher in the hierarchy and is external to the model - Hence, the term hyper. 

The optimal values of normal (i.e. non-hyper) parameters are derived via training, but hyperparameter values are selected based on expert knowledge or hyperparameter search.