This article explains the metrics used to evaluate a regression model's performance.
For a regression model trained on the Engine, the “Performance” page shows how the model performs, using three metrics: Prediction Quality, Prediction Error, and Percentage Error. A quick overview of these three metrics is given in the table below. Further detail about each metric can be found in the corresponding sections below the table.
PREDICTION QUALITY | PREDICTION ERROR | PERCENTAGE ERROR | |
UNIT |
% | Same as prediction target | % |
RANGE | 0% - 100% | 0 to infinity | 0% to infinity |
TYPE | Higher is better | Less is better | Less is better |
BEST VALUE | 100% | 0 | 0% |
UNDERLYING METRIC NAME | R2 Score | Mean Absolute Error (MAbE) | Mean Absolute Percentage Error (MAPE) |
Prediction Quality (R2 score)
What is Prediction Quality?
The prediction quality is in fact the R2 score (coefficient of determination) multiplied by 100, which results in a percentage value.
Tip: To learn more about prediction quality, and the metrics used to calculate it, see this article.
How does the R2 score measure the model performance?
R2 score explains how the trained model performs by comparing it with a worst-case baseline model. The baseline model here is one that outputs the average value of the target from the training portion seen by it. For example, if the target column in the training portion had the values 20, 40, and 60, the baseline model will always predict 40, which is the average of 20, 40, and 60. A worst-case baseline model does not take into consideration the values of the input that are fed to it.
The comparison is done by taking the ratio between the "total error of the trained model" and "total error of the baseline model". If the trained model performs better than the baseline, its total error will be smaller and hence the ratio will be smaller than 1. A perfect model would have a ratio of 0 since its total error will be 0. Finally, we obtain the R2 score by simply subtracting the comparison ratio from 1:
Note: "Total error" is the simplified term for "sum of squared residuals".
How to interpret the R2 score?
-
0 < R2 score < 1: The model performs better than the baseline model. The higher the R2 score, the better the model performs.
-
R2 score = 1: The model predicts perfectly.
-
R2 score = 0: The model performs similarly to the baseline model. In practice, the R2 score can be negative. However, the engine automatically clamps the negative R2 score to 0. Therefore, we can interpret a value of 0 as "performing equally or worse than the baseline model".
Note: Further details about the R2 score can be found here: Coefficient of determination
Prediction Error (Mean absolute error)
What is Prediction Error?
Prediction Error is the mean absolute error metric.
How does the mean absolute error measure the model performance?
The trained model gives a prediction for each example in the test portion. The mean absolute error is the average of the absolute difference between the actual and predicted values:


The mean absolute error measures the model performance by telling us how much difference we would expect between a prediction and its actual value.
The mean absolute error will have the same unit as the target values. For example, if we are predicting house prices in “dollar”, the unit of mean absolute error is “dollar”.
How to interpret the mean absolute error (MAbE)?
-
MAbE > 0: A smaller MAbE is better. Unlike the R2 score, which is upper bounded by 1, the MAbE does not have an upper bound.
-
MAbE = 0: The model predicts perfectly.
Note: For further details about mean absolute error: Mean absolute error
Percentage Error (Mean absolute percentage error)
What is Percentage Error?
For regression models, Percentage Error is simply the mean absolute percentage error.
How does the mean absolute percentage error measure the model performance?
The trained model gives a prediction for each example in the test portion. The absolute percentage error compares the magnitude of the "error" the model made with the actual value. For example, if the actual value is 1000 and the model predicts 950, the "mistake" will be 1000 - 950 = 50 and the absolute percentage error will be 100 * [50 / 1000] = 5%. The mean absolute percentage error is then the average of all absolute percentage errors:


The mean absolute percentage error metric tells us the difference we would expect, relative to the actual value of the target variable, between the model’s prediction and the corresponding actual value.
How to interpret mean absolute percentage error (MAPE)?
-
MAPE > 0: A smaller MAPE is better. The MAPE’s value is not upper-bounded by 100%. The MAPE only expresses the magnitude of difference between the model’s "error" and actual values as a percentage of the actual values. For example, if the model always predicts 3 times the actual value, the MAPE will be approximately 200%.
-
MAPE = 0: The model predicts perfectly.
Note: For further details about mean absolute percentage error: Mean absolute percentage error