7.10.6. Loss Functions¶
The loss function measures how good a set of parameters are. The loss function is a single real-valued number. The smaller the loss function value, the better the parameters are.
The following loss functions are available:
name |
string |
class |
---|---|---|
Sum of Squared Errors (default, recommended) |
‘sse’, ‘rss’ |
|
Sum of Absolute Errors |
‘sae’, |
|
Mean Absolute Error |
‘mad’, ‘mae’ |
|
Root Mean Squared Error |
‘rmsd’, ‘rmse’ |
Important
The value of the loss function will be weighted by the data_set entry weights, and the residuals will first be normalized by the data_set entry sigma. So for example using the “Mean Absolute Error” as the loss function will not necessarily calculate a physically meaningful mean absolute error. Instead, use the DataSetEvaluator to calculate meaningful MAE and RMSE.
By default the following equations are used to calculate the loss function value, where \(N\) is the number of data_set entries, \(w\) is the weight, \(\sigma\) is the sigma value, \(y\) is the predicted value and \(\hat{y}\) is the reference value. For array reference values, there are \(M_i\) elements of the array for data_set entry \(i\).
loss |
scalar values |
array values |
---|---|---|
SSE |
\(\sum_{i=1}^N w_i\left(\frac{y_i - \hat{y}_i}{\sigma _i}\right)^2\) |
\(\sum_{i=1}^N\sum_{j=1}^{M_i} w_{i,j}\left(\frac{y_{i,j} - \hat{y}_{i,j}}{\sigma _i}\right)^2\) |
SAE |
\(\sum_{i=1}^N w_i\frac{|y_i - \hat{y}_i|}{\sigma _i}\) |
\(\sum_{i=1}^N\sum_{j=1}^{M_i} w_{i,j}\frac{|y_{i,j} - \hat{y}_{i,j}|}{\sigma _i}\) |
MAE |
\(\frac{1}{N}\sum_{i=1}^N w_i\frac{|y_i - \hat{y}_i|}{\sigma _i}\) |
\(\frac{1}{N}\sum_{i=1}^N\frac{1}{M_i}\sum_{j=1}^{M_i} w_{i,j}\frac{|y_{i,j} - \hat{y}_{i,j}|}{\sigma _i}\) |
RMSE |
\(\sqrt{\frac{1}{N}\sum_{i=1}^N w_i\left(\frac{y_i - \hat{y}_i}{\sigma _i}\right)^2}\) |
\(\sqrt{\frac{1}{N}\sum_{i=1}^N\frac{1}{M_i}\sum_{j=1}^{M_i} w_{i,j}\left(\frac{y_{i,j} - \hat{y}_{i,j}}{\sigma _i}\right)^2}\) |
Note that for MAE and RMSE loss functions with array reference values, averages are first calculated over the individual arrays, and the \(N\) averages are then again averaged. It is also common to use a different definition, in which the \(N\) arrays are first concatenated and only a single average is calculated over this larger array. This second way is referred to as “other” in the below example, and is the default for the MAE and RMSE reported by the Data Set Evaluator.
Example:
from scm.params.core.lossfunctions import SSE, SAE, MAE, RMSE
import numpy as np
sse = SSE()
sae = SAE()
mae_default = MAE()
mae_other = MAE(lambda x,w=1: np.sum(w*np.abs(x)), lambda resids: sum(len(i) for i in resids))
rmse_default = RMSE()
rmse_other = RMSE(lambda x,w=1: np.sum(w*np.square(x)), lambda resids: sum(len(i) for i in resids) )
x = [[1, -3], [3, -4, 5]]
print("Residuals: {}".format(x))
print("SSE: {} (1 + 9 + 9 + 16 + 25)".format(sse(x)[0]))
print("SAE: {} (1 + 3 + 3 + 4 + 5)".format(sae(x)[0]))
print("MAE (default): {} [(2+4)/2]".format(mae_default(x)[0]))
print("MAE (other): {} [(1+3+3+4+5)/5]".format(mae_other(x)[0]))
print("RMSE (default): {:.3f} sqrt[(5+16.667)/2]".format(rmse_default(x)[0]))
print("RMSE (other): {:.3f} sqrt[(1+9+9+16+25)/5]".format(rmse_other(x)[0]))
Residuals: [[1, -3], [3, -4, 5]]
SSE: 60.0 (1 + 9 + 9 + 16 + 25)
SAE: 16.0 (1 + 3 + 3 + 4 + 5)
MAE (default): 3.0 [(2+4)/2]
MAE (other): 3.2 [(1+3+3+4+5)/5]
RMSE (default): 3.291 sqrt[(5+16.667)/2]
RMSE (other): 3.464 sqrt[(1+9+9+16+25)/5]
7.10.6.1. Specifying the loss function¶
The loss function can be passed to an Optimization in one of the following ways:
my_optimization = Optimization(*args, loss='sse') # As the string keyword
from scm.params.core.lossfunctions import SSE # Loss functions are not imported automatically
my_optimization = Optimization(*args, loss=SSE()) # Or directly
A loss function can also be passed to DataSet.evaluate()
in the same manner.
7.10.6.2. Technical information¶
Each loss function class (SAE()
, MAE()
, RMSE()
, SSE()
) derives from the Loss
base class.
The __call__
method takes two arguments: residuals
and weights
.
residuals
is a list of residuals vectors between reference and predicted properties. When called fromDataSet.evaluatio()
, each item has been normalized by the sigma value of the corresponding data_set entry: \((\boldsymbol{y} - \boldsymbol{\hat{y}})/\boldsymbol{\sigma}\).weights
is a list of a set of weights \(\boldsymbol{w}\).
The __call__
method returns a 2-tuple consisting of the loss function value (float) and a list of contributions.
7.10.6.3. Sum of Squares Error¶
-
class
SSE
(inner_f=<function SSE.<lambda>>, norm_f=None)¶ Residual Sum of Squares or Sum of Squared Error loss. This loss function is commonly used for ReaxFF parameter fitting.
(7.1)¶\[L_\mathrm{SSE} = \sum_{i=1}^N (y_i - \hat{y}_i)^2\]Accessible with the strings
'sse', 'rss'
.- Default Parameters:
inner_f :
lambda x: np.sum(x**2)
norm_f :
None
7.10.6.4. Sum of Absolute Errors¶
7.10.6.5. Mean Absolute Error¶
7.10.6.6. Root-Mean-Square Error¶
-
class
RMSE
(inner_f=<function RMSE.<lambda>>, norm_f=<built-in function len>)¶ Root-Mean-Square Error (RMSE, RMSD) loss.
(7.4)¶\[L_\mathrm{RMSE} = \sqrt{ \frac{1}{N} \sum_{i=1}^N \big( (y_i - \hat{y}_i) \big)^2 }\]Accessible with the strings
'rmse', 'rmsd'
.- Default Parameters:
inner_f :
lambda x: np.mean(x**2)
norm_f :
len
7.10.6.7. Loss Function API¶
User-specific loss functions can be defined by inheriting from the base class below. Please make sure that your loss returns a tuple of two vlaues: fx and contributions (see below). The latter should contain a percentual per-element contribution of residuals to the overall loss function value.
Note that although the residuals are depicted as a single vector throughout the documentation, the data structure that a Loss receives is a List[1d array], where every element in the list stores the (weighted) residuals vector of the respective Data Set entry.
-
class
Loss
(inner_f, norm_f)¶ Base class for the mathematical definition of a loss function.
-
__init__
(inner_f, norm_f)¶ Initialize the loss instance. Derived classes should call
super().__init__(inner_f, norm_f)
.- Parameters
- inner_fcallable
When the loss instance is called, this callable should be applied to each element i of the residuals (see
__call__()
), following the signatureinner_f(i) -> float
.- norm_fcallable or None
When the loss instance is called, this callable should be applied to the complete residuals list (see
__call__()
), following the signaturenorm_f(residuals) -> float
. It can be considered a normalization function for losses that require it (such as all mean losses), such that the returned loss function value is fx/norm_f(residuals). If None, the returned normalization factor will be 1.
-
abstract
__call__
(residuals: List[numpy.ndarray], weights: List[numpy.ndarray]) → float¶ When
DataSet.evaluate()
is called, reference and predicted values are extracted for each entry and combined into a list of residuals where every entry represents \((y_i-\hat{y}_i)/\sigma_i\).The loss computes a metric given this residuals vector, where each entry is weighted by weights.
This method should return two values: the numerical loss, and a 1d array of per-entry contributions to the former.- Parameters
- residualsList of 1d arrays
List of \((y_i-\hat{y}_i)/sigma_i\) elements.
- weightsList of 1d arrays
List of \(w_i\) elements. Each item in the list should have as many elements as the corresponding item in residuals.
- Returns
- loss: float
Total calculated loss
- contributions: ndarray
1d array of per-entry contributions to the overall loss. Should have the same length as residuals.
-
__repr__
()¶ Allow string representations of built-in losses.
-