3.11. Loss Functions

Loss Functions are metrics that evaluate the residuals vector between reference and predicted properties \((\boldsymbol{w}/\boldsymbol{\sigma})(\boldsymbol{y} - \boldsymbol{\hat{y}})\), which is generated every time DataSet.evaluate() is called (note that although DataSet.evaluate() returns the non-weighted residuals, the Loss Function always receives a residuals vector that is weighted by \(w/\sigma\)).

By default, the following string keywords are recognized as loss functions
  • lad, lae : Least Absolute Error
  • rmsd, rmse : Root-Mean-Square Deviation
  • mad, mae : Mean Absolute Deviation
  • sse, rss : Sum of Squared Errors (this is the default optimization loss)

and can be passed to an Optimization in one of the following ways:

my_optimization = Optimization(*args, loss='mae') # As the string keyword

from scm.params.core.lossfunctions import MAE # Loss functions are not imported automatically
my_optimization = Optimization(*args, loss=MAE()) # Or directly

After calling my_optimization.optimize(), generated properties will consequently be compared with the MAE. A loss function can also be passed to DataSet.evaluate() in the same manner.

3.11.1. Least Absolute Error

class LAE

Least absolute error (LAE), least absolute deviations (LAD) loss.

(3.5)\[L_\mathrm{LAE} = \sum_{i=1}^N | y_i - \hat{y}_i |\]

Accessible with the strings 'lae', 'lad'.

3.11.2. Mean Absolute Error

class MAE

Mean Absolute Error (MAE, MAD) loss.

(3.6)\[L_\mathrm{MAE} = \frac{1}{N} \sum_{i=1}^N | y_i - \hat{y}_i |\]

Accessible with the strings 'mae', 'mad'.

3.11.3. Root-Mean-Square Error

class RMSE

Root-Mean-Square Error (RMSE, RMSD) loss.

(3.7)\[L_\mathrm{RMSE} = \sqrt{ \frac{1}{N} \sum_{i=1}^N \big( (y_i - \hat{y}_i) \big)^2 }\]

Accessible with the strings 'rmse', 'rmsd'.

3.11.4. Sum of Squares Error

class SSE

Residual Sum of Squares or Sum of Squared Error loss. This loss function is commonly used for ReaxFF parameter fitting.

(3.8)\[L_\mathrm{SSE} = \sum_{i=1}^N (y_i - \hat{y}_i)^2\]

Accessible with the strings 'sse', 'rss'.

3.11.5. Loss Function API

User-specific loss functions can be defined by inheriting from the base class below. Please make sure that your loss defines the attributes fx and contribution. The latter should contain a percentual per-element contribution of residuals to the overall loss function value.

Note that although the residuals are depicted as a single vector throughout the documentation, the data structure that a Loss receives is a List[1d array], where every element in the list stores the (weighted) residuals vector of the respective Data Set entry.

class Loss

Base class for the mathematical definition of a loss function.

__call__(residuals: List[numpy.ndarray]) → float

When DataSet.evaluate() is called, reference and predicted values are extracted for each entry and combined into a weighted list of residuals where every entry represents \((w_i/\sigma_i)(y_i-\hat{y}_i)\). The loss computes a metric given this residuals vector.
This method should return two values: the numerical loss, and a 1d array of per-entry contributions to the former.

Parameters:
residuals : List of 1d arrays
List of \((w_i/\sigma_i)(y_i-\hat{y}_i)\) elements.
Returns:
loss: float
Total calculated loss
contributions: ndarray
1d array of per-entry contributions to the overall loss
__repr__()

Allow string representations of built-in losses.