3.10. Callbacks¶

Callbacks allow for further interaction with a running Optimization class. A useful callback could, for example, signal the optimization to stop after a certain time, or when overfitting.

Callback instances can be passed when a Optimization instance is created:

callbacks       = [Timeout(60*60), Logger()]
my_optimization = Optimization(*args, callbacks=callbacks)

3.10.1. Logger¶

class Logger(printfreq=100, names_to_log=None, path=None, writefreq_history=None, writefreq_bestparams=None, writefreq_datafiles=None, plot=False, kw_plot_history={}, kw_plot_contributions={}, kw_plot_residuals={})¶

Note

This callback is always included in an optimization, when running the ParAMS main script.

Combined callback that logs and saves the following data produced during an Optimization to disk for every Data Set provided:

*history.dat: The (fx,x) value pair at every evaluation (Note: Unscaled x, active subset only). Plottable with params plot
*best_params: File or path of the parameter set x with the best f(x) so far
residuals/ : Residuals of reference and predicted values \(y-\hat{y}\). Plottable with params plot
contributions/: Contributions of individual entries in the Data Set to the overall loss function value. Plottable with params plot

Note

Including this callback with at least the default settings is generally recommended: without it, there will be no relevant data written to disk. See the examples section for example output generated by this callback.

Parameters:

printfreq : int >= 0: Print the evaluation number every printfreq evaluations. Set to 0 to disable.
names_to_log : Sequence[str]: Names of the Data Sets (as set by the Optimization) to log.
Defaults to all Data Sets.
path : str: Base name of the path where the data should be written to.
Defaults to the Optimization workdir.
writefreq_history : int >= 0: Write the history.dat every n calls.
Defaults to writing on every improvement of the loss function value. Set to zero to disable logging.
writefreq_datafiles : int >= 0: Write the predictions.dat + combinations.dat every n calls.
Defaults to writing on every improvement of the loss function value. Set to zero to disable logging.
writefreq_bestparams : int >= 0: Write the best_params every n calls.
Defaults to writing on every improvement of the loss function value. Set to zero to disable logging.

3.10.2. Timeout¶

class Timeout(timeout_seconds, verbose=True)¶: Stop the optimization after timeout_seconds seconds. If verbose, prints a message when invoked.

3.10.3. Target Value¶

class TargetValue(min_fx, verbose=True)¶: Stop the optimization when the training set loss is less or equal to min_fx. If verbose, prints a message when invoked.

3.10.4. Maximum Iterations¶

class MaxIter(max_iter, verbose=True)¶: Stop the optimization after max_iter evaluations. If verbose, prints a message when invoked.

3.10.5. Early Stopping¶

class EarlyStopping(watch='trainingset', patience=0, verbose=True)¶: Stop the optimization if the data set defined in watch does not improve after patience iterations. If verbose, prints a message when invoked.

3.10.6. Stopfile¶

class Stopfile(fname='STOP', frequency=10, verbose=True)¶: Every frequency evaluations, check if a file named fname exists and stop the optimization if it does. Note that paths will be relative to the optimization directory.

3.10.7. Time per Evaluation¶

class TimePerEval(printfrequency=100, watch=None, workers=1, moving_average=50)¶: Print the average evaluation time of a new parameter set x every printfrequency iterations.

3.10.8. Load Average¶

class LoadAvg(fname, frequency=20)¶: Wrapper around psutil.getloadavg(), printing the otput to fname. Requires psutil version >= 5.6.2.
Note that when using relative file paths, the location will be relative to the optimization direvtory.

3.10.9. User-Defined Callbacks¶

The abstract Callback class allows the user to define custom optimization hooks. We will demonstrate the implementation of EarlyStopping as an example below.

from scm.params import Callback

class EarlyStopping(Callback):
    def __init__(self, patience=0):
        self.patience = patience
        self.count    = 0
        self.fxmin    = float('inf')

    def __call__(self,
                 fx         : float,
                 x          : Sequence[float],
                 name       : str,
                 ncalled    : int,
                 interface  : Type[BaseParameters],
                 dataset    : DataSet,
                 contrib    : dict,
                 results    : Union[AMSResults, AMSWorkerResults]
                 ):
        '''
        Callbacks operate on **ALL** Data Sets that are evaluated at every optimization step,
        meaning there could be more than one Data Set involved: This is for example the case when splitting
        into a training and a validation set.

        You can filter which Data Sets the callback operates on by checking the passed `name` argument --
        those are always unique per Optimization instance.
        '''

        if name == 'validationset': # Only apply to the validation set
            if np.isnan(fx): # nan means no evaluation for this call
                return
            if fx < self.fxmin:
                self.count = 0  # Reset the counter if we improved
                self.fxmin = fx # Adjust the best fx value
            else:
                self.count += 1 # Patience counter

            ret = self.count > self.patience # Do we need to stop?
            return ret

3.10.10. Callback API¶

class Callback¶

Abstract base class for callbacks

__call__(evalret: scm.params.core.opt_components.EvaluatorReturn) → Any¶

This method will be called by the optimizer at the end of every step.

Parameters:

evalret : EvaluatorReturn (named tuple)

A named tuple returned by scm.params.core.opt_components.EvaluatorReturn. The tuple unpacks to (fx, x, name, ncalled, interface, dataset, residuals, contrib, time). The names above also double as instance variables (e.g., fx can be accessed with evalret.fx).

fx : float: Loss function value of x
x : Sequence[float]: The current set of parameters suggested by the optimizer. (real, not scaled)
name : str: Name of the Date Set as set by the Optimization class. Can be ‘trainingset’, ‘validationset’ and ‘datasetXX’ (wherte XX is an int) by default
ncalled : int: The number of times this Data Set has been evaluated
interface : BaseParameters subclass: The interface that was used for this evaluation of the data set
dataset : DataSet: A tuple of DataSet and the last evaluation’s (non flattened) residuals vector
residuals : Lisd[1d-array]: A list of 1d numpy arrays holding the residuals to each data set entry such that \(r=y-\hat{y}\). See Data Set for more information.
contrib : List: List of per entry contributions to the loss function value (see also scm.params.core.dataset.DataSet.evaluate()).
time : float: Wall time (in seconds) this evaluation took

Returns:

Any value other than None will be interpreted by the optimizer as a signal to stop the optimization process.

reset()¶: This method should re-initialize the callback and will be called when a new Optimization instance is created containing this callback. It should reset the callback to it’s initial state, making the same instance available for multiple Optimization instances (e.g.: In case of Timeout a reset of the same instance is necessary to reset the timer).

on_end()¶: This method will be called once the optimization is complete

Electronic Structure

ADF

Periodic DFT

DFTB & MOPAC

Interatomic Potentials

ReaxFF

Machine Learning Potentials

Force Fields

kMC and Microkinetics

Bumblebee: OLED stacks

Fluid Thermodynamics

COSMO-RS

Workflows and Utilities

OLED workflows

ChemTraYzer2

Conformers

Reactions Discovery

AMS Driver

Properties

PES Exploration

Molecular Dynamics

Monte Carlo

Interfaces

ParAMS

PLAMS

GUI

VASP

Downloads

Windows

Mac

Linux

Documentation

Overview

Tutorials

Installation Manual

Brochures

Other Resources

Changelog

Workshops

Knowledgebank

FAQ

Pricing and licensing