4.5. Data Set Evaluator¶
4.5.1. DataSetEvaluator class¶
DataSetEvaluator
is class with two main functions:
DataSetEvaluator.calculate_reference()
will evaluate a data_set with any engine settings and set the reference valuesDataSetEvaluator.run()
will evaluate the data_set with any engine settings, and provide many function to compare the predicted values with the reference values. You can only userun()
if all data_set entries already have reference values.
After calling run()
, you will be able to get
- summary statistics like mean absolute error (MAE) and root-mean-squared error (RMSE)
- partial contributions to the loss function value
- tables with reference and predicted values in columns next to each other, that can be plotted with params plot
- grouped summary statistics, partial contributions, reference-vs-prediction based on the extractor, expression, or any metadata key-value pairs.
4.5.1.1. Example: DataSetEvaluator.calculate_reference()¶
Note
The below examples use the plams.Settings
class to define a computational engine.
See the PLAMS documentation for more information about it.
from scm.params import *
from scm.plams import Settings
dse = DataSetEvaluator()
# any engine settings are possible
engine_settings = Settings()
engine_settings.input.ForceField.Type = 'UFF'
# a job collection is needed, can for example be loaded from disk
job_collection = JobCollection('job_collection.yaml')
# the data_set to be evaluated, can for example be loaded from disk
data_set = DataSet('data_set.yaml')
# print the original expression : reference value
print("Original reference values:")
for ds_entry in data_set:
print("{}: {}".format(ds_entry.expression, ds_entry.reference))
# calculate reference. Set folder=None to not store the finished jobs on disk (can be faster)
# set overwrite=True to overwrite existing reference values
dse.calculate_reference(job_collection, data_set, engine_settings, overwrite=False, folder='saved_results')
# print the new expression : reference value
print("New reference values:")
for ds_entry in data_set:
print("{}: {}".format(ds_entry.expression, ds_entry.reference))
4.5.1.2. Example: DataSetEvaluator.run()¶
from scm.params import *
from scm.plams import Settings
dse = DataSetEvaluator()
# any engine settings are possible
engine_settings = Settings()
engine_settings.input.ForceField.Type = 'UFF'
# a job collection is needed, can for example be loaded from disk
job_collection = JobCollection('job_collection.yaml')
# the data_set to be evaluated, can for example be loaded from disk
data_set = DataSet('data_set.yaml')
# run. Set folder=None to not store the finished jobs on disk (can be faster)
dse.run(job_collection, data_set, engine_settings, folder='saved_results')
# group the results by Extractor and then by Expression
dse.group_by(('Extractor', 'Expression'))
print(dse.str(stats=True, details=True))
# store the calculated results in a format that can later be
# used to initialize another DataSetEvaluator
dse.store('data_set_predictions.yaml')
dse.pickle_dump('data_set_evaluator.pkl')
4.5.1.3. Example: Load a saved DataSetEvaluator¶
The previous example used the store() and pickle_dump() methods to store the calculated results in text (.yaml) and binary (.pkl) format. They can be loaded as follows:
from scm.params import *
from scm.plams import Settings
dse = DataSetEvaluator('data_set_predictions.yaml')
print(dse)
# to load from binary .pkl one needs to call the .pickle_load() method
# and provide a path to the original data_set
dse2 = DataSetEvaluator()
dse2.pickle_load('data_set_evaluator.pkl', data_set='data_set.yaml')
print(dse2)
4.5.2. DataSetEvaluator API¶
-
class
DataSetEvaluator
(data_set=None, total_loss=None, residuals=None, contributions=None, raw_predictions=None, predictions=None, modified_reference=None, loss=None)¶ Convenience class for evaluating a data_set with any engine.
Run the evaluation with the
run()
function.Then group the results based on the Extractor, Expression, or metadata key-value pairs with the
group_by()
method.Print the results with
str(stats=True, details=True)
- stats=True will give the mean absolute error, root mean squared error, and partial contributions to the loss function
- details=True will give a table of prediction vs. reference
The results are stored in the
results
attribute. It is of typeGroupedResults
, and can be accessed as follows:>>> dse = DataSetEvaluator() >>> dse.run(job_collection, data_set, engine_settings) >>> dse.group_by(('Group', 'SubGroup')) # for grouping by Group and SubGroup metadata keys >>> dse.results.mae >>> dse.results.rmse >>> dse.results['Forces'].mae >>> dse.results['Forces']['trajectory_1'].mae >>> str(dse.results) >>> dse.results.detailed_string() >>> dse.results['Forces'].str() >>> dse.results['Forces'].detailed_string() >>> dse.results['Forces'].residuals >>> dse.results['Forces'].predictions >>> dse.results['Forces'].reference_values etc.
-
__init__
(data_set=None, total_loss=None, residuals=None, contributions=None, raw_predictions=None, predictions=None, modified_reference=None, loss=None)¶ Typically you should initialize this class without arguments, i.e., as
>>> dse = DataSetEvaluator()
data_set, predictions, residuals, contributions, total_loss can either be set in this constructor, or will internally be calculated with the
run()
method.- data_set : DataSet
- Dataset that was evaluated
- total_loss : float
- Return value from data_set.evaluate(results, return_residuals=True)[0]
- residuals: list
- Return value from data_set.evaluate(results, return_residuals=True)[1]
- contributions: list
- Return value from data_set.evaluate(results, return_residuals=True)[2]
- raw_predictions: list
- Return value from data_set.evaluate(results, return_residuals=True)[3]
- predictions: list
- Return value from data_set.get_predictions(raw_predictions, return_reference=True)[0]
- modified_reference : list
- Return value from data_set.get_predictions(raw_predictions, return_reference=True)[1]
- loss : a LossFunction or str
- The type of loss function that was used to calculate total_loss
-
calculate_reference
(job_collection: scm.params.core.jobcollection.JobCollection, data_set: scm.params.core.dataset.DataSet, engine_settings, overwrite=False, use_pipe=True, folder=None, parallel=None, use_origin=False)¶ Method to calculate and set the reference values for the entries in
data_set
. This method will change the data_set!The method does not modify the DataSetEvaluator instance.
- engine_settings : Settings or EngineCollection
If a Settings instance, that will define the reference engine used to calculate all the jobs.
If an EngineCollection, every job in the job_collection must have a ReferenceEngineID (reference_engine), that is present in the EngineCollection. The settings will then be taken from the engine collection. If more than one engine is needed to evaluate the jobs, then you must pass in an EngineCollection.
- overwrite : bool
- If False, only calculate reference values for data set entries that have no reference value. If True, calculate all reference values.
- use_origin : bool
If a job in the job_collection has the “Origin” metadata pointing to an ams.rkf results file on disk, then load results from that file instead of rerunning the job.
If both the “Origin” and “Frame” metadata keys exist, data will be taken from the correct frame in the trajectory.
If the Origin, Frame, and OriginalEnergyHartree metadata keys exist, then the energy will be taken from the OriginalEnergyHartree metadata if the ams.rkf in Origin cannot be loaded (for example if it exists on a different machine).
If loading data from the “Origin” or “OriginalEnergyHartree” fails, the job will be run.
job_collection
,data_set
,use_pipe
,folder
, andparallel
have the same meaning as in therun()
method.
-
run
(job_collection: scm.params.core.jobcollection.JobCollection, data_set: scm.params.core.dataset.DataSet, engine_settings: scm.plams.core.settings.Settings, loss='sse', use_pipe=True, folder=None, parallel=None, group_by=None)¶ Runs the jobs in the job collection using the engine defined by engine_settings, and evaluates the data_set expressions.
- job_collection : JobCollection
- The job collection containing the jobs
- data_set : DataSet
- The data_set containing the expressions to be evaluated
- engine_settings : Settings
The engine settings to be used. Example:
>>> engine_settings = Settings() >>> engine_settings.input.ForceField.Model = 'UFF'
- loss : str or Loss
- The type of loss function
- use_pipe : bool
- Whether to use the pipe interface if possible. This will speed up the calculation. Cannot be combined with folder.
- folder : str
- If folder is not None, the results will be stored on disk in that folder. If the folder already exists, a new one is created. If set, will automatically disable the pipe interface.
- parallel : ParallelLevels
- Defaults to ParallelLevels(parametervectors=1, processes=1, threads=1). This will run N jobs in parallel, where N is the number of cores on the machine.
- group_by : tuple of str
- Group results according to the tuple. The grouping can also be changed after the run with the group_by() method.
-
group_by
(group_by)¶ Group the results according to
group_by
. Therun()
method needs to called before calling this method.- group_by : tuple of str
>>> group_by(('Extractor')) # group by extractor >>> group_by(('Extractor', 'Expression')) # group by extractor, then expression. The expression will be filtered >>> group_by(('Group', 'SubGroup')) # group by the metadata key Group, then by the metadata key SubGroup
-
__str__
()¶ Return str(self).
-
pickle_load
(fname, data_set=None, more_extractors=None)¶ Loads a DataSetEvaluator from a pickled file.
-
pickle_dump
(fname, data_set_fname=None)¶ Stores the DataSetEvaluator to a (compressed) pickled file.
The file will be automatically compressed when the file ending is .gz or .gzip.NOTE: the data_set is not stored in the same file as the DataSetEvaluator! The data_set is only stored if the data_set_fname argument is given.