2.3. Getting Started: Python¶
Important
This tutorial is only compatible with ParAMS 2023.1 or later.
See also
Documentation for ParAMSJob and ParAMSResults.
Documentation for DataSetEvaluator
2.3.1. Quickstart¶
To follow along, either
Download
plamsjob_quickstart.py
Download
plamsjob_quickstart.ipynb
(see also: how to install Jupyterlab)
2.3.1.1. Run a ParAMSJob for Lennard-Jones¶
ParAMS uses PLAMS to run jobs through Python. PLAMS offers many functions for handling jobs. To run jobs through PLAMS, you can either
use the
$AMSBIN/plams
programuse the
$AMSBIN/amspython
program. You must then callinit()
before running jobs.
Here, we use the second approach.
# first import all plams and params functions and classes
from scm.plams import *
from scm.params import *
import os
# call PLAMS init() to set up a new directory for running jobs
# set path=None to use the current working directory
# the default folder name is 'plams_workdir'
init(path='/tmp', folder='demo_paramsjob')
PLAMS working folder: /tmp/demo_paramsjob
Below it is shown how to set up and run a ParAMS job using a
params.in
file taken from the Getting Started tutorial. The job
should take less than 2 minutes to finish.
# load all the settings for the job from a "params.in" file
params_in_file = os.path.expandvars('$AMSHOME/scripting/scm/params/examples/LJ_Ar/params.in')
job = ParAMSJob.from_inputfile(params_in_file)
# set a name for the job
job.name = "LJ_Ar"
# run the job
job.run();
[22.03|13:45:53] JOB LJ_Ar STARTED
[22.03|13:45:54] JOB LJ_Ar RUNNING
[22.03|13:46:17] JOB LJ_Ar FINISHED
[22.03|13:46:17] JOB LJ_Ar SUCCESSFUL
To find out where the job and its results are stored:
print(f"The job was run in: {job.path}")
print(f"Contents of the job directory: {os.listdir(job.path)}")
print(f"The results are stored in: {job.results.path}")
print(f"Contents of the results directory: {os.listdir(job.results.path)}")
The job was run in: /tmp/demo_paramsjob/LJ_Ar
Contents of the job directory: ['LJ_Ar.out', 'LJ_Ar.run', 'LJ_Ar.dill', 'results', 'LJ_Ar.in', 'LJ_Ar.err']
The results are stored in: /tmp/demo_paramsjob/LJ_Ar/results
Contents of the results directory: ['settings_and_initial_data', 'optimization']
2.3.1.2. Access the results¶
When a job has finished, you would like to access the results. The job may have been run via the GUI or with the ParAMSJob as above. Typically, you would write another Python script and load the finished (or running) job:
#job = ParAMSJob.load_external(results_dir)
#in this example it would be
#job = ParAMSJob.load_external('/tmp/demo_paramsjob/LJ_Ar/results')
In this tutorial, there is no need to explicitly load the job again with
load_external
since the job was run in the same script, so the lines
above are commented out.
The results can be accessed with job.results
, which is of type
ParAMSResults
.
Below we print a table with the initial and best Lennard-Jones parameters eps and sigma, and the corresponding loss function values.
# compare the results
initial_interface = job.results.get_parameter_interface(source='initial')
initial_loss = job.results.get_loss(source='initial')
best_interface = job.results.get_parameter_interface(source='best')
best_loss = job.results.get_loss(source='best')
print("{:12s} {:>12s} {:>12s} {:>12s}".format("", "eps", "rmin", "loss"))
print("{:12s} {:12.7f} {:12.5f} {:12.5f}".format("Initial",
initial_interface['eps'].value,
initial_interface['rmin'].value,
initial_loss))
print("{:12s} {:12.7f} {:12.5f} {:12.5f}".format("Best",
best_interface['eps'].value,
best_interface['rmin'].value,
best_loss))
eps rmin loss
Initial 0.0003000 4.00000 572.18867
Best 0.0001961 3.65375 0.00251
Let’s also plot the running loss function value vs. evaluation number:
import matplotlib.pyplot as plt
import numpy as np
evaluation, loss = job.results.get_running_loss()
plt.plot(evaluation, np.log10(loss), '-')
plt.ylabel("log10(loss)")
plt.xlabel("Evaluation number");
To see the parameter values at different evaluations:
evaluation, parameters = job.results.get_running_active_parameters()
plt.plot(evaluation, parameters['rmin'])
plt.xlabel("Evaluation id")
plt.ylabel("Value of rmin");
You can plot a scatter plot of reference vs. predicted forces with the
help of the get_data_set_evaluator()
function, which returns a
DataSetEvaluator
:
dse = job.results.get_data_set_evaluator()
forces = dse.results['forces']
plt.plot(forces.reference_values, forces.predictions, '.')
plt.xlabel(f"Reference force ({forces.unit})")
plt.ylabel(f"Predicted force ({forces.unit})");
plt.xlim(auto=True)
plt.autoscale(False)
plt.plot([-10,10],[-10,10], linewidth=5, zorder=-1, alpha = 0.3, c='red')
plt.show()
For all the ways the DataSetEvaluator can be used, see the Data Set Evaluator documentation.
2.3.1.3. Call PLAMS finish()¶
If you used PLAMS to run jobs, the finish()
function should be
called at the end, if
init()
was called at the beginning.
finish()
[22.03|13:46:17] PLAMS run finished. Goodbye
More results to extract can be found in the ParAMSResults
API.
2.3.2. Setting up a ParAMSJob¶
To follow along, either
Download
plamsjob_settings.py
Download
plamsjob_settings.ipynb
(see also: how to install Jupyterlab)
A ParAMSJob
has a settings
attribute which contains the settings
for the job. The settings
will be converted to the params.in
file
before the job is run.
See also
Documentation for PLAMS Settings
Documentation for preparing input for AMSJobs. ParAMSJobs use the same input preparation as AMSJobs.
The first example below shows how to manually set all the input settings, but this can be quite tedious. Further down some helpful functions are shown, which make it easier to set up the settings.
2.3.2.1. Manually set ParAMSJob Settings¶
Call the get_input()
function to see what the params.in
file
will look like.
Note:
All paths must be absolute paths. This is needed to be able to run the job through PLAMS
DataSet
andOptimizer
are recurring blocks, so they are initialized as lists. TheDataSet[0]
syntax means that the settings are set for the first dataset, etc.
from scm.plams import *
from scm.params import *
import os
job = ParAMSJob()
job.settings.input.Task = 'Optimization'
job.settings.input.JobCollection = '/path/job_collection.yaml' # absolute path
job.settings.input.DataSet = [Settings(), Settings()] #DataSet is a recurring block
job.settings.input.DataSet[0].Name = 'training_set'
job.settings.input.DataSet[0].Path = '/path/training_set.yaml' # absolute path
job.settings.input.DataSet[1].Name = 'validation_set'
job.settings.input.DataSet[1].Path = '/path/validation_set.yaml' # absolute path
job.settings.input.LoggingInterval.General = 10
job.settings.input.SkipX0 = 'No' # Booleans are specified as strings "Yes" or "No"
job.settings.input.Optimizer = [Settings()] # Optimizer is a recurring block
job.settings.input.Optimizer[0].Type = 'CMAES'
job.settings.input.Optimizer[0].CMAES.Sigma0 = 0.01
job.settings.input.Optimizer[0].CMAES.Popsize = 8
job.settings.input.ParallelLevels.Optimizations = 1 # ParallelLevels is NOT a recurring block
print(job.get_input())
Task Optimization
DataSet
Name training_set
Path /path/training_set.yaml
end
DataSet
Name validation_set
Path /path/validation_set.yaml
end
JobCollection /path/job_collection.yaml
LoggingInterval
General 10
end
Optimizer
CMAES
Popsize 8
Sigma0 0.01
End
Type CMAES
end
ParallelLevels
Optimizations 1
end
SkipX0 No
2.3.2.2. Load a job from a params.in file¶
If you already have a params.in
file (for example created by the GUI
or by hand), you can simply load it into a ParAMSJob
using
from_inputfile()
.
Note that any paths in the params.in
file get converted to absolute
paths.
params_in_file = os.path.expandvars('$AMSHOME/scripting/scm/params/examples/LJ_Ar/params.in')
job = ParAMSJob.from_inputfile(params_in_file)
print(job.get_input())
task Optimization
parameterinterface /home/user/adfhome/scripting/scm/params/examples/LJ_Ar/parameter_interface.yaml
dataset
name training_set
path /home/user/adfhome/scripting/scm/params/examples/LJ_Ar/training_set.yaml
end
jobcollection /home/user/adfhome/scripting/scm/params/examples/LJ_Ar/job_collection.yaml
exitcondition
timelimit 120
type TimeLimit
end
exitcondition
maxoptimizersconverged 1
type MaxOptimizersConverged
end
optimizer
scipy
algorithm Nelder-Mead
End
type Scipy
end
parallellevels
jobs 1
optimizations 1
parametervectors 1
processes 1
end
2.3.2.3. Create a ParAMSJob from a directory with .yaml files¶
In the input file you need to specify many paths to different .yaml
files. This can be tedious to set up manually. If you have a directory
with .yaml files (e.g. the jobname.params
directory created by the
GUI), you can initialize a ParAMSJob to read those yaml files using
from_yaml()
. The files need to have the default names:
job_collection.yaml
training_set.yaml
validation_set.yaml
job_collection_engines.yaml or engine_collection.yaml
parameter_interface.yaml or parameters.yaml
Note 1: When you run a ParAMSJob, any .yaml files in the current working
directory will be used if they have the default names and if the
corresponding settings are unset. In this way, you do not need to
specify the paths in the settings
if you have the .yaml files in the
same directory as the script .py
file that runs the job.
Note 2: from_yaml()
only sets the settings for the yaml files and
leaves all other settings empty. from_inputfile()
reads all the
settings from the params.in file.
job = ParAMSJob.from_yaml(os.path.expandvars('$AMSHOME/scripting/scm/params/examples/LJ_Ar'))
print(job.get_input())
ParameterInterface /home/user/adfhome/scripting/scm/params/examples/LJ_Ar/parameter_interface.yaml
DataSet
Name training_set
Path /home/user/adfhome/scripting/scm/params/examples/LJ_Ar/training_set.yaml
end
JobCollection /home/user/adfhome/scripting/scm/params/examples/LJ_Ar/job_collection.yaml
2.3.2.4. Input validation¶
The allowed settings blocks, keys, and values are described in the
documentation. If you make a mistake in the block or key names,
get_input()
will raise an error:
job.settings.input.NonExistingKey = 3.14159
try:
print(job.get_input())
except Exception as e:
print(e)
Input error: unrecognized entry "nonexistingkey" found in line 10
If you want to print the input anyway, use
get_input(validate=False)
:
print(job.get_input(validate=False)) # print the input anyway
ParameterInterface /home/user/adfhome/scripting/scm/params/examples/LJ_Ar/parameter_interface.yaml
DataSet
Name training_set
Path /home/user/adfhome/scripting/scm/params/examples/LJ_Ar/training_set.yaml
end
JobCollection /home/user/adfhome/scripting/scm/params/examples/LJ_Ar/job_collection.yaml
NonExistingKey 3.14159
2.3.2.5. Delete a block or key¶
To delete an entry from the Settings, use del
:
del job.settings.input.NonExistingKey
2.3.2.6. Attributes for easier setup of .yaml files¶
ParAMSJob
has some special attributes which makes it easier to set
up the settings.
job = ParAMSJob()
job.job_collection = 'my_job_collection.yaml' # will be converted to absolute path if it exists
job.training_set = 'my_training_set.yaml' # will be converted to absolute path if it exists
print(job.get_input())
DataSet
Name training_set
path my_training_set.yaml
end
JobCollection my_job_collection.yaml
Note that job.training_set
and job.validation_set
are quite
special: when you assign a string to them as above, it will set the
corresponding path
in the settings
. But when you read them you
will get the corresponding Settings block:
print(job.training_set)
print(type(job.training_set))
Name: training_set
path: my_training_set.yaml
<class 'scm.plams.core.settings.Settings'>
print(job.training_set.path)
print(type(job.training_set.path))
my_training_set.yaml
<class 'str'>
Assigning to job.validation_set
will create another item in the
job.settings.input.DataSet
list:
job.validation_set = 'validation_set.yaml'
print(job.get_input())
DataSet
Name training_set
path my_training_set.yaml
end
DataSet
Name validation_set
path validation_set.yaml
end
JobCollection my_job_collection.yaml
To set other settings for the training set or validation set, use the standard dot-notation:
job.validation_set.EvaluateEvery = 100
job.settings.input.LoggingInterval.General = 100
print(job.get_input())
DataSet
Name training_set
path my_training_set.yaml
end
DataSet
EvaluateEvery 100
Name validation_set
path validation_set.yaml
end
JobCollection my_job_collection.yaml
LoggingInterval
General 100
end
You can also use the job.parameter_interface
and
job.engine_collection
in the same way as job.job_collection
:
job = ParAMSJob()
job.parameter_interface = 'my_parameter_interface.yaml' # will be converted to absolute path if it exists
job.engine_collection = 'my_engine_collection.yaml' # will be converted to absolute path if it exists
print(job.get_input())
# note: job.training_set is always defined, this is why a DataSet block is printed below
ParameterInterface my_parameter_interface.yaml
DataSet
Name training_set
end
EngineCollection my_engine_collection.yaml
2.3.2.7. Functions for recurring blocks: Optimizers, Stoppers, ExitConditions¶
Use the below functions to easily add optimizers, stoppers, or exit conditions:
job = ParAMSJob()
job.add_exit_condition("MaxTotalFunctionCalls", 100000)
job.add_exit_condition("TimeLimit", 24*60*60)
job.add_exit_condition("StopsAfterConvergence", {'OptimizersConverged': 3, 'OptimizersStopped': 1})
job.add_optimizer("CMAES", {'Sigma0': 0.01, 'PopSize': 8})
job.add_optimizer("Scipy")
job.add_stopper("BestFunctionValueUnmoving", {'Tolerance': 0.1})
job.add_stopper("MaxFunctionCalls", 1000)
print(job.get_input())
DataSet
Name training_set
end
ExitCondition
MaxTotalFunctionCalls 100000
Type MaxTotalFunctionCalls
end
ExitCondition
TimeLimit 86400
Type TimeLimit
end
ExitCondition
StopsAfterConvergence
OptimizersConverged 3
OptimizersStopped 1
End
Type StopsAfterConvergence
end
Optimizer
CMAES
PopSize 8
Sigma0 0.01
End
Type CMAES
end
Optimizer
Scipy
End
Type Scipy
end
Stopper
BestFunctionValueUnmoving
Tolerance 0.1
End
Type BestFunctionValueUnmoving
end
Stopper
MaxFunctionCalls 1000
Type MaxFunctionCalls
end
To delete an added recurring block, use pop
together with zero-based
indices:
job.settings.input.ExitCondition.pop(1) # 2nd exit condition
job.settings.input.Optimizer.pop(1) # 2nd optimizer
job.settings.input.Stopper.pop(0) # first stopper
print(job.get_input())
DataSet
Name training_set
end
ExitCondition
MaxTotalFunctionCalls 100000
Type MaxTotalFunctionCalls
end
ExitCondition
StopsAfterConvergence
OptimizersConverged 3
OptimizersStopped 1
End
Type StopsAfterConvergence
end
Optimizer
CMAES
PopSize 8
Sigma0 0.01
End
Type CMAES
end
Stopper
MaxFunctionCalls 1000
Type MaxFunctionCalls
end
Note: the ExitConditionBooleanCombination
and
StopperBooleanCombination
work with indices starting with 1.