2.3. Getting Started: Python

Important

This tutorial is only compatible with ParAMS 2023.1 or later.

2.3.1. Quickstart

To follow along, either

2.3.1.1. Run a ParAMSJob for Lennard-Jones

ParAMS uses PLAMS to run jobs through Python. PLAMS offers many functions for handling jobs. To run jobs through PLAMS, you can either

  • use the $AMSBIN/plams program

  • use the $AMSBIN/amspython program. You must then call init() before running jobs.

Here, we use the second approach.

# first import all plams and params functions and classes
from scm.plams import *
from scm.params import *
import os

# call PLAMS init() to set up a new directory for running jobs
# set path=None to use the current working directory
# the default folder name is 'plams_workdir'
init(path='/tmp', folder='demo_paramsjob')
PLAMS working folder: /tmp/demo_paramsjob

Below it is shown how to set up and run a ParAMS job using a params.in file taken from the Getting Started tutorial. The job should take less than 2 minutes to finish.

# load all the settings for the job from a "params.in" file
params_in_file = os.path.expandvars('$AMSHOME/scripting/scm/params/examples/LJ_Ar/params.in')
job = ParAMSJob.from_inputfile(params_in_file)

# set a name for the job
job.name = "LJ_Ar"

# run the job
job.run();
[22.03|13:45:53] JOB LJ_Ar STARTED
[22.03|13:45:54] JOB LJ_Ar RUNNING
[22.03|13:46:17] JOB LJ_Ar FINISHED
[22.03|13:46:17] JOB LJ_Ar SUCCESSFUL

To find out where the job and its results are stored:

print(f"The job was run in: {job.path}")
print(f"Contents of the job directory: {os.listdir(job.path)}")
print(f"The results are stored in: {job.results.path}")
print(f"Contents of the results directory: {os.listdir(job.results.path)}")
The job was run in: /tmp/demo_paramsjob/LJ_Ar
Contents of the job directory: ['LJ_Ar.out', 'LJ_Ar.run', 'LJ_Ar.dill', 'results', 'LJ_Ar.in', 'LJ_Ar.err']
The results are stored in: /tmp/demo_paramsjob/LJ_Ar/results
Contents of the results directory: ['settings_and_initial_data', 'optimization']

2.3.1.2. Access the results

When a job has finished, you would like to access the results. The job may have been run via the GUI or with the ParAMSJob as above. Typically, you would write another Python script and load the finished (or running) job:

#job = ParAMSJob.load_external(results_dir)

#in this example it would be

#job = ParAMSJob.load_external('/tmp/demo_paramsjob/LJ_Ar/results')

In this tutorial, there is no need to explicitly load the job again with load_external since the job was run in the same script, so the lines above are commented out.

The results can be accessed with job.results, which is of type ParAMSResults.

Below we print a table with the initial and best Lennard-Jones parameters eps and sigma, and the corresponding loss function values.

# compare the results
initial_interface = job.results.get_parameter_interface(source='initial')
initial_loss = job.results.get_loss(source='initial')
best_interface = job.results.get_parameter_interface(source='best')
best_loss = job.results.get_loss(source='best')

print("{:12s} {:>12s} {:>12s} {:>12s}".format("", "eps", "rmin", "loss"))
print("{:12s} {:12.7f} {:12.5f} {:12.5f}".format("Initial",
                                                 initial_interface['eps'].value,
                                                 initial_interface['rmin'].value,
                                                 initial_loss))
print("{:12s} {:12.7f} {:12.5f} {:12.5f}".format("Best",
                                                 best_interface['eps'].value,
                                                 best_interface['rmin'].value,
                                                 best_loss))
                      eps         rmin         loss
Initial         0.0003000      4.00000    572.18867
Best            0.0001961      3.65375      0.00251

Let’s also plot the running loss function value vs. evaluation number:

import matplotlib.pyplot as plt
import numpy as np

evaluation, loss = job.results.get_running_loss()
plt.plot(evaluation, np.log10(loss), '-')
plt.ylabel("log10(loss)")
plt.xlabel("Evaluation number");
../../_images/plamsjob_quickstart_12_0.png

To see the parameter values at different evaluations:

evaluation, parameters = job.results.get_running_active_parameters()
plt.plot(evaluation, parameters['rmin'])
plt.xlabel("Evaluation id")
plt.ylabel("Value of rmin");
../../_images/plamsjob_quickstart_14_0.png

You can plot a scatter plot of reference vs. predicted forces with the help of the get_data_set_evaluator() function, which returns a DataSetEvaluator:

dse = job.results.get_data_set_evaluator()
forces = dse.results['forces']
plt.plot(forces.reference_values, forces.predictions, '.')
plt.xlabel(f"Reference force ({forces.unit})")
plt.ylabel(f"Predicted force ({forces.unit})");
plt.xlim(auto=True)
plt.autoscale(False)
plt.plot([-10,10],[-10,10], linewidth=5, zorder=-1, alpha = 0.3, c='red')
plt.show()
../../_images/plamsjob_quickstart_16_0.png

For all the ways the DataSetEvaluator can be used, see the Data Set Evaluator documentation.

2.3.1.3. Call PLAMS finish()

If you used PLAMS to run jobs, the finish() function should be called at the end, if init() was called at the beginning.

finish()
[22.03|13:46:17] PLAMS run finished. Goodbye

More results to extract can be found in the ParAMSResults API.

2.3.2. Setting up a ParAMSJob

To follow along, either

A ParAMSJob has a settings attribute which contains the settings for the job. The settings will be converted to the params.in file before the job is run.

See also

The first example below shows how to manually set all the input settings, but this can be quite tedious. Further down some helpful functions are shown, which make it easier to set up the settings.

2.3.2.1. Manually set ParAMSJob Settings

Call the get_input() function to see what the params.in file will look like.

Note:

  • All paths must be absolute paths. This is needed to be able to run the job through PLAMS

  • DataSet and Optimizer are recurring blocks, so they are initialized as lists. The DataSet[0] syntax means that the settings are set for the first dataset, etc.

from scm.plams import *
from scm.params import *
import os

job = ParAMSJob()
job.settings.input.Task = 'Optimization'
job.settings.input.JobCollection = '/path/job_collection.yaml' # absolute path
job.settings.input.DataSet = [Settings(), Settings()] #DataSet is a recurring block
job.settings.input.DataSet[0].Name = 'training_set'
job.settings.input.DataSet[0].Path = '/path/training_set.yaml' # absolute path
job.settings.input.DataSet[1].Name = 'validation_set'
job.settings.input.DataSet[1].Path = '/path/validation_set.yaml' # absolute path
job.settings.input.LoggingInterval.General = 10
job.settings.input.SkipX0 = 'No' # Booleans are specified as strings "Yes" or "No"
job.settings.input.Optimizer = [Settings()] # Optimizer is a recurring block
job.settings.input.Optimizer[0].Type = 'CMAES'
job.settings.input.Optimizer[0].CMAES.Sigma0 = 0.01
job.settings.input.Optimizer[0].CMAES.Popsize = 8
job.settings.input.ParallelLevels.Optimizations = 1 # ParallelLevels is NOT a recurring block

print(job.get_input())
Task Optimization

DataSet
  Name training_set
  Path /path/training_set.yaml
end
DataSet
  Name validation_set
  Path /path/validation_set.yaml
end

JobCollection /path/job_collection.yaml

LoggingInterval
  General 10
end

Optimizer
  CMAES
    Popsize 8
    Sigma0 0.01
  End
  Type CMAES
end

ParallelLevels
  Optimizations 1
end

SkipX0 No

2.3.2.2. Load a job from a params.in file

If you already have a params.in file (for example created by the GUI or by hand), you can simply load it into a ParAMSJob using from_inputfile().

Note that any paths in the params.in file get converted to absolute paths.

params_in_file = os.path.expandvars('$AMSHOME/scripting/scm/params/examples/LJ_Ar/params.in')
job = ParAMSJob.from_inputfile(params_in_file)
print(job.get_input())
task Optimization

parameterinterface /home/user/adfhome/scripting/scm/params/examples/LJ_Ar/parameter_interface.yaml

dataset
  name training_set
  path /home/user/adfhome/scripting/scm/params/examples/LJ_Ar/training_set.yaml
end

jobcollection /home/user/adfhome/scripting/scm/params/examples/LJ_Ar/job_collection.yaml

exitcondition
  timelimit 120
  type TimeLimit
end
exitcondition
  maxoptimizersconverged 1
  type MaxOptimizersConverged
end

optimizer
  scipy
    algorithm Nelder-Mead
  End
  type Scipy
end

parallellevels
  jobs 1
  optimizations 1
  parametervectors 1
  processes 1
end

2.3.2.3. Create a ParAMSJob from a directory with .yaml files

In the input file you need to specify many paths to different .yaml files. This can be tedious to set up manually. If you have a directory with .yaml files (e.g. the jobname.params directory created by the GUI), you can initialize a ParAMSJob to read those yaml files using from_yaml(). The files need to have the default names:

  • job_collection.yaml

  • training_set.yaml

  • validation_set.yaml

  • job_collection_engines.yaml or engine_collection.yaml

  • parameter_interface.yaml or parameters.yaml

Note 1: When you run a ParAMSJob, any .yaml files in the current working directory will be used if they have the default names and if the corresponding settings are unset. In this way, you do not need to specify the paths in the settings if you have the .yaml files in the same directory as the script .py file that runs the job.

Note 2: from_yaml() only sets the settings for the yaml files and leaves all other settings empty. from_inputfile() reads all the settings from the params.in file.

job = ParAMSJob.from_yaml(os.path.expandvars('$AMSHOME/scripting/scm/params/examples/LJ_Ar'))
print(job.get_input())
ParameterInterface /home/user/adfhome/scripting/scm/params/examples/LJ_Ar/parameter_interface.yaml

DataSet
  Name training_set
  Path /home/user/adfhome/scripting/scm/params/examples/LJ_Ar/training_set.yaml
end

JobCollection /home/user/adfhome/scripting/scm/params/examples/LJ_Ar/job_collection.yaml

2.3.2.4. Input validation

The allowed settings blocks, keys, and values are described in the documentation. If you make a mistake in the block or key names, get_input() will raise an error:

job.settings.input.NonExistingKey = 3.14159
try:
    print(job.get_input())
except Exception as e:
    print(e)
Input error: unrecognized entry "nonexistingkey" found in line 10

If you want to print the input anyway, use get_input(validate=False):

print(job.get_input(validate=False)) # print the input anyway
ParameterInterface /home/user/adfhome/scripting/scm/params/examples/LJ_Ar/parameter_interface.yaml

DataSet
  Name training_set
  Path /home/user/adfhome/scripting/scm/params/examples/LJ_Ar/training_set.yaml
end

JobCollection /home/user/adfhome/scripting/scm/params/examples/LJ_Ar/job_collection.yaml

NonExistingKey 3.14159

2.3.2.5. Delete a block or key

To delete an entry from the Settings, use del:

del job.settings.input.NonExistingKey

2.3.2.6. Attributes for easier setup of .yaml files

ParAMSJob has some special attributes which makes it easier to set up the settings.

job = ParAMSJob()
job.job_collection = 'my_job_collection.yaml' # will be converted to absolute path if it exists
job.training_set = 'my_training_set.yaml' # will be converted to absolute path if it exists
print(job.get_input())
DataSet
  Name training_set
  path my_training_set.yaml
end

JobCollection my_job_collection.yaml

Note that job.training_set and job.validation_set are quite special: when you assign a string to them as above, it will set the corresponding path in the settings. But when you read them you will get the corresponding Settings block:

print(job.training_set)
print(type(job.training_set))
Name:       training_set
path:       my_training_set.yaml

<class 'scm.plams.core.settings.Settings'>
print(job.training_set.path)
print(type(job.training_set.path))
my_training_set.yaml
<class 'str'>

Assigning to job.validation_set will create another item in the job.settings.input.DataSet list:

job.validation_set = 'validation_set.yaml'
print(job.get_input())
DataSet
  Name training_set
  path my_training_set.yaml
end
DataSet
  Name validation_set
  path validation_set.yaml
end

JobCollection my_job_collection.yaml

To set other settings for the training set or validation set, use the standard dot-notation:

job.validation_set.EvaluateEvery = 100
job.settings.input.LoggingInterval.General = 100
print(job.get_input())
DataSet
  Name training_set
  path my_training_set.yaml
end
DataSet
  EvaluateEvery 100
  Name validation_set
  path validation_set.yaml
end

JobCollection my_job_collection.yaml

LoggingInterval
  General 100
end

You can also use the job.parameter_interface and job.engine_collection in the same way as job.job_collection:

job = ParAMSJob()
job.parameter_interface = 'my_parameter_interface.yaml' # will be converted to absolute path if it exists
job.engine_collection = 'my_engine_collection.yaml' # will be converted to absolute path if it exists
print(job.get_input())
# note: job.training_set is always defined, this is why a DataSet block is printed below
ParameterInterface my_parameter_interface.yaml

DataSet
  Name training_set
end

EngineCollection my_engine_collection.yaml

2.3.2.7. Functions for recurring blocks: Optimizers, Stoppers, ExitConditions

Use the below functions to easily add optimizers, stoppers, or exit conditions:

job = ParAMSJob()

job.add_exit_condition("MaxTotalFunctionCalls", 100000)
job.add_exit_condition("TimeLimit", 24*60*60)
job.add_exit_condition("StopsAfterConvergence", {'OptimizersConverged': 3, 'OptimizersStopped': 1})

job.add_optimizer("CMAES", {'Sigma0': 0.01, 'PopSize': 8})
job.add_optimizer("Scipy")

job.add_stopper("BestFunctionValueUnmoving", {'Tolerance': 0.1})
job.add_stopper("MaxFunctionCalls", 1000)

print(job.get_input())
DataSet
  Name training_set
end

ExitCondition
  MaxTotalFunctionCalls 100000
  Type MaxTotalFunctionCalls
end
ExitCondition
  TimeLimit 86400
  Type TimeLimit
end
ExitCondition
  StopsAfterConvergence
    OptimizersConverged 3
    OptimizersStopped 1
  End
  Type StopsAfterConvergence
end

Optimizer
  CMAES
    PopSize 8
    Sigma0 0.01
  End
  Type CMAES
end
Optimizer
  Scipy
  End
  Type Scipy
end

Stopper
  BestFunctionValueUnmoving
    Tolerance 0.1
  End
  Type BestFunctionValueUnmoving
end
Stopper
  MaxFunctionCalls 1000
  Type MaxFunctionCalls
end

To delete an added recurring block, use pop together with zero-based indices:

job.settings.input.ExitCondition.pop(1) # 2nd exit condition
job.settings.input.Optimizer.pop(1) # 2nd optimizer
job.settings.input.Stopper.pop(0) # first stopper
print(job.get_input())
DataSet
  Name training_set
end

ExitCondition
  MaxTotalFunctionCalls 100000
  Type MaxTotalFunctionCalls
end
ExitCondition
  StopsAfterConvergence
    OptimizersConverged 3
    OptimizersStopped 1
  End
  Type StopsAfterConvergence
end

Optimizer
  CMAES
    PopSize 8
    Sigma0 0.01
  End
  Type CMAES
end

Stopper
  MaxFunctionCalls 1000
  Type MaxFunctionCalls
end

Note: the ExitConditionBooleanCombination and StopperBooleanCombination work with indices starting with 1.