2.3. Getting Started: Python¶

Important

This tutorial is only compatible with ParAMS 2023.1 or later.

2.3.1. Quickstart¶

To follow along, either

Download plamsjob_quickstart.py
Download plamsjob_quickstart.ipynb (see also: how to install Jupyterlab)

2.3.1.1. Run a ParAMSJob for Lennard-Jones¶

ParAMS uses PLAMS to run jobs through Python. PLAMS offers many functions for handling jobs. To run jobs through PLAMS, you can either

use the $AMSBIN/plams program
use the $AMSBIN/amspython program. You must then call init() before running jobs.

Here, we use the second approach.

# first import all plams and params functions and classes
from scm.plams import *
from scm.params import *
import os

# call PLAMS init() to set up a new directory for running jobs
# set path=None to use the current working directory
# the default folder name is 'plams_workdir'
init(path='/tmp', folder='demo_paramsjob')

PLAMS working folder: /tmp/demo_paramsjob

Below it is shown how to set up and run a ParAMS job using a params.in file taken from the Getting Started tutorial. The job should take less than 2 minutes to finish.

# load all the settings for the job from a "params.in" file
params_in_file = os.path.expandvars('$AMSHOME/scripting/scm/params/examples/LJ_Ar/params.in')
job = ParAMSJob.from_inputfile(params_in_file)

# set a name for the job
job.name = "LJ_Ar"

# run the job
job.run();

[22.03|13:45:53] JOB LJ_Ar STARTED
[22.03|13:45:54] JOB LJ_Ar RUNNING
[22.03|13:46:17] JOB LJ_Ar FINISHED
[22.03|13:46:17] JOB LJ_Ar SUCCESSFUL

To find out where the job and its results are stored:

print(f"The job was run in: {job.path}")
print(f"Contents of the job directory: {os.listdir(job.path)}")
print(f"The results are stored in: {job.results.path}")
print(f"Contents of the results directory: {os.listdir(job.results.path)}")

The job was run in: /tmp/demo_paramsjob/LJ_Ar
Contents of the job directory: ['LJ_Ar.out', 'LJ_Ar.run', 'LJ_Ar.dill', 'results', 'LJ_Ar.in', 'LJ_Ar.err']
The results are stored in: /tmp/demo_paramsjob/LJ_Ar/results
Contents of the results directory: ['settings_and_initial_data', 'optimization']

2.3.1.2. Access the results¶

When a job has finished, you would like to access the results. The job may have been run via the GUI or with the ParAMSJob as above. Typically, you would write another Python script and load the finished (or running) job:

#job = ParAMSJob.load_external(results_dir)

#in this example it would be

#job = ParAMSJob.load_external('/tmp/demo_paramsjob/LJ_Ar/results')

In this tutorial, there is no need to explicitly load the job again with load_external since the job was run in the same script, so the lines above are commented out.

The results can be accessed with job.results, which is of type ParAMSResults.

Below we print a table with the initial and best Lennard-Jones parameters eps and sigma, and the corresponding loss function values.

# compare the results
initial_interface = job.results.get_parameter_interface(source='initial')
initial_loss = job.results.get_loss(source='initial')
best_interface = job.results.get_parameter_interface(source='best')
best_loss = job.results.get_loss(source='best')

print("{:12s} {:>12s} {:>12s} {:>12s}".format("", "eps", "rmin", "loss"))
print("{:12s} {:12.7f} {:12.5f} {:12.5f}".format("Initial",
                                                 initial_interface['eps'].value,
                                                 initial_interface['rmin'].value,
                                                 initial_loss))
print("{:12s} {:12.7f} {:12.5f} {:12.5f}".format("Best",
                                                 best_interface['eps'].value,
                                                 best_interface['rmin'].value,
                                                 best_loss))

                      eps         rmin         loss
Initial         0.0003000      4.00000    572.18867
Best            0.0001961      3.65375      0.00251

Let’s also plot the running loss function value vs. evaluation number:

import matplotlib.pyplot as plt
import numpy as np

evaluation, loss = job.results.get_running_loss()
plt.plot(evaluation, np.log10(loss), '-')
plt.ylabel("log10(loss)")
plt.xlabel("Evaluation number");

../../_images/plamsjob_quickstart_12_0.png

To see the parameter values at different evaluations:

evaluation, parameters = job.results.get_running_active_parameters()
plt.plot(evaluation, parameters['rmin'])
plt.xlabel("Evaluation id")
plt.ylabel("Value of rmin");

../../_images/plamsjob_quickstart_14_0.png

You can plot a scatter plot of reference vs. predicted forces with the help of the get_data_set_evaluator() function, which returns a DataSetEvaluator:

dse = job.results.get_data_set_evaluator()
forces = dse.results['forces']
plt.plot(forces.reference_values, forces.predictions, '.')
plt.xlabel(f"Reference force ({forces.unit})")
plt.ylabel(f"Predicted force ({forces.unit})");
plt.xlim(auto=True)
plt.autoscale(False)
plt.plot([-10,10],[-10,10], linewidth=5, zorder=-1, alpha = 0.3, c='red')
plt.show()

../../_images/plamsjob_quickstart_16_0.png

For all the ways the DataSetEvaluator can be used, see the Data Set Evaluator documentation.

2.3.1.3. Call PLAMS finish()¶

If you used PLAMS to run jobs, the finish() function should be called at the end, if init() was called at the beginning.

finish()

[22.03|13:46:17] PLAMS run finished. Goodbye

More results to extract can be found in the ParAMSResults API.

2.3.2. Setting up a ParAMSJob¶

To follow along, either

Download plamsjob_settings.py
Download plamsjob_settings.ipynb (see also: how to install Jupyterlab)

A ParAMSJob has a settings attribute which contains the settings for the job. The settings will be converted to the params.in file before the job is run.

2.3.2.1. Manually set ParAMSJob Settings¶

Call the get_input() function to see what the params.in file will look like.

Note:

All paths must be absolute paths. This is needed to be able to run the job through PLAMS
DataSet and Optimizer are recurring blocks, so they are initialized as lists. The DataSet[0] syntax means that the settings are set for the first dataset, etc.

from scm.plams import *
from scm.params import *
import os

job = ParAMSJob()
job.settings.input.Task = 'Optimization'
job.settings.input.JobCollection = '/path/job_collection.yaml' # absolute path
job.settings.input.DataSet = [Settings(), Settings()] #DataSet is a recurring block
job.settings.input.DataSet[0].Name = 'training_set'
job.settings.input.DataSet[0].Path = '/path/training_set.yaml' # absolute path
job.settings.input.DataSet[1].Name = 'validation_set'
job.settings.input.DataSet[1].Path = '/path/validation_set.yaml' # absolute path
job.settings.input.LoggingInterval.General = 10
job.settings.input.SkipX0 = 'No' # Booleans are specified as strings "Yes" or "No"
job.settings.input.Optimizer = [Settings()] # Optimizer is a recurring block
job.settings.input.Optimizer[0].Type = 'CMAES'
job.settings.input.Optimizer[0].CMAES.Sigma0 = 0.01
job.settings.input.Optimizer[0].CMAES.Popsize = 8
job.settings.input.ParallelLevels.Optimizations = 1 # ParallelLevels is NOT a recurring block

print(job.get_input())

Task Optimization

DataSet
  Name training_set
  Path /path/training_set.yaml
end
DataSet
  Name validation_set
  Path /path/validation_set.yaml
end

JobCollection /path/job_collection.yaml

LoggingInterval
  General 10
end

Optimizer
  CMAES
    Popsize 8
    Sigma0 0.01
  End
  Type CMAES
end

ParallelLevels
  Optimizations 1
end

SkipX0 No

2.3.2.2. Load a job from a params.in file¶

If you already have a params.in file (for example created by the GUI or by hand), you can simply load it into a ParAMSJob using from_inputfile().

Note that any paths in the params.in file get converted to absolute paths.

params_in_file = os.path.expandvars('$AMSHOME/scripting/scm/params/examples/LJ_Ar/params.in')
job = ParAMSJob.from_inputfile(params_in_file)
print(job.get_input())

task Optimization

parameterinterface /home/user/adfhome/scripting/scm/params/examples/LJ_Ar/parameter_interface.yaml

dataset
  name training_set
  path /home/user/adfhome/scripting/scm/params/examples/LJ_Ar/training_set.yaml
end

jobcollection /home/user/adfhome/scripting/scm/params/examples/LJ_Ar/job_collection.yaml

exitcondition
  timelimit 120
  type TimeLimit
end
exitcondition
  maxoptimizersconverged 1
  type MaxOptimizersConverged
end

optimizer
  scipy
    algorithm Nelder-Mead
  End
  type Scipy
end

parallellevels
  jobs 1
  optimizations 1
  parametervectors 1
  processes 1
end

2.3.2.3. Create a ParAMSJob from a directory with .yaml files¶

In the input file you need to specify many paths to different .yaml files. This can be tedious to set up manually. If you have a directory with .yaml files (e.g. the jobname.params directory created by the GUI), you can initialize a ParAMSJob to read those yaml files using from_yaml(). The files need to have the default names:

job_collection.yaml
training_set.yaml
validation_set.yaml
job_collection_engines.yaml or engine_collection.yaml
parameter_interface.yaml or parameters.yaml

Note 1: When you run a ParAMSJob, any .yaml files in the current working directory will be used if they have the default names and if the corresponding settings are unset. In this way, you do not need to specify the paths in the settings if you have the .yaml files in the same directory as the script .py file that runs the job.

Note 2: from_yaml() only sets the settings for the yaml files and leaves all other settings empty. from_inputfile() reads all the settings from the params.in file.

job = ParAMSJob.from_yaml(os.path.expandvars('$AMSHOME/scripting/scm/params/examples/LJ_Ar'))
print(job.get_input())

ParameterInterface /home/user/adfhome/scripting/scm/params/examples/LJ_Ar/parameter_interface.yaml

DataSet
  Name training_set
  Path /home/user/adfhome/scripting/scm/params/examples/LJ_Ar/training_set.yaml
end

JobCollection /home/user/adfhome/scripting/scm/params/examples/LJ_Ar/job_collection.yaml

2.3.2.4. Input validation¶

The allowed settings blocks, keys, and values are described in the documentation. If you make a mistake in the block or key names, get_input() will raise an error:

job.settings.input.NonExistingKey = 3.14159
try:
    print(job.get_input())
except Exception as e:
    print(e)

Input error: unrecognized entry "nonexistingkey" found in line 10

If you want to print the input anyway, use get_input(validate=False):

print(job.get_input(validate=False)) # print the input anyway

ParameterInterface /home/user/adfhome/scripting/scm/params/examples/LJ_Ar/parameter_interface.yaml

DataSet
  Name training_set
  Path /home/user/adfhome/scripting/scm/params/examples/LJ_Ar/training_set.yaml
end

JobCollection /home/user/adfhome/scripting/scm/params/examples/LJ_Ar/job_collection.yaml

NonExistingKey 3.14159

2.3.2.5. Delete a block or key¶

To delete an entry from the Settings, use del:

del job.settings.input.NonExistingKey

2.3.2.6. Attributes for easier setup of .yaml files¶

ParAMSJob has some special attributes which makes it easier to set up the settings.

job = ParAMSJob()
job.job_collection = 'my_job_collection.yaml' # will be converted to absolute path if it exists
job.training_set = 'my_training_set.yaml' # will be converted to absolute path if it exists
print(job.get_input())

DataSet
  Name training_set
  path my_training_set.yaml
end

JobCollection my_job_collection.yaml

Note that job.training_set and job.validation_set are quite special: when you assign a string to them as above, it will set the corresponding path in the settings. But when you read them you will get the corresponding Settings block:

print(job.training_set)
print(type(job.training_set))

Name:       training_set
path:       my_training_set.yaml

<class 'scm.plams.core.settings.Settings'>

print(job.training_set.path)
print(type(job.training_set.path))

my_training_set.yaml
<class 'str'>

Assigning to job.validation_set will create another item in the job.settings.input.DataSet list:

job.validation_set = 'validation_set.yaml'
print(job.get_input())

DataSet
  Name training_set
  path my_training_set.yaml
end
DataSet
  Name validation_set
  path validation_set.yaml
end

JobCollection my_job_collection.yaml

To set other settings for the training set or validation set, use the standard dot-notation:

job.validation_set.EvaluateEvery = 100
job.settings.input.LoggingInterval.General = 100
print(job.get_input())

DataSet
  Name training_set
  path my_training_set.yaml
end
DataSet
  EvaluateEvery 100
  Name validation_set
  path validation_set.yaml
end

JobCollection my_job_collection.yaml

LoggingInterval
  General 100
end

You can also use the job.parameter_interface and job.engine_collection in the same way as job.job_collection:

job = ParAMSJob()
job.parameter_interface = 'my_parameter_interface.yaml' # will be converted to absolute path if it exists
job.engine_collection = 'my_engine_collection.yaml' # will be converted to absolute path if it exists
print(job.get_input())
# note: job.training_set is always defined, this is why a DataSet block is printed below

ParameterInterface my_parameter_interface.yaml

DataSet
  Name training_set
end

EngineCollection my_engine_collection.yaml

2.3.2.7. Functions for recurring blocks: Optimizers, Stoppers, ExitConditions¶

Use the below functions to easily add optimizers, stoppers, or exit conditions:

job = ParAMSJob()

job.add_exit_condition("MaxTotalFunctionCalls", 100000)
job.add_exit_condition("TimeLimit", 24*60*60)
job.add_exit_condition("StopsAfterConvergence", {'OptimizersConverged': 3, 'OptimizersStopped': 1})

job.add_optimizer("CMAES", {'Sigma0': 0.01, 'PopSize': 8})
job.add_optimizer("Scipy")

job.add_stopper("BestFunctionValueUnmoving", {'Tolerance': 0.1})
job.add_stopper("MaxFunctionCalls", 1000)

print(job.get_input())

DataSet
  Name training_set
end

ExitCondition
  MaxTotalFunctionCalls 100000
  Type MaxTotalFunctionCalls
end
ExitCondition
  TimeLimit 86400
  Type TimeLimit
end
ExitCondition
  StopsAfterConvergence
    OptimizersConverged 3
    OptimizersStopped 1
  End
  Type StopsAfterConvergence
end

Optimizer
  CMAES
    PopSize 8
    Sigma0 0.01
  End
  Type CMAES
end
Optimizer
  Scipy
  End
  Type Scipy
end

Stopper
  BestFunctionValueUnmoving
    Tolerance 0.1
  End
  Type BestFunctionValueUnmoving
end
Stopper
  MaxFunctionCalls 1000
  Type MaxFunctionCalls
end

To delete an added recurring block, use pop together with zero-based indices:

job.settings.input.ExitCondition.pop(1) # 2nd exit condition
job.settings.input.Optimizer.pop(1) # 2nd optimizer
job.settings.input.Stopper.pop(0) # first stopper
print(job.get_input())

DataSet
  Name training_set
end

ExitCondition
  MaxTotalFunctionCalls 100000
  Type MaxTotalFunctionCalls
end
ExitCondition
  StopsAfterConvergence
    OptimizersConverged 3
    OptimizersStopped 1
  End
  Type StopsAfterConvergence
end

Optimizer
  CMAES
    PopSize 8
    Sigma0 0.01
  End
  Type CMAES
end

Stopper
  MaxFunctionCalls 1000
  Type MaxFunctionCalls
end

Note: the ExitConditionBooleanCombination and StopperBooleanCombination work with indices starting with 1.

Electronic Structure

ADF

Periodic DFT

DFTB & MOPAC

Interatomic Potentials

ReaxFF

Machine Learning Potentials

Force Fields

kMC and Microkinetics

Bumblebee: OLED stacks

Fluid Thermodynamics

COSMO-RS

Workflows and Utilities

OLED workflows

ChemTraYzer2

Conformers

Reactions Discovery

AMS Driver

Properties

PES Exploration

Molecular Dynamics

Monte Carlo

Interfaces

ParAMS

PLAMS

GUI

VASP

Downloads

Windows

Mac

Linux

Documentation

Overview

Tutorials

Installation Manual

Brochures

Other Resources

Changelog

Workshops

Knowledgebank

FAQ

Pricing and licensing