Cookbook¶
This is a collection of code snippets showing how to perform recurrent PLAMS tasks.
Settings and input¶
Create an input block with an header¶
These Settings
sett = Settings()
sett.input.ams.SomeInputBlock['_h'] = 'MyHeader'
sett.input.ams.SomeInputBlock.SomeOption = 2
will generate the following text input when used for an AMSJob
:
SomeInputBlock MyHeader
SomeOption 2
End
Create an empty input block¶
These Settings
sett = Settings()
sett.input.ams.SomeInputBlock
will generate the following text input when used for an AMSJob
:
SomeInputBlock
End
Create an input block with repeating keys¶
These Settings
sett = Settings()
sett.input.ams.constraints.atom = [1,2,3]
will generate the following text input when used for an AMSJob
:
Constraints
Atom 1
Atom 2
Atom 3
End
Repeating input block¶
These Settings
block_1 = Settings()
block_1.SomeOption = 1
block_2 = Settings()
block_2.SomeOption = 7
sett = Settings()
sett.input.ams.SomeInputBlock = [block_1, block_2]
will generate the following text input when used for an AMSJob
:
SomeInputBlock
SomeOption 1
End
SomeInputBlock
SomeOption 7
End
Convert an AMS text input into an AMS job¶
The from_input()
function can be used to convert text input into AMSJob
instances.
job = AMSJob.from_input(name='jobname', text_input='''
Task SinglePoint
System
Atoms
H 0. 0. 0.
H 0. 0. 1.
End
End
Engine Band
Basis
Type DZ
Core None
End
EndEngine
''')
Convert an AMS text input into settings object¶
The Settings
object corresponding to a piece of text input can easily be obtained as the .settings
attribute of the AMSJob
instance returned by the from_input()
method.
settings = AMSJob.from_input("""
Task SinglePoint
Engine Band
Basis
Type DZ
Core None
End
EndEngine
""").settings
Convert an AMS .run file into an AMS job¶
The from_inputfile()
method can be used to convert a .run file generated by the AMS GUI into a PLAMS AMSJob
.
job = AMSJob.from_inputfile('/path/to/job.run')
Note
This function does not work on PLAMS-generated .run files. You can instead use the PLAMS-generated .in file.
Specify paths to files in the input¶
With PLAMS, you cannot specify relative paths to input files, because every PLAMS job launches in a new directory, which makes the relative paths invalid. To specify an absolute path, use os.path.abspath
:
import os
sett = Settings()
sett.input.reaxff.forcefield = os.path.abspath('../my-forcefield.ff')
Restart from a previous job¶
To use restart features in AMS, for example the EngineRestart
, or to read the InitialVelocities
from the final velocities of a previous molecular dynamics run, you can use a convenient shortcut and simply assign the job to the corresponding settings entry:
sett = Settings()
sett.input.ams.EngineRestart = (previous_ams_job, 'engine') # resolves to the engine.rkf
sett2 = Settings()
sett2.input.ams.MolecularDynamics.InitialVelocities.Type = 'FromFile'
sett2.input.ams.MolecularDynamics.InitialVelocities.File = previous_ams_job # resolves to ams.rkf
Alternatively, call the rkfpath()
method on the previous job’s AMSResults
:
sett = Settings()
sett.input.ams.EngineRestart = previous_ams_job.results.rkfpath(file='engine')
sett2 = Settings()
sett2.input.ams.MolecularDynamics.InitialVelocities.Type = 'FromFile'
sett2.input.ams.MolecularDynamics.InitialVelocities.File = previous_ams_job.results.rkfpath()
Molecules¶
Generate a molecule from a SMILES string¶
See function from_smiles()
.
# Compute 10 conformers, optimize with UFF and pick the lowest in energy.
ethane = from_smiles('C-C', nconfs=10, forcefield='uff')[0]
Load all files in a folder as molecules¶
See function read_molecules()
.
molecules = read_molecules('/some_path/folder_containing_structure_files/')
for name, mol in molecules.items():
print("Name of the file (without extension): ", name)
print(mol)
Generate a liquid or gas mixture¶
# Generate a 2-to-1 water-acetonitrile mixture with a density of 0.9 g/cm^3 and approximately 200 atoms
water = from_smiles('O')
acetonitrile = from_smiles('CC#N')
mixture = packmol(molecules=[water, acetonitrile],
mole_fractions=[0.667, 0.333],
n_atoms=200,
density=0.9)
See Packmol example for more examples on how to construct liquid or gas mixtures and solid/liquid or solid/gas interfaces.
Write an ams.rkf-like trajectory¶
If you have a list of molecules, it can be convenient to write them to an AMS-like .rkf file so that you can visualize them in the GUI module AMSmovie.
from scm.plams import molecules_to_rkf, from_smiles
molecule_list = [from_smiles('C'), from_smiles('CC')]
molecules_to_rkf(molecule_list, 'output.rkf', overwrite=True)
Convert a trajectory to ams.rkf with bond guessing¶
Pre-optimize a molecule¶
# pre-optimize with UFF inside AMS
optimized_mol = preoptimize(mol)
See Quick jobs for details and options.
Counting rings¶
Rings inside molecules can be counted in various ways, which are not all giving the same results.
With the help of the RDKit library, a vast variety of ring counting approaches is readily available.
The general approach to using these functions in a PLAMS scripts is to convert your PLAMS Molecule
into an RDKit molecule, see the page on the RDKit interface.
This is how one searches for the smallest set of rings in a molecule:
# import RDKit
from rdkit import Chem
# create a PLAMS molecule and convert it to an RDKit Mol
dicyclopentadiene = from_smiles('C1C=CC2C1C3CC2C=C3')
rdmol = to_rdmol(dicyclopentadiene)
# Calculate smalles set of rings
for atoms in Chem.GetSymmSSSR(rdmol):
print ([atom_id for atom_id in atoms], len(atoms))
For more information see also the RDKit manual.
Extracting Results¶
You can use the following snippets to retrieve results after running the required calculations:
Directly from Functions¶
Results can be either red from previous calculations (see Accessing Old Jobs) or from an AMSResults
instance of a computation just executed within the same workflow.
In either case an AMSResults
object should be present at runtime:
myAMSJob.run()
myAMSResults = myAMSJob.results if myAMSJob.ok() else None
Warning
Access to any results data should only occur under the condition that AMSJob.ok() indicate a successful termination of the computation.
Examples: Total Energy and Final Structure¶
Multiple functions of the AMSResults
API allow for simple access of the most common results.
myAMSEnergy = myAMSResults.get_energy(unit='au')
myAMSStructure = myAMSResults.get_main_molecule()
AMSResults API Functions¶
The following members of an AMSResults
instance can be used as shown in the above examples to read results:
Property |
Function |
Return Type |
Details |
---|---|---|---|
Structure |
get_molecule(section) |
Structure from section |
|
get_input_molecule() |
Input structure |
||
get_main_molecule() |
Final structure from any AMS task |
||
get_history_molecule(step) |
Structure from history section at step # step |
||
Energy |
get_energy() |
Float |
Final energy |
Gradients |
get_gradients() |
Array (numpy) |
Gradients from engine calculation |
Stress tensor |
get_stresstensor() |
Array (numpy) |
Stress tensor from periodic engine calculation |
Hessian |
get_hessian() |
Array (numpy) |
Hessian from frequency calculation (AMS/engine) |
Elastic tensor |
get_elastictensor() |
Array (numpy) |
Elastic tensor from periodic calculation |
Frequencies |
get_frequencies() |
Array (numpy) |
Vibrational frequencies |
Atomic Charges |
get_charges() |
Array (numpy) |
Atomic partial charges |
Dipole vector |
get_dipolemoment() |
Array (numpy) |
Electric dipole moment |
Nuclear gradients of dipole vector |
get_dipolegradients() |
Array (numpy) |
Nuclear Gradients of Electric dipole moment |
From the RKF Interface¶
Other properties not listed in the table above should be retrieved using the readrkf()
function:
myProperty = myAMSResults.readrkf(section, variable)
It is the responsibility of the user to provide the correct names for section and variable under which the required result is stored in the rkf file.
Finding Section/Variable Pairs¶
Looking up the names of the needed sections and variable within rkf files is typically needed for more intricate properties when writing a new PLAMS workflow. There are two main approaches to search for this information.
From Python Directories¶
The AMSResults
member function get_rkf_skeleton()
returns a dictionary containing the available sections as keys and the containing variable names as values
KFBrowser¶
KFBrowser is a GUI module used to inspect rkf files.
- 1. Open KFBrowser in the GUI via SCM → KFBrowser2. By default KFBrowser opens the ams.rkf file. Where neccessary, switch to File → open → <engine>.rkf3. Press ctrl + e or select File → Expert Mode to display the stored file contents4. Find the entry of interest. While this is a sometimes not trivial step, most often the required variable is found in either the
Properties
orAMSResults
sections.5. Once found, the names for section and variable listed in the rkf file directly corresponds to the section/variable pair to be used in the readrkf function as shown above.
Note
When reading results from a different rkf file than ams.rkf the filename has to be specified as:
myEngineProperty = myAMSResults.readrkf(section, variable, file=<engine>)
whereas <engine> corresponds to the file <engine>.rkf present in the calculation directory.
From molecular dynamics trajectories¶
General MD properties¶
The KFHistory
class can be used to iterate through the History or MDHistory of a trajectory.
In this example the energy, temperature and pressure per frame are read and printed.
kf = KFReader(mdjob.results['ams.rkf'])
hist = KFHistory(kf, "History")
mdhist = KFHistory(kf, "MDHistory")
frame = 0
for E, T, p in zip(hist.iter("Energy"), mdhist.iter("Temperature"), mdhist.iter("Pressure")):
frame += 1
print("Frame: {} Energy: {} Temperature: {} Pressure: {}".format(frame, E, T, p))
Properties that can be iterated in this way are
Property |
Return type |
Unit |
---|---|---|
Coords |
List of float |
bohr |
nLatticeVectors |
Int |
n.a. |
LatticeVectors |
List of float |
bohr |
Energy |
Float |
hartree |
Gradients |
List of float |
hartree/bohr |
StressTensor |
List of float |
atomic units |
Note
For AMS MD simulations you must set MolecularDynamics.Trajectory.WriteGradients = "True"
to store the gradients on the ams.rkf file.
Property |
Return type |
Unit |
---|---|---|
Step |
Integer |
n.a. |
Time |
Float |
fs |
TotalEnergy |
Float |
Hartree |
PotentialEnergy |
Float |
Hartree |
KineticEnergy |
Float |
Hartree |
Temperature |
Float |
Kelvin |
ConservedEnergy |
Float |
Hartree |
Velocities |
List of float |
bohr/fs |
Charges |
List of float |
n.a. |
PressureTensor |
List of float |
hartree/bohr3 |
Pressure |
Float |
hartree/bohr3 |
Density |
Float |
dalton/bohr3 |
Number of molecules |
Float |
n.a. |
To read a single property into a numpy array, you can run
import numpy as np
# mdjob is a finished AMSJob
coords = mdjob.results.get_history_property('Coords', history_section='History')
coords = np.array(coords).reshape(len(coords), -1, 3) # in bohr
print(coords.shape)
Set history_section='MDHistory'
to read from the MDHistory section.
Molecules from trajectories¶
The coordinates of an MD trajectory can efficiently be obtained by creating an RKFTrajectoryFile
.
To create an instance of RKFTrajectoryFile
, simply pass the according ams.rkf file to it. In this example, the atomic coordinates and lattice vectors are read via RKFTrajectoryFile
while the PLAMS Molecule function get_center_of_mass()
to calculate the center of mass for every frame.
rkf = RKFTrajectoryFile(mdjob.results['ams.rkf'])
mol = rkf.get_plamsmol()
for i in range(rkf.get_length()):
coords,cell = rkf.read_frame(i,molecule=mol)
print(coords, cell, mol.get_center_of_mass())
It is also possible to iterate through the History section of trajectory file. This can be useful in cases were the numbers of atoms is changing per frame or the coordinates per single molecule are needed. Here’s an example where the molecule types present in that particular frame are read for every frame:
kf = KFReader(mdjob.results['ams.rkf'])
mdhist = KFHistory(kf, "MDHistory")
hist = KFHistory(kf, "History")
# get number of distinct molecule types and all their formulas
number_of_molecules = kf.read('Molecules','Num molecules')
formulas = [ kf.read('Molecules',f'Molecule name {i+1}') for i in range(number_of_molecules) ]
for mols, step in zip( hist.iter("Mols.Type"), mdhist.iter("Step")):
line = f"{step:8d} "
for i in sorted(set(mols)): line += f"{formulas[i-1]:s} "
print(line)
Accessing Old Jobs¶
The following illustrates how to load data from previously executed jobs.
Binding Native PLAMS Jobs¶
Warning
The jobs should be loaded with a version of PLAMS that is consistent with the version originally used to run the jobs.
From an existing PLAMS working directory with the contents
OLDDIR/
├── OLDJOB1/
| ├── ams.log
| ├── ams.rkf
| ├── OLDJOB1.dill
| ├── OLDJOB1.err
| ├── OLDJOB1.in
| ├── OLDJOB1.out
| ├── OLDJOB1.run
| ├── engine.rkf
| ├── output.xyz
├── input
└── logfile
we can bind an instance of the AMSJob
class by making use of the .dill file.
The AMSJob
object in turn contains a results object, which gives access to the data previously calculated.
This can be achieved using the load()
function as illustrated in the following snippet:
path = "OLDDIR/OLDJOB1/OLDJOB1.dill"
single_JOB = load(path) # AMSJob instance
if single_JOB.ok():
energy = single_JOB.results.get_energy() # load the desired properties
structure = single_JOB.results.get_main_molecule()
propertyX = single_JOB.results.readrkf('AMSResults', 'DipoleMoment', file='engine')
More often than not, the working directory will include multiple individual subdirectories, each containing individual PLAMS job.
OLDDIR/
├── OLDJOB1/
| ├── ams.log
| ├── ams.rkf
| ├── OLDJOB1.dill
| ├── OLDJOB1.err
| ├── OLDJOB1.in
| ├── OLDJOB1.out
| ├── OLDJOB1.run
| ├── engine.rkf
| ├── output.xyz
├── OLDJOB2/
| ├── ams.log
| ├── ams.rkf
| ├── OLDJOB2.dill
| ├── OLDJOB2.err
| ├── OLDJOB2.in
| ├── OLDJOB2.out
| ├── OLDJOB2.run
| ├── engine.rkf
| ├── output.xyz
├── OLDJOB3/
| ├── ams.log
| ├── ams.rkf
| ├── OLDJOB3.dill
| ├── OLDJOB3.err
| ├── OLDJOB3.in
| ├── OLDJOB3.out
| ├── OLDJOB3.run
| ├── engine.rkf
| ├── output.xyz
├── input
└── logfile
These can be loaded using the load_all()
function and by providing only the path to the top-level directory:
path = "OLDDIR"
all_JOBS = load_all(path)
Note that load_all()
wraps the load()
function used above and therefore requires existing .dill files in each of the loaded subdirectories.
The load_all()
function yields a dictionary with the paths of the .dill files as keys and the corresponding job object as values:
print(all_JOBS)
{'/home/user/OLDDIR/OLDJOB1/OLDJOB1.dill': <scm.plams.interfaces.adfsuite.ams.AMSJob object at 0x7f0baad340b8>,
'/home/user/OLDDIR/OLDJOB2/OLDJOB2.dill': <scm.plams.interfaces.adfsuite.ams.AMSJob object at 0x7f0baacf24a8>,
'/home/user/OLDDIR/OLDJOB3/OLDJOB3.dill': <scm.plams.interfaces.adfsuite.ams.AMSJob object at 0x7f0baad06cf8>}
We can now access these AMSJob
instances:
for this_JOB in all_JOBS.values():
if this_JOB.ok():
energy = this_JOB.results.get_energy()
structure = this_JOB.results.get_main_molecule()
propertyX = this_JOB.results.readrkf('AMSResults', 'DipoleMoment', file='engine')
Binding old RKF Files¶
In cases where the .dill files are not available any more, it is still possible to load the contents of previously generated .rkf files into a PLAMS workflow:
path = "OLDDIR/OLDJOB1/"
ext_JOB = AMSJob.load_external(path)
if ext_JOB.ok():
energy = ext_JOB.results.get_energy()
structure = ext_JOB.results.get_main_molecule()
If the .rkf file does originate from some other source than any of the direct AMS engines, also an instance of the more generic SingleJob
class can be used:
path = "OLDDIR/OLDJOB1/ams.rkf"
ext_JOB = SingleJob.load_external(path)
The downside of this latter approach is that the accessibility to the data is very limited and has to be implemented mostly in terms of pattern-matching searches in the output files.
An alternative way is to make use of the KFReader class:
path = "OLDDIR/OLDJOB1/ams.rkf"
rkf_reader = KFReader(path)
n_steps = rkf_reader.read("History", "nEntries")
energy = rkf_reader.read("History", "Energy({})".format(n_steps))
structure = rkf_reader.read("History", "Coords({})".format(n_steps))
Note that also the KFReader class lacks most of the shortcut functions of a proper AMSResults
object so that the access to the data has to be specified manually.
Parallelization¶
Parallel job execution¶
PLAMS supports running multiple jobs in parallel. Details on the synchronization between parallel job executions can be found here. To make sure your PLAMS script can take maximum advantage of parallel job execution there is a simple rule: Make sure to create and run as many jobs as possible before starting to access any results. (This is because the access to results of a job may block until that job has finished, preventing you to submit more independent jobs in the meantime.)
In the common case where there are no dependencies between jobs, this means that we should set up and run all jobs before starting to access any results. The script below shows how to parallelize the trivially parallel task of just executing the same job on a set of molecules.
mols = read_molecules('my_molecules')
# mols is a dictionary mapping filenames (without
# extension) to plams.Molecule instances, e.g.:
# my_molecules/benzene.xyz would become mols['benzene']
sett = Settings()
# ... all your settings go here ...
config.default_jobrunner = JobRunner(parallel=True, maxjobs=8)
sett.runscript.nproc = 4
# run up to 8 jobs (using 4 cores each) in parallel
jobs = { n: AMSJob(name=n, settings=sett, molecule=m) for n,m in mols.items() }
results = { n: j.run() for n,j in jobs.items() }
# make and run all jobs before accessing any results
for n,r in results.items():
print(n, r.get_energy() if r.ok() else 'Failed')
Obviously, runscript.nproc
could also be set on a per-job basis.
This is useful if some of your jobs do not scale well with the number of CPU cores, or if you have jobs of very different computational cost.
PLAMS scripts under Slurm¶
A PLAMS script can be run under the Slurm batch system and execute jobs within the resource allocation that is created for the script. The script shown in the previous section should work just fine when run under a batch system. The only change you should make is not to set the maximum number of jobs.
config.default_jobrunner = JobRunner(parallel=True)
Technical
The execution of the job is implemented as a Slurm job step, which may need to wait for free resources before actually starting.
As Slurm is taking care of limiting the number of simultaneously executing jobs, we no longer need to do that through the PLAMS JobRunner
and can therefore skip the maxjobs
argument of its constructor.
Furthermore you may need to wrap the call to the PLAMS launch script in a job script, in which we recommend you cd
to the directory from which the job was submitted.
This makes sure you will find the PLAMS working directory in the normal location.
(If $AMSBIN
is not already in your environment, this is also the place to set up the AMS environment.)
#!/bin/sh
cd "$SLURM_SUBMIT_DIR"
$AMSBIN/plams myscript.plms
The above job script can then simply be submitted to the batch system:
sbatch [...] myscript.sh
Alternatively you can also skip the job script, and submit the PLAMS launch script itself:
sbatch [...] --chdir=. $AMSBIN/plams myscript.plms
Warning
The integration of PLAMS, AMS and Slurm will only work on Slurm versions >=15.
Furthermore AMS needs to use an MPI implementation that is integrated with the Slurm.
This is the case for the IntelMPI builds of AMS, but not the OpenMPI builds.
Please refer to the installation manual for details on the capabilities of the different MPI versions.
If the batch system integration does not work for you, you can still run PLAMS scripts via the batch system, but you will be restricted to running on a single node and will need to use the maxjobs
argument in the constructor of the JobRunner
to limit the number of simultaneously running jobs.