AMS worker¶
The PLAMS interface to the AMS driver and its engines through the AMSJob
and AMSResults
questions is technically quite similar to how one would run calculations from the command line or from the GUI: the AMS input files are written to disk, the AMS driver starts up, reads its input, performs the calculations and writes the results to disk in form of human readable text files as well as machine readable binary files, usually in KF format. This setup has the advantage that any calculation that can be performed with AMS can be setup from PLAMS as an AMSJob
, and that any result from any calculation can be accessed from PLAMS through the corresponding AMSResults
instance. Furthermore, the resulting files on disk can often visualized using the AMS GUI, as if the job had been set up and run through the graphical user interface. As such, this way of running AMS offers maximum flexibility and convenience to users.
However, for simple and fast jobs where we only care about some basic results, this flexibility comes at a cost: input files need to be created on disk, a process is launched, possibly reading all kinds of configuration and parameter files. The process writes more files to disk, which we later need to open again to extract (in the worst case) just a single number. The overhead might be irrelevant for sufficiently slow engines, but for a very fast force field this overhead can easily become the performance bottleneck.
Starting with the AMS2019.3 release, the AMS driver implements a special task, in which the running process listens for calculation requests on a named pipe (FIFO) and communicates the results of the calculations back on another pipe. This avoids the overhead of starting processes and eliminates all file based I/O. You can find more information about the pipe interface in the AMS driver in the corresponding part of the documentation. In PLAMS the AMSWorker
class is used to represent this running AMS driver process. The AMSWorker
class handles all communication with the process and hides the technical details of underlying communication protocol.
Consider the following short PLAMS script, that calculates and prints the total GFN1-xTB energy for all molecules found in a folder full of xyz-files. Using the regular AMSJob
, this can be written as:
molecules = read_molecules('folder/with/xyz/files')
sett = Settings()
sett.input.ams.Task = 'SinglePoint'
sett.input.dftb.Model = 'GFN1-xTB'
for name, mol in molecules.items():
results = AMSJob(name=name, molecule=mol, settings=sett).run()
print('Energy of {} = {}'.format(name, results.get_energy()))
In order to switch this script over to using the AMSWorker
, we need to make only a couple of changes:
molecules = read_molecules('folder/with/xyz/files')
sett = Settings()
sett.input.dftb.Model = 'GFN1-xTB'
with AMSWorker(sett) as worker:
for name, mol in molecules.items():
results = worker.SinglePoint(name, mol)
print('Energy of {} = {}'.format(name, results.get_energy()))
With the first AMSJob
based version, both the Settings
instance and the Molecule
instance were passed into the constructor of the AMSJob
, while the AMSWorker
constructor only accepts the Settings
instance. The Molecule
instance is only later passed into the SinglePoint
method. This shows the basic usage of the AMSWorker
class: create it once, supplying the desired Settings
, and use these fixed settings for calculations on multiple molecules. It is not possible to change the Settings
on an already running AMSWorker
instance. If you have to switch Settings
, you need to create a new AMSWorker
with the new settings. It therefore only makes sense to use the AMSWorker
if one has to do calculations on many molecules using the same settings.
Note that when using the AMSWorker
the type of results
in the above example is not actually AMSResults
anymore: the call to SinglePoint
returns an instance of AMSWorkerResults
, which only implements a small subset of the methods available in the full AMSResults
. This is the concession we have to make for using AMSWorker
instead of AMSJob
: after all the AMSJob
class has many methods to extract arbitrary data from the result files of an AMS calculation. Since none of these files exist when directly communicating with the AMS process over a pipe, the AMSWorkerResults
class supports none of these methods.
Given these restrictions we recommend that users first try the traditional route of running the AMS driver via the AMSJob
class, and only switch to the AMSWorker
alternative if they observe a significant slowdown due to the startup and I/O cost. The overhead is likely only relevant for simple tasks (single points, geometry optimizations) using rather fast engines such as semi-empirical methods and force fields.
In case the worker process fails to start up or terminates unexpectedly, an AMSWorkerError
exception will be raised. The standard output and standard error output from the failed worker process is stored in the stdout
and stderr
attributes in AMSWorkerError
.
If an AMSWorkerError
or AMSPipeRuntimeError
exception occurs during SinglePoint
, it will be internally caught and stored in the error
attribute of the returned AMSWorkerResults
object for further inspection.
These two types of exceptions are typically related to the calculation being performed (the combination of the Molecule
and Settings
), so they are not allowed to propagate out of SinglePoint
to match the behavior of AMSJob
in similar situations.
However, other types of exceptions derived from AMSPipeError
may also occur in AMSWorker
.
These correspond to other errors defined by the pipe protocol and will propagate normally, because they represent programming and logic errors, protocol incompatibilities, or unsupported features.
In any case, AMSWorker
will be ready to handle another call to SinglePoint
after an error.
PLAMS also provides the AMSWorkerPool
class, which represents a pool of running AMSWorker
instances, which dynamically pick tasks from a queue of calculations to be performed. This is useful for workflows that require the execution of many trivially parallel simple tasks. Using the AMSWorkerPool
we could write the above example as:
molecules = read_molecules('folder/with/xyz/files')
sett = Settings()
sett.input.dftb.Model = 'GFN1-xTB'
sett.runscript.nproc = 1 # every worker is a serial process now
with AMSWorkerPool(sett, num_workers=4) as pool:
results = pool.SinglePoints(molecules.items())
for r in results:
print('Energy of {} = {}'.format(r.name, r.get_energy()))
AMSWorker API¶
-
class
AMSWorker
(settings, workerdir_root=None, workerdir_prefix='amsworker', use_restart_cache=True, keep_crashed_workerdir=False)[source]¶ A class representing a running instance of the AMS driver as a worker process.
Users need to supply a
Settings
instance representing the input of the AMS driver process (see Preparing input), but not including theTask
keyword in the input (theinput.ams.Task
key in theSettings
instance). TheSettings
instance should also not contain a system specification in theinput.ams.System
block, theinput.ams.Properties
block, or theinput.ams.GeometryOptimization
block. Often the settings of the AMS driver in worker mode will come down to just the engine block.The AMS driver will then start up as a worker, communicating with PLAMS via named pipes created in a temporary directory (determined by the workerdir_root and workerdir_prefix arguments). This temporary directory might also contain temporary files used by the worker process. Note that while an
AMSWorker
instance exists, the associated worker process can be assumed to be running and ready: If it crashes for some reason, it is automatically restarted.The recommended way to start an
AMSWorker
is as a context manager:with AMSWorker(settings) as worker: results = worker.SinglePoint('my_calculation', molecule) # clean up happens automatically when leaving the block
If it is not possible to use the
AMSWorker
as a context manager, cleanup should be manually triggered by calling thestop()
method.-
stop
(keep_workerdir=False)[source]¶ Stops the worker process and removes its working directory.
This method should be called when the
AMSWorker
instance is not used as a context manager and the instance is no longer needed. Otherwise proper cleanup is not guaranteed to happen, the worker process might be left running and files might be left on disk.
-
SinglePoint
(name, molecule, prev_results=None, quiet=True, gradients=False, stresstensor=False, hessian=False, elastictensor=False, charges=False, dipolemoment=False, dipolegradients=False)[source]¶ Performs a single point calculation on the geometry given by the
Molecule
instance molecule and returns an instance ofAMSWorkerResults
containing the results.Every calculation should be given a name. Note that the name must be unique for this
AMSWorker
instance: One should not attempt to reuse calculation names with a given instance ofAMSWorker
.By default only the total energy is calculated but additional properties can be requested using the corresponding keyword arguments:
- gradients: Calculate the nuclear gradients of the total energy.
- stresstensor: Calculate the clamped-ion stress tensor. This should only be requested for periodic systems.
- hessian: Calculate the Hessian matrix, i.e. the second derivative of the total energy with respect to the nuclear coordinates.
- elastictensor: Calculate the elastic tensor. This should only be requested for periodic systems.
- charges: Calculate atomic charges.
- dipolemoment: Calculate the electric dipole moment. This should only be requested for non-periodic systems.
- dipolemoment: Calculate the nuclear gradients of the electric dipole moment. This should only be requested for non-periodic systems.
Users can pass an instance of a previously obtained
AMSWorkerResults
as the prev_results keyword argument. This can trigger a restart from previous results in the worker process, the details of which depend on the used computational engine: For example, a DFT based engine might restart from the electronic density obtained in an earlier calculation on a similar geometry. This is often useful to speed up series of sequentially dependent calculations:mol = Molecule('some/system.xyz') with AMSWorker(sett) as worker: last_results = None do i in range(num_steps): results = worker.SinglePoint(f'step{i}', mol, prev_results=last_results, gradients=True) # modify the geometry of mol using results.get_gradients() last_results = results
Note that the restarting is disabled if the
AMSWorker
instance was created withuse_restart_cache=False
. It is still permitted to pass previousAMSResults
instances as the prev_results argument, but no restarting will happen.The quiet keyword can be used to obtain more output from the worker process. Note that the output of the worker process is not printed to the standard output but instead ends up in the
ams.out
file in the temporary working directory of theAMSWorker
instance. This is mainly useful for debugging.
-
GeometryOptimization
(name, molecule, prev_results=None, quiet=True, gradients=False, stresstensor=False, hessian=False, elastictensor=False, charges=False, dipolemoment=False, dipolegradients=False, method=None, coordinatetype=None, usesymmetry=None, optimizelattice=False, maxiterations=None, pretendconverged=None, calcpropertiesonlyifconverged=True, convenergy=None, convgradients=None, convstep=None, convstressenergyperatom=None)[source]¶ Performs a geometry optimization on the
Molecule
instance molecule and returns an instance ofAMSWorkerResults
containing the results from the optimized geometry.The geometry optimizer can be controlled using the following keyword arguments:
- method: String identifier of a particular optimization algorithm.
- coordinatetype: Select a particular kind of optimization coordinates.
- usesymmetry: Enable the use of symmetry when applicable.
- optimizelattice: Optimize the lattice vectors together with atomic positions.
- maxiterations: Maximum number of iterations allowed.
- pretendconverged: If set to true, non converged geometry optimizations will be considered successful.
- calcpropertiesonlyifconverged: Calculate properties (e.g. the Hessian) only if the optimization converged.
- convenergy: Convergence criterion for the energy (in Hartree).
- convgradients: Convergence criterion for the gradients (in Hartree/Bohr).
- convstep: Convergence criterion for displacements (in Bohr).
- convstressenergyperatom: Convergence criterion for the stress energy per atom (in Hartree).
-
AMSWorkerResults API¶
-
class
AMSWorkerResults
(name, molecule, results, error=None)[source]¶ A specialized class encapsulating the results from calls to an
AMSWorker
.Technical
AMSWorkerResults is not a subclass of
Results
orAMSResults
. It does however implement some commonly used methods of theAMSResults
class, so that results calculated byAMSJob
andAMSWorker
can be accessed in a uniform way.-
name
¶ The name of a calculation.
That is the name that was passed into the
AMSWorker
method when thisAMSWorkerResults
object was created. I can not be changed after theAMSWorkerResults
instance has been created.
-
ok
()[source]¶ Check if the calculation was successful. If not, the
error
attribute contains a corresponding exception.Users should check if the calculation was successful before using the other methods of the
AMSWorkerResults
instance, as using them might raise aResultsError
exception otherwise.
-
get_errormsg
()[source]¶ Attempts to retreive a human readable error message from a crashed job. Returns
None
for jobs without errors.
-
get_gradients
(energy_unit='au', dist_unit='au')[source]¶ Return the nuclear gradients of the total energy, expressed in energy_unit / dist_unit.
-
get_hessian
()[source]¶ Return the Hessian matrix, i.e. the second derivative of the total energy with respect to the nuclear coordinates, expressed in atomic units.
-
get_dipolegradients
()[source]¶ Return the nuclear gradients of the electric dipole moment, expressed in atomic units. This is a (3*numAtoms x 3) matrix.
-
AMSWorkerPool API¶
-
class
AMSWorkerPool
(settings, num_workers, workerdir_root=None, workerdir_prefix='awp', keep_crashed_workerdir=False)[source]¶ A class representing a pool of AMS worker processes.
All workers of the pool are initialized with the same
Settings
instance, see theAMSWorker
constructor for details.The number of spawned workers is determined by the num_workers argument. For optimal performance on many small jobs it is recommended to spawn a number of workers equal to the number of physical CPU cores of the machine the calculation is running on, and to let every worker instance run serially:
import psutil molecules = read_molecules('folder/with/xyz/files') sett = Settings() # ... more settings ... sett.runscript.nproc = 1 # <-- every worker itself is serial (aka export NSCM=1) with AMSWorkerPool(sett, psutil.cpu_count(logical=False)) as pool: results = pool.SinglePoints([ (name, molecules[name]) for name in sorted(molecules) ])
As with the underlying
AMSWorker
class, the location of the temporary directories can be changed with the workerdir_root and workerdir_prefix arguments.It is recommended to use the
AMSWorkerPool
as a context manager in order to ensure that cleanup happens automatically. If it is not possible to use theAMSWorkerPool
as a context manager, cleanup should be manually triggered by calling thestop()
method.-
SinglePoints
(items)[source]¶ Request to pool to execute single point calculations for all items in the iterable items. Returns a list of
AMSWorkerResults
objects.The items argument is expected to be an iterable of 2-tuples
(name, molecule)
and/or 3-tuples(name, molecule, kwargs)
, which are passed on to theSinglePoint
method of the pool’sAMSWorker
instances. (Herekwargs
is a dictionary containing the optional keyword arguments and their values for this method.)As an example, the following call would do single point calculations with gradients and (only for periodic systems) stress tensors for all
Molecule
instances in the dictionarymolecules
.results = pool.SinglePoint([ (name, molecules[name], { "gradients": True, "stresstensor": len(molecules[name].lattice) != 0 }) for name in sorted(molecules) ])
-
GeometryOptimizations
(items)[source]¶ Request to pool to execute geometry optimizations for all items in the iterable items. Returns a list of
AMSWorkerResults
objects for the optimized geometries.The items argument is expected to be an iterable of 2-tuples
(name, molecule)
and/or 3-tuples(name, molecule, kwargs)
, which are passed on to theGeometryOptimization
method of the pool’sAMSWorker
instances. (Herekwargs
is a dictionary containing the optional keyword arguments and their values for this method.)
-
stop
()[source]¶ Stops the all worker processes and removes their working directories.
This method should be called when the
AMSWorkerPool
instance is not used as a context manager and the instance is no longer needed. Otherwise proper cleanup is not guaranteed to happen, worker processes might be left running and files might be left on disk.
-