Automating Workflows¶
Identifying systems with certain physical properties out of a large database of molecules is a typical task that can be easily automatized. In this tutorial we will show how this can be achieved with a simple PLAMS script.
Note: this tutorial focuses on scripting aspects rather than studying physical phenomena. Thus, the computational methods used might not be entirely appropriate to describe the physical properties of interest.
Introducing the case study¶
Imagine that, out of a large database of test systems, we want to identify those molecules that absorb light with an energy smaller than 4 eV.
A simple approach would be to calculate the excitation energies (and the corresponding oscillator strengths) using ADF’s time-dependent DFT (TD-DFT) for all the systems of the database. As TD-DFT is a computationally expensive method, this procedure can be quite demanding for larger numbers of molecules.
A feasible approach may thus be realized by pre-screening the large database of molecules first, e.g. by using a less accurate but more efficient method like time-dependent DFTB (TD-DFTB). The (computationally expensive) TD-DFT calculations with ADF need then to be conducted only for those systems exhibiting the most promising adsorption energies in the previous TD-DFTB calculations. Thus, the basic workflow may look as follows:
- Step 1
- Calculate excitation energies and oscillator strengths with TD-DFTB.
- Select those molecules with electronic excitations of energies below 6 eV and nonvanishing oscillator strengths (since TD-DFTB is less accurate than TD-DFT, we opt for a larger energy threshold in this step). This should significantly reduce the number of molecules to be considered in the following steps.
- Step 2
- Optimize the structures of the promising molecules with ADF.
- Step 3
- Compute the electronic excitations energies using ADF’s TD-DFT and select the molecules with excitations of energies below 4 eV and significant oscillator strengths.
Depending on your command of the Python language, you may want to read the following Python tutorials before proceeding:
Workflow script¶
Click here
to download a tarball file (workflow_tutorial.tar) containing the scripts workflow.py, my_settings.py and a folder molecules
with the molecular structures in .xyz format.
You can extract the files by typing in your terminal:
tar -xf workflow_tutorial.tar
The workflow.py script performs the three steps described above:
# workflow.py
# Workflow automation with PLAMS
import os
import sys
sys.path.append(os.getcwd()) # make sure we can import from the current dir
import my_settings # my_settings.py contains some predefined settings
def get_good_excitations(job, energy, osc_str):
"""Returns the excitations satisfying the condition e < en_thresh and o > osc_thresh.
Units: energies in eV and oscillator strengths in Debye"""
good_excitations = []
# Before reading the results, make sure that the job terminated normally
if job.check():
# The results on the binary KF file are in atomic units
exc = [Units.convert(e,'au','eV')
for e in job.results.readkf('Excitations SS A','excenergies')]
osc = [Units.convert(o,'au','Debye')
for o in job.results.readkf('Excitations SS A','oscillator strengths')]
for e,o in zip(exc,osc):
# Check whether the excitation satisfies our condition:
if e < energy and o > osc_str:
good_excitations.append((e,o))
return good_excitations
@add_to_class(DFTBResults)
def get_opt_molecule(self):
return self.get_molecule('Molecule','Coords')
@add_to_class(ADFResults)
def get_opt_molecule(self):
return self.get_molecule('Geometry','xyz')
################################ Step 0 ################################
# Create a list of Molecules from the .xyz files in the directory 'molecules':
molecules_dir = 'molecules'
molecules = []
# Loop over all the file in the directory
for mol_name in os.listdir(molecules_dir):
mol = Molecule(os.path.join(molecules_dir,mol_name))
# the properties attribute of the Molecule object is a PLAMS’ Settings object
mol.properties.name = os.path.splitext(mol_name)[0]
molecules.append(mol)
################################ Step 1 ################################
print("Step 1: optimize molecules and calculate excitations with DFTB")
promising_molecules = []
for mol in molecules:
# The settings are defined in the module ``my_settings``,
# which we imported at the beginning of our script
dftb_job = DFTBJob(molecule=mol, settings=my_settings.dftb_GO_plus_excitations(),
name='DFTB_' + mol.properties.name)
dftb_job.run()
good_excitations = get_good_excitations(dftb_job, energy=6.0, osc_str=1.0E-4)
if good_excitations:
print ("Found a molecule with a promising excitation: " + mol.properties.name)
promising_molecules.append(dftb_job.results.get_opt_molecule())
print("Number of promising molecules: %i " % len(promising_molecules))
################################ Step 2 ################################
print("Step 2: Geometry optimizations with ADF")
optimized_molecules = []
for mol in promising_molecules:
adf_GO_job = ADFJob(molecule=mol, settings=my_settings.adf_GO(basis='DZP', XC='LDA'),
name='ADF_GO_' + mol.properties.name)
adf_GO_job.run()
# Before reading the results, make sure that the job terminated normally
if adf_GO_job.check():
optimized_molecules.append(adf_GO_job.results.get_opt_molecule())
################################ Step 3 ################################
print("Step 3: Calculate the excitations with ADF")
for mol in optimized_molecules:
adf_exci_job = ADFJob(molecule=mol, settings=my_settings.adf_excitations(n_excitations=5),
name='ADF_EXCI_' + mol.properties.name)
adf_exci_job.run()
good_excitations = get_good_excitations(adf_exci_job, energy=4.0, osc_str=1.0E-4)
if good_excitations:
print ("Found a molecule with a good excitation: " + mol.properties.name)
print ("Excitations (energy [eV], oscillator strength [Debye]:" + str(good_excitations))
Go to the folder workflow_tutorial
and execute the script by typing:
$ADFBIN/plams workflow.py
The calculation will take about a minute and should yield:
Step 1: optimize molecules and calculate excitations with DFTB
[13:34:33] Job DFTB_AlF3 started
...
[13:34:41] Job DFTB_S2Cl2 finished with status 'successful'
Found a molecule with a promising excitation: S2Cl2
Number of promising molecules: 2
Step 2: Geometry optimizations with ADF
[13:34:41] Job ADF_GO_CSCl2 started
...
[13:34:53] Job ADF_GO_S2Cl2 finished with status 'successful'
Step 3: Calculate the excitations with ADF
[13:34:53] Job ADF_EXCI_CSCl2 started
...
[13:35:18] Job ADF_EXCI_S2Cl2 finished with status 'successful'
Found a molecule with a good excitation: S2Cl2
Excitations (energy [eV], oscillator strength [Debye]):
[(3.426, 0.011), (3.563, 0.015), (3.568, 0.001)]
In this simple example the large database of molecules consists of 5 systems. 2 of the 5 molecules pass the TD-DFTB screening step (Step 1), and eventually only one molecule, S2Cl2, has electronic excitations satisfying our condition of an excitation energy smaller than 4 eV with a significant corresponding oscillator strength.
Settings library¶
The Settings objects specifying the input option for the computational engines are defined in my_settings.py
.
We encourage you to develop your own library of input options.
# my_settings.py
# Small library of predefined settings
from scm.plams import *
# Simple immutable settings:
def dftb_GO_plus_excitations():
sett = Settings()
sett.input.Task.RunType = 'GO'
sett.input.DFTB.ResourcesDir = 'QUASINANO2015'
sett.input.Properties.Excitations.TDDFTB.calc = 'singlet'
sett.input.Properties.Excitations.TDDFTB.lowest = 10
return sett
# You can improve flexibility by having optional arguments:
def adf_GO(basis='TZP', XC='GGA PBE'):
sett = Settings()
sett.input.Basis.Type = basis
sett.input.XC['_1'] = XC
sett.input.NumericalQuality = 'Basic'
sett.input.Geometry
return sett
def adf_excitations(basis='TZP', XC='GGA PBE', n_excitations=10):
sett = Settings()
sett.input.Basis.Type = basis
sett.input.XC['_1'] = XC
sett.input.Excitations.lowest = n_excitations
sett.input.Excitations.OnlySing = ''
sett.input.NumericalQuality = 'Basic'
sett.input.Symmetry = 'NoSym'
return sett
Miscellaneous remarks¶
Before accessing a Results object of a PLAMS’ job you should assert the correct termination of the corresponding job via the
check()
member function:job.run() if job.check(): r = job.results.readkf(section_name, variable_name)
Further details about the usage of decorator functions can be found in the corresponding section of the PLAMS documentation.