2.1. Conformers¶
There are four types of conformer classes, which differ in the approach to duplicate recognition,
and all have largely the same interface.
A description of the full interface is provided below for the UniqueConformersCrest
class,
followed by more abbreviated descriptions of the UniqueConformersRMSD
, UniqueConformersTFD
, and UniqueConformersAMS
classes.
Overall, we recommend the UniqueConformersCrest
, for its good accuracy/efficiency ratio, and its ability to find and store rotamers.
The UniqueConformersAMS
class is slow in filtering out duplicates for large symmetric molecular systems,
but is a good choice is conformer sets need to be compared and clustered.
The UniqueConformersRMSD
class is only able to filter out the most obvious duplicates.
It is not able to identify duplicates if they involve symmetric images (e.g. rotations around methyl groups).
2.1.1. UniqueConformersCrest¶
A class holding the conformers of a molecule, using CREST duplicate recognition to filter out duplicates.
-
class
UniqueConformersCrest
(energy_threshold=0.05, rmsd_threshold=0.125, bconst_threshold=0.003)¶ Class representing a set of unique conformers
An instance of this class has the following attributes:
molecule
– A PLAMS molecule object defining the connection data of the moleculegeometries
– A list containing the coordinates of all conformers in the setenergies
– A list containing the energies of all conformers in the setrotamers
– A list withUniqueConformersCrest
objects representing the rotamer-set for each conformergenerator
– A conformer generator object. Has to be set withset_generator()
. The default generator is of theCRESTGenerator
type.settings
– All the user definable settings *check_for_duplicates
– Only accept new conformer if candidate is not a duplicate *accept_isomers
– Don’t reject isomers (default is to reject them) *accept_all
– Accept any candidate in the set without checks
A simple example of (parallel) use:
>>> from scm.plams import Molecule >>> from scm.plams import init, finish >>> from scm.conformers import UniqueConformersCrest >>> # Set up the molecular data >>> mol = Molecule('mol.xyz') >>> conformers = UniqueConformersCrest() >>> conformers.prepare_state(mol) >>> # Set up PLAMS settings >>> init() >>> # Create the generator and run >>> conformers.generate(nproc=4) >>> finish() >>> # Write the results to file >>> print(conformers) >>> conformers.write()
Note
The default generator for this conformer class is the
CRESTGenerator
, using the GFN1-xTB engine. This will generally take a lot of time. To speed things up, set a generator with a different engine prior to runninggenerate()
:>>> engine = Settings() >>> engine.ForceField.Type = 'UFF' >>> conformers.set_generator(method='crest', engine_settings=engine, nproc=4)
-
__init__
(energy_threshold=0.05, rmsd_threshold=0.125, bconst_threshold=0.003)¶ Creates an instance of the conformer class
energy_threshold
– The energy difference above which conformers are always considered unique (kcal/mol).rmsd_threshold
– RMSD below which conformers are considered duplicates Angstrom.bconst_threshold
– Relative rotational constant used to determine if conformers are unique or not.Note: in the grimme code they use 0.01 as bconst_threshold, but this leads to a lot of misclassifications (i.e. different conformers are classified as equivalent rotamers) So, here we use a smaller default value.
-
add_conformer
(coords, energy, reorder=None, log_errors=True)¶ Adds the new coordinates to the list of conformers, if they are not duplicates
coords
– A coordinate array for the candidate conformerenergy
– The energy of the candidate conformerreorder
– Boolean specifying if the conformers should be ordered based on energy after addition of candidate
Note
If the conformer is not unique, this returns the index of its duplicate. If it is unique, this returns None.
-
get_diffs_for_candidate
(coords, energy, iconf=None)¶ Find out how much the values in the candidate molecule differ from each conformer
coords
– Coordinate array for the candidate conformerenergy
– Energy of the candidate conformer (kcal/mol)iconf
– Optional: A single conformer index to compare the candidate with (default is to compare to all)
-
_compute_inverse_size
(coords)¶ Compute all distances in this molecule object
Note: This is a CREST distance matrix thing
-
_copy_state
()¶ Copy only the state of the conformerset to the new conformerset (not the conformers)
-
_get_family
()¶ Find the names of the family of classes to which self belongs
-
_handle_duplicates
(data, reorder, check_for_duplicates)¶ Find any duplicate/rotamer conformer, and make swap if energy is lower
-
_log_all_reading_errors
(bad_candidates=None, nstart=0, indices=None)¶ Log all reading errors that were stored
-
_log_candidate_info
(nstart, bad_candidates, duplicates, indices=None)¶ Log info on duplicates and errors for candidates
indices
– A list of candidate indices for the conformers that were just added.
- Note: indices should only be passed if nstart == 0. This is because they only correspond
to the added conformers, and if some conformers were already there, you will see a problem of multiple conformers having the same index.
-
_log_energy_filtering
(high_energies, max_energy)¶ Do some logging about high energy conformers that were filtered out
-
_log_reading_errors
()¶ Log the errors from reading
-
add_conformers
(geometries, energies, max_energy=None, log_data=True)¶ Add a set of conformers and energies
-
clear
()¶ Remove all conformers
-
convert_key_to_camelcase
(key)¶ Convert the key to json format
-
convert_key_to_underscore_case
(key)¶ Revert back to original value, and see if it works
-
convert_keys_to_camelcase
(settings)¶ Convert all keys to CamelCase (except the last json ones like _type, _default, etc)
-
convert_keys_to_underscore_case
(settings)¶ Revert all the keys to the Python settings
-
copy
()¶ Copy the conformer set
-
static
create_json_entry
(typename, value, unique=True, choices=None, include=None)¶ Create a single entry
-
filter
(max_energy=None)¶ Filter all conformers again, possibly with a maximum allowed (relative) energy
-
find_clusters
(dist=5.0, criterion='maxclust', method='average', indices=None)¶ Assign all conformers to clusters
dist
– Either the max number of clusters (for maxclust), or the maximum distance between clusters (for distance)criterion
– Determines how many clusters to make (maxclust or distance).indices
– A tuple with as elements lists of indices for subsets of conformers
Note
Uses scipy’s fcluster method
-
find_nth_conformer
(i)¶ Find the the index of the n-th conformer added (indices start at 0)
-
fit
()¶ Fit the lowest energy conformer onto the reference molecule. Fit all other conformers onto the lowest one in the set.
-
classmethod
from_rdkitmol
(rdmol, energies=None, reorder=None)¶ Get all the conformers from the RDKit molecule
-
generate
(method=None, nproc=1)¶ Generate conformers using the specified method
Note
Adjusts self
method
– A string, and one of the following options[‘crest’, ‘rdkit’,’torsion’] or None o use a previously set generator
nproc
– Number of processors used in total (only used if set_generator was not called)
-
get_all_energies
()¶ Get all the energies in the set
-
get_all_geometries
()¶ Get all the geometries in the set
-
get_all_rmsds
()¶ Get the RMSD value from the lowest energy conformer for all conformers
-
get_conformers
()¶ Returns the conformers as a list of molecules
-
get_dendrogram
(method='average')¶ Gets a dendrogram reflecting the distances between conformers
Note
Uses scipy’s fcluster method
-
get_energies
()¶ Returns the energies in reference to the most stable
-
get_json_settings
()¶ Convert the settings object in self to a json style settings object
-
get_molecule
(i)¶ Return a molecule object for conformer i
-
get_plot_dendrogram
(dend, names=None, fontsize=4)¶ Makes a plot of the dendrogram
-
get_rdkitmol
()¶ Convert to RDKit molecule
-
get_rmsds_from_frame
(frame)¶ Get all RMSDs from a certain frame
-
handle_input
(inp, names, handle_nested_objects)¶ Get the settings from the input, convert to underscore case, and insert them
inp
- Input objectnames
- The names (self._name) of all the classes in the family of classes of selfhandle_nested_objects
- Boolean specifying if there can be encapsulated objects of the same family,for which settings are needed
-
indices_to_names
(indices1, indices2, name1='a', name2='b')¶ Convert two sets of indices to names for the conformers in self
Note
Mostly for use related to clustering features
Only works with two sets of indices.
All indices need to be represented by these two lists
-
property
molecule
¶ Return the main molecule object that contains the bonds etc
-
optimize
(convergence_level, optimizer=None, max_energy=None, engine_settings=None, nproc=1, name='go', verbose=False)¶ (Re)-Optimize the conformers currently in the set
convergence_level
– One of the convergence options (‘Normal’, ‘Good’, ‘VeryGood’, ‘Excellent’)optimizer
– Instance of the ConformerOptimizer class. If not provided, an engine_settings object is required.engine_settings
– PLAMS Settings object:>>> engine_settings = Settings() >>> engine_settings.DFTB.Model = 'GFN1-xTB'
-
pass_settings
(settings, encaps_settings=False)¶ Set the settings object into self
-
prepare_state
(mol: scm.plams.mol.molecule.Molecule)¶ Set up all the molecule data
-
read
(dirname='.', name='conformers', enfilename=None, reorder=None, filetype=None, read_rotamers=False)¶ Read a conformer set from the specified directory in DCD format
-
read_settings
(inp, names)¶ Get the settings for the conformers object from the input in the required format
inp
- Input objectnames
- The classnames of all the classes in the family of classes of obj
Note: This is complicated by the fact that a single family of classes may be spread over multiple input blocks.
-
remove_conformer
(index)¶ Remove a conformer from the set
-
remove_high_energy
(max_energy)¶ Remove all high energy conformers
-
remove_non_minima
(save_rejected_to_file=False, rejected_filename='rejected_non_minima_conformers.xyz')¶ Perform PES point characterizations for all conformers and remove the ones that are not local minima If save_rejected_to_file is true, rejected non-minimum conformers are saved to the file rejected_filename
-
reorder
()¶ Reorder conformers from smallest to largest energy
-
property
rmsds
¶ Get the RMSD value from the lowest energy conformer for all conformers
-
score
(engine_settings, nproc=1, watch=True)¶ Re-score the conformers according to the energy of a single-point calculation wiht the specified engine settings. This method does not change the geometry of the conformers; it just computes the energy with the given engine and re-sort them.
-
set_blocknames
(blocknames)¶ Provide the relevant blocknames for the family of classes
blocknames
– List of strings representing the input blocks relevant for this family of classes
-
set_energies
(energies)¶ Set the energies of the conformers
-
set_generator
(method, engine_settings=None, nproc=1, max_energy=None)¶ Store a generator object
Note
Overwrites previous generator object
method
– A string, and one of the following options[‘crest’, ‘rdkit’,’torsion’,’annealing’]
engine_settings
– PLAMS Settings object:engine_settings = Settings() engine_settings.DFTB.Model = ‘GFN1-xTB’
nproc
– Number of processors used in total
-
property
subset_indices
¶ Return the indices of any subsets that were added in the past
-
to_json
()¶ Create a json settings object from self.settings
-
write
(dirname='.', name='conformers', filetype='rkf', write_rotamers=True)¶ Write the conformers to file
2.1.2. UniqueConformersTFD¶
A class holding the conformers of a molecule, using the torsion fingerprint difference distance (TFD) to recognize and filter out duplicates.
-
class
UniqueConformersTFD
(energy_threshold=0.05, tfd_threshold=0.05)¶ Class representing a set of unique conformers
An instance of this class has the following attributes:
molecule
– A PLAMS molecule object defining the connection data of the moleculerdmol
– RDKit molecule object without conformersgeometries
– A list containing the coordinates of all conformers in the setenergies
– A list containing the energies of all conformers in the setgenerator
– A conformer generator object. Has to be set withset_generator()
. The default generator is of theCRESTGenerator
type.settings
– User definable settings *check_for_duplicates
– Only accept new conformer if candidate is not a duplicate *accept_isomers
– Don’t reject isomers (default is to reject them) *accept_all
– Accept any candidate in the set without checks
A simple example of (parallel) use:
>>> from scm.plams import Molecule >>> from scm.plams import init, finish >>> from scm.conformers import UniqueConformersTFD >>> # Set up the molecular data >>> mol = Molecule('mol.xyz') >>> conformers = UniqueConformersTFD() >>> conformers.prepare_state(mol) >>> # Set up PLAMS settings >>> init() >>> # Create the generator and run >>> conformers.generate(nproc=4) >>> finish() >>> # Write the results to file >>> print(conformers) >>> conformers.write()
Note
The default generator for this conformer class is the RDKitGenerator, using the GFN1-xTB engine. This will generally take a lot of time. To speed things up, set a different generator prior to runnung
generate()
:>>> engine = Settings() >>> engine.ForceField.Type = 'UFF' >>> conformers.set_generator(method='rdkit', engine_settings=engine, nproc=4)
-
__init__
(energy_threshold=0.05, tfd_threshold=0.05)¶ Creates an instance of the conformer class
energy_threshold
– The energy difference above which conformers are always considered unique (kcal/mol).tfd_threshold
– Torsion Fingerprint (unitless)
-
prepare_state
(mol)¶ Set up all the molecule data
mol
– PLAMS Molecule object
-
property
rdmol
¶ Return the RDKit molecule object (which does not contain the conformers)
-
add_conformer
(coords, energy, reorder=None, log_errors=True)¶ Adds the new coordinates to the list of conformers, if they are not duplicates
coords
– A coordinate array for the candidate conformerenergy
– The energy of the candidate conformerreorder
– Boolean specifying if the conformers should be ordered based on energy after addition of candidate
Note
If the conformer is not unique, this returns the index of its duplicate. If it is unique, this returns None.
-
get_diffs_for_candidate
(coords, energy, iconf=None)¶ Find out how much the values in the candidate molecule differ from each conformer
coords
– Coordinate array for the candidate conformerenergy
– Energy of the candidate conformer (kcal/mol)iconf
– Optional: A single conformer index to compare the candidate with (default is to compare to all)
-
get_torsion_atoms
()¶ Returns all the torsion atoms involved in the TFD
Note
Each contribution is a list of sets of four atoms. Mostly the list has only one entry, but in case of symmetry, more sets of 4 atoms can contribute to a single torsion value.
-
get_torsion_values
(iconf)¶ Get the values of all the torsion angles for this conformer
Note
Each contribution is a list of torion angles. Mostly the list has only one entry, but in the case of symmetry, or rings, several torsion angles contribute to a single TFP value.
2.1.3. UniqueConformersRMSD¶
A class holding the conformers of a molecule, using only RMSD to recognize and filter out duplicates.
-
class
UniqueConformersRMSD
(energy_threshold=0.05, rmsd_threshold=0.125)¶ Class representing a set of unique conformers
An instance of this class has the following attributes:
molecule
– A PLAMS molecule object defining the connection data of the moleculegeometries
– A list containing the coordinates of all conformers in the setenergies
– A list containing the energies of all conformers in the setgenerator
– A conformer generator object. Has to be set withset_generator()
. The default generator is of theCRESTGenerator
type.settings
– User definable settings *check_for_duplicates
– Only accept new conformer if candidate is not a duplicate *accept_isomers
– Don’t reject isomers (default is to reject them) *accept_all
– Accept any candidate in the set without checks
A simple example of (parallel) use:
>>> from scm.plams import Molecule >>> from scm.plams import init, finish >>> from scm.conformers import UniqueConformersRMSD >>> # Set up the molecular data >>> mol = Molecule('mol.xyz') >>> conformers = UniqueConformersRMSD() >>> conformers.prepare_state(mol) >>> # Set up PLAMS settings >>> init() >>> # Create the generator and run >>> conformers.generate(nproc=4) >>> finish() >>> # Write the results to file >>> print(conformers) >>> conformers.write()
Note
The default generator for this conformer class is the
RDKitGenerator
, using the UFF engine.-
__init__
(energy_threshold=0.05, rmsd_threshold=0.125)¶ Creates an instance of the conformer class
energy_threshold
– The energy difference above which conformers are always considered unique (kcal/mol).rmsd_threshold
– RMSD below which conformers are considered duplicates Angstrom.
-
prepare_state
(mol)¶ Set up all the molecule data
mol
– PLAMS Molecule object
-
add_conformer
(coords, energy, reorder=None, log_errors=True)¶ Adds a conformer to the list if requirements are met
Note
Adds every conformer
-
get_diffs_for_candidate
(coords, energy, iconf=None)¶ Find out how much the values in the candidate molecule differ from each conformer
coords
– Coordinate array for the candidate conformerenergy
– Energy of the candidate conformer (kcal/mol)iconf
– Optional: A single conformer index to compare the candidate with (default is to compare to all)
-
_get_trimmed_rdmol
(coords)¶ Get the trimmed RDKit Molecule from the full coordinates
-
_get_trimmed_rdmol_from_conf
(conf)¶ Get the trimmed RDKit Molecule from the trunned conformer
2.1.4. UniqueConformersAMS¶
A class holding the conformers of a molecule, using distance matrices and torsion angles to recognize and filter out duplicates.
-
class
UniqueConformersAMS
(energy_threshold=0.2, dihedral_threshold=30.0, distance_threshold=0.1)¶ Class representing a set of unique conformers
An instance of this class has the following attributes:
molecule
– A PLAMS molecule object defining the connection data of the moleculegeometries
– A list containing the coordinates of all conformers in the setenergies
– A list containing the energies of all conformers in the setrotamers
– A list withUniqueConformersAMS
objects representing the rotamer-set for each conformergenerator
– A conformer generator object. Has to be set withset_generator()
. The default generator is of the CrestGenerator type.settings
– All the user definable settings *check_for_duplicates
– Only accept new conformer if candidate is not duplicate *accept_isomers
– Don’t reject isomers (default is to reject them) *accept_all
– Accept any candidate in the set without checks
A simple example of (parallel) use:
>>> from scm.plams import Molecule >>> from scm.plams import init, finish >>> from scm.conformers import UniqueConformersAMS >>> # Set up the molecular data >>> mol = Molecule('mol.xyz') >>> conformers = UniqueConformersAMS() >>> conformers.prepare_state(mol) >>> # Set up PLAMS settings >>> init() >>> # Create the generator and run >>> conformers.generate(nproc=4) >>> finish() >>> # Write the results to file >>> print(conformers) >>> conformers.write()
The default generator for this conformer class is the
RDKitGenerator
. A list of all possibe generators:RDKitGenerator
TorsionGenerator
CrestGenerator
By default the
RDKitGenerator
uses the UFF engine. To select a different engine, set a different generator prior to runninggenerate()
:>>> engine = Settings() >>> engine.ForceField.Type = 'UFF' >>> conformers.set_generator(method='rdkit', engine_settings=engine, nproc=4)
The
RDKitGenerator
first uses RDKit to generate an initial set of conformer geometries. These are then subjected to geometry optimization using an AMS engine, after which duplicates are filtered out. By default, theRDKitGenerator
determines the number of initial conformers based on the number of rotatable bonds in the system. For a large molecule, this will result in a very large number of conformers. To set the number of initial conformers by hand, use:>>> conformers.set_generator(method='rdkit', nproc=4) >>> conformers.generator.set_number_initial_conformers(100) >>> print ('Initial number of conformers: ',conformers.generator.settings.ngeoms)
-
__init__
(energy_threshold=0.2, dihedral_threshold=30.0, distance_threshold=0.1)¶ Creates an instance of the conformer class
energy_threshold
– The energy difference above which conformers are always considered unique (kcal/mol).distance_threshold
– Maximum difference a distance between two atoms can have for a conformer to be considered a duplicate.dihedral_threshold
– Maximum difference a dihedral can have for a conformer to be considered a duplicate.
-
prepare_state
(mol)¶ Set up all the molecule data
mol
– A PLAMS Molecule object
-
add_conformer
(coords, energy, reorder=None, log_errors=True)¶ Adds the new coordinates to the list of conformers, if they are not duplicates
coords
– A coordinate array for the candidate conformerenergy
– The energy of the candidate conformerreorder
– Boolean specifying if the conformers should be ordered based on energy after addition of candidate
Note
If the conformer is not unique, this method returns the index of its duplicate. If it is unique, this returns None.
-
get_diffs_for_candidate
(coords, energy=0.0, iconf=None)¶ Find out how much the values in the candidate molecule differ from each conformer
coords
– Coordinate array for the candidate conformerenergy
– Energy of the candidate conformer (kcal/mol)iconf
– Optional: A single conformer index to compare the candidate with (default is to compare to all)
-
get_group_indices
()¶ Change format of self._groups.
-
get_cost
(mapping, j, distance_matrix, dihedral_values=None, angle_values=None)¶ Compute max cost of dihedrals and distances.
-
get_partial_cost
(i, j, mapping, jconf, distance_matrix, dihedral_values=None, angle_values=None)¶ Here we compare all already mapped distances from this atom (i/j) Why only the distances from i/j? Because the rest was not affected by this particular swap.
The distances already mapped will no longer change. So, if a maximum value here is larger than any in the best option, we are not on the correct path.
-
create_tree
(start=0)¶ Create a tree for this molecular graph
-
set_partial_permutator_checks
(value)¶ Decide if the permutator should check partial values to save time
-
_compare_dihedrals_to_conformer_slow
(j, indices, dihedral_values, angle_values)¶ Find difference in dihedrals (or angles) between conformer j and new candidate
-
_get_value_indices
(mapping, groups_list)¶ Get the correct values (of dihedrals/angles) after mapping with indices
Note: Mapping may contain -1, for unmapped atoms
-
_get_dihedral_diffs_slow
(j, indices, dihedral_values, angle_values)¶ Get all the differences in dihedral angles