RDKit interface¶
RDKit is a collection of cheminformatics and machine-learning software written in C++ and Python. PLAMS interface to RDKit originates from QMFlows project and features functions for generating PLAMS molecules from string representations (SMARTS, SMILES) as well as a handful of tools for dealing with proteins and PDB files.
-
from_rdmol
(rdkit_mol, confid=-1, properties=True)[source]¶ Translate an RDKit molecule into a PLAMS molecule type. RDKit properties will be unpickled if their name ends with ‘_pickled’.
Parameters: Returns: a PLAMS molecule
Return type:
-
to_rdmol
(plams_mol, sanitize=True, properties=True, assignChirality=False)[source]¶ Translate a PLAMS molecule into an RDKit molecule type. PLAMS
Molecule
,Atom
orBond
properties are pickled if they are neither booleans, floats, integers, floats nor strings, the resulting property names are appended with ‘_pickled’.Parameters: - plams_mol (
Molecule
) – A PLAMS molecule - sanitize (bool) – Kekulize, check valencies, set aromaticity, conjugation and hybridization
- properties (bool) – If all
Molecule
,Atom
andBond
properties should be converted from PLAMS to RDKit format. - assignChirality (bool) – Assign R/S and cis/trans information, insofar as this was not yet present in the PLAMS molecule.
Returns: an RDKit molecule
Return type: rdkit.Chem.Mol
- plams_mol (
-
from_smiles
(smiles, nconfs=1, name=None, forcefield=None, rms=0.1)[source]¶ Generates PLAMS molecule(s) from a smiles strings.
Parameters: - smiles (str) – A smiles string
- nconfs (int) – Number of conformers to be generated
- name (str) – A name for the molecule
- forcefield (str) – Choose ‘uff’ or ‘mmff’ forcefield for geometry optimization and ranking of comformations. The default value None results in skipping of the geometry optimization step.
- rms (float) – Root Mean Square deviation threshold for removing similar/equivalent conformations
Returns: A molecule with hydrogens and 3D coordinates or a list of molecules if nconfs > 1
Return type: Molecule
or list of PLAMS Molecules
-
from_smarts
(smarts, nconfs=1, name=None, forcefield=None, rms=0.1)[source]¶ Generates PLAMS molecule(s) from a smarts strings. This allows for example to define hydrogens explicitly. However it is less suitable for aromatic molecules (use from_smiles in that case).
Parameters: - smarts (str) – A smarts string
- nconfs (int) – Number of conformers to be generated
- name (str) – A name for the molecule
- forcefield (str) – Choose ‘uff’ or ‘mmff’ forcefield for geometry optimization and ranking of comformations. The default value None results in skipping of the geometry optimization step.
- rms (float) – Root Mean Square deviation threshold for removing similar/equivalent conformations.
Returns: A molecule with hydrogens and 3D coordinates or a list of molecules if nconfs > 1
Return type: Molecule
or list of PLAMS Molecules
-
get_conformations
(mol, nconfs=1, name=None, forcefield=None, rms=-1, enforceChirality=False)[source]¶ Generates 3D conformation(s) for an rdkit_mol or a PLAMS Molecule
Parameters: - mol (rdkit.Chem.Mol or
Molecule
) – RDKit or PLAMS Molecule - nconfs (int) – Number of conformers to be generated
- name (str) – A name for the molecule
- forcefield (str) – Choose ‘uff’ or ‘mmff’ forcefield for geometry optimization and ranking of comformations. The default value None results in skipping of the geometry optimization step
- rms (float) – Root Mean Square deviation threshold for removing similar/equivalent conformations.
- enforceChirality (bool) – Enforce the correct chirality if chiral centers are present
Returns: A molecule with hydrogens and 3D coordinates or a list of molecules if nconfs > 1
Return type: Molecule
or list of PLAMS Molecules- mol (rdkit.Chem.Mol or
-
from_sequence
(sequence, nconfs=1, name=None, forcefield=None, rms=0.1)[source]¶ Generates PLAMS molecule from a peptide sequence. Includes explicit hydrogens and 3D coordinates.
Parameters: - sequence (str) – A peptide sequence, e.g. ‘HAG’
- nconfs (int) – Number of conformers to be generated
- name (str) – A name for the molecule
- forcefield (str) – Choose ‘uff’ or ‘mmff’ forcefield for geometry optimization and ranking of comformations. The default value None results in skipping of the geometry optimization step.
- rms (float) – Root Mean Square deviation threshold for removing similar/equivalent conformations.
Returns: A peptide molecule with hydrogens and 3D coordinates or a list of molecules if nconfs > 1
Return type: Molecule
or list of PLAMS Molecules
-
modify_atom
(mol, idx, element)[source]¶ Change atom “idx” in molecule “mol” to “element” and add or remove hydrogens accordingly
Parameters: Returns: Molecule with new element and possibly added or removed hydrogens
Return type:
-
apply_template
(mol, template)[source]¶ Modifies bond orders in PLAMS molecule according template smiles structure.
Parameters: Returns: Molecule with correct chemical structure and provided 3D coordinates
Return type:
-
apply_reaction_smarts
(mol, reaction_smarts, complete=False, forcefield=None, return_rdmol=False)[source]¶ Applies reaction smirks and returns product. If returned as a PLAMS molecule, thismolecule.properties.orig_atoms is a list of indices of atoms that have not been changed (which can for example be used partially optimize new atoms only with the freeze keyword)
Parameters: - mol (
Molecule
or rdkit.Chem.Mol) – molecule to be modified - reactions_smarts (str) – Reactions smarts to be applied to molecule
- complete (bool or float (value between 0 and 1)) – Apply reaction until no further changes occur or given fraction of reaction centers have been modified
- forcefield (str) – Specify ‘uff’ or ‘mmff’ to apply forcefield based geometry optimization of product structures.
- return_rdmol (bool) – return a RDKit molecule if true, otherwise a PLAMS molecule
Returns: (product molecule, list of unchanged atoms)
Return type: (
Molecule
, list of int)- mol (
-
readpdb
(pdb_file, removeHs=False, proximityBonding=False, return_rdmol=False)[source]¶ Generate a molecule from a PDB file
Parameters: Returns: The molecule
Return type: Molecule
or rdkit.Chem.Mol
-
writepdb
(mol, pdb_file=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>)[source]¶ Write a PDB file from a molecule
Parameters:
-
add_Hs
(mol, forcefield=None, return_rdmol=False)[source]¶ Add hydrogens to protein molecules read from PDB. Makes sure that the hydrogens get the correct PDBResidue info.
Parameters: Returns: A molecule with explicit hydrogens added
Return type: Molecule
or rdkit.Chem.Mol
-
partition_protein
(mol, residue_bonds=None, split_heteroatoms=True, return_rdmol=False)[source]¶ Splits a protein molecule into capped amino acid fragments and caps.
Parameters: - mol (
Molecule
or rdkit.Chem.Mol) – A protein molecule - residue_bonds (tuple) – a tuple of pairs of residue number indicating which peptide bonds to split. If none, split all peptide bonds.
- split_heteroatoms (bool) – if True, all bonds between a heteroatom and a non-heteroatom across residues are removed
Returns: list of fragments, list of caps
- mol (
-
get_backbone_atoms
(mol)[source]¶ Return a list of atom indices corresponding to the backbone atoms in a peptide molecule. This function assumes PDB information in properties.pdb_info of each atom, which is the case if the molecule is generated with the “readpdb” or “from_sequence” functions.
Parameters: mol ( Molecule
or rdkit.Chem.Mol) – a peptide moleculeReturns: a list of atom indices Return type: list
-
get_substructure
(mol, func_list)[source]¶ Search for functional groups within a molecule based on a list of reference functional groups. SMILES strings, PLAMS and/or RDKit molecules can be used interchangeably in “func_list”.
Example:
>>> mol = from_smiles('OCCO') # Ethylene glycol >>> func_list = ['[H]O', 'C[N+]', 'O=PO'] >>> get_substructure(mol, func_list) {'[H]O': [(<scm.plams.mol.atom.Atom at 0x125183518>, <scm.plams.mol.atom.Atom at 0x1251836a0>), (<scm.plams.mol.atom.Atom at 0x125183550>, <scm.plams.mol.atom.Atom at 0x125183240>)]}
Parameters: Returns: A dictionary with functional groups from “func_list” as keys and a list of n-tuples with matching PLAMS
Atom
as values.