RKF trajectory files

class RKFTrajectoryFile(filename, mode='rb', fileobject=None, ntap=None)[source]

Class representing an RKF file containing a molecular trajectory

An instance of this class has the following attributes:

  • file_object – A PLAMS KFFile object, referring to the actual RKF file

  • position – The frame to which the cursor is currently pointing in the RKF file

  • mode – Designates whether the file is in read or write mode (‘rb’ or ‘wb’)

  • ntap – The number of atoms in the molecular system (needs to be constant throughout)

  • elements – The elements of the atoms in the system (needs to be constant throughout)

  • conect – The connectivity information of the current frame

  • mddata – Read mode only: A dictionary containing data from the MDHistory section in the RKF file

  • read_lattice– Read mode only: Wether the lattice vectors will be read from the file

  • read_bonds – Wether the connectivity information will be read from the file

  • saving_freq – How often the ‘wb’ file is written (default: only when close() is called)

An RKFTrajectoryFile object behaves very similar to a regular file object. It has read and write methods (read_next() and write_next()) that read and write from/to the position of the cursor in the file_object attribute. If the file is in read mode, an additional method read_frame() can be used that moves the cursor to any frame in the file and reads from there. The amount of information stored in memory is kept to a minimum, as only information from the latest frame is ever stored.

Reading and writing to and from the files can be done as follows:

>>> from scm.plams import RKFTrajectoryFile

>>> rkf = RKFTrajectoryFile('ams.rkf')
>>> mol = rkf.get_plamsmol()

>>> rkfout = RKFTrajectoryFile('new.rkf',mode='wb')

>>> for i in range(rkf.get_length()) :
>>>     crd,cell = rkf.read_frame(i,molecule=mol)
>>>     rkfout.write_next(molecule=mol)
>>> rkfout.close()

The above script reads information from the RKF file ams.rkf into the Molecule object mol in a step-by-step manner. The Molecule object is then passed to the write_next() method of the new RKFTrajectoryFile object corresponding to the new rkf file new.rkf.

The exact same result can also be achieved by iterating over the instance as a callable

>>> rkf = RKFTrajectoryFile('ams.rkf')
>>> mol = rkf.get_plamsmol()
>>> rkfout = RKFTrajectoryFile('new.rkf',mode='wb')
>>> for crd,cell in rkf(mol) :
>>>     rkfout.write_next(molecule=mol)
>>> rkfout.close()

This procedure requires all coordinate information to be passed to and from the Molecule object for each frame, which can be time-consuming. Some time can be saved by bypassing the Molecule object:

>>> rkf = RKFTrajectoryFile('ams.rkf')

>>> rkfout = RKFTrajectoryFile('new.rkf',mode='wb')
>>> rkfout.set_elements(rkf.get_elements())

>>> for crd,cell in rkf :
>>>     rkfout.write_next(coords=crd,cell=cell,conect=rkf.conect)
>>> rkfout.close()

The only mandatory argument to the write_next() method is coords. Further time can be saved by setting the read_lattice and read_bonds variables to False.

By default the write mode will create a minimal version of the RKF file, containing only elements, coordinates, lattice, and connectivity information. This minimal file format can be read by AMSMovie.

It is possible to store additional information, such as energies, velocities, and charges. To enable this, the method store_mddata() needs to be called after creation, and a dictionary of mddata needs to be passed to the write_next() method. When that is done, the AMS trajectory analysis tools can be used on the file. Restarting an MD run with such a file is however currently not possible:

>>> rkf = RKFTrajectoryFile('ams.rkf')
>>> rkf.store_mddata()
>>> mol = rkf.get_plamsmol()

>>> rkf_out = RKFTrajectoryFile('new.rkf',mode='wb')
>>> rkf_out.store_mddata(rkf)

>>> for i in range(len(rkf)) :
>>>         crd,cell = rkf.read_frame(i,molecule=mol)
>>>         rkf_out.write_next(molecule=mol,mddata=rkf.mddata)
>>> rkf_out.close()
__init__(filename, mode='rb', fileobject=None, ntap=None)[source]

Initiates an RKFTrajectoryFile object

  • filename – The path to the RKF file

  • mode – The mode in which to open the RKF file (‘rb’ or ‘wb’)

  • fileobject – Optionally, a file object can be passed instead (filename needs to be set to None)

  • ntap – If the file is in write mode, the number of atoms needs to be passed here

store_mddata(rkf=None)[source]

Read/write an MDHistory section

  • rkf – If in write mode an RKFTrajectoryFile object in read mode needs to be passed to extract unit info

store_historydata()[source]

Read/write non-standard entries in the History section

close(override_molecule_section_with_last_frame=True)[source]

Execute all prior commands and cleanly close and garbage collect the RKF file

_rewrite_molecule()[source]

Overwrite the molecule section with the latest frame

_set_mddata_items()[source]

Get all the items for the mddatam if those are to be read

_move_cursor_to_append_pos()[source]

Get the instance ready for appending

_update_celldata(cell)[source]

Use the newly supplied cell to update the dimensionality of the system

get_plamsmol()[source]

Extracts a PLAMS molecule object from the RKF file

read_frame(i, molecule=None)[source]

Reads the relevant info from frame i and returns it, or stores it in molecule

  • i – The frame number to be read from the RKF file

  • moleculeMolecule object in which the new coordinates need to be stored

get_regular_connection_table()[source]

Get the connection table without the bond orders

_store_historydata_for_step(istep)[source]

Store the extra data from the History section

Note: Block format is not used in the History section

read_next(molecule=None, read=True)[source]

Reads coordinates and lattice vectors from the current position of the cursor and returns it

  • moleculeMolecule object in which the new coordinates need to be stored

  • read – If set to False the cursor will move to the next frame without reading

write_next(coords=None, molecule=None, cell=[0.0, 0.0, 0.0], conect=None, historydata=None, mddata=None)[source]

Write frame to next position in trajectory file

  • coords – A list or numpy array of (ntap,3) containing the system coordinates in angstrom

  • molecule – A molecule object to read the molecular data from

  • cell – A set of lattice vectors (or cell diameters for an orthorhombic system) in angstrom

  • conect – A dictionary containing the connectivity info (e.g. {1:[2],2:[1]})

  • historydata – A dictionary containing additional variables to be written to the History section

  • mddata – A dictionary containing the variables to be written to the MDHistory section

The mddata dictionary can contain the following keys: (‘TotalEnergy’, ‘PotentialEnergy’, ‘Step’, ‘Velocities’, ‘KineticEnergy’, ‘Charges’, ‘ConservedEnergy’, ‘Time’, ‘Temperature’)

The historydata dictionary can contain for example: (‘Energy’,’Gradients’,’StressTensor’) All values must be in atomic units Numpy arrays or lists of lists will be flattened before they are written to the file

Note

Either coords or molecule are mandatory arguments

_set_energy(mddata, historydata)[source]

Looks if an energy is passed as input, and it not, sets to zero

_write_dictionary_to_history(data, section, counter=1)[source]

Add the entries of a dictionary to a History section

_flatten_variable(var)[source]

Make sure that the variable is a Python 1D list (not numpy)

rewind(nframes=None)[source]

Rewind the file either by nframes or to the first frame

  • nframes – The number of frames to rewind

get_length()[source]

Get the number of frames in the file

read_last_frame(molecule=None)[source]

Reads the last frame from the file

RKF history files

This subsection describes the API of the RKFHistoryFile class, which can read and write the results from simulations with changing numbers of atoms. The majority of molecular simulations explore a subspace of the canonical, micro-canonical, or isothermal-isobaric ensembles, in which the number of atoms \(N\) remains constant. However, a Grand Canonical Monte Carlo simulation is one of the exceptions in which the number of atoms in the system does change. The RKFTrajectoryFile object cannot read and write the resulting simulation history, and the derived class RKFHistoryFile was developed to handle these atypical trajectories. While the methods in this class will be slower than the ones in the parent class, the API is nearly identical. The only exception is the write_next() method, which has an additional argument elements.

class RKFHistoryFile(filename, mode='rb', fileobject=None, ntap=None)[source]

Class representing an RKF file containing a molecular simulation history with varying numbers of atoms

An instance of this class has the following attributes:

  • file_object – A PLAMS KFFile object, referring to the actual RKF file

  • position – The frame to which the cursor is currently pointing in the RKF file

  • mode – Designates whether the file is in read or write mode (‘rb’ or ‘wb’)

  • elements – The elements of the atoms in the system at the current frame

  • conect – The connectivity information of the current frame

  • mddata – Read mode only: A dictionary containing data from the MDHistory section in the RKF file

  • read_lattice– Read mode only: Wether the lattice vectors will be read from the file

  • read_bonds – Wether the connectivity information will be read from the file

An RKFHistoryFile object behaves very similar to a regular file object. It has read and write methods (read_next() and write_next()) that read and write from/to the position of the cursor in the file_object attribute. If the file is in read mode, an additional method read_frame() can be used that moves the cursor to any frame in the file and reads from there. The amount of information stored in memory is kept to a minimum, as only information from the latest frame is ever stored.

Reading and writing to and from the files can be done as follows:

>>> from scm.plams import RKFHistoryFile

>>> rkf = RKFHistoryFile('ams.rkf')
>>> mol = rkf.get_plamsmol()

>>> rkfout = RKFHistoryFile('new.rkf',mode='wb')

>>> for i in range(rkf.get_length()) :
>>>     crd,cell = rkf.read_frame(i,molecule=mol)
>>>     rkfout.write_next(molecule=mol)
>>> rkfout.close()

The above script reads information from the RKF file ams.rkf into the Molecule object mol in a step-by-step manner.. The Molecule object is then passed to the write_next() method of the new RKFHistoryFile object corresponding to the new rkf file new.rkf.

The exact same result can also be achieved by iterating over the instance as a callable

>>> rkf = RKFHistoryFile('ams.rkf')
>>> mol = rkf.get_plamsmol()
>>> rkfout = RKFHistoryFile('new.rkf',mode='wb')
>>> for crd,cell in rkf(mol) :
>>>     rkfout.write_next(molecule=mol)
>>> rkfout.close()

This procedure requires all coordinate information to be passed to and from the Molecule object for each frame, which can be time-consuming. Some time can be saved by bypassing the Molecule object:

>>> rkf = RKFHistoryFile('ams.rkf')

>>> rkfout = RKFHistoryFile('new.rkf',mode='wb')

>>> for crd,cell in rkf :
>>>     rkfout.write_next(coords=crd,cell=cell,elements=rkf.elements,conect=rkf.conect)
>>> rkfout.close()

The only mandatory argument to the write_next() method is coords. Further time can be saved by setting the read_lattice and read_bonds variables to False.

By default the write mode will create a minimal version of the RKF file, containing only elements, coordinates, lattice, and connectivity information. This minimal file format can be read by AMSMovie.

If the original RKF file contains an MDHistory section (if it resulted from a MolecularGun simulation) it is possible to store the information from that section and write it to another file. To enable this, the method store_mddata() needs to be called after creation, and a dictionary of mddata needs to be passed to the write_next() method. When that is done, the AMS trajectory analysis tools can be used on the file. Restarting an MD run with such a file is however currently not possible:

>>> rkf = RKFHistoryFile('ams.rkf')
>>> rkf.store_mddata()
>>> mol = rkf.get_plamsmol()

>>> rkf_out = RKFHistoryFile('new.rkf',mode='wb')
>>> rkf_out.store_mddata(rkf)

>>> for i in range(rkf.get_length()) :
>>>         crd,cell = rkf.read_frame(i,molecule=mol)
>>>         rkf_out.write_next(molecule=mol,mddata=rkf.mddata)
>>> rkf_out.close()
__init__(filename, mode='rb', fileobject=None, ntap=None)[source]

Initializes the RKFHistoryFile object

  • filename – The path to the RKF file

  • mode – The mode in which to open the RKF file (‘rb’ or ‘wb’)

  • fileobject – Optionally, a file object can be passed instead (filename needs to be set to None)

  • ntap – If the file is in write mode, the number of atoms can be passed here

get_plamsmol()[source]

Extracts a PLAMS molecule object from the RKF file

_rewrite_molecule()[source]

Overwrite the molecule section with the latest frame (called in close())

_correct_chemical_system(elements, prev_elements, added_atoms, removed_atoms)[source]

Check if the referenced chemical system is correct, and if not, find one matching added/removed atoms

write_next(coords=None, molecule=None, elements=None, cell=[0.0, 0.0, 0.0], conect=None, historydata=None, mddata=None)[source]

Write frame to next position in trajectory file

  • coords – A list or numpy array of (ntap,3) containing the system coordinates

  • molecule – A molecule object to read the molecular data from

  • elements – The element symbols of the atoms in the system

  • cell – A set of lattice vectors (or cell diameters for an orthorhombic system)

  • conect – A dictionary containing the connectivity info (e.g. {1:[2],2:[1]})

  • historydata – A dictionary containing additional variables to be written to the History section

  • mddata – A dictionary containing the variables to be written to the MDHistory section

The mddata dictionary can contain the following keys: (‘TotalEnergy’, ‘PotentialEnergy’, ‘Step’, ‘Velocities’, ‘KineticEnergy’, ‘Charges’, ‘ConservedEnergy’, ‘Time’, ‘Temperature’)

The historydata dictionary can contain for example: (‘Energy’,’Gradients’,’StressTensor’) All values must be in atomic units Numpy arrays or lists of lists will be flattened before they are written to the file

Note

Either coords and elements or molecule are mandatory arguments

_set_system_version_elements()[source]

Store all chemical systems from the file