KF files

KF is the main format for storing binary data used in all Amsterdam Modeling Suite programs. All TAPEXX and .rkf files are KF files. PLAMS offers a dictionary-like interface to KF format which allows for reading, writing, modifying and creating KF files efficiently.

class KFFile(path, autosave=True)[source]

A class for reading and writing binary files in KF format.

This class acts as a wrapper around KFReader collecting all the data written by user in some “temporary zone” and using Fortran binaries udmpkf and cpkf to write this data to the physical file when needed.

The constructor argument path should be a string with a path to an existing KF file or a new KF file that you wish to create. If a path to existing file is passed, new KFReader instance is created allowing to read all the data from this file.

When write() method is used, the new data is not immediately written to a disk. Instead of that, it is temporarily stored in tmpdata dictionary. When method save() is invoked, contents of that dictionary are written to a physical file and tmpdata is emptied.

Other methods like read() or delete_section() are aware of tmpdata and work flawlessly, regardless if save() was called or not.

By default, save() is automatically invoked after each write(), so physical file on a disk is always “actual”. This behavior can be adjusted with autosave constructor parameter. Having autosave enabled is usually a good idea, however, if you need to write a lot of small pieces of data to your file, the overhead of calling udmpkf and cpkf after every write() can lead to significant delays. In such a case it is advised to disable autosave and call save() manually, when needed.

Dictionary-like bracket notation can be used as a shortcut to read and write variables:

mykf = KFFile('someexistingkffile.kf')
#all three below are equivalent
x = mykf['General%Termination Status']
x = mykf[('General','Termination Status')]
x = mykf.read('General','Termination Status')

#all three below are equivalent
mykf['Geometry%xyz'] = somevariable
mykf[('Geometry','xyz')] = somevariable
mykf.write('Geometry','xyz', somevariable)
__init__(path, autosave=True)[source]

Initialize self. See help(type(self)) for accurate signature.

read(section, variable, return_as_list=False)[source]

Extract and return data for a variable located in a section.

By default, for single-value numerical or boolean variables returned value is a single number or bool. For longer variables this method returns a list of values. For string variables a single string is returned. This behavior can be changed by setting return_as_list parameter to True. In that case the returned value is always a list of numbers (possibly of length 1) or a single string.

write(section, variable, value)[source]

Write a variable with a value in a section . If such a variable already exists in this section, the old value is overwritten.

save()[source]

Save all changes stored in tmpdata to physical file on a disk.

delete_section(section)[source]

Delete the entire section from this KF file.

sections()[source]

Return a list with all section names, ordered alphabetically.

read_section(section)[source]

Return a dictionary with all variables from a given section.

Note

Some sections can contain very large amount of data. Turning them into dictionaries can cause memory shortage or performance issues. Use this method carefully.

get_skeleton()[source]

Return a dictionary reflecting the structure of this KF file. Each key in that dictionary corresponds to a section name of the KF file with the value being a set of variable names.

__getitem__(name)[source]

Allow to use x = mykf['section%variable'] or x = mykf[('section','variable')] instead of x = kf.read('section', 'variable').

__setitem__(name, value)[source]

Allow to use mykf['section%variable'] = value or mykf[('section','variable')] = value instead of kf.write('section', 'variable', value).

__iter__()[source]

Iteration yields pairs of section name and variable name.

__contains__(arg)[source]

Implements Python in operator for KFFiles. arg can be a single string with a section name or a pair of strings (section, variable).

static _split(name)[source]

Ensure that a key used in bracket notation is of the form 'section%variable' or ('section','variable'). If so, return a tuple ('section','variable').

static _str(val)[source]

Return a string representation of val in the form that can be understood by udmpkf.

class KFReader(path, blocksize=4096, autodetect=True)[source]

A class for efficient Python-native reader of binary files in KF format.

This class offers read-only access to any fragment of data from a KF file. Unlike other Python KF readers, this one does not use the Fortran binary dmpkf to process KF files, but instead reads and interprets raw binary data straight from the file, on Python level. That approach results in significant speedup (by a factor of few hundreds for large files extracted variable by variable).

The constructor argument path should be a string with a path (relative or absolute) to an existing KF file.

blocksize indicates the length of basic KF file block. So far, all KF files produced by any of Amsterdam Modeling Suite programs have the same block size of 4096 bytes. Unless you’re doing something very special, you should not touch this value.

Organization of data inside KF file can depend on a machine on which this file was produced. Two parameters can vary: the length of integer (32 or 64 bit) and endian (little or big). These parameters have to be determined before any reading can take place, otherwise the results will have no sense. If the constructor argument autodetect is True, the constructor attempts to automatically detect the format of a given KF file, allowing to read files created on a machine with different endian or integer length. This automatic detection is enabled by default and it is advised to leave it that way. If you wish to disable it, you should set endian and word attributes manually before reading anything (see the code for details).

Note

This class consists of quite technical, low level code. If you don’t need to modify or extend KFReader, you can safely ignore all private methods, all you need is read() and occasionally __iter__()

__init__(path, blocksize=4096, autodetect=True)[source]

Initialize self. See help(type(self)) for accurate signature.

read(section, variable)[source]

Extract and return data for a variable located in a section.

For single-value numerical or boolean variables returned value is a single number or bool. For longer variables this method returns a list of values. For string variables a single string is returned.

__iter__()[source]

Iteration yields pairs of section name and variable name.

_autodetect()[source]

Try to automatically detect the format (int size and endian) of this KF file.

_read_block(f, pos)[source]

Read a single block of binary data from posistion pos in file f.

_parse(block, format)[source]

Translate a block of binary data into list of values in specified format.

format should be a list of pairs (a,t) where t is one of the following characters: 's' for string (bytes), 'i' for 32-bit integer, 'q' for 64-bit integer and a is the number of occurrences (or length of a string).

For example, if format is equal to [(32,'s'),(4,'i'),(2,'d'),(2,'i')], the contents of block are divided into 72 bytes (32*1 + 4*4 + 2*8 + 2*4 = 72) chunks (possibly droping the last one, if it’s shorter than 72 bytes). Then each chunk is translated to a 9-tuple of bytes, 4 ints, 2 floats and 2 ints. List of such tuples is the returned value.

_get_data(datablock, vtype)[source]

Extract all data of a given type from a single data block. Returned value is a list of values (int, float, or bool) or a single “bytes” object.

_create_index()[source]

Find and parse relevant index blocks of KFFile to extract the information about location of all sections and variables.

Two dictionaries are populated during this process. _data contains, for each section, a list of triples describing how logical blocks of data are mapped into physical ones. For example, _data['General'] = [(3,6,12), (9,40,45)] means that logical blocks 3-8 of section General are located in physical blocks 6-11 and logical blocks 9-13 in physical blocks 40-44. This list is always sorted via first tuple elements allowing efficient access to arbitrary logical block of each section.

The second dictionary, _sections, is used to locate each variable within its section. For each section, it contains another dictionary of each variable of this section. So _section[sec][var] contains all information needed to extract variable var from section sec. This is a 4-tuple containing the following information: variable type, logic block in which the variable first occurs, position within this block where its data start and the length of the variable. Combining this information with mapping stored in _data allows to extract each single variable.

static _datablocks(lst, n=1)[source]

Transform a tuple of lists ([x1,x2,...], [(a1,b1),(a2,b2),...]) into an iterator over range(a1,b1)+range(a2,b2)+... Iteration starts from nth element of this list.

class KFHistory(kf, section)[source]

A class for reading “History” sections of files in the KF format.

This class acts as a wrapper around KFReader enabling convenient iteration over entries (frames) of History sections.

The constructor argument kf should be a KFReader instance attached to an existing KF file. The section argument then holds a name of the desired History-like section, such as “History” or “MDHistory”.

The read_all() method can be used used to easily read all values of a particular history item into a single numpy array.

To iterate over the frames in a history section, use iter() or iter_optional(). The former raises an exception if the selected variable is not present in the history, while the latter returns a given default value instead.

Usage:

kf = KFReader('somefile.rkf')
mdhistory = KFHistory(kf, 'MDHistory')

for T, p in mdhistory.iter('Temperature'), mdhistory.iter_optional('Pressure', 0):
    print(T, p)
__init__(kf, section)[source]

Initialize self. See help(type(self)) for accurate signature.

read_all(name)[source]

Return a numpy array containing the values of history item name from all frames.

iter(name)[source]

Iterate over the values of history item name.

iter_optional(name, default=None)[source]

Iterate over the values of history item name, returning default if the item is not present.