COSMO-RS

(contributed by Bas van Beek)

COSMO-RS can be run from PLAMS using the CRSJob class and the corresponding CRSResults, both respectively being subclasses of SCMJob and SCMResults.

Note

There is also a tutorial showing full code examples available in the COSMO-RS documentation. There are several templates available that can easily be customized for other problem types, workflows, etc.

Settings

For example, considering the following input file for a COSMO-RS sigma-profile calculation [1]:

compound /path/to/file.t21
    frac1 1.0
end

property puresigmaprofile
    nprofile 50
    pure
    sigmamax 0.025
end

temperature 298.15

The input file displayed above corresponds to the following settings:

>>> from scm.plams import Settings, CRSJob

>>> s = Settings()

>>> s.input.compound._h = '/path/to/file.t21'
>>> s.input.compound.frac1 = 1.0
>>> s.input.property._h = 'puresigmaprofile'
>>> s.input.property.nprofile = 50
>>> s.input.property.sigmamax = 0.025
>>> s.input.property.pure = ''
>>> s.input.temperature = 298.15

>>> my_job = CRSJob(settings=s)
>>> my_results = my_job.run()

Alternatively one can create a CRSJob instance from a runscript created by, for example, the ADF GUI (File -> Save as).

>>> from scm.plams import CRSJob

>>> filename = 'path/to/my/crs/inputfile.run'
>>> my_job = CRSJob.from_inputfile(filename)
>>> my_results = my_job.run()

Settings with multiple compound

More often than not one is interested in the properties of multi-component mixtures (e.g. a dissolved solute). In such cases one has to pass multiple compound blocks to the input file, which is somewhat problematic as Python dictionaries (including Settings) can only contain a set of unique keys.

This problem can be resolved by changing the value of compound from a Settings instance into a list of multiple Settings instances. Each item within this list is expanded into its own compound block once CRSJob.run() creates the actual input file.

Example Settings with three compounds:

>>> from scm.plams import Settings, CRSJob

>>> compound1, compound2, compound3 = Settings(), Settings(), Settings()

>>> compound1._h = '/path/to/compound1.t21'
>>> compound1.frac1 = 0.33
>>> compound2._h = '/path/to/compound2.t21'
>>> compound2.frac1 = 0.33
>>> compound3._h = '/path/to/compound3.t21'
>>> compound3.frac1 = 0.33

>>> s = Settings()
>>> s.input.compound = [compound1, compound2, compound3]

>>> my_job = CRSJob(settings=s)
>>> my_results = my_job.run()

Which yields the following input:

compound /path/to/compound1.t21
    frac1 0.33
end

compound /path/to/compound2.t21
    frac1 0.33
end

compound /path/to/compound3.t21
    frac1 0.33
end

ADF and CRSJob

A workflow is presented in the PLAMS cookbook. In this workflow, we follow the usual procedure of generating the inputs required to run COSMO-RS and COSMO-SAC calculations.

COSMO-RS Parameters

A large number of configurable parameters is available for COSMO-RS. If one is interested in running multiple jobs it can be usefull to store the paramaters in seperate dictionary / Settings instance and update the Job settings as needed, double so if one wants to use multiple different paramater sets.

An example is provided below with the default COSMO-RS paramaters (i.e. ADF Combi2005):

>>> from scm.plams import Settings

# ADF Combi2005 COSMO-RS parameters
>>> adf_combi2005 = {
...     'crsparameters': {
...         '_1': 'HB_HNOF',
...         '_2': 'HB_TEMP',
...         '_3': 'FAST',
...         '_4': 'COMBI2005',
...         'rav': 0.400,
...         'aprime': 1510.0,
...         'fcorr': 2.802,
...         'chb': 8850.0,
...         'sigmahbond': 0.00854,
...         'aeff': 6.94,
...         'lambda': 0.130,
...         'omega': -0.212,
...         'eta': -9.65,
...         'chortf': 0.816
...     },
...     'dispersion': {
...         'H': -0.0340,
...         'C': -0.0356,
...         'N': -0.0224,
...         'O': -0.0333,
...         'F': -0.026,
...         'Si': -0.04,
...         'P': -0.045,
...         'S': -0.052,
...         'Cl': -0.0485,
...         'Br': -0.055,
...         'I': -0.062
...     }
... }

>>> s_list = [Settings(), Settings(), Settings()]
>>> for s in s_list:
...     s.input.update(adf_combi2005)

>>> print([s.input == adf_combi2005 for s in s_list])
[True, True, True]

Data analyses and plotting

As COSMO-RS can produce a large variety of data series, a number of specialized methods are available in the CRSResults for their extraction and analysis. The resulting data is stored in either a dictionary of Numpy arrays or (optionally) a Pandas DataFrame.

The extracted data can be further customized by altering the subsection argument. For example, by default CRSResults.get_solubility() will extract the solubility in mol solute per liter solvent ("subsection=mol_per_L_solvent"). This can be changed in, for example, gram per liter solvent ("subsection=m_per_L_solvent") or the solute mass fraction ("massfrac").

A complete overview of all available sections and subsections can be acquired by calling the KFFile.get_skeleton() method of the KF binary file (e.g. print(my_results._kf.get_skeleton())).

Quantity

Method for data extraction

Sigma profile

CRSResults.get_sigma_profile()

Sigma potential

CRSResults.get_sigma_potential()

Vapor pressure

CRSResults.get_vapor_pressure()

Boiling point

CRSResults.get_boiling_point()

Solubility

CRSResults.get_solubility()

Binary mixture

CRSResults.get_bi_mixture()

Ternary mixtures

CRSResults.get_tri_mixture()

Solvents composition line

CRSResults.get_composition_line()

If the Matplotlib package is installed than the resulting data can easily plotted by passing it to the CRSResults.plot() method (e.g. CRSResults.plot(my_sigma_profile)):

>>> from scm.plams import Settings, CRSJob
>>> import numpy as np

>>> s = Settings()

>>> s.input.compound._h = '/path/to/Water.coskf'
>>> s.input.compound.frac1 = 1.0
>>> s.input.property._h = 'puresigmaprofile'
>>> s.input.property.nprofile = 50
>>> s.input.property.sigmamax = 0.025
>>> s.input.property.pure = ''
>>> s.input.temperature = 298.15

>>> my_job = CRSJob(settings=s)
>>> my_results = my_job.run()

>>> my_sigma_profile = my_results.get_sigma_profile()
>>> with np.printoptions(threshold=0, edgeitems=5):
...     print(my_sigma_profile)
{'Water.coskf': array([0., 0., 0., 0., 0., ..., 0., 0., 0., 0., 0.]),
 'σ (e/A**2)': array([-0.25, -0.24, -0.23, -0.22, -0.21, ...,  0.21,  0.22,  0.23,  0.24, 0.25])}

>>> my_results.plot(my_sigma_profile)
/scm-uploads/doc.trunk/plams/_images/sigma_profile.png

API

class CRSJob(**kwargs)[source]

A SCMJob subclass intended for running COSMO-RS jobs.

_result_type

alias of CRSResults

__init__(**kwargs)[source]

Initialize a CRSJob instance.

static cos_to_coskf(filename)[source]

Convert a .cos file into a .coskf file with the $AMSBIN/cosmo2kf command.

Returns the filename of the new .coskf file.

class CRSResults(job)[source]

A SCMResults subclass for accessing results of CRSJob.

get_energy(energy_type='deltag', compound_idx=0, unit='kcal/mol')[source]

Returns the solute solvation energy from an Activity Coefficients calculation.

get_activity_coefficient(compound_idx=0)[source]

Return the solute activity coefficient from an Activity Coefficients calculation.

get_sigma_profile(subsection='profil', as_df=False)[source]

Grab all sigma profiles, returning a dictionary of Numpy Arrays.

Values of \(\sigma\) are stored under the (e/A**2)" key.

Results can be returned as a Pandas DataFrame by settings as_df to True.

The returned results can be plotted by passing them to the CRSResults.plot() method.

Note

as_df = True requires the Pandas package. Plotting requires the matplotlib package.

get_sigma_potential(subsection='mu', unit='kcal/mol', as_df=False)[source]

Grab all sigma profiles, expressed in unit, and return a dictionary of Numpy Arrays.

Values of \(\sigma\) are stored under the (e/A**2)" key.

Results can be returned as a Pandas DataFrame by settings as_df to True.

The returned results can be plotted by passing them to the CRSResults.plot() method.

Note

as_df = True requires the Pandas package. Plotting requires the matplotlib package.

get_prop_names(section=None)[source]

Read the section of the .crskf file and return a list of the properties that were calculated. The section argument can be supplied to look at previously-calculated results. If no section name is supplied, the function defaults to using the most recent property that was calculated.

get_results(section=None)[source]

Read the section from the most recent calculation type and return the result as a dictionary.

get_multispecies_dist()[source]

This function returns multispecies distribution for each (compound,structure) pair. The format is a list with indices corresponding to compound indices. Each item in the list is a dictionary with a structure name : list pair, where the structure name corresponds to a structure the compound can be exist as and the list is the distribution of that compound in that structure over the number of points (mole fractions, temperatures, pressures).

get_structure_energy(as_df=False)[source]

If the OUTPUT_ENERGY_COMPONENTS is set to True in the input file, this funcion returns:

  1. The energy of each structure in multispecies (units in kcal/mol).

  2. Information related to association with other compound, if any.

It takes the following optional boolean keyword argument, as_df. If as_df is True, the result will be returned as a Pandas DataFrame. Otherwise the result will be returned as a dictionary.

The following abbreviations are used in the dictionary or Pandas DataFrame for the energy:

  • s_idx: the index for each unique structure

  • CompIdx: the compound index in multispecies

  • FormIdx: the form index in multispecies

  • SpecIdx: the species index in multispecies

  • StrucIdx: the structure index in multispecies

  • z: the equilibrium concentration in multispecies

  • coskf: the corresponding coskf file for each s_idx

  • mu_res: the residual part of the pseudo-chemical potential

  • mu_comb: the combinatorial part of the pseudo-chemical potential

  • mu_disp: the energy contribution from the dispersive interaction

  • mu_pdh: the energy contribution from the Pitzer-Debye-Hückel term

  • mu_RTlnz: the energy contribution from the ideal mixing

  • mu_Ecosmo: the Ecosmo energy

  • Assoc: True if the structure has any association with other compound

  • NumRepMonomer: the number of repeated monomers used for polymers

  • NumStrucPerComp: the number of structures per compound used for dimers, trimers

The following abbreviations are used in the dictionary or Pandas DataFrame for the information related to association with other compound:

  • ReqCompNameAssoc: the required compound name for the associating structure

  • ReqCompIdxAssoc: the required compound index (CompIdx) for the associating structure

  • NumReqCompAssoc: the number of the required compounds in the associating structure

plot(*arrays, x_axis=None, plot_fig=True, x_label=None, y_label=None)[source]

Plot, show and return a series of COSMO-RS results as a matplotlib Figure instance.

Accepts the output of, e.g., CRSResults.get_sigma_profile(): A dictionary of Numpy arrays or a Pandas DataFrame.

Returns a matplotlib Figure instance which can be further modified to the users liking. Automatic plotting of the resulting figure can be disabled with the plot_fig argument.

Note

This method requires the matplotlib package.

Note

The name of the dictionary/DataFrame key containing the index (i.e. the x-axis) can, and should, be manually specified in x_axis if a custom x_axis is passed to CRSResults._get_array_dict(). This argument can be ignored otherwise.

_get_array_dict(section, subsection, x_axis, index_subsection, unit='kcal/mol', as_df=False)[source]

Create dictionary or DataFrame containing all values in section/subsection.

Takes the following arguments:
  • The section/subsection of the desired quantity.

  • The desired name of the index (x_axis).

  • The name of subsection containing the index (index_subsection).

  • The unit of the output quanty (ignore this keyword if not applicable).

  • If the result should be returned as Pandas DataFrame (as_df).

_construct_array_dict(section, subsection, unit='kcal/mol')[source]

Construct dictionary containing all values in section/subsection.

static _dict_to_df(array_dict, section, x_axis)[source]

Attempt to convert a dictionary into a DataFrame.