COSMO-RS¶
(contributed by Bas van Beek)
COSMO-RS can be run from PLAMS using the CRSJob
class and the corresponding CRSResults
,
both respectively being subclasses of SCMJob
and SCMResults
.
Note
There is also a tutorial showing full code examples available in the COSMO-RS documentation. There are several templates available that can easily be customized for other problem types, workflows, etc.
Settings¶
For example, considering the following input file for a COSMO-RS sigma-profile calculation [1]:
compound /path/to/file.t21
frac1 1.0
end
property puresigmaprofile
nprofile 50
pure
sigmamax 0.025
end
temperature 298.15
The input file displayed above corresponds to the following settings:
>>> from scm.plams import Settings, CRSJob
>>> s = Settings()
>>> s.input.compound._h = '/path/to/file.t21'
>>> s.input.compound.frac1 = 1.0
>>> s.input.property._h = 'puresigmaprofile'
>>> s.input.property.nprofile = 50
>>> s.input.property.sigmamax = 0.025
>>> s.input.property.pure = ''
>>> s.input.temperature = 298.15
>>> my_job = CRSJob(settings=s)
>>> my_results = my_job.run()
Alternatively one can create a CRSJob
instance from a runscript created by,
for example, the ADF GUI (File -> Save as
).
>>> from scm.plams import CRSJob
>>> filename = 'path/to/my/crs/inputfile.run'
>>> my_job = CRSJob.from_inputfile(filename)
>>> my_results = my_job.run()
Settings with multiple compound¶
More often than not one is interested in the properties of
multi-component mixtures (e.g. a dissolved solute).
In such cases one has to pass multiple compound
blocks to
the input file, which is somewhat problematic as Python dictionaries
(including Settings
) can only contain a set of unique keys.
This problem can be resolved by changing the value of compound
from a Settings
instance into a list of multiple Settings
instances.
Each item within this list is expanded into its own compound
block
once CRSJob.run()
creates the actual input file.
Example Settings
with three compounds:
>>> from scm.plams import Settings, CRSJob
>>> compound1, compound2, compound3 = Settings(), Settings(), Settings()
>>> compound1._h = '/path/to/compound1.t21'
>>> compound1.frac1 = 0.33
>>> compound2._h = '/path/to/compound2.t21'
>>> compound2.frac1 = 0.33
>>> compound3._h = '/path/to/compound3.t21'
>>> compound3.frac1 = 0.33
>>> s = Settings()
>>> s.input.compound = [compound1, compound2, compound3]
>>> my_job = CRSJob(settings=s)
>>> my_results = my_job.run()
Which yields the following input:
compound /path/to/compound1.t21
frac1 0.33
end
compound /path/to/compound2.t21
frac1 0.33
end
compound /path/to/compound3.t21
frac1 0.33
end
ADF and CRSJob¶
A workflow is presented in the PLAMS cookbook. In this workflow, we follow the usual procedure of generating the inputs required to run COSMO-RS and COSMO-SAC calculations.
COSMO-RS Parameters¶
A large number of configurable parameters is available for COSMO-RS.
If one is interested in running multiple jobs it can be usefull to store the paramaters
in seperate dictionary / Settings
instance and update the Job settings as needed,
double so if one wants to use multiple different paramater sets.
An example is provided below with the default COSMO-RS paramaters (i.e. ADF Combi2005):
>>> from scm.plams import Settings
# ADF Combi2005 COSMO-RS parameters
>>> adf_combi2005 = {
... 'crsparameters': {
... '_1': 'HB_HNOF',
... '_2': 'HB_TEMP',
... '_3': 'FAST',
... '_4': 'COMBI2005',
... 'rav': 0.400,
... 'aprime': 1510.0,
... 'fcorr': 2.802,
... 'chb': 8850.0,
... 'sigmahbond': 0.00854,
... 'aeff': 6.94,
... 'lambda': 0.130,
... 'omega': -0.212,
... 'eta': -9.65,
... 'chortf': 0.816
... },
... 'dispersion': {
... 'H': -0.0340,
... 'C': -0.0356,
... 'N': -0.0224,
... 'O': -0.0333,
... 'F': -0.026,
... 'Si': -0.04,
... 'P': -0.045,
... 'S': -0.052,
... 'Cl': -0.0485,
... 'Br': -0.055,
... 'I': -0.062
... }
... }
>>> s_list = [Settings(), Settings(), Settings()]
>>> for s in s_list:
... s.input.update(adf_combi2005)
>>> print([s.input == adf_combi2005 for s in s_list])
[True, True, True]
Data analyses and plotting¶
As COSMO-RS can produce a large variety of data series,
a number of specialized methods are available in the CRSResults
for their extraction and analysis.
The resulting data is stored in either a dictionary of Numpy arrays or (optionally) a Pandas DataFrame.
The extracted data can be further customized by altering the subsection
argument.
For example, by default CRSResults.get_solubility()
will extract the solubility in mol solute
per liter solvent ("subsection=mol_per_L_solvent"
). This can be changed in, for example,
gram per liter solvent ("subsection=m_per_L_solvent"
) or the solute mass fraction ("massfrac"
).
A complete overview of all available sections and subsections can be acquired
by calling the KFFile.get_skeleton()
method of the KF binary file
(e.g. print(my_results._kf.get_skeleton())
).
Quantity |
Method for data extraction |
---|---|
|
|
|
|
|
|
|
|
|
|
|
If the Matplotlib package is installed than the resulting data can easily plotted by passing
it to the CRSResults.plot()
method (e.g. CRSResults.plot(my_sigma_profile)
):
>>> from scm.plams import Settings, CRSJob
>>> import numpy as np
>>> s = Settings()
>>> s.input.compound._h = '/path/to/Water.coskf'
>>> s.input.compound.frac1 = 1.0
>>> s.input.property._h = 'puresigmaprofile'
>>> s.input.property.nprofile = 50
>>> s.input.property.sigmamax = 0.025
>>> s.input.property.pure = ''
>>> s.input.temperature = 298.15
>>> my_job = CRSJob(settings=s)
>>> my_results = my_job.run()
>>> my_sigma_profile = my_results.get_sigma_profile()
>>> with np.printoptions(threshold=0, edgeitems=5):
... print(my_sigma_profile)
{'Water.coskf': array([0., 0., 0., 0., 0., ..., 0., 0., 0., 0., 0.]),
'σ (e/A**2)': array([-0.25, -0.24, -0.23, -0.22, -0.21, ..., 0.21, 0.22, 0.23, 0.24, 0.25])}
>>> my_results.plot(my_sigma_profile)
API¶
- class CRSJob(**kwargs)[source]¶
A
SCMJob
subclass intended for running COSMO-RS jobs.- _result_type¶
alias of
CRSResults
- class CRSResults(job)[source]¶
A
SCMResults
subclass for accessing results ofCRSJob
.- get_energy(energy_type='deltag', compound_idx=0, unit='kcal/mol')[source]¶
Returns the solute solvation energy from an Activity Coefficients calculation.
- get_activity_coefficient(compound_idx=0)[source]¶
Return the solute activity coefficient from an Activity Coefficients calculation.
- get_sigma_profile(subsection='profil', as_df=False)[source]¶
Grab all sigma profiles, returning a dictionary of Numpy Arrays.
Values of \(\sigma\) are stored under the
"σ (e/A**2)"
key.Results can be returned as a Pandas DataFrame by settings as_df to
True
.The returned results can be plotted by passing them to the
CRSResults.plot()
method.Note
as_df =
True
requires the Pandas package. Plotting requires the matplotlib package.
- get_sigma_potential(subsection='mu', unit='kcal/mol', as_df=False)[source]¶
Grab all sigma profiles, expressed in unit, and return a dictionary of Numpy Arrays.
Values of \(\sigma\) are stored under the
"σ (e/A**2)"
key.Results can be returned as a Pandas DataFrame by settings as_df to
True
.The returned results can be plotted by passing them to the
CRSResults.plot()
method.Note
as_df =
True
requires the Pandas package. Plotting requires the matplotlib package.
- get_prop_names(section=None)[source]¶
Read the section of the .crskf file and return a list of the properties that were calculated. The section argument can be supplied to look at previously-calculated results. If no section name is supplied, the function defaults to using the most recent property that was calculated.
- get_results(section=None)[source]¶
Read the section from the most recent calculation type and return the result as a dictionary.
- get_multispecies_dist()[source]¶
This function returns multispecies distribution for each (compound,structure) pair. The format is a list with indices corresponding to compound indices. Each item in the list is a dictionary with a structure name : list pair, where the structure name corresponds to a structure the compound can be exist as and the list is the distribution of that compound in that structure over the number of points (mole fractions, temperatures, pressures).
- get_structure_energy(as_df=False)[source]¶
If the OUTPUT_ENERGY_COMPONENTS is set to True in the input file, this funcion returns:
The energy of each structure in multispecies (units in kcal/mol).
Information related to association with other compound, if any.
It takes the following optional boolean keyword argument, as_df. If as_df is True, the result will be returned as a Pandas DataFrame. Otherwise the result will be returned as a dictionary.
The following abbreviations are used in the dictionary or Pandas DataFrame for the energy:
s_idx: the index for each unique structure
CompIdx: the compound index in multispecies
FormIdx: the form index in multispecies
SpecIdx: the species index in multispecies
StrucIdx: the structure index in multispecies
z: the equilibrium concentration in multispecies
coskf: the corresponding coskf file for each s_idx
mu_res: the residual part of the pseudo-chemical potential
mu_comb: the combinatorial part of the pseudo-chemical potential
mu_disp: the energy contribution from the dispersive interaction
mu_pdh: the energy contribution from the Pitzer-Debye-Hückel term
mu_RTlnz: the energy contribution from the ideal mixing
mu_Ecosmo: the Ecosmo energy
Assoc: True if the structure has any association with other compound
NumRepMonomer: the number of repeated monomers used for polymers
NumStrucPerComp: the number of structures per compound used for dimers, trimers
The following abbreviations are used in the dictionary or Pandas DataFrame for the information related to association with other compound:
ReqCompNameAssoc: the required compound name for the associating structure
ReqCompIdxAssoc: the required compound index (CompIdx) for the associating structure
NumReqCompAssoc: the number of the required compounds in the associating structure
- plot(*arrays, x_axis=None, plot_fig=True, x_label=None, y_label=None)[source]¶
Plot, show and return a series of COSMO-RS results as a matplotlib Figure instance.
Accepts the output of, e.g.,
CRSResults.get_sigma_profile()
: A dictionary of Numpy arrays or a Pandas DataFrame.Returns a matplotlib Figure instance which can be further modified to the users liking. Automatic plotting of the resulting figure can be disabled with the plot_fig argument.
Note
This method requires the matplotlib package.
Note
The name of the dictionary/DataFrame key containing the index (i.e. the x-axis) can, and should, be manually specified in x_axis if a custom x_axis is passed to
CRSResults._get_array_dict()
. This argument can be ignored otherwise.
- _get_array_dict(section, subsection, x_axis, index_subsection, unit='kcal/mol', as_df=False)[source]¶
Create dictionary or DataFrame containing all values in section/subsection.
- Takes the following arguments:
The section/subsection of the desired quantity.
The desired name of the index (x_axis).
The name of subsection containing the index (index_subsection).
The unit of the output quanty (ignore this keyword if not applicable).
If the result should be returned as Pandas DataFrame (as_df).