4.4. Convert between ParAMS and ASE formats¶
Important
This tutorial is only compatible with ParAMS 2024.1 or later.
This tutorial shows how to
convert between ParAMS and ASE (Atomic Simulation Environment) data format
See also
Documentation for ParAMSJob and ParAMSResults.
Documentation for ParAMS MachineLearning
To follow along, either
Download
convert_ase_params.py
Download
convert_ase_params.ipynb
(see also: how to install Jupyterlab)
The ASE .xyz format is a convenient format for storing single point properties of a collection of structures. It is commonly used and supported by academic machine learning potential projects, so you may find it useful.
Here, we will initialize/save a ResultsImporter object from/to ASE .xyz format.
For advanced usage you can also work directly with lists of ASE Atoms,
and the ParAMS JobCollection
and DataSet
classes. See the
documentation for params_to_ase
, and ase_to_params
.
from scm.params import ResultsImporter
import os
4.4.1. One-liner conversion from ParAMS .yaml to ASE .xyz¶
Here we use the examples directory that already contains files in the ParAMS .yaml format:
yaml_dir = os.path.expandvars(
"$AMSHOME/scripting/scm/params/examples/DataSets/LiquidAr_32Atoms_100Frames"
)
# put training_set.xyz, validation_set.xyz in current working directory
xyz_target_dir = os.getcwd()
xyz_files = ResultsImporter.from_yaml(yaml_dir).store_ase(
xyz_target_dir, format="extxyz"
)
print(xyz_files)
[PosixPath('/path/convert_ase/training_set.xyz'), PosixPath('/home/user/adfhome/scripting/scm/params/doc/source/examples/convert_ase/validation_set.xyz')]
4.4.2. One-liner conversion from ASE .xyz to ParAMS .yaml¶
xyz_dir = os.getcwd()
# put job_collection.yaml etc. in a folder yaml_ref_data
yaml_target_dir = "yaml_ref_data"
yaml_files = ResultsImporter.from_ase(
f"{xyz_dir}/training_set.xyz", f"{xyz_dir}/validation_set.xyz"
).store(yaml_target_dir, backup=False)
print(yaml_files)
['yaml_ref_data/job_collection.yaml', 'yaml_ref_data/results_importer_settings.yaml', 'yaml_ref_data/training_set.yaml', 'yaml_ref_data/validation_set.yaml']
4.4.3. More about ASE .xyz format¶
Let’s look at the first few lines of the ASE .xyz files:
for file in xyz_files:
print(f"--- first 5 lines of {file} ---")
with open(file) as f:
print("".join(f.readlines()[:5]))
--- first 5 lines of /path/convert_ase/training_set.xyz ---
32
Lattice="10.52 0.0 0.0 0.0 10.52 0.0 0.0 0.0 10.52" Properties=species:S:1:pos:R:3:forces:R:3 nAtoms=32.0 energy_weight=1.0 forces_weights=1.0 energy=-5354.689144659606 pbc="T T T"
Ar 5.22816630 -0.09999816 7.90156704 0.00827748 0.00316181 -0.00109192
Ar 5.36838670 0.11217830 2.61448917 -0.00296065 -0.01538116 0.00091621
Ar 5.47472123 5.39235886 7.71471647 -0.02373655 -0.02892664 0.00741117
--- first 5 lines of /path/convert_ase/validation_set.xyz ---
32
Lattice="10.52 0.0 0.0 0.0 10.52 0.0 0.0 0.0 10.52" Properties=species:S:1:pos:R:3:forces:R:3 nAtoms=32.0 energy_weight=1.0 forces_weights=1.0 energy=-5354.752060382252 pbc="T T T"
Ar 5.26000000 0.00000000 7.89000000 -0.00000013 0.00000037 -0.00000037
Ar 5.26000000 0.00000000 2.63000000 -0.00000013 0.00000037 0.00000013
Ar 5.26000000 5.26000000 7.89000000 -0.00000013 -0.00000013 -0.00000037
To read the ASE format, use ase.io.read("filename.xyz", ":")
to get
a list of ASE Atoms. For more details, see the ASE documentation.
import ase.io
list_of_ase_atoms = ase.io.read(xyz_files[0], ":") # files[0] == "training_set.xyz"
atoms = list_of_ase_atoms[0]
print("First structure:")
print(atoms)
print(f"Energy: {atoms.get_potential_energy()}")
First structure:
Atoms(symbols='Ar32', pbc=True, cell=[10.52, 10.52, 10.52], forces=..., calculator=SinglePointCalculator(...))
Energy: -5354.689144659606
4.4.3.1. More about ParAMS ResultsImporter initialized from ASE .xyz¶
ri = ResultsImporter.from_ase(
f"{xyz_dir}/training_set.xyz", f"{xyz_dir}/validation_set.xyz"
)
The structure is stored in the job collection:
for name in ri.job_collection:
print(f"ID: {name}")
print(ri.job_collection[name])
break
ID: training_set0001
ReferenceEngineID: None
AMSInput: |
properties
gradients yes
End
system
Atoms
Ar 5.2281663000 -0.0999981600 7.9015670400
Ar 5.3683867000 0.1121783000 2.6144891700
Ar 5.4747212300 5.3923588600 7.7147164700
Ar 5.0988082900 5.4349847800 2.7366203700
Ar 7.9476603300 7.6156524900 8.0580968000
Ar 8.0131968700 8.0720767000 2.6497474100
Ar 7.8281713200 2.4490447200 7.8676092300
Ar 8.0186369800 2.4216024300 2.7003118400
Ar 2.6755162100 7.8807576800 7.8447710500
Ar 2.5484574000 7.8109744500 2.4632325200
Ar 2.5454749200 2.7008728300 7.8508382500
Ar 2.8017327000 2.4629101300 2.6664112100
Ar 7.8474635500 -0.0973865700 0.0171454400
Ar 7.8797231100 0.0391989100 5.0441430600
Ar 7.8726237500 5.3153487100 -0.2533762600
Ar 7.9047282900 5.2404472900 5.0502125000
Ar 2.9247115500 -0.2226033900 0.1389427600
Ar 2.6890580100 0.1416854000 5.4569524000
Ar 2.7625038100 5.4126025300 -0.0144396700
Ar 2.4993060000 5.1786998200 5.3291454800
Ar 0.2830784700 7.9023484600 0.0274749500
Ar -0.1270171800 8.0412894300 5.3905843300
Ar -0.2138571200 2.5109171800 0.0077376600
Ar -0.1896415200 2.7934612700 5.3088064400
Ar 5.1592133300 7.9576884700 0.0444941600
Ar 5.0858600300 7.7810555100 5.3596734400
Ar 5.2313185300 2.6624672300 -0.0623109300
Ar 5.1917182600 2.6762513100 5.0453924000
Ar -0.0610246500 0.0066595600 7.9748565400
Ar 0.0874271200 -0.0476604500 2.7572524100
Ar -0.0099775900 5.2074930100 7.9102005200
Ar -0.1261450000 5.4866211100 2.6387010300
End
Lattice
10.5200000000 0.0000000000 0.0000000000
0.0000000000 10.5200000000 0.0000000000
0.0000000000 0.0000000000 10.5200000000
End
End
task singlepoint
The reference values are stored in the data sets:
# print the first training set entry
for ds_entry in ri.get_data_set("training_set"):
print(ds_entry)
break
---
Expression: energy('training_set0001')
Weight: 1.0
ReferenceValue: -5354.689144659606
Unit: eV, 27.211386245988
# print the first validation set entry
for ds_entry in ri.get_data_set("validation_set"):
print(ds_entry)
break
---
Expression: energy('validation_set0001')
Weight: 1.0
ReferenceValue: -5354.752060382252
Unit: eV, 27.211386245988
When you initialize a ResultsImporter using from_ase
, it will
automatically set the ResultsImporter units to the ASE units. If you use
the results importer to import new data, the data will be stored in the
ASE units (eV, eV/angstrom, etc.).
print(f"{ri.settings['units']['energy']=}")
print(f"{ri.settings['units']['forces']=}")
ri.settings['units']['energy']=('eV', 27.211386245988)
ri.settings['units']['forces']=('eV/angstrom', 51.422067476325886)