4.14.4. Weights schemes¶
A weights scheme returns individual datapoint weights for data that is an array or matrix of several data points.
For example, if the data consists of the forces of the atoms, then it is an Nx3 matrix, where N is the number of atoms.
With a weights scheme, not all force components need to be weighted equally.
The weights then depend on the reference values. They do not depend on the calculated value with the parametrized engine. They also do not depend on atomic positions or any other structural information.
The parameters of a weights scheme may need to be changed depending on which unit the reference data is expressed in.
The weights scheme requires that you specify a normalization
, which affects the sum of returned weights.
Example 1: apply a weights scheme when calling a results importer.
ri = ResultsImporter()
# only fit force components between -0.03 and 0.03 force units
# the sum of the weights will equal the number of nonzero weights
ri.add_singlejob('finished-ams-job.results', properties={
'forces': {
'weights_scheme': WeightsSchemeClip(normalization='nonzero', min=-0.03, max=0.03)
}
})
print(sc.data_set[-1])
Example 2: apply a weights scheme to many data_set entries
# data_set is of type DataSet
# first filter the data_set to some suitable subset
# for example, all data_set entries having the "Group: Forces" metadata.
subset = data_set.from_metadata('Group', 'Forces')
# only fit force components between -0.03 and 0.03 force units
# the sum of the weights will equal the number of nonzero weights
subset.apply_weights_scheme(WeightsSchemeClip(normalization='nonzero', min=-0.03, max=0.03))
print(subset)
4.14.4.1. Types of weights schemes¶
Set weights to 0 outside given range (WeightsSchemeClip)
# the weights for force components smaller than -0.03 Ha/bohr or bigger than 0.03 Ha/bohr become 0
# if the forces in the reference data are expressed in Ha/bohr
# note: the min and max need to be expressed in the same unit as the force components in the data_set!
WeightsSchemeClip(min=-0.03,max=0.03)
Boltzmann weighting
WeightsSchemeBoltzmann(normalization=1.0, temperature=3000)
Gaussian weighting
WeightsSchemeGaussian(normalization='dim0', center=0, width=0.01)
4.14.4.2. Examples of weight schemes¶
In this example there are
- 66 atoms
- 198 force components, all within the range [-0.06,0.06] Ha/bohr
The figures show the weight for each force component depending on its value, for different weight schemes.
The sum of the weights is printed in the figure title.
4.14.4.3. Weights schemes API¶
-
class
WeightsScheme
(normalization=1.0)¶ -
__init__
(normalization=1.0)¶ Parent class for different weight schemes.
- normalization : float or str
‘numelements’: the sum of weights will equal the number of elements in the weights matrix
‘nonzero’: the sum of weights will equal the number of nonzero elements in the weights matrix
‘dim0’: the sum of weights will equal the length of the first dimension of the weights matrix (for example, the number of atoms if the data consists of forces)
‘dim1’: the sum of weights will equal the length of the second dimension of the weights matrix (for example, 3 if the data consists of forces)
float: the sum of weights will equal the given number
-
get_weights
(arr)¶ Returns the weights for a given data matrix arr.
- arr : np.ndarray
- A numpy array with data
-
normalize
(weights, normalization=None)¶ This normalize method does not modify the weights, but returns a number. The function is called like
weights *= self.normalize(weights)
-
__str__
()¶ Return str(self).
-
-
class
WeightsSchemeUniform
(normalization=1.0)¶ -
__init__
(normalization=1.0)¶ All weights become the same.
-
get_weights
(arr)¶ Returns the weights for a given data matrix arr.
- arr : np.ndarray
- A numpy array with data
-
__str__
()¶ Return str(self).
-
-
class
WeightsSchemeClip
(normalization=1.0, min=-inf, max=inf)¶ -
__init__
(normalization=1.0, min=-inf, max=inf)¶ All weights for entries < min or > max become 0. The remaining weights all get the same value.
-
get_weights
(arr)¶ Returns the weights for a given data matrix arr.
- arr : np.ndarray
- A numpy array with data
-
__str__
()¶ Return str(self).
-
-
class
WeightsSchemeGaussian
(normalization=1.0, mean=0.0, stdev=0.1, absolute=False)¶ -
__init__
(normalization=1.0, mean=0.0, stdev=0.1, absolute=False)¶ Apply Gaussian weighting.
exp(-(arr-mean)**2/(2*stdev**2))
, normalize by normalization- normalization : float or str
- See docs for WeightsScheme
- mean : float or str
Center of gaussian
‘max’: maximum value
‘min’: minimum value
float: numeric value
- stdev : float
- Standard deviation (“sigma”) of gaussian
- absolute : bool
- If True, there’s a mean both at +mean at -mean. The two distributions do not overlap: the weight of each datapoint is calculated from its nearest mean.
-
get_weights
(arr)¶ Returns the weights for a given data matrix arr.
- arr : np.ndarray
- A numpy array with data
-
__str__
()¶ Return str(self).
-
-
class
WeightsSchemeBoltzmann
(normalization=1.0, temperature=5000, kB=3.167e-06, subtract_min=False)¶ -
__init__
(normalization=1.0, temperature=5000, kB=3.167e-06, subtract_min=False)¶ Apply Boltzmann weighting.
If not subtract_min:
exp(-data/(kB*temperature))
If subtract_min:
exp(-(data-minimum)/(kB*temperature))
- normalization : float or str
- Normalization scheme
- temperature : float
- Temperature in K
- kB: float
- Boltzmann constant in data_unit/K. Default: the normal value in Ha/K.
- subtract_min : bool
Whether to subtract the minimum element before calculating the Boltzmann weights
Set subtract_min = True if you’re passing in raw (total) energies
Set subtract_min = False if you’re passing in relative energies (e.g. with the add_trajectory_singlepoints ResultsImporter)
-
get_weights
(arr)¶ Returns the weights for a given data matrix arr.
- arr : np.ndarray
- A numpy array with data
-
__str__
()¶ Return str(self).
-