3.13.1. Active Parameter Search¶
This class allows to reduce the dimensionality of the parameter search space by performing a sensitivity analysis on each active parameter individually, or in a small set.
Synopsis
>>> ff = ReaxParams('path/to/ffield.ff')
>>> ds = DataSet('path/to/dataset.yml')
>>> jc = JobCollection('path/to/jobcol.yml')
>>> aps = ActiveParameterSearch(ff, ds, jc)
>>> ids, fx = aps.scan(steps=[1.1], dim=1, verbose=True)
>>> ff.is_active = aps.get_is_active(n=20)
scan()
returns the scanned ids of the active subset, and the respective
loss function values.
>>> # Set only the first three parameters to active:
>>> ff.is_active = len(ff)*[False]
>>> for i in range(3):
>>> ff[i].is_active = True
>>> len(ff.active)
3
>>> aps = ActiveParameterSearch(ff, ds, jc)
>>> aps.scan()
(array([[0],
[1],
[2]]), array([[[-0.16769481]],
[[ 0.33069672]],
[[-0.09795433]]]))
The first return value are the scanned ids, the second one an array of loss function values.
The parameter search can also scan a subset of active parameters, rather than scanning every one individually:
>>> aps.scan(dim=2)
(array([[0, 1],
[0, 2],
[1, 2]]), array([[-0.28081611],
[ 0.02706811],
[-0.2683532 ]]))
The step size and number can be set with the steps argument. Each entry is a multiplier to the initial parameters, generating a new set from \(\boldsymbol{x}_\mathrm{scaled} = scale*\boldsymbol{x}_0\).
>>> aps.scan(steps=[0.9,1.2])
(array([[0],
[1],
[2]]), array([[-0.55754578, -0.26966971],
[-0.21234735, -0.19213127],
[-0.16746101, -0.19213127]]))
The results are also stored in the attributes fx0, ids and fx
after scan()
has been called:
>>> aps.ids
array([[0],
[1],
[2]])
>>> aps.fx
array([[[-0.55754578, -0.26966971]],
[[-0.21234735, -0.19213127]],
[[-0.16746101, -0.19213127]]])
For relative sensitivities, use the fx0
attribute:
>>> rel_fx = aps.fx[:,:,ds_id] / aps.fx0[ds_id]
Once a scan is complete, get_is_active()
will return an array of bools, that can be assigned to the parameter
interface’s is_active
attribute:
>>> ff.is_active = aps.get_is_active(n=20)
Multiple Data Sets can be evaluated with one Parameter Search instance, provided
they all can be calculated with the same Job Collection.
To do so, a list of data sets can be passed when instantiating.
This results in the attribute shapes fx0.shape == (len(ds))
and
fx.shape == (len(ff.active), len(ids), len(ds))
.
>>> aps = ActiveParameterSearch(ff, [ds1, ds2], jc)
>>> ids, fx = aps.scan()
>>> fx_ds1 = fx[:,:,0] # select scanned results of the first data set
>>> fx_ds2 = fx[:,:,1] # select scanned results of the second data set
In such cases get_is_active()
method’s dataset_id argument
can be passed to specify which data set results to use for the evaluation:
>>> aps = ActiveParameterSearch(ff, [ds1, ds2], jc)
>>> ids, fx = aps.scan()
>>> active_based_on_ds1 = aps.get_is_active(10, dataset_id=0)
>>> active_based_on_ds2 = aps.get_is_active(10, dataset_id=1)
API
-
class
ActiveParameterSearch
(parameterinterface, datasets, jobcollection, file=None)¶ Allows to scan for the most sensitive parameters of a ParameterInterface instance, given a Data Set.
Note
Will only scan the active subset of parameters.
The following are available after
scan()
has been called:Attributes: - fx0 : float
- The fx value of the initial parameters
- ids : ndarray
- The last return value of
scan()[0]
- fx : ndarray
- The last return value of
scan()[1]
-
__init__
(parameterinterface, datasets, jobcollection, file=None)¶ Initialize a Parameter Search instance with the given interface, datasets and jobcollection.
Previous results can be loaded by providing the optional file argument.The datasets argument can either be a single
DataSet
instance, or a list of them. The latter assumes that all Data Sets in the list can be calculated from the jobcollection. If multiple Data Sets are provided, theget_is_active()
method’s dataset_id can be used to specify which of the sets are used for the best parameter evaluation.
-
scan
(steps: Sequence = [1.05], dim=1, loss='sse', parallel=None, verbose=True)¶ Start the scan.
Note
Parameters that have a value of zero will be shifted by (step-1) instead.
After calling this method, the
get_is_active()
andsave()
methods can be called.Parameters: - steps : Sequence[float]
- Number of steps and the respective scaling for each step
- dim : 1 <= int <= len(parameterinterface.active)
- If dim > 1, will scan dim parameters at once on a combinatorial grid of len(parameters) over dim points. Possiby costly, as \(N_\mathrm{evals} = \binom {N_\mathrm{params}}{dim}\).
- loss : str,
Loss
- The Loss function to be used for the Data Set evaluation.
- parallel : ParallelLevels
- Calculate parallel.parametervectors parameter sets at once, each set set running parallel.jobs jobs in parallel. Defaults to ParallelLevels(parametervectors=NCPU).
Returns: - self.ids : ndarray
- 2d array of indices for the parameterinterface.active subset of parameters, each element i maps to the scanned parameter(s) of parameterinterface.active[i].
- self.fx : ndarray
- Array of shape
(len(ids), len(steps), len(datasets))
. In the same order as ids, the fitness function values for the modified parameter sets. Will contain a list of multiple fx values, if len(steps) > 1.
-
get_is_active
(n: Union[int, slice], dataset_id: int = 0, mode: str = 'highest_absolute') → List¶ Can only be called after
scan()
.
Given the initial parameter interface, return theParameterInterface.is_active
attribute with n most sensitive parameters marked as active. The returned List can be used to set the parameter interface:>>> params.is_active = ActiveParameterSearch.get_is_active(10)
- Valid mode argument values are
'lowest_relative'
: Will determine the best parameters by selecting lowest values as determined by (fx/fx0).mean(-1)'highest_absolute'
: Will determine the best parameters by selecting highest values as determined by abs(fx-fx0).mean(-1)
If dim>1 was requested during the scan, the number of active parameters will be equal to set(dim*n).
When multiple datasets have been provided at init, the dataset_id can be used to specify, which of the sets should be used for the best parameters evaluation.
-
save
(fname)¶ Saves
ids
,fx0
andfx
to fname
-
static
load
(fname)¶ Loads and returns a triplet of
ids
,fx
andfx0
from fname