6.5. Task: GenerateReference¶
If some training or validation set entries do not have reference values, set Task
GenerateReference
to generate a training_set.ref.yaml
or validation_set.ref.yaml
file with reference
values.
Task GenerateReference
will run all the needed jobs for any training set entries
with missing reference values. The engine settings are determined by looking up
the ReferenceEngineID
for the job in the engine collection.
If you use the GUI, then the calculated reference values will automatically be loaded when the job has finished.
To continue from an interrupted run, you can set a RestartDirectory
to the
results/reference_jobs
directory from the previous run.
RestartDirectory
- Type:
String
- Default value:
- GUI name:
Load jobs from:
- Description:
Specify a directory to continue interrupted GenerateReference or SinglePoint calculations. The directory depends on the task: GenerateReference: results/reference_jobs SinglePoint: results/single_point/jobs Note: If you use the GUI this directory will be COPIED into the results folder and the name will be prepended with ‘dep-’. This can take up a lot of disk space, so you may want to remove the ‘dep-’ folder after the job has finished.
Parallelization options can be set with ParallelLevels
. Note: Only
ParallelLevels%Jobs
, ParallelLevels%Processes
, and
ParallelLevels%Threads
are used for GenerateReference jobs.
ParallelLevels
- Type:
Block
- GUI name:
Parallelization distribution:
- Description:
Distribution of threads/processes between the parallelization levels.
Jobs
- Type:
Integer
- Default value:
0
- GUI name:
Jobs (per loss function evaluation)
- Description:
Number of JobCollection jobs to run in parallel for each loss function evaluation.
Processes
- Type:
Integer
- Default value:
1
- GUI name:
Processes (per Job)
- Description:
Number of processes (MPI ranks) to spawn for each JobCollection job. This effectively sets the NSCM environment variable for each job. A value of -1 will disable explicit setting of related variables. We recommend a value of 1 in almost all cases. A value greater than 1 would only be useful if you parametrize DFTB with a serial optimizer and have very few jobs in the job collection.
Threads
- Type:
Integer
- Default value:
1
- GUI name:
Threads (per Process)
- Description:
Number of threads to use for each of the processes. This effectively set the OMP_NUM_THREADS environment variable. Note that the DFTB engine does not use threads, so the value of this variable would not have any effect. We recommend always leaving it at the default value of 1. Please consult the manual of the engine you are parameterizing. A value of -1 will disable explicit setting of related variables.