5.2.7. Checkpoints¶
ParAMS now supports the creation of checkpoints. These are compressed files representing snapshots of an optimization at a particular point in time. You can resume an optimization from a checkpoint.
Note
Stoppers and ExitConditions are evaluated through checkpoints.
For example, consider the example below in an optimization where we have asked for an exit after 5 optimizers have converged.
After 2000 function evaluations, a checkpoint was made (indicated by the vertical black line). If we resume an optimization from this checkpoint, it will appear to end after 3 optimizers converge, but this is because 2 optimizers have already converged before the checkpoint was made (indicated by asterisks).
One important exception to this is the TimeLimit
and TimeLimitThroughRestarts
exit conditions.
See here for more details.
Warning
Resuming from a checkpoint in which the exit conditions have already been met is impossible. However, it is trivial to overwrite any setting in a checkpoint simply by adding input blocks for the settings you would like to replace.
5.2.7.1. Generate checkpoints¶
The CheckpointControl
block controls the generation of checkpoints.
CheckpointControl
AtEnd Yes/No
AtInitialisation Yes/No
CheckpointingDirectory string
EveryFunctionCalls integer
EverySeconds float
KeepPast integer
NamingFormat string
RaiseFail Yes/No
End
CheckpointControl
- Type
Block
- GUI name
Checkpointing options:
- Description
Settings to control the production of checkpoints from which the optimization can be resumed.
AtEnd
- Type
Bool
- Default value
No
- GUI name
Checkpoint at end:
- Description
Create a checkpoint when the exit condition/s are triggered.
AtInitialisation
- Type
Bool
- Default value
No
- GUI name
Checkpoint at start:
- Description
Create a checkpoint immediately at the start of an optimization.
CheckpointingDirectory
- Type
String
- Default value
- Description
Directory in which the checkpoints will be saved. Defaults to ‘checkpoints’ in the results directory
EveryFunctionCalls
- Type
Integer
- GUI name
Checkpoint interval (function evaluations):
- Description
Create a checkpoint every n function evaluations. If not specified or -1, checkpoints are not created based on function calls.
EverySeconds
- Type
Float
- Default value
3600.0
- Unit
s
- GUI name
Checkpoint interval (seconds):
- Description
Create a checkpoint every n seconds. If not specified or -1, a checkpoint is not created based time.
KeepPast
- Type
Integer
- Default value
0
- GUI name
Number of older checkpoints to keep:
- Description
Number of earlier checkpoints to keep. Older ones are deleted when a new one is created. -1 does not delete any previous checkpoints, and 0 retains only the most recent checkpoint. This number excludes the most recent checkpoint which is obviously always retained! So the actual number of files will be larger than this number by one.
NamingFormat
- Type
String
- Default value
glompo_checkpoint_%(date)_%(time)
- Description
Convention used to name the checkpoints. The following special keys are supported: • %(date): Current calendar date in YYYYMMDD format • %(year): Year formatted to YYYY • %(yr): Year formatted to YY • %(month): Numerical month formatted to MM • %(day): Calendar day of the month formatted to DD • %(time): Current calendar time formatted to HHMMSS (24-hour style) • %(hour): Hour formatted to HH (24-hour style) • %(min): Minutes formatted to MM • %(sec): Seconds formatted to SS • %(count): Index count of the number of checkpoints constructed. Starts at zero, formatted to 3 digits.
RaiseFail
- Type
Bool
- Default value
No
- GUI name
Exit on failed checkpoint:
- Description
Raise an error and stop the optimization if a checkpoint fails to be constructed, otherwise issue a warning and continue the optimization.
5.2.7.2. Resume from checkpoint¶
The ResumeCheckpoint
key is used to continue from previously stored checkpoints.
Note
When resuming from checkpoints, other input blocks can be used to override information stored in the checkpoint file.
ResumeCheckpoint
- Type
String
- Description
Path to checkpoint file from which a previous optimization can be resumed.