Outputs¶
GloMPO produces various types of results files which can be configured via the GloMPOManager
; all or none of them can be produced. A summary human-readable YAML file is the most basic record of the optimization. Image files of the optimizer trajectories can also produced, as well as a compressed HDF5 file with all trajectory data and optimization metadata.
Several other outputs are also produced depending on the configuration of the optimization and the optimizers.
Real-Time Status Reports¶
GloMPO supports real-time status logging of an optimization. This can be directed to a file or the console (see Logging Messages).
Printstreams¶
Often optimizers will have been implemented by other developers and its BaseOptimizer
class will simply be a wrapper around this code rather than a new implementation of the algorithm. In such circumstances, it is likely that these optimizers will incorporate print statements in their code. When GloMPO runs multiple optimizers in parallel, this can create an illegible console output as statements from different optimizers shuffle together. It also makes any GloMPO logging messages a user may have setup, very difficult to follow and parse.
For this reason, the GloMPOManager.split_printstreams
option is provided which automatically redirects optimizer print statements to separate printstream_xxxx.out
text files. Errors are similarly redirected to printstream_xxxx.err
files. xxxx
is the four digit representation of each optimizer’s unique identification number. All these files are stored in the glompo_optimizer_printstreams
directory created in GloMPOManager.working_dir
.
Note
If your optimizers can be silenced in another way or do not contain print statements, it is better to use split_printstreams = False
when initializing the manager. This avoids creating a large number of empty files.
Checkpoints¶
GloMPO supports creating ‘snapshots’ of the optimization in time. The checkpoint files are compressed into a single tarball from which the optimization can be resumed (see Checkpointing).
Python Result Object¶
GloMPOManager.start_manager()
returns a Result
object with the final minimization result and some basic optimization metadata. This allows a user to continue some operations after an optimization within the same script.
Manager Summary File¶
The most informative, human-readable, GloMPO output is the glompo_manager_log.yml
file (produced by GloMPOManager.summary_files
\(\geq\) 1). An example of which can be downloaded here
and seen below. It includes all GloMPO settings, the final result, computational resources used, checkpoints created, as well as time and date information. These files can also be loaded by YAML at a later stage and their contents accessed like a dictionary.
Important
The manager summary file includes information about CPU usage, memory usage and system load. This is useful traceback to ensure the function is being parallelized correctly. It is important to note that CPU usage and memory usage is provided at a process level, system load is provided at a system level. This means that the system load information will only be of use if GloMPO is the only application running over the entire system. In distributed computing systems where GloMPO is only given access to a portion of a node, this information will be useless as it will be conflated with the usage of other users.
The quality of this output is limited by the psutil version installed in the python environment and various system limitations detailed in that package’s documentation.
Assignment:
Task: Schwefel
Working Dir: /home/user53/glompo_runs/run_008
Username: user53
Hostname: node3263
Time:
optimization Periods:
- Start: '2020-11-10 15:22:18.412977'
End: '2020-11-10 15:22:56.375724'
Total: '0:00:37.962747'
Session: '0:00:37.962746'
Settings:
x0 Generator:
Generator: RandomGenerator
n_params: 20
Exit Conditions: |-
MaxFuncCalls(fmax=121116)
Stoppers: |-
[
[
EvaluationsUnmoving(calls=500, tol=0.01) &
ValueAnnealing(crit_stop_chance)
] |
BestUnmoving(calls=8074, tol=0.2)
] |
ParameterDistance(bounds, relative_distance=0.05, test_all=False)
Optimizer Selector:
Selector: CycleSelector
Allow Spawn:
IterSpawnStop:
max_calls: 109004
Available Optimizers:
0:
type: CMAOptimizer
init_kwargs:
workers: 1
popsize: 12
call_kwargs:
sigma0: 500.0
Max Jobs: 4
Bounds:
(-500, 500): [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
19 ]
Counters:
Function Evaluations: 121200
Times Stoppers Evaluated: 10043
Optimizers:
Started: 18
Stopped: 16
Converged: 1
Run Information:
Memory:
Used:
Max: --
Ave: --
Available: 92.78GB
CPU:
Cores:
Total: 12
IDs: [ 1, 34, 35, 5, 6, 10, 22, 23, 24, 26, 27, 30 ]
Frequency: 0.0GHz
Load:
Average: [ 0 ]
Std. Dev.: [ 0 ]
CPU Usage(%):
Average: 0
Std. Dev.: 0
Solution:
fx: -7371.403964491363
origin:
opt_id: 5
type: CMAOptimizer
exit cond.: |-
MaxFuncCalls(fmax=121116) = True
x: [ 420.96874696682596, 420.96874566727126, 420.96874680120084, 420.9687469639721,
420.96874537846645, 420.9687477762165, 420.96874677352207, -500.0, -124.8293560430916,
420.96874705250616, 420.9687462991201, 420.9687473178516, -302.52493527156935,
420.9687464478644, 420.9687462591337, -302.52493479109353, -302.5249353300162,
-302.52493489667165, 420.96874590791356, 420.96874612415513 ]
Plots¶
The best way to get a overall sense of how the optimization proceeded is by using the summary trajectory plots (produced by GloMPOManager.summary_files
\(\geq\) 2). This requires matplotlib. Two plots are produced: trajectories.png
and trajectories_best.png
. The former shows the actual function evaluation results and the latter shows the best optimizer value. Below is an example of such a trajectory plot:
Optimizer Trajectory Log File¶
The most detailed output is stored in a compressed HDF5 file (produced by GloMPOManager.summary_files \(\geq\) 3). This includes all iteration and metadata information from the optimizers themselves. This file also contains all the manager metadata; in this way all information from an optimization can be accessed from one location. To work with these files within a Python environment, we recommend loading it with PyTables. To explore the file in a user-friendly GUI, we recommend using the vitables package.
It is within this HDF5 file that BaseFunction.detailed_call()
information is saved if this is being used.
HDF5 log files have the following structure:
glompo_log.h5
| This object contains manager and general optimization metadata in its attributes object.
|
+-- optimizer_1
| | This contains optimizer specific metadata in its attributes object.
| |
| +-- messages
| | An array of strings each representing a message send by the optimizer to the manager.
| |
| +-- iter_history
| A table of iterations results from the optimizer with the following columns:
| call_id
| Unique iteration identifier across the entire optimization.
| The universal function evaluation number.
| x
| Input space vectors.
| fx
| Function results.
| <others>
| Extras if the detailed_call method is used and returns extra information
|
+-- optimizer_2
| +-- messages
| +-- iter_history
+-- optimizer_3
| +-- messages
| +-- iter_history
.
.
.
Logger Classes¶
Within the GloMPO framework, the iteration history is tracked by a BaseLogger
or a FileLogger
. A BaseLogger
holds all information in memory only, for the Stoppers and other GloMPO decision making; used when an HDF5 log file has not been requested by the user.
In contrast, a FileLogger
(which is a child of BaseLogger
) maintains records in memory (for fast access) while simultaneously writing data to file during the optimization.
Neither of these classes should ever need to be accessed directly by users. They are setup and controlled directly by the GloMPOManager
, but their documentation is provided here for reference.
- class BaseLogger(n_parms, expected_rows, build_traj_plot)[source]¶
Holds iteration results in memory for faster access.
- Parameters:
- n_parms
Number of parameters in the domain of the optimization problem.
- expected_rows
Estimated number of rows in each optimizer log file. Estimated by
GloMPOManager
based on exit conditions and dimensionality of the optimization task.- build_traj_plot
Flag the logger to hold trajectories in memory to construct the summary image.
- Attributes:
- build_traj_plot
True
if the user has asked for a trajectory plot at the end of the optimization. Used to decide whether to hold all iterations in memory or purge them during the optimization when they would no longer be needed for Stopper purposes.
- add_iter_history(opt_id, extra_headers=None)[source]¶
Extends iteration history with all the columns required, including possible detailed calls.
- property best_iters¶
Dictionary of the best iterations for each optimizer.
- See Also:
- classmethod checkpoint_load(path)[source]¶
Construct a new
BaseLogger
from the attributes saved in the checkpoint file located atpath
.
- checkpoint_save(path='', block=None)[source]¶
Saves the state of the logger, suitable for resumption, during a checkpoint.
- Parameters:
- path
Directory in which to dump the generated files.
- block
Iterable of class attributes which should not be included in the log.
- clear_cache(opt_id=None)[source]¶
Removes all data associated with
opt_id
from memory. The data is not cleared if a summary trajectory plot has been configured.
- get_best_iter(opt_id=None)[source]¶
Returns the overall best record in history if
opt_id
is not provided. If it is, the best iteration of the corresponding optimizer is returned.
- get_history(opt_id, track)[source]¶
Returns data from the evaluation history of an optimizer.
- Parameters:
- opt_id
Unique optimizer identifier.
- track
Column name to return. Any column name in the logfile can be used. The following are always present:
'call_id'
: The overall evaluation number across all function calls.'x'
: Input vectors evaluated by the optimizer.'fx'
: The function response for each iteration.
- has_iter_history(opt_id)[source]¶
Returns
True
if an iteration history table has been constructed for optimizeropt_id
.
- property largest_eval¶
Returns the largest (finite) function evaluation processed thus far.
- property n_optimizers¶
Returns the number of optimizers in the log.
- plot_optimizer_trials(path=None, opt_id=None)[source]¶
Generates plots of parameter value versus optimizer function evaluation number for each parameter of input space.
- Parameters:
- path
Path to directory into which the image/s will be saved.
- opt_id
Optimizer for which the plot should be made. If
None
, plots will be made for all optimizers.
- plot_trajectory(title, log_scale=False, best_fx=False)[source]¶
Generates a plot of function values versus the overall function evaluation number.
- Parameters:
- title
Path to file to which the plot should be saved.
- log_scale
If
True
the function evaluations will be converted to base 10 log values.- best_fx
If
True
the best function evaluation seen thus far by each optimizer will be plotted rather than the function evaluation at the matching evaluation number.
- class FileLogger(n_parms, expected_rows, build_traj_plot)[source]¶
Bases:
BaseLogger
Extends the BaseLogger to write progress of GloMPO optimizers to disk in HDF5 format through PyTables. Results of living optimizers are still held in memory for optimizer Stopping.
- add_iter_history(opt_id, extra_headers=None)[source]¶
Creates an iteration history table in the HDF5 file.
- add_optimizer(opt_id, opt_type, t_start)[source]¶
Creates an HDF5 file and memory log for a new optimizer.
- classmethod checkpoint_load(path)[source]¶
Construct a new
FileLogger
from the attributes saved in the checkpoint file located atpath
- clear_cache(opt_id=None)[source]¶
Clears information held in the cache for Stopping purposes. If
opt_id
is provided then the corresponding optimizer is closed, else all optimizers are closed in this way.
- flush(opt_id=None)[source]¶
Writes iterations held in chunks to disk. If
opt_id
is provided then the corresponding optimizer is closed, else all optimizers are closed in this way.
- open(path, mode, checksum)[source]¶
Opens or creates the HDF5 file.
- Parameters:
- path
File path in which to construct the logfile.
- mode
The open mode of the file.
'w'
and'a'
modes are supported.- checksum
Unique checksum value generated by
GloMPOManager
and stored in checkpoints and the logfile. When a checkpoint is loaded, GloMPO will confirm a match between the checksum value in the checkpoint and in the logfile before using it (see Checkpointing).