3.6.1. CMA-ES¶
Implementation of the Covariance Matrix Adaptation Evolution Strategy [1]. The optimizer can be used out of the box:
>>> optimizer = CMAOptimizer(sigma=0.15, popsize=15)
>>> runner = Optimization('name', JobCollection, DataSet, interface, optimizer)
>>> runner.optimize()
>>> print(runner.results.x)
array([x0, ..., xN])
>>> print(runner.results.fx)
1234.5
See also
-
class
CMAOptimizer
(sigma=0.5, sampler='full', verbose=True, **cmasettings)¶ Implementation of the Covariance Matrix Adaptation Evolution Strategy
- Home: http://cma.gforge.inria.fr/
- Module: https://github.com/CMA-ES/pycma
Parameters:
- sigma : optional, float
- Standard deviation for all parameters (all parameters must be scaled accordingly).
Defines the search space as the std dev from an initial x0. Larger values will sample a wider Gaussian. - sampler : optional, str, ‘vkd’ or ‘vd’
- Allows the use of GaussVDSampler and GaussVkDSampler settings.
- verbose : bool
- Produce additional output on the status of the optimization.
- cmasettings : optional kwargs
cma module-specific settings as
k,v
pairs. Seecma.s.pprint(cma.CMAOptions())
(below) for a list of available options. Most useful keys are: timeout, tolstagnation, popsize. Additionally, the following options are supported through this class:- minsigma: Termination if sigma < minsigma
- maxsigma: Limit sigma to min(sigma,maxsigma). Defaults to 10*sigma
-
__init__
(sigma=0.5, sampler='full', verbose=True, **cmasettings)¶ Initialize with the above parameters.
3.6.1.1. List of valid cmasettings¶
>>> import cma
>>> cma.s.pprint(cma.CMAOptions())
AdaptSigma True or False or any CMAAdaptSigmaBase class e.g. CMAAdaptSigmaTPA, CMAAdaptSigmaCSA
CMA_active True negative update, conducted after the original update
CMA_cmean 1 learning rate for the mean value
CMA_const_trace False normalize trace, 1, True, "arithm", "geom", "aeig", "geig" are valid
CMA_diagonal 0*100*N/popsize**0.5 nb of iterations with diagonal covariance matrix, True for always
CMA_eigenmethod np.linalg.eigh or cma.utils.eig or pygsl.eigen.eigenvectors
CMA_elitist False or "initial" or True, elitism likely impairs global search performance
CMA_mirrors popsize < 6 values <0.5 are interpreted as fraction, values >1 as numbers (rounded), otherwise about 0.16 is used
CMA_mirrormethod 2 0=unconditional, 1=selective, 2=selective with delay
CMA_mu None parents selection parameter, default is popsize // 2
CMA_on 1 multiplier for all covariance matrix updates
CMA_sampler None a class or instance that implements the interface of `cma.interfaces.StatisticalModelSamplerWithZeroMeanBaseClass`
CMA_sampler_options {} options passed to `CMA_sampler` class init as keyword arguments
CMA_rankmu 1.0 multiplier for rank-mu update learning rate of covariance matrix
CMA_rankone 1.0 multiplier for rank-one update learning rate of covariance matrix
CMA_recombination_weights None a list, see class RecombinationWeights, overwrites CMA_mu and popsize options
CMA_dampsvec_fac np.Inf tentative and subject to changes, 0.5 would be a "default" damping for sigma vector update
CMA_dampsvec_fade 0.1 tentative fading out parameter for sigma vector update
CMA_teststds None factors for non-isotropic initial distr. of C, mainly for test purpose, see CMA_stds for production
CMA_stds None multipliers for sigma0 in each coordinate, not represented in C, makes scaling_of_variables obsolete
CSA_dampfac 1 positive multiplier for step-size damping, 0.3 is close to optimal on the sphere
CSA_damp_mueff_exponent 0.5 zero would mean no dependency of damping on mueff, useful with CSA_disregard_length option
CSA_disregard_length False True is untested, also changes respective parameters
CSA_clip_length_value None poorly tested, [0, 0] means const length N**0.5, [-1, 1] allows a variation of +- N/(N+2), etc.
CSA_squared False use squared length for sigma-adaptation
BoundaryHandler BoundTransform or BoundPenalty, unused when ``bounds in (None, [None, None])``
bounds [None, None] lower (=bounds[0]) and upper domain boundaries, each a scalar or a list/vector
conditioncov_alleviate [1e8, 1e12] when to alleviate the condition in the coordinates and in main axes
fixed_variables None dictionary with index-value pairs like {0:1.1, 2:0.1} that are not optimized
ftarget -inf target function value, minimization
integer_variables [] index list, invokes basic integer handling: prevent std dev to become too small in the given variables
is_feasible is_feasible a function that computes feasibility, by default lambda x, f: f not in (None, np.NaN)
maxfevals inf maximum number of function evaluations
maxiter 100 + 150 * (N+3)**2 // popsize**0.5 maximum number of iterations
mean_shift_line_samples False sample two new solutions colinear to previous mean shift
mindx 0 minimal std in any arbitrary direction, cave interference with tol*
minstd 0 minimal std (scalar or vector) in any coordinate direction, cave interference with tol*
maxstd inf maximal std in any coordinate direction
pc_line_samples False one line sample along the evolution path pc
popsize 4+int(3*np.log(N)) population size, AKA lambda, number of new solution per iteration
randn np.random.randn randn(lam, N) must return an np.array of shape (lam, N), see also cma.utilities.math.randhss
scaling_of_variables None depreciated, rather use fitness_transformations.ScaleCoordinates instead (or possibly CMA_stds).
Scale for each variable in that effective_sigma0 = sigma0*scaling. Internally the variables are divided by scaling_of_variables and sigma is unchanged, default is `np.ones(N)`
seed time random number seed for `numpy.random`; `None` and `0` equate to `time`, `np.nan` means "do nothing", see also option "randn"
signals_filename None cma_signals.in
termination_callback None a function returning True for termination, called in `stop` with `self` as argument, could be abused for side effects
timeout inf stop if timeout seconds are exceeded, the string "2.5 * 60**2" evaluates to 2 hours and 30 minutes
tolconditioncov 1e14 stop if the condition of the covariance matrix is above `tolconditioncov`
tolfacupx 1e3 termination when step-size increases by tolfacupx (diverges). That is, the initial step-size was chosen far too small and better solutions were found far away from the initial solution x0
tolupsigma 1e20 sigma/sigma0 > tolupsigma * max(eivenvals(C)**0.5) indicates "creeping behavior" with usually minor improvements
tolfun 1e-11 termination criterion: tolerance in function value, quite useful
tolfunhist 1e-12 termination criterion: tolerance in function value history
tolstagnation int(100 + 100 * N**1.5 / popsize) termination if no improvement over tolstagnation iterations
tolx 1e-11 termination criterion: tolerance in x-changes
transformation None depreciated, use cma.fitness_transformations.FitnessTransformation instead.
[t0, t1] are two mappings, t0 transforms solutions from CMA-representation to f-representation (tf_pheno),
t1 is the (optional) back transformation, see class GenoPheno
typical_x None used with scaling_of_variables
updatecovwait None number of iterations without distribution update, name is subject to future changes
verbose 3 verbosity e.g. of initial/final message, -1 is very quiet, -9 maximally quiet, may not be fully implemented
verb_append 0 initial evaluation counter, if append, do not overwrite output files
verb_disp 100 verbosity: display console output every verb_disp iteration
verb_filenameprefix outcmaes output filenames prefix
verb_log 1 verbosity: write data to files every verb_log iteration, writing can be time critical on fast to evaluate functions
verb_plot 0 in fmin(): plot() is called every verb_plot iteration
verb_time True output timings on console
vv {} ? versatile set or dictionary for hacking purposes, value found in self.opts["vv"]
3.6.1.2. References¶
[1] | Hansen, N. and A. Ostermeier, Completely Derandomized Self-Adaptation in Evolution Strategies, Evolutionary Computation 2001, 9 (2), 159-195 |