Environmental modelling is complex, and models often require the calibration of several parameters that are not able to be directly evaluated from a physical quantity or field measurement.
Multi-objective calibration has many advantages such as adding constraints in a poorly constrained problem or finding a compromise between different objectives by defining a set of optimal parameters.
The caRamel optimizer has been developed to meet the requirement for an automatic calibration procedure that delivers not just one but a family of parameter sets that are optimal with regard to a multi-objective target. The idea behind caRamel is to rely on stochastic rules while also allowing more “local” mechanisms, such as the extrapolation along vectors in the parameter space.
The caRamel algorithm is a hybrid of the multi-objective evolutionary annealing simplex (MEAS) method and the non-dominated sorting genetic algorithm II (

Environmental modelling is complex, and models often require the calibration of many parameters that cannot be directly estimated from a physical quantity or a field measurement. Moreover, as models' outputs exhibit errors whose statistical structure may be difficult to characterize precisely, it is frequently necessary to use various objectives to evaluate the modelling performance. In other words, it is often difficult to find a rigorous likelihood function or sufficient statistics to be maximized/minimized

Multi-objective calibration allows for a compromise between these different objectives to be found by defining a set of optimal parameters.
Practical experience shows that single-objective calibrations are efficient for highlighting a certain property of a system, but this might lead to increasing errors in some other characteristics

Many studies have used the multi-objective approach in environmental modelling

The caRamel optimizer has been developed to meet the need for an automatic calibration procedure that delivers not only one but a family of parameter sets that are optimal with regard to a multi-objective target

The caRamel algorithm was initially developed and used for the calibration of hydrological models by studies such as

This paper aims to describe the principles of the caRamel algorithm via an analysis of its results when used for the parametrization of hydrological models. Pieces of code are provided in the Appendix.
For an analytical example and for three river case studies, a comparison with the two calibration algorithms that inspired caRamel, the non-dominated sorting genetic algorithm II (NSGA-II;

The intent of multi-objective calibration is to find sets of parameters that provide a compromise between several potentially conflicting objectives; for instance, how to achieve a good simulation of both flood and low-flow conditions in a hydrological model. Multi-objective calibration is also a means of adding some constraints to an under-constrained problem when many parameters have to be quantified. This can help to reduce the equifinality of parameter sets.

To introduce our notation, Fig.

a model with

a vector

a vector of

Notations to describe a model calibration, where

We will use the following notations: vectors or matrices are presented using bold italic and bold roman font respectively (

Figure

equifinality of a structure – the two points

equifinality related to the objective – the vectors

The purpose of a multi-objective algorithm is to approach the Pareto front,

The caRamel algorithm belongs to the genetic algorithm family. The idea is to start from an ensemble of parameter sets (called a “population”) and to make this population evolve following certain generation rules (Fig.

the multi-objective evolutionary annealing simplex method (MEAS;

the non-dominated sorting genetic algorithm II (

Flowchart of the caRamel algorithm.

This section describes the functioning of the caRamel algorithm; this algorithm has been implemented in an R package,

The caRamel algorithm has five rules for producing new solutions at each generation: (1) interpolation, (2) extrapolation, (3) independent sampling with a priori parameter variance, (4) sampling with respect to a correlation structure and (5) recombination.

The first two rules (interpolation and extrapolation) are based on a

The following two rules create new parameter sets by exploring the parameter space in a nondirectional and less local way – either by independent variations in each parameter or by multivariate sampling using the covariance structure of all parameter sets located near the estimated Pareto front at the current iteration.

Finally, the recombination rule consists of creating new parameter sets using two partial subsets derived from a pair of previously evaluated parameter sets (inspired by

For rules 1 and 2, we use the notion of simplex which is a generalization of the notion of a triangle to higher dimensions: a 0-simplex is a point, a 1-simplex is a line segment, a 2-simplex is a triangle and a 3-simplex is a tetrahedron. A vertex is a point where two or more edges meet.
The explanation of the first rule is based on Fig.

Illustration of rules 1 and 2 based on a Delaunay triangulation in the objective space for a maximization problem with two parameters (

Let us consider a simplex with at least one vertex on the approximated Pareto front.
This simplex is the result of the function

First the triangulation is established, then simplex volumes are computed.
The probability of generating one new point with a simplex is proportional to its volume when it has at least one point on the Pareto front (otherwise it is zero).
If the simplex is selected, then a set of barycentric coordinates are computed by randomly generating

Extrapolation is based on the same hypothesis of continuity as interpolation.
In this case, it is tested to find if an improvement may be obtained by extrapolating from certain directions.
These directions are computed from the triangulation by selecting the edges that have only one vertex on to the approximated Pareto front (the second vertex is dominated by the first).
These oriented edges computed from the objective space represent directions of improvement in the parameter space (Fig.

The length

The drawback of the first two rules is that the generation of new vectors is only based on a small number of existing vectors.
To compensate for this search by gradient and to avoid convergence toward a local optimum, the third generation rule has two goals:

to make the parameters vary within a larger range than with local rules, and

to make the parameters vary independently of one another.

When considering a vector

The algorithm selects the

One generation of this rule then produces

The variance–covariance matrix

This matrix reflects the correlation structure between the parameter sets. For instance, in the case of a hydrological model, parameters are frequently not independent of each other.
This rule intends to obtain an estimate

There are many possibilities in selecting the vector for evaluating the covariance matrix:

Vectors may be selected from a library of “historical” vectors for the calibrated model. The drawback is that this library has to be previously established, and it does not take the progression of the running calibration into account.

Vectors may be selected from the archive

All vectors of the running population may be selected. This helps to maintain diversity, but it has a high computational cost as few new vectors will make the front progress.

Finally, the algorithm uses a mix between items 2 and 3: all simplexes from the first rule triangulation that have at least one vertex in the approximated Pareto front are selected.
Reference vectors for the computation of the variance–covariance matrix are defined by the ensemble

This operation increases the number of selected points for the averages computation significantly.
However, there is still the risk is of having a variance that is too low.
To reduce this risk, the variance of all of the parameters is increased by the same factor (empirically doubled):

The new vectors are obtained from a classical procedure for multivariate generation:

computation of the upper triangular matrix

generation of vectors

This fourth rule enables us to randomly explore some area of space

With respect to rule 4, recombination considers that the parameters from a model are not independent.
In a hydrological model, they can frequently be grouped in functional blocks (for instance, rapid runoff, base flow, snow dynamics, transfer and so on).
A new parameter vector is simply generated by combining blocks of parameters from vectors of the archive

At the end of each generation, the population is kept under a maximum size (

The population downsizing is adapted from

Pareto ranking – the parameter vectors are sorted according to the ranking order of the Pareto level to which they belong. Points from level 1 are non-dominated, points from level 2 are dominated only by points from level 1 and so on.

Downsizing according to the chosen precision – the objective space is partitioned by an

Keeping the population size under

Method for population downsizing for a maximization problem with two objectives: Pareto ranking (level 1 is the current approximated Pareto front) and partition of the objective space according to the chosen

The aim is to assess the performance of the caRamel algorithm against two other optimizers using various case studies.
Two optimizers have been selected for the comparison: NSGA-II

The caRamel algorithm is used in its general form, with a generation of five new parameters sets for each rule by iteration, involving an average of 25 parameter sets by generation.

NSGA-II

The MEAS algorithm

For each optimizer, the end of one optimization is set to a maximum number of model evaluations depending on the case studies. As the algorithms use random functions, 40 optimizations of each test case have been run for each optimizer to obtain representative results. In order to focus on the evolution of the optimization, the initial population is the same for each optimizer (40 initial populations for each case study).

We chose to run an important number of model evaluations and optimizations to get representative results and assess the reproducibility of the optimization. Other benchmark methodology would be conceivable, such as that presented by

To evaluate the optimizer performance, we chose metrics from the literature.
Evaluating optimization techniques experimentally always involves the notion of performance.
In the case of multi-objective optimization, the definition of quality is substantially more complex than for single-objective optimization problems, because the optimization goal itself consists of multiple objectives

accuracy, which is the closeness of the solutions to the theoretical Pareto front (if known) or relative closeness;

diversity, which can be described by the spread of the set (range of values covered by the solutions) and the distribution (relative distance among solutions in the set);

cardinality, which qualifies the number of Pareto-optimal solutions in the set.

To quantify these aspects, we selected three different metrics that are evaluated in the objective space:

hypervolume (HV), which is a volume-based index that takes accuracy, diversity and cardinality into account

generational distance (GD), which is a distance-based accuracy performance index (

generalized spread (GS), which evaluates the diversity of the set

The evaluation of the GS and GD metrics requires us to establish a reference front. For each case study, this reference front is built by evaluating the Pareto front on all of the final optimization results of all optimizers.

Four case studies have been designed to have an increasing complexity:
case study 1 is an analytical example with a Kursawe test function

The objective of a test function is to evaluate some characteristics of optimization algorithms.
The final Pareto front has a specific shape (non-convex, asymmetric and discontinuous) with an isolated point that the optimizer has to accurately reproduce.
The Kursawe function is a benchmark test for many researchers

The optimizations are run on 50 000 model evaluations.
The R script to run the Kursawe function optimization with caRamel is available in Appendix B or as a vignette in the

The GR4J hydrological model is a widely used global rainfall–runoff model

Daily discharge regimes at the three catchments studied.

GR4J has four parameters to calibrate: the production store capacity

The calibration is done on the daily time series for the period from 1990 to 1999.
The Kling–Gupta efficiency (KGE,

For each component, the optimal value is 1 and the optimization consists of a maximization.
At the end of the optimization only the sets with KGE

The R script to run an optimization of the GR4J model with caRamel is available in Appendix C.

The spatially distributed MORDOR-TS rainfall–runoff model

Maps of the catchments studied:

This model was implemented at a daily time step for two French catchments with contrasting climates.
The Tarn catchment at Millau (Fig.

The hydrological meshes have been built with an average cell area of 100 km

Parameters to calibrate for MORDOR-TS and bounds of variation.

MORDOR-TS has 22 free parameters in its comprehensive formulation.
For the Tarn case study, a simplified formulation is adopted with 12 free parameters to calibrate in order to describe the functioning of conceptual reservoirs, evapotranspiration correction and wave celerity (Table

For the Tarn catchment, the calibration is based on the Nash–Sutcliffe efficiencies (NSE;

Four aspects are considered with respect to the results of the case studies: the shape of the final Pareto fronts, the dynamics of the optimizations, the distribution of the calibrated parameters and the consequences of the latter on simulated discharges for the hydrological case studies. To illustrate the results on the simulated discharges, a “best compromise” set has been selected regarding the distance to point (1,1,1) in the objective space for each hydrological case study.

First of all, it is important to accurately reproduce the disconnected Pareto front for the Kursawe test function, and
this is the case for all of the optimizers (Fig.

Pareto front after 50 000 model evaluations with caRamel (1183 points), NSGA-II (1780 points) or MEAS (687 points) for the Kursawe test function.

Pareto fronts over 40 optimizations with the caRamel, NSGA-II and MEAS optimizers for each hydrological case study: Blue River with GR4J

Concerning the three hydrological case studies, the solutions of the Pareto fronts look quite similar for caRamel and NSGA-II and more narrow with MEAS (Fig.

Figure

Metrics' evolution over 40 optimizations with caRamel, NSGA-II and MEAS, showing the mean evolution and 10 %–90 % quantiles of the metrics with respect to the number of model evaluations:

The caRamel algorithm converges more quickly for accuracy (the HV and GD metrics in most cases). The caRamel's dynamics is closer to NSGA-II's dynamics than to MEAS's dynamics, as they have almost the same final values for the three metrics. This confirms the distinctive behaviour between the two classes of algorithms.

With respect to the diversity criteria, GS dynamics is different for the Kursawe test case than for the hydrological case studies. For the Kursawe test case, the optimal final front has a spread, so all optimizers give the same results. For the hydrological cases, the optimal solution is a point (1,1,1); thus, the Pareto front may get smaller with the optimization. NSGA-II and caRamel look alike, as they generate more diversity than MEAS (GS final values). On average, caRamel gives better values than NSGA-II for the three real cases.

Finally, the envelopes over 40 optimizations are comparable for the three optimizers, which means that reproducibility is always obtained but with different regularities depending on the case or the optimizer without any notable feature. In some cases, a smoother statistical GS convergence would have implied more optimizations.

Figure

Calibrated parameter distributions for the sets on the Pareto front (

In the parameter space, the optimizers provide very similar results that explore the equifinality of the model, meaning that different parameter sets show similar performance (Fig.

The difference in the diversity of the final sets is also visible in the parameter distributions. Distributions are quite similar for caRamel and NSGA-II but are much narrower for MEAS. This once again confirms the different behaviour of MEAS, with weaker general performance for the cases studied here.

The consequences with respect to the simulated discharges are displayed in Fig.

Observed and simulated discharges for the three case studies. “Observations” refer to the observed discharges, “Best compromise” refers to the best compromise simulated discharges and “Envelope” refers to the simulated discharges envelope using all parameter sets on the Pareto front (over 40 optimizations) with caRamel, NSGA-II and MEAS.

Figure

The caRamel function is an optimization algorithm for multi-objective calibration, and it results in a family of parameter sets that are Pareto-optimal with regard to the different objectives.
The algorithm is a hybrid of the MEAS algorithm

An optimization algorithm might be delicate to use because of the choice of input
arguments, which are specific to the algorithm and might require some “expert knowledge”.
The sensitivity to caRamel internal parameters is not presented in this paper, but we have carried out some sensitivity analyses using the Morris method

Multi-objective optimization may require thousands of evaluations, which can be a limitation for the calibration of time-consuming models. To cope with this issue, parallel computation is implemented in the

Better consideration of equality or inequality constraints, such as the relationship between two parameters, could be an improvement. Another perspective would be the ability of caRamel to deal with discrete parameters.

The

creation of blocks/subsets of parameters that should be jointly recombined (for example, parameters of a same module);

choice of parallel or sequential computation;

continuation of optimization starting from an existing population;

saving of the population after each generation or only the final one;

indication of the number of parameter sets produced by generation.

As a result, the function returns a list of six elements:

success – a logical, “TRUE” if the optimization process ran with no errors,

parameters – a matrix of parameter sets from the Pareto front (dimension: number of sets in the front, number of calibrated parameters),

objectives – a matrix of associated objective values (dimension: number of sets in the front, number of objectives),

save_crit – a matrix that describes the evolution of the optimization process; for each generation, the first column is the number of model evaluations, and the following columns are the optimum of each objective taken separately (dimension: number of generations, number of objectives

total_pop – total population (dimension: number of parameters sets, number of calibrated parameters + number of objectives).

gpp – the calling period for the third generation rule (independent sampling with a priori parameters variance). It is computed by the algorithm if the user does not fix it.

Arguments of the caRamel() function. Optional arguments are shown in italic font.

The data analysis was performed with the open-source environment R (

NLM developed the algorithm in the Scilab platform. FH, FZ and CM adapted the algorithm to an R package and performed the various test cases. CM prepared the paper with contributions from all co-authors.

The authors declare that they have no conflict of interest.

The authors wish to thank the editor and the reviewers of this paper for their constructive suggestions.
Special thanks to Andreas Efstratiadis, who gave us MEAS source code, and to Guillaume Thirel, who provided the first script of

This paper was edited by Elena Toth and reviewed by Andreas Efstratiadis and Guillaume Thirel.