Relatively Robust Multicriteria Decisions

Thomas A. Weber
Thomas A. Weber
[email protected]
https://orcid.org/0000-0003-2857-4754
Chair of Operations, Economics and Strategy, École Polytechnique Fédérale de Lausanne, CH-1015 Lausanne, Switzerland
Search for more papers by this author

Chair of Operations, Economics and Strategy, École Polytechnique Fédérale de Lausanne, CH-1015 Lausanne, Switzerland

Published Online:14 Aug 2025https://doi.org/10.1287/mnsc.2025.00510

Abstract

For a general multicriteria decision problem with linear scalarization and unknown weights, we propose relatively robust decisions, which are Pareto-efficient and at the same time maximize a performance index. The latter measures the worst-case ratio, attained by the weighted objective relative to its maximum value, with respect to all possible weights. The main results include a simple boundary representation of the performance index as the minimum of criterion-specific performance ratios, and a computationally simple method of determining a relatively robust decision up to any prespecified performance tolerance by maximizing an $ε$ -augmented performance index. The proposed method relies merely on the continuity of all criterion functions and the compactness of the set of feasible decisions which may be nonconvex. This imposes no restrictions at all for any finite action set. A notable feature of our method is that it endogenously yields the tradeoffs between all criteria, including a performance guarantee relative to decisions justified by any other weighting. A number of structural results, examples, and applications are provided, as well as generalizations to allow for limited weight ambiguity, criterion ambiguity, and generalized aggregation of criteria based on an axiomatic foundation.

This paper was accepted by Peng Sun, optimization and decision analytics.

1. Introduction

In real-world decision making, evaluating alternatives often involves multiple, sometimes conflicting criteria. Whether considering investment portfolios under Environmental, Social, and Governance (ESG) parameters, selecting products based on bundles of attributes, or valuing companies for both profitability and sustainability, decision makers must navigate tradeoffs between competing objectives. Similarly, lifecycle environmental impact assessments—such as comparing vehicles with electric, combustion, or hybrid engines—require reconciling diverse metrics like emissions, cost, and resource consumption. These complex but often inevitable comparisons highlight the critical need for robust multicriteria optimization frameworks.

Multicriteria optimization involves identifying solutions that balance conflicting objectives in a manner consistent with the decision maker’s priorities. Traditional approaches often rely on scalarization techniques, where multiple criteria are combined into a single weighted objective function using weighted sums. However, these methods depend heavily on the precise specification of weights, which are rarely known a priori and can be difficult to justify. This uncertainty complicates the search for decisions that are both Pareto-efficient and robust to variations in tradeoff preferences. To address these challenges, we specialize the concept of relatively robust decisions by Weber (2023) so as to refer to decisions that achieve Pareto-efficiency among all relevant criteria while also maximizing a performance index designed to account for the ambiguity in weights. Specifically, the performance index measures the worst-case (WC) ratio attained by the weighted objective relative to its maximum value over all possible weight configurations. By focusing on worst-case performance, this framework provides guarantees of robustness and tradeoff transparency, which are particularly valuable in high-stakes or uncertain decision contexts.

1.1. Practical Examples

The approach developed here can be used for virtually all multicriteria decision problems, such as the following three example applications.¹

ESG Investing: Investors face the challenge of balancing financial returns with social and environmental impact. For example, a fund manager might evaluate portfolios based on criteria such as profitability, carbon footprint, and diversity inclusion. Weighting these criteria is inherently subjective, and the optimal portfolio might vary widely depending on the chosen weights. The relatively robust optimization framework proposed here enables the identification of investment strategies that remain robust across different weight configurations, offering a performance guarantee regardless of the specific preferences.²
Lifecycle Assessment of Vehicles: Consider evaluating the environmental impact of electric, combustion, and hybrid cars. Criteria might include greenhouse gas emissions, energy efficiency, and resource use (see, e.g., Hawkins et al. 2012). A relatively robust decision could pinpoint vehicle types or designs that perform well across a broad range of plausible weightings, resulting in a balanced and defensible choice.
Product Evaluation and Design: Companies frequently design products to optimize attributes such as cost, durability, and aesthetic appeal. For example, in designing a smartphone, decision makers must weigh the importance of battery life, screen quality, and price.³ Relatively robust multicriteria optimization helps determine design specifications that ensure competitive performance across various market segments with diverse preferences, valuing the availability of different attributes with different weights.

For the application of our method, one only needs that there exists a default action, that is, a baseline alternative that performs adequately across all criteria (which can always be achieved by reindexing evaluation scales), together with the technical assumption that criteria are continuous in actions, and that the finite-dimensional action set is closed and bounded (i.e., compact).

1.2. Literature

1.2.1. Origins of Multicriteria Decision Making.

The idea of multicriteria optimization in Economics can be traced back to the distribution of resources among different individuals, leading to a set of undominated solutions such as the “contract curve” proposed by Edgeworth (1881, p. 21) for a simple exchange economy, and more generally a set of efficient outcomes as implied by Pareto (1894; 1897, sections 721–723),⁴ which cannot be improved upon for one agent without making another agent worse off. The latter avoids a direct interpersonal comparison of individuals’ utilities (or “ophelimities” in Pareto’s terminology); see also Harsanyi (1955) as well as Keeney and Raiffa (1993, chapter 10) who explore group utility functions. The drawback of such an agnostic approach to optimality with multiple criteria is that the set of Pareto-optimal allocations, because of its typically large cardinality, offers only imperfect guidance about which solution should actually be implemented. For example, in the two-agent exchange economy, the contract curve usually includes allocations that attribute all resources to any single individual, which almost completely undermines the notion of multicriteria optimization.

In Operations Research, “goal programming” refers to the notion of minimizing the weighted deviation from criterion-specific targets (Charnes and Cooper 1961). The technique was first employed in the context of executive compensation based on different “factors” (i.e., criteria) using linear programming techniques (Charnes et al. 1955). This basic approach is usually applied to a “utopian” (or “ideal”) target point, which corresponds to the (generically infeasible) vector of individually maximized criteria. Depending on the distance measure (e.g., a weighted Chebyshev distance; see, e.g., Steuer 1986, chapter 14),⁵ the corresponding solutions trade off among criteria according to the specified weights, and they are naturally Pareto-efficient.⁶ Instead of minimizing the distance to the ideal point (in the outcome space), it is also possible to maximize the distance to a (generically infeasible) “nadir” point, which contains the minimum value of each individual criterion on the Pareto-efficient set.⁷ In this approach, known as “compromise programming” (Zeleny 1974), the appropriate choice of the weights for the different criteria remains the critical point, and in our view, very little satisfying progress has been made in the assignment of weights without imposing subjectivity, which arguably amounts to picking a solution from the Pareto-efficient set. For instance, based on an exogenous ranking of the criterion importance, it is possible to apply an ordered weighted average (Yager 1988) which in turn can be related to compromise programming (Zarghami and Szidarovszky 2010, Wang and Fu 2020).⁸ Besides being subjective from the start by requiring the imposition of a preference order on criteria, this method does not provide any nontrivial performance guarantee relative to other choices of weights and/or importance rankings that might be plausible for other decision makers.⁹

1.2.2. Relative Robustness.

By contrast, we approach the selection of weights from the standpoint of relative robustness (Weber 2023), involving only comparisons between feasible points, thus avoiding fictitious targets such as the aforementioned utopian and nadir points. Rather than minimizing a fixed distance metric with specific weights, the proposed method evaluates performance in terms of the worst-case tradeoffs among criteria, ensuring a solution that remains defensible regardless of the exact weighting chosen ultimately. The underlying robustness measure, equivalent to relative regret, has been used in computer science to evaluate the relative performance of algorithms (Sleator and Tarjan 1985, Ben-David and Borodin 1994), for the scenario-based evaluation of operational decisions (Kouvelis and Yu 1997), in price discrimination (Han and Weber 2023), robust optimization (Weber 2024), and fair resource allocation (Goel et al. 2009). The idea of absolute regret (AR) is due to Savage (1951), based on the maximin robustness approach by Wald (1945) in his general treatment of sequential decision problems. The main issue with absolute regret is that it is inherently sensitive to the scale of the reference point. This sensitivity may make it impossible to derive reasonable performance guarantees, such as ensuring positive profits in a monopoly pricing problem with unknown demand (Weber 2025), because a zero-profit reference point is intrinsically small-scale. A relative robustness approach, we argue, yields more acceptable results, particularly in the context of multicriteria optimization, where changing the units of any given criterion would usually affect solutions that are based on absolute performance measures.

1.2.3. Connection to Distributionally Robust Optimization.

Because one can reinterpret a (normalized) weight vector as a probability distribution, our approach is naturally related to distributionally robust optimization (DRO), where the true probability distribution governing uncertain parameters is unknown but assumed to lie within a known ambiguity set. Important early contributions include Delage and Ye (2010), who studied DRO with moment-based sets, and Ben-Tal et al. (2013), who developed tractable reformulations for DRO with Wasserstein ambiguity sets. More recently, Blanchet and Murthy (2019) and Mohajerin Esfahani and Kuhn (2018) developed general formulations based on Wasserstein balls, offering strong out-of-sample guarantees. Whereas DRO typically focuses on expectations or risk measures over stochastic uncertainty, our approach generalizes worst-case robustness to a multicriteria setting without requiring a probabilistic model. This positions our proposed framework as a deterministic analogue to DRO, where ambiguity arises from unknown relative preferences and state-dependent criteria rather than unknown probability distributions.

1.2.4. Relative Evaluations.

The proposed approach to robust multicriteria decisions is entirely relative, in the sense that the question underlying all of our developments is “How well am I doing relative to how well I could be doing?” Indeed, the idea that the size of an object can be judged only relative to other objects goes back at least to the Taoist writings of Zhuang Zhou in the fourth century BC.¹⁰ In Economics, Cournot (1838) was among the first to note that there is no absolute value (“Il n’y a pas de valeurs absolues,” p. 22) and that inference from a social system can be likened to the observation of astronomical objects and their relative positions to each other (pp. 15–16), concluding by analogy that the concept of value is fundamentally relative (“Il y en a en ce sense que des valeurs relatives,” p. 18). In fact, human perception is inherently relative, as demonstrated by Weber (1846) and his student Fechner (1860) in extensive experiments which showed that across different senses (e.g., hearing, touch, and vision) the minimum perceptible difference is proportional to the current level of the stimulus, giving rise to the Weber-Fechner law of psychophysics. Similarly, in an economic context, it is often relative reference points such as one’s current wealth level (Kahneman and Tversky 1979) or the outcomes experienced by others (Loewenstein et al. 1989) that tend to determine human behavior. For example, when pondering whether to purchase a product from a cheaper store within walking distance, humans base decisions less on absolute gains than on the prospective relative savings in expenditure (Kahneman and Tversky 1984).

Beyond the aforementioned congruence with human perception, there are other practical arguments for relative evaluations. First, there is the insensitivity to scale, common to all relative measures such as internal rate of return (IRR), demand elasticity, profit margin, or relative regret, which allows for a direct comparison across different sizes. For example, with IRR one can readily benchmark projects of different financial magnitudes against outside investment options (of comparable risk), whereas the equivalent absolute indicator of net present value (NPV) remains silent about the required absolute investment (see, e.g., Weber 2014).¹¹ Second, normalization facilitates fairness and equity. When stakeholders have differing capacities or baselines, a relative comparison may help to ensure fairness (Goel et al. 2009). Third, relative evaluation criteria allow for comparability across contexts. For example, demand elasticity (as introduced by Marshall 1890, p. 162) is a unit-free relative measure that allows one to compare the changes of demand relative to price changes across widely differing goods, irrespective of the underlying unity of measurement (e.g., units of cars versus units of electric power). This last point is especially salient for multicriteria decision making, as the units (and inherent scale) for different criteria generally differ, so that a robustness criterion (and robust decision) should remain unaffected if the values of a given criterion are all multiplied by 10, for example. The relatively robust framework developed here uses a relative worst-case performance perspective. This methodology not only identifies solutions with reliable tradeoff characteristics, but also provides a transparent representation of the tradeoffs themselves. These tradeoffs are reflected in a robust weight vector consistent with a robust solution.

1.3. Outline

The remainder of this paper is organized as follows: Section 2 discusses the multicriteria optimization problem and associated comparative statics, together with the performance index for a robust evaluation of different decisions. Section 3 introduces pseudo-robustness and Pareto-efficiency, which together characterize relatively robust decisions. Here we also provide a computational approach for determining a relatively robust decision up to any given performance tolerance, and we allow for the possibility of close-to-arbitrary restrictions in the set of admissible weights, for example, based on a priori knowledge about physical constraints or a given priority ranking of criteria. In addition, there are extensions to criterion ambiguity and general aggregation of criteria, followed by a practical guide for how to apply the method. Section 4 focuses on discrete applications where our framework relies on virtually no assumptions, so the approach can be entirely data-driven. Section 5 concludes.

2. Basic Framework

Let $X \subset R^{m}$ be a nonempty, compact action set, for a given integer $m \geq 1$ . Consider a decision maker who faces the multicriteria optimization problem of having to select an action (or decision, or point) $x \in X$ so as to “simultaneously maximize” the continuous functions $f_{i} : X \to R_{+}$ , for $i \in N = {1, \dots, n}$ , each of which is referred to as a criterion, where $n \geq 1$ is a given integer.

Remark 1.

(i) The continuity of each criterion $f_{i}$ ensures that small perturbations in the action $x$ yield correspondingly small changes in the objective value. Notably, this assumption is trivially satisfied at any isolated point of $X$ .¹² In particular, this means that there is no imposed regularity requirement when the action set is finite. (ii) The requirement that each $f_{i}$ be nonnegative is without loss of generality: any real-valued (continuous) criterion ${\hat{f}}_{i}$ can be translated as $f_{i} = {\hat{f}}_{i} - {\hat{f}}_{i}^{•} \geq 0$ , where ${\hat{f}}_{i}^{•} = \min_{x \in X} {\hat{f}}_{i} (x)$ denotes the minimum value of ${\hat{f}}_{i}$ .¹³

2.1. Decision Problem

To evaluate goal achievement for any decision $x \in X$ , the decision maker considers a scalarization of his multicriteria optimization problem by means of a weighted objective,

F (x | λ) = \sum_{i = 1}^{n} λ_{i} f_{i} (x), x \in X,

(1)

where

λ = (λ_{1}, \dots, λ_{n}) \in Δ = {w \in R_{+}^{n} : ‖ w ‖_{1} = 1}

is a (normalized) vector of weights (with

λ_{i} \geq 0

for all

i \in N

, and

λ_{1} + \dots + λ_{n} = 1

). Maximizing the weighted objective yields the set of ex post optimal decisions,

X (λ) = \arg \max_{x \in X} F (x | λ), λ \in Δ,

(2)

which by the extreme-value theorem (Rudin 1976, theorem 4.16, p. 89) is nonempty. Moreover, by the maximum theorem (Berge 1963, p. 116) the (set-valued) mapping

X : Δ ⇉ X

is compact-valued and upper semicontinuous.

Remark 2.

The weighted objective is homogeneous of degree one, that is, for all $α > 0$ and $λ \in Δ$ , it is $F (\cdot | α λ) = α F (\cdot | λ)$ , whereas the set of ex post optimal decisions is homogeneous of degree zero, in the sense that $X (α λ) = X (λ)$ , for all $α > 0$ and $λ \in Δ$ . It is therefore possible to extend the definition of $F (x | \cdot)$ in Equation (1) and the definition of $X (\cdot)$ in Equation (2) to a domain containing any nonzero weight vector $w \in R_{+}^{n} \ {0}$ , because a unique normalized weight $λ = α w \in Δ$ , with $α = 1 / (\sum_{i = 1}^{n} w_{i}) > 0$ , is always available:¹⁴

F (x | w) ≜ (1 / α) F (x | λ), x \in X,

(1′)

and

X (w) ≜ X (λ) = \arg \max_{x \in X} (1 / α) F (x | λ) .

(2′)

Remark 3.

Limited ambiguity, that is, allowing for weights in a (nonempty, compact) subset of $Δ$ , is discussed in Section 3.7. Criterion ambiguity is treated in Section 3.8, and Section 3.9 investigates the use of general multicriteria objectives. All three generalizations can be treated, after suitable adjustment, within the basic framework.

To keep matters nontrivial, we assume that there exists a (feasible) default decision ( $x_{d}$ ) such that the decision maker’s weighted objective is positive, that is,

\exists x_{d} \in X : F (x_{d} | λ) > 0, λ \in Δ .

(N)

The nontriviality condition (N) ensures that the ex post optimal objective (or value function) is positive:

F^{*} (λ) = \max_{x \in X} F (x | λ) = F (\hat{x} | λ) \geq F (x_{d} | λ) > 0, \hat{x} \in X (λ), λ \in Δ .

The sign-definiteness of the ex post optimal objective is critical for its role as a reference, against which the goal achievement of any feasible decision can be compared.

Remark 4.

The nontriviality condition (N) can be satisfied without affecting $X (\cdot)$ , that is, without changing any set of ex post optimal decisions. It is sufficient to consider the translated criterion ${\hat{f}}_{i} = f_{i} + s_{i}$ instead of $f_{i}$ , for all $i \in N$ , using a suitable shift $s = (s_{1}, \dots, s_{n}) \in R_{+ +}^{n}$ , in which case $\hat{F} (x | λ) = F (x | λ) + c_{0} (λ) > 0$ , because $c_{0} (λ) = λ \cdot s$ is strictly positive (as it is bounded from below by $\min {s_{1}, \dots, s_{n}} > 0$ ).

2.2. Comparative Statics

What happens to the criteria at the optimum when shifting weight from one criterion to another? At the optimum, one would naturally expect that a criterion which receives relatively more weight than before cannot decrease, whereas a criterion that receives relatively less weight cannot increase. The following result formalizes this intuition for a weight shift from one criterion to another.

Proposition 1.

Let $δ_{i j} = e_{i} - e_{j}$ , where $e_{i}$ and $e_{j}$ denote the i-th and j-th Euclidean unit vectors, respectively. Consider $λ, \hat{λ} \in Δ$ such that $\hat{λ} = λ + ε δ_{i j}$ for some $ε > 0$ and some $i, j \in N$ with $i \neq j$ . Then for any $(x, \hat{x}) \in X (λ) \times X (\hat{λ})$ it is $f_{i} (x) \leq f_{i} (\hat{x})$ and $f_{j} (\hat{x}) \leq f_{j} (x)$ .

A transfer from the j-th criterion weight $λ_{j}$ to the i-th criterion weight $λ_{i}$ augments the optimal value of criterion i and lowers the optimal value of criterion j (at least weakly). Criteria other than i and j, whose weights remain constant but whose values are affected when decisions change, may go either way as a result of the weight shift. Similarly, the value function $F^{*} (λ)$ could go up or down, depending primarily on the difference between $f_{i}$ and $f_{j}$ at the optimum.

Remark 5.

The conclusion of Proposition 1 can be applied multiple times. In particular, it can also be used for shifts in nonnormalized weights $w = (w_{j}, w_{- j}) \in R_{+}^{n} \ {0}$ ; see Remark 2. Thus, increasing $w_{j}$ is equivalent to (at most) $n - 1$ successive weight transfers in the normalized weight from the components of $λ_{- j}$ to $λ_{j}$ , resulting in an increase of the j-th criterion at the optimum.

Remark 6.

For small shifts of weight from criterion j to criterion i, the value of the ex post optimal objective $F^{*} (λ) = F (x^{*} (λ) | λ)$ goes up (resp., down) when the corresponding score difference, $f_{i} (x^{*} (λ)) - f_{j} (x^{*} (λ))$ , is positive (resp., negative), at the selection $x^{*} (λ) \in X (λ)$ .

The following result states that for a simple nonnormalized increase of the i-th weight, the value function goes up, as long as the i-th criterion is always positive at an optimum.

Lemma 1.

Let $w, \hat{w} \in R^{n} \ {0}$ be nonnormalized weights, such that $\hat{w} = w + δ e_{i}$ , for a given $δ > 0$ and a given $i \in N$ . Provided that $f_{i} (x) > 0$ , for any $x \in X (w)$ , it is $F^{*} (\hat{w}) > F^{*} (w)$ .¹⁵

2.3. Performance Index

The decision maker may a priori have no knowledge about which weight $λ \in Δ$ should be used to compute the weighted objective $F (\cdot | λ)$ in Equation (1).¹⁶ To deal with this ambiguity, the decision maker evaluates any feasible decision $x \in X$ with respect to any given weight $λ \in Δ$ by the performance ratio,¹⁷

φ (x | λ) = \frac{F (x | λ)}{F^{*} (λ)} \in [0, 1],

(3)

which is continuous on

X \times Δ

and naturally bounded from above by one. Its minimum with respect to all possible weights (cf. Endnote 13),

ρ (x) = \min_{λ \in Δ} φ (x | λ) \in [0, 1],

(4)

is called the performance index (evaluated at

x

). The performance index measures the performance of the action

x

relative to all possible weighted-sum scalarizations of the decision maker’s multicriteria decision problem. It turns out that this performance criterion depends only on the maximized individual criteria,

f_{i}^{*} = \max_{x \in X} f_{i} (x), i \in N .

(5)

By the nontriviality condition (N), it is $f_{i}^{*} = F^{*} (e_{i}) \geq F (x_{d} | e_{i}) > 0$ , for all $i \in N$ . Thus, all maximized individual criteria are strictly positive. The next result provides an important representation of the performance index, in terms of relative goal achievement of a decision, relative to the various maximized individual criteria.

Proposition 2.

The performance index is equal to

ρ (x) = \min_{i \in N} ϕ_{i} (x), x \in X,

(6)

where

ϕ_{i} (x) = f_{i} (x) / f_{i}^{*} \in [0, 1]

, for all

(x, i) \in X \times N

The representation in Equation (6) expresses the performance index as the minimum of the criterion-specific performance ratios $ϕ_{i}$ . Hence, to compute $ρ (x)$ as the minimum of $φ (x | λ)$ over all weights $λ \in Δ$ , it is sufficient to restrict attention to the Euclidean unit vectors $e_{i} \in Δ$ , as $ϕ_{i} (\cdot) = F (\cdot | e_{i}) / F^{*} (e_{i})$ , for all $i \in N$ . This simplification arises because $φ (x | \cdot)$ is quasiconcave for fixed $x$ , implying that its minimum over $Δ$ is attained at a vertex. This highlights a “perfect complementarity” among the criterion-specific performance ratios in the determination of the performance index.

Remark 7.

The idea of perfect complementarity, discussed by Cournot (1838) and Edgeworth (1897), describes elements contributing to a common goal in fixed proportions. This is equivalent to evaluating the criterion (i.e., the performance index) using the Leontief production function,¹⁸ that is, taking the minimum among the inputs $ϕ_{i}$ , resulting in $ρ (x) = \min_{i \in N} ϕ_{i} (x)$ , for all $x \in X$ , as in Proposition 2.

Example 1.

Let $X \subset R_{+}^{m}$ be a compact action set with $X \cap R_{+ +}^{m} \neq \emptyset$ , where $m \geq 2$ is an integer. Consider a decision problem with $n = 2$ criteria (containing an egalitarian and a utilitarian evaluation),

f_{1} (x) = \min {x_{1}, \dots, x_{m}} and f_{2} (x) = (1 / m) (x_{1} + \dots + x_{m}),

for all

x = (x_{1}, \dots, x_{m})

, with maximized values

f_{i}^{*} = \max f_{i} (X) > 0

, for

i \in N

. The corresponding weighted objective becomes

F (x | ℓ) = (1 - ℓ) f_{1} (x) + ℓ f_{2} (x), ℓ \in [0, 1],

where, for simplicity, the parameter

ℓ

replaces the normalized weight

λ = (1 - ℓ, ℓ)

. The optimal value,

F^{*} (ℓ) = \max F (X | ℓ)

, can be obtained by solving a two-stage maximization problem,

F^{*} (ℓ) = \max_{t \in T} {(1 - ℓ) t + ℓ μ (t)}, ℓ \in [0, 1] .

Here $T = [t_{1}, t_{2}]$ is a compact interval, with $0 < t_{1} \leq t_{2} < \infty$ , and the best average coordinate (subject to all coordinates being at least of size t),

μ (t) = (1 / m) \max_{x \in X} {x_{1} + \dots + x_{m} : t \leq x_{1}, \dots, t \leq x_{m}}, t \in T,

is a (weakly) decreasing function on

T

, independent of

ℓ

. Because

t < μ (t)

, for all

t \in T

, it is clear that

F^{*} (ℓ)

must be increasing in

ℓ

. The interval

T

is adapted to the problem in the sense that at

t_{1}

no constraint is imposed on the computation of the mean

μ (t)

, and at

t_{2}

the maximum of the smallest coordinate of any point in

X

is reached. Accordingly, one can verify that

t_{2} = F^{*} (0) = f_{1}^{*} < f_{2}^{*} = F^{*} (1) = μ (t_{1})

, where

X (ℓ) = \arg \max_{x \in X} F (x | ℓ)

and

0 < t_{1} = \min_{ξ \in X (1)} {ξ_{1}, \dots, ξ_{m}} \leq t_{2}

, where

ξ = (ξ_{1}, \dots, ξ_{m})

; see Figure 1. By means of Proposition 2 the performance index can then be written in the form

ρ (x) = \min {\frac{f_{1} (x)}{f_{1}^{*}}, \frac{f_{2} (x)}{f_{2}^{*}}} = \min {\frac{f_{1} (x)}{t_{2}}, \frac{f_{2} (x)}{μ (t_{1})}}, x \in X .

**Figure 1. Nonconvex Action Set $X \subset R_{+ +}^{2}$ in Example 1**

Here the criterion-specific performance ratios are $ϕ_{1} = f_{1} / t_{2}$ and $ϕ_{2} = f_{2} / μ (t_{1})$ .

3. Robust Multicriteria Decision Making

3.1. Pseudo-Robustness

We refer to a decision $\hat{x} \in X$ which maximizes the performance index $ρ (\cdot)$ as pseudo-robust. The corresponding (compact, nonempty) set of pseudo-robust decisions,¹⁹

Ψ = \arg \max_{x \in X} ρ (x),

(7)

is not necessarily a singleton, as illustrated by the following example.

Example 2.

Optimizing the performance index determined in Example 1, using the same two-stage maximization procedure, yields the optimal performance index,

ρ^{*} = \max_{x \in X} ρ (x) = \max_{t \in T} \min {\frac{t}{t_{2}}, \frac{μ (t)}{μ (t_{1})}} .

Here, the first term in the minimand increases continuously in t, whereas the second is (weakly) decreasing and may be discontinuous; see Figure 2. As a result, at the optimal value $\hat{t} \in T = [t_{1}, t_{2}]$ the two terms are about to cross, resulting in a “balancedness condition,”

\hat{t} = \sup {t \in T : \frac{t}{t_{2}} < \frac{μ (t)}{μ (t_{1})}},

(8)

Figure 2. Qualitative Behavior of Performance Ratio $μ (t) / μ (t_{1})$ as a Function of $t / t_{2}$ in Example 2, for the Nonconvex Action Set in Figure 1
*Note.* $f_{1}^{*} = t_{2} > t_{1}$ and $f_{2}^{*} = μ (t_{1}) > μ (t_{2})$ .

so that $ρ^{*} = \hat{t} / t_{2} = \hat{t} / f_{1}^{*}$ , as shown in Figure 2. Because for any $t \in T$ it is $μ (t) \geq t$ , one obtains $μ (\hat{t}) \geq μ (t_{2}) \geq t_{2} = f_{1}^{*}$ . This implies that the optimal performance index,

ρ^{*} = \frac{\hat{t}}{f_{1}^{*}} \in [\frac{t_{1}}{f_{1}^{*}}, \frac{μ (\hat{t})}{f_{2}^{*}}],

is always nontrivial (i.e., strictly positive). It attains its maximum possible value (of one) if and only if

μ (\cdot)

is constant on

T

, that is, when

μ (t_{1}) = μ (t_{2})

. The set of pseudo-robust actions,

Ψ = {x \in X : \min {x_{1}, \dots, x_{m}} = \hat{t} / ρ^{*}}

, may generally contain multiple elements.

3.2. Efficiency

When $Ψ$ is not a singleton, the decision maker may have good reason to prefer one pseudo-robust decision over another, based on “efficiency” (or a lack thereof). Specifically, given two feasible actions $x, x^{'} \in X$ , we say that decision $x^{'}$ is more efficient than decision $x$ with respect to the vector of criteria $f = (f_{1}, \dots, f_{n})$ , if and only if $x^{'}$ strictly improves on at least one criterion while weakly improving on all other criteria (over their values achieved at $x$ ). The corresponding preference relation $≻$ on $X$ is defined by²⁰

x^{'} ≻ x \Leftrightarrow (\exists i \in N : f_{i} (x^{'}) > f_{i} (x), f (x^{'}) \geq f (x)),

(9)

for all

x, x^{'} \in X

. The resulting set of efficient (or Pareto-optimal) decisions is

P = {x \in X : (f (x) \leq f (x^{'}) \Rightarrow f (x) = f (x^{'})), x^{'} \in X} .

(10)

It is easy to see that a pseudo-robust decision $\hat{x} \in Ψ$ is not necessarily efficient, in the sense that it may be possible to find a different decision which strictly improves on at least one criterion-specific performance ratio while maintaining the optimal performance index achieved by $\hat{x}$ .

Example 3.

Following up on our analysis in Example 1 and Example 2, consider the (nonconvex, compact) action set $X = {[0, 1]}^{m} \ {(1 / 2, 1]}^{m}$ , for some integer $m \geq 2$ . Because $μ (t) = 1 - 1 / (2 m)$ , for all $t \in T = [t_{1}, t_{2}]$ , where $t_{1} \leq t_{2} = f_{1}^{*} = 1 / 2$ and $f_{2}^{*} = μ (t_{1})$ , the optimality condition $\hat{t} / t_{2} = μ (\hat{t}) / μ (t_{1}) = 1$ yields $\hat{t} = 1 / 2$ . Hence, the optimal performance index attains its maximum possible value, $ρ^{*} = 1$ . This makes sense, because it is feasible (in $X$ ) to attain simultaneously the highest possible minimum coordinate and the highest possible average coordinate, regardless of the weights assigned to the two objectives. Note also that the set of pseudo-robust actions,

Ψ = {x \in X : \min {x_{1}, \dots, x_{m}} = \hat{t} / ρ^{*}} = {x \in {[1 / 2, 1]}^{m} : \exists i \in N s . t . x_{i} = 1 / 2},

has a continuum of elements. For instance, compared with the pseudo-robust action

(1, \dots, 1) / 2

, all other actions in

Ψ

are more efficient, particularly those in the set of Pareto-optimal decisions,

P = {(1, \dots, 1) - (e_{i} / 2) : i \in N}

, which is a subset of

Ψ

. For

m = 2

, we obtain that

P = {(0.5, 1), (1, 0.5)}

. Figure 3 depicts the situation, including the set of Pareto-optimal decisions, for

m = 3

Figure 3. Nonconvex Action Set in Example 3, for $m = 3$
*Note.* The set of Pareto-optimal decisions is $P = {(1, 1, 0.5), (1, 0.5, 1), (0.5, 1, 1)}$ .

Remark 8.

It is well known that for any weight vector $λ$ with strictly positive components (ensuring all criteria are considered) an ex post optimal decision must also be efficient (see, e.g., Ehrgott 2010, proposition 3.9, p. 71). That is, for any $λ \in int (Δ)$ , the corresponding optimal decision set satisfies $X (λ) \subset P$ .

3.3. Robust Decision Set

Requiring efficiency in addition to pseudo-robustness is important, because, by avoiding unnecessary shortfalls, it can only improve the weighted objective in Equation (1)—at least weakly. A decision ${\hat{x}}^{*} \in X$ is called robust if it is pseudo-robust and efficient. Thus, our goal becomes to examine the properties of the robust decision set,

R = Ψ \cap P,

(11)

which contains all available robust decisions in

X

, and to then determine a direct method for the computation of a robust decision (or an arbitrarily close approximation thereof).

Example 4.

Following up on our analysis in Example 1 and Example 2, consider the (convex, compact) action set $X = {x \in R^{m} : ‖ x ‖_{2} \leq 1}$ , which is equal to the unit ball in the standard Euclidean distance, where the dimension of the underlying space is given by some integer $m \geq 2$ . By direct computation, $μ (t) = 1 / \sqrt{m}$ for all $t \in T = [t_{1}, t_{2}]$ , and $t_{1} = f_{1}^{*} = 1 / \sqrt{m} = f_{2}^{*} = μ (t_{2}) = t_{2}$ . As in Example 3, the balancedness condition (8) yields $\hat{t} = t_{2}$ , and thus an optimal performance index of $ρ^{*} = 1$ . The set of pseudo-robust actions,

Ψ = {x \in X : \min {x_{1}, \dots, x_{m}} = \hat{t} / ρ^{*}} = {(1 / \sqrt{m}, \dots, 1 / \sqrt{m})},

is a singleton. On the other hand, the set of Pareto-optimal actions,

P = {x \in R_{+}^{m} : ‖ x ‖_{2} = 1},

contains a continuum of elements, including the only pseudo-robust decision,

(1, \dots, 1) / \sqrt{m}

In Example 3, we examined a problem where $P ⊊ Ψ$ (so $R = P$ ), whereas in Example 4, for the same multicriteria decision problem with a different action set, it was $Ψ ⊊ P$ (so $R = Ψ$ ). Neither of these two extremes might apply, in which case $R \notin {Ψ, P}$ , as illustrated next.

Example 5.

Mixing and matching features from Example 3 and Example 4, let us consider the (nonconvex, compact) action set $X = {x \in R^{2} : ‖ x ‖_{2} \leq 1} \ {(1 / 2, 1]}^{2}$ in the Euclidean plane. The corresponding set of Pareto-optimal actions is given by the intersection of the unit circle with both the action set $X$ and the positive quadrant $R_{+}^{2}$ , so $P = {x \in X \cap ℝ_{+}^{2} : ‖ x ‖_{2} = 1}$ . Meanwhile, the set of pseudo-robust actions is $Ψ = {x \in X : \min {x_{1}, x_{2}} = 1 / 2}$ . Thus, by Equation (11) it is $R = Ψ \cap P = {(1 / 2, \sqrt{3} / 2), (\sqrt{3} / 2, 1 / 2)}$ , which in this setting means $R \notin {Ψ, P}$ .

The existence of robust actions is ensured by the next result, together with the fact that the robust decision set $R$ must be closed and bounded (i.e., compact).

Lemma 2.

The robust decision set $R$ is nonempty and compact.

The proof of Lemma 2 starts by noting that the set $P$ of efficient actions is compact because it must be bounded (by the boundedness of the encompassing action set $X$ ) and closed (by the continuity of the criteria). One can then construct a sequence of pseudo-robust actions in the (nonempty) compact set $Ψ$ which might be inefficient (or else $R \neq \emptyset$ holds true immediately). But each inefficient pseudo-robust action suggests the existence of a more efficient action, which incidentally must also be pseudo-robust. Given that $Ψ$ is compact, the Bolzano-Weierstrass theorem (Berge 1963, p. 67) then guarantees the existence of a converging subsequence of pseudo-robust actions, with a limit that must be a feasible decision which is both pseudo-robust and efficient, so $R \neq \emptyset$ . Compactness of the robust decision set then follows, as it has to be both closed and bounded.

Lemma 3.

The optimal performance index ρ* is such that $\max ρ (P) = ρ^{*} = \max ρ (X)$ and $ρ (R) = ρ (Ψ) = {ρ^{*}}$ .

The preceding result (re)states the fact that the decision maker can restrict attention to efficient actions when maximizing the performance index, meaning that there always exists an efficient action which attains the optimal performance index $ρ^{*}$ ; this action is by definition robust. Conversely, any robust action necessarily achieves a performance index of $ρ^{*}$ , which is quite straightforward in light of both Lemma 2 and the definition of the robust decision set in Equation (11).

3.4. Robust Decisions: Computation

How can one determine a (relatively) robust decision? By Lemma 3 we can limit attention to maximizing the performance index over all efficient decisions in our search for robust decisions. Thus, to be guaranteed an efficient decision which is also approximately pseudo-robust, by virtue of Proposition 2 we introduce the $ε$ -augmented performance index,

Φ_{ε} (x) = (1 - ε) \min_{i \in N} ϕ_{i} (x) + (ε / n) \sum_{i \in N} ϕ_{i} (x), (x, ε) \in X \times [0, 1],

(12)

so that

Φ_{0} = ρ = \min_{i \in N} ϕ_{i}

, and

Φ_{1} = (1 / n) \sum_{i \in N} ϕ_{i}

. Because

Φ_{ε} (\cdot)

is a continuous function, its maximizer, referred to as the set of

ε

-robust actions,

ℛ_{ε} = \arg \max_{x \in X} Φ_{ε} (x), ε \in [0, 1],

(13)

is nonempty and compact, again by virtue of the extreme-value theorem and the maximum theorem. It is useful, for our further analysis, to shift the perspective from the available decisions in

X

to their respective consequences (or “outcomes”) in terms of their robustness performance, given the prevailing weight ambiguity.

Remark 9.

The (nonempty, compact) outcome space,

Y = {y \in {[0, 1]}^{n} : y_{i} = ϕ_{i} (x), (x, i) \in X \times N},

(14)

contains the vectors

y

of criterion-specific performance ratios, achieved by the available decisions

x

, so

Y = ϕ (X)

, where

ϕ = (ϕ_{1}, \dots, ϕ_{n})

. As several feasible decisions might result in the same score vector, the mapping

ϕ : X \to Y

may not be one-to-one. Consider now the

ε

-augmented performance index in the outcome space,

{\hat{Φ}}_{ε} (y) = (1 - ε) \min_{i \in N} y_{i} + (ε / n) \sum_{i \in N} y_{i}, (y, ε) \in Y \times R_{+} .

Using ideas from Example 1, maximization of this weighted objective yields

{\hat{Φ}}_{ε}^{*} = \max_{y \in Y} Φ_{ε} (y) = \max_{t \in T} {(1 - ε) t + ε \hat{μ} (t)}, ε \in [0, 1],

where

T = [t_{1}, t_{2}]

is a suitable compact interval, and

\hat{μ} (t) = \max {(1 / n) \sum_{i \in N} y_{i} : t \leq y_{i}, (y, i) \in Y \times N}, t \in T,

denotes the average (criterion-specific) performance ratio. The interval boundaries

t_{1}, t_{2}

T

, with

0 < t_{1} \leq t_{2} \leq 1

, are given by

t_{1} = \min_{y \in Y} \min_{i \in N} y_{i} and t_{2} = \max_{y \in Y} \min_{i \in N} y_{i} .

In addition, one can easily verify that

{\hat{Φ}}_{0}^{*} = t_{2} = ρ^{*} = \max ρ (X) and {\hat{Φ}}_{1}^{*} = \hat{μ} (t_{1}) = (1 / n) \max_{y \in Y} \sum_{i \in N} y_{i} \geq t_{2} .

Proposition 1 implies that any selection $t_{ε} \in T_{ε} = \arg \max_{t \in T} {(1 - ε) t + ε \hat{μ} (t)}$ must be decreasing in $ε$ , and similarly, $\hat{μ} (t_{ε})$ must be increasing in $ε$ , at least weakly. The latter also follows directly from the monotonicity of $t_{ε}$ , because $\hat{μ} (t)$ must be nonincreasing in t.

Lemma 4.

Let $ε \in (0, 1]$ . (i) $ℛ_{0} = Ψ$ . (ii) $ℛ_{ε} \subset P$ . (iii) If $x \in ℛ_{ε} \ Ψ$ and $\hat{x} \in Ψ$ , then there exists $ε^{'} \in (0, ε)$ such that $Φ_{\hat{ε}} (x) < Φ_{\hat{ε}} (\hat{x})$ , for all $\hat{ε} \in [0, ε^{'}]$ .

Part (i) of the preceding result notes that without $ε$ -augmentation (i.e., for $ε = 0$ ) the weighted objective $Φ_{ε}$ specializes to the performance index (via Proposition 2), the maximization of which produces the set of pseudo-robust decisions, as in Equation (7). Part (ii) stipulates that whenever the $ε$ -augmentation is nontrivial (i.e., for $ε > 0$ ), maximization yields efficient actions. Finally, part (iii) means that if a decision $x$ maximizes the $ε$ -augmented performance index in Equation (12), for some nontrivial $ε \in (0, 1]$ , without being pseudo-robust, then any pseudo-robust decision $\hat{x} \in Ψ$ would strictly improve upon $x$ in terms of any $\hat{ε}$ -augmented performance index, as long as $\hat{ε}$ (smaller than $ε$ ) lies in a sufficiently small right-neighborhood of the origin.

We are now able to establish a cornerstone property for the practice of robust multicriteria optimization, in the sense that a robust decision (i.e., an element of $R$ ) can be obtained as a lower limit of the set of $ε$ -robust actions, for $ε \to 0^{+}$ .²¹ In Section 3.5, we then show that $ε > 0$ can be chosen so as to guarantee an approximation of the optimal performance $ρ^{*}$ up to any desired precision.

Proposition 3.

Let $Q = {\underline{Lim}}_{ε \to 0^{+}} ℛ_{ε}$ . Then $Q \subset R$ and $Q \neq \emptyset$ .

The following example shows that it is possible that $Q ⊊ R$ . In other words, some points in the robust decision set might not be reached using the proposed approximation procedure.

Example 6.

Consider a (nonconvex, compact) action set as shown in Figure 4(a), specified by

X = {x \in [0, 1] \times R_{+} : x_{2} \leq 2 + \sqrt{1 - x_{1}}} \cup {(2, \frac{3}{2})} .

Figure 4. Action Sets in Examples 6 and 7
*Notes.* (a) Action set in Example 6 with ${\underline{Lim}}_{ε \to 0^{+}} ℛ_{ε} = {(2, \frac{3}{2})} ⊊ R = {(1, 2), (2, \frac{3}{2})}$ . (b) Action set in Example 7 with ${\underline{Lim}}_{ε \to 0^{+}} ℛ_{ε} = {(1, 2)} = R$ .

There are $n = 2$ criteria that simply measure the coordinate achievement, so $f_{i} (x) = x_{i}$ , for all $(x, i) \in X \times N$ . The set of efficient actions is

P = {x \in [0, 1] \times R_{+} : x_{2} = 2 + \sqrt{1 - x_{1}}} \cup {(2, \frac{3}{2})},

whereas the set of pseudo-robust actions is given by

Ψ = {(1, x_{2}) : 1 \leq x_{2} \leq 2} \cup {(2, \frac{3}{2})}

. By its definition in Equation (11) the set of robust actions can be obtained as the intersection of the preceding sets:

R = Ψ \cap P = {(1, 2), (2, \frac{3}{2})} .

At this point, let us consider the $ε$ -augmented performance index in Equation (12). By Lemma 4 we have $ℛ_{0} = Ψ$ , and for $ε \in (0, 1]$ the maximizer of $Φ_{ε}$ is efficient, so

Φ_{ε}^{*} = \max_{x \in P} Φ_{ε} (x) = \max {\max_{x_{1} \in [0, 1]} {\frac{(1 - ε) x_{1}}{2} + \frac{ε}{2} (\frac{x_{1}}{2} + \frac{2 + \sqrt{1 - x_{1}}}{3})}, Φ_{ε} (2, \frac{3}{2})},

where

ϕ_{i} = f_{i} / f_{i}^{*}

, for

i \in N

, with

(f_{1}^{*}, f_{2}^{*}) = (2, 3)

, and

Φ_{ε} (2, \frac{3}{2}) = (2 + ε) / 4

. By direct computation one finds that the maximizer of the (nontrivially)

ε

-augmented performance index,²²

ℛ_{ε} = {(2, \frac{3}{2})}, ε \in (0, 1],

is a singleton. Proposition 3 then guarantees that the lower limit of this maximizer is robust:

Q = {\underline{Lim}}_{ε \to 0^{+}} ℛ_{ε} = {(2, \frac{3}{2})} ⊊ {(1, 2), (2, \frac{3}{2})} = R .

The fact that $(2, \frac{3}{2})$ is an isolated point (cf. Endnote 12) is not important, because one could easily connect it to $X$ by adding points to the action set that are always suboptimal.²³

3.5. Approximation Error of $ε$ -Robust Decisions

Maximizing the $ε$ -augmented performance index $Φ_{ε}$ in Equation (12) enables us to approximate a robust decision to an arbitrary prespecified precision, as a function of $ε \in (0, 1]$ . The quality of any approximate decision $x_{ε} \in ℛ_{ε}$ in Equation (11) is gauged by its approximation error,

ψ_{ε} = ρ^{*} - ρ_{ε} \in [0, 1], ε \in [0, 1],

(15)

that is, the difference between the optimal performance index,

ρ^{*} = \max ρ (X)

, and the achieved performance index,

ρ_{ε} = ρ (x_{ε})

. This quantity measures the loss in robustness from using the

ε

-approximation rather than the true robust decision. For the remainder of this discussion, we keep the selection

x_{ε} \in ℛ_{ε}

fixed, for all

ε \in [0, 1]

. If we denote by

μ_{ε} = μ (x_{ε}) = (1 / n) \sum_{i \in N} ϕ_{i} (x_{ε})

the average performance ratio achieved by

x_{ε}

, the optimal

ε

-augmented performance index becomes

Φ_{ε}^{*} = \max Φ_{ε} (X) = (1 - ε) ρ_{ε} + ε μ_{ε}, ε \in [0, 1] .

(16)

The following result guarantees continuity, as well as first- and second-order monotonicity of the optimal $ε$ -augmented performance index.

Lemma 5.

$Φ_{ε}^{*}$ is continuous, increasing, and convex in $ε \in [0, 1]$ .

The proof of the first-order monotonicity uses the fact that $μ_{ε} \geq ρ_{ε}$ (for all $ε \in [0, 1]$ ), together with natural properties of an optimal solution to Equation (13) in order to establish that $Φ_{ε}^{*}$ in Equation (16) must increase in $ε$ (at least weakly). That the optimal $ε$ -augmented performance index also exhibits a second-order monotonicity means that tightening the approximation parameter further and further leads to progressively slower decreases of $Φ_{ε}^{*}$ toward $Φ_{0}^{*} = ρ^{*}$ (as $ε \to 0^{+}$ ). The convexity of the optimal $ε$ -augmented performance index also implies its smoothness (almost everywhere), as pointed out next.

Remark 10.

By the Rademacher theorem (Villani 2008, theorem 10.8, p. 222), the convexity of $Φ_{ε}^{*}$ in $ε \in [0, 1]$ , which implies Lipschitz continuity, guarantees that the optimal $ε$ -augmented performance index is differentiable almost everywhere (a.e.) on [0, 1]. The Alexandrov theorem (Villani 2008, theorem 14.25, p. 402) goes even further by establishing its second-order differentiability a.e., in the sense of having a Taylor expansion with a smaller-than-quadratic local error at almost every point $ε \in [0, 1]$ . By the envelope theorem (see, e.g., Mas-Colell et al. 1995, theorem M.L.1, pp. 965–966), at points of differentiability one therefore obtains:

\frac{d Φ_{ε}^{*}}{d ε} = μ_{ε} - ρ_{ε} \geq 0, \dot{\forall} ε \in [0, 1] .

In other words, the gradient of the optimal $ε$ -augmented performance index is (a.e.) equal to the difference between the average performance ratio ( $μ_{ε}$ ) and the minimum performance ratio ( $ρ_{ε}$ ) attained by the approximately robust decision $x_{ε}$ .

The maximized weighted objective in Equation (16) is a convex combination of $ρ_{ε}$ and $μ_{ε}$ ; the latter both exhibit “natural” comparative statics as implied by the reweighting result in Proposition 1.

Lemma 6.

(i) $ρ_{ε}$ is decreasing in $ε \in [0, 1]$ . (ii) $ρ^{*} \geq ρ_{ε}$ , for all $ε \in [0, 1]$ . (iii) $\lim_{ε \to 0^{+}} ρ_{ε} = ρ^{*}$ . (iv) $μ_{ε}$ is increasing in $ε \in [0, 1]$ . (v) $μ_{ε} \geq ρ^{*}$ , for all $ε \in [0, 1]$ .

Parts (i) and (iv) of Lemma 6 assert that decreasing the augmentation parameter $ε$ can only decrease the average criterion-specific performance ratio $μ_{ε}$ and at the same time increase the performance index $ρ_{ε}$ , where both are achieved at a given $ε$ -robust selection $x_{ε}$ . Meanwhile, by parts (ii) and (v) the optimal performance ratio $ρ^{*} = \max ρ (X)$ is bracketed by these two values, in the sense that $ρ_{ε} \leq ρ^{*} \leq μ_{ε}$ , for all $ε \in [0, 1]$ . Finally, parts (i) and (iii) establish the monotone convergence of $ρ_{ε}$ to $ρ^{*}$ , as $ε \to 0^{+}$ .

The following result provides a performance guarantee for any $ε$ -approximation of our robust multicriteria decision problem, alluded to at the outset of our discussion.

Proposition 4.

Fix any $δ > 0$ . If $ε \leq δ / μ_{1}$ , then $ψ_{ε} \leq δ$ .

A special case of the preceding result is that for any $δ > 0$ the approximation error $ψ_{δ / μ_{1}}$ cannot exceed $δ$ . In other words, any $ε$ -robust decision $x_{ε}$ , for $ε = δ / μ_{1}$ , attains a performance index $ρ_{ε} \in [ρ^{*} - δ, ρ^{*}]$ , where $ρ^{*}$ is the optimal performance index. The following example applies this result in an already familiar context.

Example 7.

Somewhat similar to Example 6, we consider the (nonconvex, compact) action set

X = {x \in [0, 1] \times R_{+} : x_{2} \leq 2 + \sqrt{1 - x_{1}}} \cup {(3, \frac{1}{2})},

together with the same

n = 2

criteria

f_{i} (x) = x_{i}

, for all

(x, i) \in X \times N

. Consequently, the set of efficient actions is

P = {x \in [0, 1] \times R_{+} : x_{2} = 2 + \sqrt{1 - x_{1}}} \cup {(3, \frac{1}{2})}

, and the set of pseudo-robust actions is

Ψ = {(1, x_{2}) : 1 \leq x_{2} \leq 2}

. Thus, by Equation (11) the set of robust actions amounts to

R = Ψ \cap P = {(1, 2)}

, which is a singleton. By Lemma 4 the

ε

-augmented performance index

Φ_{ε}

in Equation (12) is efficient (i.e.,

ℛ_{ε} \subset P

), for all

ε \in (0, 1]

, and

ℛ_{0} = Ψ

. In particular, the optimal value of the

ε

-augmented performance index is

Φ_{ε}^{*} = \max_{x \in P} Φ_{ε} (x) = \max {\max_{x_{1} \in [0, 1]} {\frac{(1 - ε) x_{1}}{3} + \frac{ε}{2} (\frac{x_{1}}{3} + \frac{2 + \sqrt{1 - x_{1}}}{3})}, Φ_{ε} (3, \frac{1}{2})},

where

ϕ_{i} = f_{i} / f_{i}^{*}

, for

i \in N

, with

(f_{1}^{*}, f_{2}^{*}) = (3, 3)

, and

Φ_{ε} (3, \frac{1}{2}) = (2 + 5 ε) / 12

. For sufficiently small

ε

, the maximizer,

ℛ_{ε} = {(1 - {(\frac{ε / 2}{2 - ε})}^{2}, 2 + \frac{ε / 2}{2 - ε})}, 0 < ε < \frac{4}{7} (2 - \frac{1}{\sqrt{2}}) \approx 0.7388,

is single-valued, achieving

Φ_{ε}^{*} = (1 / 24) (16 - 3 ε^{2}) / (2 - ε)

.²⁴ By Proposition 3 the lower limit of this maximizer for

ε \to 0^{+}

is robust, and in our case:

Q = {\underline{Lim}}_{ε \to 0^{+}} ℛ_{ε} = {(1, 2)} = R;

see Figure 4(b). Note also that

ℛ_{1} = {(3, \frac{1}{2})}

, which in turn yields the average criterion-specific performance ratio achieved by the decision

x_{ε} \in ℛ_{ε}

, for

ε = 1

μ_{1} = \frac{1}{2} {(\frac{x_{1}}{f_{1}^{*}} + \frac{x_{2}}{f_{2}^{*}}) |}_{(x_{1}, x_{2}) = (3, \frac{1}{2})} = \frac{7}{12} \approx 0.5833 .

Thus, to guarantee that the approximation error $ψ_{ε}$ in Equation (15) cannot exceed $δ = 5 %$ , it is by Proposition 4 enough to find an action $x_{ε}$ that maximizes the $ε$ -augmented performance index $Φ_{ε}$ in Equation (12) for some $ε \in (0, δ / μ_{1}] \approx (0, 8.6 %]$ . Incidentally, for $ε = 8 %$ , we find $ρ_{ε} \approx 33.32 %$ , so that the realized approximation error of $ψ_{ε} \approx 0.01 %$ (with respect to $ρ^{*} = 1 / 3$ ) is actually much smaller than the prespecified 5% approximation-error bound.

It is also possible to derive a priori performance estimates without knowledge of the optimal performance index ( $ρ^{*}$ ) just by solving the $ε$ -approximation problem in Equation (13) for some admissible $ε$ . Indeed, Lemma 6 (ii) and Lemma 5 together imply that

ρ_{ε} \leq ρ^{*} \leq Φ_{ε}^{*}, ε \in [0, 1] .

(17)

For any $ε \in [0, 1]$ , let us now consider the midpoint estimator,

{\hat{ρ}}_{ε} = \frac{ρ_{ε} + Φ_{ε}^{*}}{2} = ρ_{ε} + ε \frac{μ_{ε} - ρ_{ε}}{2} = ρ (x_{ε}) + ε \frac{μ (x_{ε}) - ρ (x_{ε})}{2}, ε \in [0, 1] .

(18)

The latter means that ${\hat{ρ}}_{ε}$ effectively approximates $ρ^{*}$ , for $ε \to 0^{+}$ .

Lemma 7.

$| {\hat{ρ}}_{ε} - ρ^{*} | \leq ε (μ_{ε} - ρ_{ε}) / 2$ , for all $ε \in [0, 1]$ .

Because $μ_{ε} - ρ_{ε} \in [0, 1]$ , a somewhat simpler (though generally less precise) approximation inequality than the one given in the preceding Lemma 7 is $| {\hat{ρ}}_{ε} - ρ^{*} | \leq ε / 2$ , for all $ε \in [0, 1]$ .

Remark 11.

For a robust decision $\hat{x} \in R$ let $\hat{y} = (f_{1} (\hat{x}), \dots, f_{n} (\hat{x}))$ ; in addition, let $y^{*} = (f_{1}^{*}, \dots, f_{n}^{*})$ be the utopian point in the outcome space. Because by construction ${\hat{y}}_{i} \geq ρ^{*} y_{i}^{*}$ , for all $i \in N$ , it is

ρ^{*} \geq \underline{ρ} = \sup {r \in [0, 1] : r y^{*} \in Y},

where

\sup \emptyset = 0

. In the balanced case, where

f_{i} (\hat{x}) = ρ^{*} f_{i}^{*}

, for all

i \in N

, the lower bound becomes tight (i.e.,

ρ^{*} = \underline{ρ}

). However, in general

\underline{ρ}

may not be very reliable (e.g., generically worse than the performance index

ρ_{d} = ρ (x_{d})

, attained by the default decision

x_{d}

3.6. Robust Weights

3.6.1. General Case.

An interesting and useful byproduct of a robust decision $\hat{x} \in R$ is its associated robust (normalized) weight,

\hat{λ} = (\frac{\hat{h}}{f_{1} (\hat{x})}, \dots, \frac{\hat{h}}{f_{n} (\hat{x})}) \in Δ,

(19)

where the positive constant

\hat{h} = {(\sum_{i \in N} \frac{1}{f_{i} (\hat{x})})}^{- 1}

(20)

denotes the harmonic mean of the criterion scores at the robust decision. By construction one obtains that the contribution to the robustly weighted objective

F (\cdot | \hat{λ})

, evaluated at the robust decision

\hat{x}

, is equal for all criteria, in the sense that

\max_{x \in X} \min_{i \in N} {\hat{λ}}_{i} f_{i} (x) = \hat{h} = {\hat{λ}}_{i} f_{i} (\hat{x}), i \in N .

(21)

When the action set is convex, then the chosen robust action naturally also maximizes the robustly weighted criterion, that is, $\hat{x} \in X (\hat{λ})$ . One can think of the robust weight as an endogenous belief that can be used to evaluate the expected criterion achievement. It defines the tradeoffs compatible with the robust decision, where the latter was found while remaining agnostic over all possible weights (in $Δ$ ).

3.6.2. Balanced Case.

In the case where $f_{i} (\hat{x}) = ρ^{*} f_{i}^{*}$ , for all $i \in N$ (cf. Remark 11), we have

\hat{λ} = (\frac{h^{*}}{f_{1}^{*}}, \dots, \frac{h^{*}}{f_{n}^{*}}),

(19′)

where

h^{*} = {(\sum_{i \in N} \frac{1}{f_{i}^{*}})}^{- 1}

(20′)

is the harmonic mean of the maximum criterion scores. The balancedness condition then becomes

ρ^{*} = \frac{f_{i} (\hat{x})}{f_{i}^{*}} = \frac{\hat{h}}{h^{*}}, i \in N .

In the balanced case, the robust weight can therefore be determined based on the maximized individual criteria alone. If all maximized individual criteria are equal, the robust weight becomes uniform. Indeed, $f_{i}^{*} = c > 0$ , for all $i \in N$ , implies that $h^{*} = c / n$ , so $\hat{λ} = (1 / n, \dots, 1 / n)$ .²⁵

Example 8.

In Example 1, we determined the value function $F^{*} (ℓ) = (1 - ℓ) \hat{t} + ℓ μ (\hat{t})$ , so that by Equations (19′) and (20′) in this balanced case (as established in Example 2):

\hat{λ} = (1 - \hat{ℓ}, \hat{ℓ}) = (\frac{μ (\hat{t})}{\hat{t} + μ (\hat{t})}, \frac{\hat{t}}{\hat{t} + μ (\hat{t})}) = (\frac{f_{2}^{*}}{f_{1}^{*} + f_{2}^{*}}, \frac{f_{1}^{*}}{f_{1}^{*} + f_{2}^{*}}) .

Generically, it is $F^{*} (\hat{ℓ}) \neq F (\hat{x} | \hat{ℓ})$ , even when the decision set is convex. To see this, let $m = n = 2$ and consider the convex domain, $X = {x \in R_{2}^{+} : ν_{1} x_{1} + ν_{2} x_{2} \leq 1}$ , where the constants $ν_{1}, ν_{2}$ are such that $0 < ν_{1} < ν_{2} < \infty$ . Then $μ (t) = (1 - ν_{2} t) / ν_{1} + t$ , so that $μ^{'} (t) = 1 - (ν_{2} / ν_{1}) < 0$ . Meanwhile, $F (t | ℓ) = (1 - ℓ) t + ℓ μ (t)$ , for $t \in [0, f_{1}^{*}] = [0, 1 / (ν_{1} + ν_{2})]$ , and

F^{'} (t | ℓ) = (1 - ℓ) + ℓ μ^{'} (t) = 1 - \frac{ℓ ν_{2}}{ν_{1}} \geq 0 \Leftrightarrow ℓ \leq \frac{ν_{1}}{ν_{2}} .

Therefore, $F^{*} (ℓ) = F (t^{*} (ℓ) | ℓ) = (ℓ / ν_{1}) + \max {0, 1 - (ℓ ν_{2} / ν_{1})} f_{1}^{*}$ , where

t^{*} (ℓ) \in {\begin{array}{l} {f_{1}^{*}}, & if ℓ < ν_{1} / ν_{2}, \\ [0, f_{1}^{*}], & if ℓ = ν_{1} / ν_{2}, \\ {0}, & if ℓ > ν_{1} / ν_{2}, \end{array}

and

f_{1}^{*} = 1 / (ν_{1} + ν_{2})

. Because

f_{2}^{*} = 1 / ν_{1}

, we find

ν_{1} / ν_{2} = f_{1}^{*} / (f_{2}^{*} - f_{1}^{*})

. By virtue of the fact that

\hat{ℓ} = f_{1}^{*} / (f_{1}^{*} + f_{2}^{*}) < ν_{1} / ν_{2}

, it is

t^{*} (\hat{ℓ}) = f_{1}^{*}

. The balancedness condition,

\frac{\hat{t}}{f_{1}^{*}} = \frac{μ (\hat{t})}{f_{2}^{*}},

is equivalent to

\hat{t} / f_{1}^{*} = 1 - (\hat{t} / f_{1}^{*})

, so

\hat{t} = f_{1}^{*} / 2

. Hence, we find that

\hat{t} \neq t^{*} (\hat{ℓ})

. Let us check the corresponding performance index. Indeed,

ρ (\hat{t}) = 0.5 > 0 = \min {\frac{f_{1}^{*}}{f_{1}^{*}}, 1 - \frac{f_{1}^{*}}{f_{1}^{*}}} = ρ (t^{*} (\hat{ℓ})) .

Note also,

\hat{ℓ} = \frac{f_{1}^{*}}{f_{1}^{*} + f_{2}^{*}} = \frac{1}{2 + (ν_{2} / ν_{1})} < \frac{1}{3} .

That is, the robust weight puts more than twice as much emphasis on the Rawlsian (or egalitarian) objective as on the utilitarian objective. One can conclude that in general there is no “robustness equivalence principle,” in the sense that substituting the robust parameter into the original scalarization (via maximization of the corresponding weighted objective) might not lead to a robust decision.²⁶

Example 9.

Consider the robust allocation of two resources to $n = 2$ agents. The total amount of each resource has been normalized to one. Allocations are determined as decisions $x = (x_{1}, x_{2}) \in {[0, 1]}^{2}$ , under which agent 1 obtains $x_{1}$ of resource 1 and $x_{2}$ of resource 2, whereas agent 2 obtains $1 - x_{1}$ of resource 1 and $1 - x_{2}$ of resource 2. Agent 1’s utility is $f_{1} (x) = x_{1}^{α} x_{2}^{1 - α}$ , and agent 2’s utility is $f_{2} (x) = {(1 - x_{1})}^{β} {(1 - x_{2})}^{1 - β}$ , where $α, β \in (0, 1)$ are given scalars. Figure 5 provides an illustration in the corresponding Edgeworth box (see, e.g., Pareto 1906, p. 187). Because a robust allocation is necessarily Pareto-optimal, both agents’ marginal rates of substitution for the two goods must be equal, so $x \in P$ if and only if

\frac{\partial f_{1} (x) / \partial x_{1}}{\partial f_{1} (x) / \partial x_{2}} = \frac{α}{1 - α} \frac{x_{2}}{x_{1}} = \frac{β}{1 - β} \frac{1 - x_{2}}{1 - x_{1}} = \frac{\partial f_{2} (x) / \partial x_{1}}{\partial f_{2} (x) / \partial x_{2}} .

(22)

Figure 5. Robust Allocation of Resources $\hat{x} = ({\hat{x}}_{1}, {\hat{x}}_{2})$ in Edgeworth Box $X = [0, 1] \times [0, 1]$
*Note.* For two agents with utility functions ( $f_{1}$ and $f_{2}$ ) specified in Example 9 (for $α > β$ ), the unique robust allocation satisfies Pareto-efficiency in Equation (22) and the balancedness condition in Equation (23), under full ambiguity.

In addition, a robust allocation must maximize the performance ratio, and one can verify that this requires the boundary performance ratios to be equal (i.e., a balancedness condition), so

f_{1} (x) = x_{1}^{α} x_{2}^{1 - α} = {(1 - x_{1})}^{β} {(1 - x_{2})}^{1 - β} = f_{2} (x),

(23)

because

f_{1}^{*} = f_{2}^{*} = 1

. In the interesting special case where

α + β = 1

, Equations (22) and (23) together yield the unique robust solution

\hat{x} = (α, β)

, resulting in an optimal performance index of

ρ^{*} = α^{α} β^{β} = α^{α} {(1 - α)}^{1 - α}

, with the last expression being a strictly convex function in

α \in (0, 1)

achieving its minimum (of

1 / 2

) at

α = 1 / 2

and

ρ^{*}

going to one as

α \to {0^{+}, 1^{-}}

.²⁷

3.7. Limited Ambiguity

Consider the (nonempty, compact) subset $Ω \subset R_{+}^{n} \ {0}$ which may reflect the decision maker’s a priori knowledge about the relevant weights for the problem at hand, and let

ρ (x | Ω) = \inf_{w \in Ω} φ (x | w), x \in X,

(24)

be the corresponding

Ω

-conditioned performance index. The different weight combinations considered feasible may not necessarily be normalized, especially at an initial stage in practice when information about reasonable weight ranges is being compiled. The function

π : R_{+}^{n} \ {0} \to Δ

, with

π (w) = w / ‖ w ‖_{1}

, describes the radial projection of any nonzero weight

w

R_{+}^{n}

onto the unit simplex

Δ

. It allows for the normalization under weight ambiguity; see Remark 2. The radial projection is a nonlinear function which is homogeneous of degree zero (i.e.,

π (α w) = π (w)

, for all

α > 0

and all

w \in R_{+}^{n} \ {0}

). It can be represented using the conical closure (see, e.g., Berge 1963, p. 14),

C (Ω) = {\hat{w} \in R_{+}^{n} : \hat{w} = α w, w \in Ω, α \in R_{+}}

, as described, among other useful properties of

π (\cdot)

, in the following auxiliary result.

Lemma 8.

Let $Ω$ be a (nonempty, compact) subset of $R_{+}^{n} \ {0}$ . Then (i) (a) $π (Ω) = C (Ω) \cap Δ$ is nonempty and compact, and (b) $π (Ω) = π (\partial Ω)$ . (ii) If $Ω^{'} \subset Ω$ is open (in $R_{+}^{n}$ ), then $π (Ω^{'})$ is open (in $Δ$ ). (iii) If $Ω^{'} \subseteq Ω$ and $Ω^{'} \neq \emptyset$ , then $\partial π (Ω^{'}) = \partial π (\partial Ω^{'})$ is nonempty and compact. (iv) If $Ω$ is convex, then $π (Ω)$ is convex. (v) If $Ω^{'} \subset Ω$ is a straight line segment, then $π (Ω^{'})$ is a straight line segment (or a point). (vi) $π (co (Ω)) = co (π (\partial Ω))$ . (vii) If $Ω = Ω^{'} \cup Ω^{″}$ for some $Ω^{'}, Ω^{″} \subset R_{+}^{n}$ , then $π (Ω) = π (Ω^{'}) \cup π (Ω^{″})$ . (viii) If $Ω = Ω^{'} \cap Ω^{″}$ for some $Ω^{'}, Ω^{″} \subset R_{+}^{n}$ , then $π (Ω) = C (Ω^{'} \cap Ω^{″}) \cap Δ$ .

Part (i) of Lemma 8 notes that (a) the radial projection of $Ω$ onto $Δ$ can be obtained by simply intersecting the conical closure of $Ω$ with $Δ$ , and (b) one can limit attention to the boundary $\partial Ω$ (ignoring interior points of $Ω$ ). Although a continuous function generally does not map open sets to open sets (e.g., a constant function would not), part (ii) guarantees that $π (\cdot)$ does exactly that, provided the “interiority” of a point in $Ω$ is assessed in $R_{+}^{n}$ and then, after its projection, in the lower-dimensional $Δ$ . Part (iii) ensures that weight ambiguity in any (nonempty) subset $Ω^{'}$ of $Ω$ leads to a compact domain $\partial π (\partial Ω^{'})$ for a robust decision according to Proposition 5 below. Part (iv) establishes that convex sets are projected to convex sets.²⁸ Following (v) and (vi), the radial projection leaves straight-line geometries intact, thus, for example, converting (bounded) polyhedra in $R_{n}^{+}$ to polytopes in $Δ$ . Finally, parts (vii) and (viii) note that the radial projection of a union of sets is the union of the corresponding single-set projections, but the same does generally not apply to an intersection.²⁹

Example 10.

Assume the (nonnormalized) weights $w$ considered reasonable by the decision maker are such that each of its components $w_{i}$ , for $i \in N$ , is known to lie in some interval, resulting in a rectangular ambiguity set,

Ω = [\underline{w}, \bar{w}] = \prod_{i \in N} [{\underline{w}}_{i}, {\bar{w}}_{i}],

with bounds

\underline{w} = ({\underline{w}}_{1}, \dots, {\underline{w}}_{n}) \neq 0

and

\bar{w} = ({\bar{w}}_{1}, \dots, {\bar{w}}_{n}) \geq \underline{w}

. Using Lemma 8, leveraging the preservation of straight-line geometries, it can be verified that the radial projection of

Ω

onto

Δ

has the form

π (Ω) = co ({\frac{\underline{w} + δ_{1} e_{1}}{‖ \underline{w} ‖_{1} + δ_{1}}, \dots, \frac{\underline{w} + δ_{n} e_{n}}{‖ \underline{w} ‖_{1} + δ_{n}}}),

(25)

where

δ = (δ_{1}, \dots, δ_{n}) = \bar{w} - \underline{w}

and

‖ \underline{w} ‖_{1} = \sum_{l = 1}^{n} {\underline{w}}_{l}

. As an example, Figure 6 illustrates that the radial projection of a box in

n = 3

dimensions is a triangle in

Δ

. Thus, the shape of

π (Ω)

depends on (at most) n of the original

2^{n}

vertices of

Ω

. Moreover, if the modulus of the lower bound of the (nontrivial) box-shaped ambiguity set

Ω

becomes very small, then the situation tends to become equivalent to full ambiguity. To see this, consider a

\bar{w}

with only positive components and let

\underline{w} = ε e_{1}

, for a sufficiently small

ε > 0

. Then, by Equation (25) it is

π (Ω) = co ({λ^{(1)}, \dots, λ^{(n)}})

, where

λ^{(1)} = e_{1}

and

λ^{(i)} = (ε e_{1} + {\bar{w}}_{i} e_{i}) / (ε + {\bar{w}}_{i})

, for

i \in {2, \dots, n}

. Hence,

\lim_{ε \to 0^{+}} λ^{(i)} = e_{i}

, for all

i \in N

, implying full ambiguity in the limit, as

Δ = co ({e_{1}, \dots, e_{n}}) = π (R_{+}^{n} \ {0})

Figure 6. Radial Projection of Box-Shaped Ambiguity Set $Ω = [\underline{w}, \bar{w}]$
*Note.* With $\underline{w} = (1, 1, 1)$ and $\bar{w} = (3, 2, 2)$ , a radial projection onto $Δ$ yields $π (Ω) = co ({(3 / 5, 1 / 5, 1 / 5), (1 / 4, 1 / 2, 1 / 4), (1 / 4, 1 / 4, 1 / 2)})$ , as described in Example 10.

Under limited ambiguity, a coordinate-wise decomposition is no longer available. However, because $φ (x | \cdot)$ is quasiconcave, for any given $x \in X$ , as established in the proof of Proposition 2, the upper contour sets of the performance ratio in the space of weights are necessarily convex. This in turn allows restricting attention, for the representation of the performance index, to the extreme points of the convex hull of the ambiguity set, which yields a representation much in the same spirit as before.

Proposition 5.

The $Ω$ -conditioned performance index in Equation (24) is such that

ρ (x | Ω) = \min_{λ \in \partial π (\partial Ω)} φ (x | λ) = \min_{λ \in Λ} φ (x | λ), x \in X,

for some “extremal base”

Λ

which is a (compact) subset of

\partial π (\partial Ω)

, such that

co (Λ) = co (π (Ω))

Although choosing $Λ = \partial π (Ω) = \partial π (\partial Ω)$ (cf. Lemma 8 (i) (b)) is always possible, it is usually advantageous to opt for the smallest possible extremal base $Λ$ , so it consists only of the “extreme points” of $co (π (Ω))$ , where the latter cannot be represented as convex combinations of other points in $Λ$ (see, e.g., Rockafellar 1970, section 18, p. 162).³⁰ If $Ω$ is finite or a (finite) polytope, then the smallest extremal base $Λ$ of extreme points of $co (π (Ω))$ is also finite.³¹ Because any bounded convex set can be approximated by a finite polytope (Bronstein 2008), a finite extremal base can be used to represent the convex hull of the given ambiguity set (or its radial projection onto $Δ$ ) up to any desired precision. That $π (Ω)$ may be nonconvex is not important because the level sets of the performance ratio are convex, so that the minima of $φ (x | \cdot)$ on $π (Ω)$ are attained on the boundary of its convexification, $co (π (Ω))$ .

Example 11.

Consider the robust allocation of resources to $n = 2$ agents as in Example 9, given the box-shaped ambiguity set $Ω$ as in Example 11, with its radial projection specified by Equation (25),

π (Ω) = co (Λ) = co ({\frac{({\bar{w}}_{1}, {\underline{w}}_{2})}{{\bar{w}}_{1} + {\underline{w}}_{2}}, \frac{({\underline{w}}_{1}, {\bar{w}}_{2})}{{\underline{w}}_{1} + {\bar{w}}_{2}}}) ⊊ Δ,

where

Λ = {λ^{(1)}, λ^{(2)}}

. Thus, by virtue of Proposition 5 the

Ω

-conditioned performance index becomes

ρ (x | Ω) = \min {\frac{{\bar{w}}_{1} f_{1} (x) + {\underline{w}}_{2} f_{2} (x)}{\max_{\hat{x} \in X} {{\bar{w}}_{1} f_{1} (\hat{x}) + {\underline{w}}_{2} f_{2} (\hat{x})}}, \frac{{\underline{w}}_{1} f_{1} (x) + {\bar{w}}_{2} f_{2} (x)}{\max_{\hat{x} \in X} {{\underline{w}}_{1} f_{1} (\hat{x}) + {\bar{w}}_{2} f_{2} (\hat{x})}}}, x \in X .

For a pseudo-robust decision $x$ (in $Ψ$ ), balancedness must hold, so that, by substituting $f_{1}, f_{2}$ from Example 9,

\frac{F (x | λ^{(1)})}{F (x | λ^{(2)})} = \frac{{\bar{w}}_{1} x_{1}^{α} x_{2}^{1 - α} + {\underline{w}}_{2} {(1 - x_{1})}^{β} {(1 - x_{2})}^{1 - β}}{{\underline{w}}_{1} x_{1}^{α} x_{2}^{1 - α} + {\bar{w}}_{2} {(1 - x_{1})}^{β} {(1 - x_{2})}^{1 - β}} = \frac{F^{*} (λ^{(1)})}{F^{*} (λ^{(2)})} .

(26)

Meanwhile, a feasible allocation decision $x = (x_{1}, x_{2})$ is Pareto-optimal (in $P$ ) if and only if it satisfies Equation (22), as before. When the box-shaped ambiguity set becomes smaller, the elements of the extremal base $Λ$ become more similar and coincide in the limit (i.e., $λ^{(1)} - λ^{(2)} \to 0$ as $‖ \bar{w} - \underline{w} ‖_{1} \to 0^{+}$ ). Figure 7 shows the robust allocation $\hat{x} = \hat{x} (Ω)$ on the contract curve between the Pareto-optimal allocations $x^{*} (λ^{(i)})$ that maximize the weighted objective $F (x | λ^{(i)})$ for $i \in N$ . Under full ambiguity (i.e., for $Ω = Δ$ ), it is $λ^{(i)} \equiv e_{i}$ ; see Example 9.

Figure 7. Robust Allocation of Resources, ${\hat{x} (Ω)} = \arg \max_{x \in P} ρ (x | Ω)$ , to Two Agents in Example 11 (for $α > β$ )
*Note.* Under limited ambiguity $Ω = [\underline{w}, \bar{w}]$ as in Example 10, the robust allocation $\hat{x} (Ω)$ satisfies Pareto-efficiency in Equation (22) and balancedness in Equation (26); it differs from the allocation $\hat{x} (Δ)$ under full ambiguity, which maximizes $ρ (x) = ρ (x | Δ)$ on $P$ .

Example 12.

In some applications, the weights are naturally ranked by importance (see, e.g., Wang and Fu 2020). The ordered set of weights, $Ω^{'} = {(w_{1}, \dots, w_{n}) \in R_{+}^{n} : w_{1} \geq w_{2} \geq \dots \geq w_{n}} = C (Ω^{'})$ , has a radial projection onto $Δ$ of the form $π (Ω^{'}) = co ({λ^{' (i)} : i \in N})$ , where $λ^{' (i)} = \frac{1}{i} \sum_{j = 1}^{i} e_{j}$ , for all $i \in N$ . Combining this importance-ranking with a box-shaped ambiguity set $Ω^{″} = [\underline{w}, \underline{w} + δ]$ as in Example 10, with $π (Ω^{″}) = co ({λ^{″ (i)} : i \in N})$ , where $λ^{″ (i)} = (\underline{w} + δ_{i} e_{i}) / (‖ \underline{w} ‖_{1} + δ_{i})$ , for all $i \in N$ , leads to

Λ = {𝟙_{{\underline{w} + δ_{i} e_{i} \notin Ω^{'}}} λ^{' (i)} + 𝟙_{{\underline{w} + δ_{i} e_{i} \in Ω^{'}}} λ^{″ (i)} : i \in N} .

This extremal base (of cardinality n) suits practical applications where a decision maker disposes of plausible ranges for the weights to be placed on the criteria, together with a ranking of their importance (cf. Section 4.3.2).

Remark 12.

(i) Limiting ambiguity can only increase robustness performance. That is, if $Ω, \hat{Ω} \subset R_{+}^{n} \ {0}$ are nonempty and compact and satisfy $π (\hat{Ω}) \subseteq π (Ω)$ , then $ρ (\cdot | Ω) \leq ρ (\cdot | \hat{Ω})$ , which follows directly from the definition of the $Ω$ -conditioned performance index in Equation (24). (ii) In the absence of ambiguity, the optimal robustness performance has to be maximal. That is, if $\hat{Ω} = {w}$ , then $ρ^{*} (\hat{Ω}) = \max_{x \in X} ρ (x | {w}) = F^{*} (w) / F^{*} (w) = 1$ .

3.8. Criterion Ambiguity

Given a (nonempty, compact) action set $X \subset R^{m}$ as in Section 2, we now allow for ambiguity in each of the $n^{'}$ criteria, given by the continuous functions $g_{i} : X \times Θ \to R_{+}$ , for $i \in N^{'} = {1, \dots, n^{'}}$ , which map $(x, θ)$ to real numbers, where $x \in X$ is a feasible action and $θ$ is an ex ante unknown state in the (nonempty, compact) state space $Θ \subset R^{l}$ , and where $l, m, n^{'} \geq 1$ are given integers. The unknown state $θ$ represents the decision maker’s uncertainty about the exact value that each of his $n^{'}$ objectives may attain under a chosen action $x$ .

Remark 13.

(i) Continuity of the criteria in the state implies that a small perturbation in the state can have only a small impact on the decision maker’s objective. This continuity is automatically satisfied when the state space $Θ$ is finite.³² (ii) The unknown state may introduce dependencies between the different $g_{i}$ . Indeed, even if $θ = (θ_{1}, \dots, θ_{n^{'}})$ (for $l = n^{'}$ ) and criteria are such that $g_{i} (x, θ) \equiv g_{i} (x, θ_{i})$ , for all $i \in N^{'}$ , then the different criteria’s ambiguity may still be linked via common constraints in $Θ$ .³³ But if in addition the state space is a Cartesian product, so $Θ = Θ_{1} \times \dots Θ_{n^{'}}$ , and $θ_{i} \in Θ_{i}$ , for all $i \in N$ , then the decoupling of ambiguity across the different criteria is complete, in the sense that observing the part of the state that determines one criterion does not reduce ambiguity for any other criterion.

To assess goal achievement of an action in a given state, the decision maker considers a scalarization of his multicriteria optimization problem by means of a weighted objective,

G (x, θ | λ^{'}) = \sum_{i = 1}^{n^{'}} λ_{i}^{'} g_{i} (x, θ), (x, θ, λ^{'}) \in X \times Θ \times Δ^{'},

where

Δ^{'} = {w^{'} = (w_{1}, \dots, w_{n^{'}}) \in R_{+}^{n^{'}} : w_{1} + \dots + w_{n^{'}} = 1}

denotes the set of all (normalized) weights. As in the nontriviality condition (N) (cf. Section 2), we assume that there exists a default decision (

x_{d}

) such that the decision maker’s weighted objective is positive across all states, that is,

\exists x_{d} \in X : G (x_{d}, θ | λ^{'}) > 0, (θ, λ^{'}) \in Θ \times Δ^{'} .

(N′)

The modified nontriviality condition (N′) ensures that the ex post optimal objective is always positive, so

G^{*} (θ | λ^{'}) = \max_{x \in X} G (x, θ | λ^{'}) \geq G (x_{d}, θ | λ^{'}) > 0,

for all

(θ, λ^{'}) \in Θ \times Δ^{'} .

Remark 14.

The modified nontriviality condition (N′) can always be satisfied, without altering the set of ex post optimal decisions, by using a (positive, translated) criterion ${\hat{g}}_{i} = g_{i} + ε$ , for some $ε > 0$ ; see also Remark 4.

Given a (nonempty, compact) ambiguity set $Ω^{'} \subset R_{+}^{n^{'}}$ , the decision maker’s robustness objective, as in Section 3.7, is to maximize the $(Ω^{'}, Θ)$ -conditioned performance index,

ρ (x | Ω^{'}, Θ) = \min_{(λ^{'}, θ) \in π (Ω^{'}) \times Θ} \frac{G (x, θ | λ^{'})}{G^{*} (θ | λ^{'})}, x \in X .

The following result recasts the robustness objective into a by-now-familiar representation.

Proposition 6.

Let $Λ^{'} \subset Δ^{'}$ be an extremal base of $co (π (Ω^{'}))$ . Then

ρ (x | Ω^{'}, Θ) = \min_{λ^{'} \in Λ^{'}} {\min_{θ \in Θ} \frac{G (x, θ | λ^{'})}{G^{*} (θ | λ^{'})}}, x \in X .

(27)

Based on Proposition 6, if the (smallest) extremal base is finite, we can reduce the general multicriteria decision problem to our basic framework in Section 2, with full weight ambiguity and no criterion ambiguity.

Corollary 1.

Assume that there exists a (finite, smallest) integer $n \geq 1$ so that $Λ^{'} = {λ^{' (1)}, \dots, λ^{' (n)}}$ and $co (Λ^{'}) = co (π (Ω^{'}))$ , and let $f_{i} (x) = \min_{θ \in Θ} {G (x, θ | λ^{' (i)}) / G^{*} (θ | λ^{' (i)})}$ , for all $(x, i) \in X \times N$ . Then

ρ (x) = \min_{i \in N} {\frac{f_{i} (x)}{f_{i}^{*}}}, x \in X,

(28)

where $f_{i}^{*} = \max_{x \in X} f_{i} (x) = 1$ , for all $i \in N = {1, \dots, n}$ , represents $ρ (\cdot | Ω^{'}, Θ)$ in Equation (27).

Example 13.

(i) In the case of full ambiguity, it is $Λ^{'} = Δ^{'}$ , so that $n = n^{'}$ and $Δ^{'} = co ({e_{1}, \dots, e_{n}}) = Δ$ , with $f_{i} (x) = \min_{θ \in Θ} {g_{i} (x, θ) / G^{*} (θ | e_{i})}$ , for all $(x, i) \in X \times N$ . (ii) Consider now a multicriteria decision problem under full ambiguity, with criteria of the form $g_{i} (x, θ) = {({\hat{g}}_{i} (x))}^{θ_{i}}$ , for all $i \in N$ , where the state $θ = (θ_{1}, \dots, θ_{n})$ is only known to lie in the (nonempty, compact) set $Θ \subset R_{+}^{n}$ (with $Θ \neq {0}$ to avoid trivialities), and where the functions ${\hat{g}}_{i} : X \to R_{+ +}$ are continuous. Let ${\hat{g}}_{i}^{*} = \max {\hat{g}}_{i} (X) > 0$ and ${\bar{θ}}_{i} = \max_{θ \in Θ} θ_{i}$ ; furthermore, set

f_{i} (x) = {(\frac{{\hat{g}}_{i} (x)}{{\hat{g}}_{i}^{*}})}^{{\bar{θ}}_{i}} = \min_{θ \in Θ} (\frac{{\hat{g}}_{i}^{θ_{i}} (x)}{\max_{x \in X} {\hat{g}}_{i}^{θ_{i}} (x)}), x \in X,

for all

i \in N

, taking into account that

ξ^{θ}

decreases in

θ

, for all

ξ \in [0, 1]

. The extreme state

\bar{θ} = ({\bar{θ}}_{1}, \dots, {\bar{θ}}_{n})

determines the robust choice, as it leads ceteris paribus to the smallest criterion values—thus following a precautionary principle. Then, by Corollary 1 we follow our basic framework, maximizing the performance index in Equation (28) with respect to all Pareto-optimal decisions in Equation (10). For instance, if

Θ = Δ

, the reduction to the weighted objective in Equation (1) obtains with

f_{i} = {\hat{g}}_{i} / {\hat{g}}_{i}^{*}

, for all

i \in N

Remark 15.

(i) The minimization over the state space in Example 13 reflects a robust, precautionary stance: the decision maker evaluates actions under the most adverse plausible realization of the state with respect to the achievable performance ratio, consistent with a notion of relative worst-case robustness. (ii) The treatment of criterion ambiguity in this section also bears a conceptual resemblance to models of DRO, where decisions are evaluated against worst-case distributions within a specified ambiguity set. In our framework, the uncertainty is not over probability distributions but over states $θ \in Θ$ , with performance assessed under the worst-case realization of both the state and the weights. This can be viewed as a nonprobabilistic analogue of DRO, where the ambiguity set $Ω^{'}$ over the weights plays a role similar to ambiguity sets over distributions in DRO models. In particular, the two-layer minimization in Equation (27)—over weights and states—mirrors the inner DRO minimization over distributions, highlighting a parallel structure between worst-case evaluation across distributional and multicriteria settings.

3.9. General Multicriteria Objectives

In certain practical applications, it may seem appealing to consider alternatives to the arithmetic mean in Equation (1), such as a harmonic or geometric mean, with suitable weights. One could even think of using a (weighted) power mean which would accommodate each of the earlier options as a special case; see Example 14 below, which illustrates aggregation ambiguity. Rather than commit to a particular functional form for aggregating multiple criteria, however, we propose a general class of multicriteria objectives that adhere to a set of reasonable axioms. These axioms reduce (in the case of uniform weights) to those known to characterize the quasiarithmetic mean. A key finding, notably, is that maximizing this general weighted objective is entirely equivalent to maximizing the arithmetic mean objective in Equation (1), provided the criteria are suitably transformed.

Let $H : Y \times Δ \to R$ denote a (continuous) multicriteria objective, where $Y \subseteq R^{n}$ is a (nonempty) domain such that $\hat{f} (X) \subset Y$ , with the vector of criteria $\hat{f} = ({\hat{f}}_{1}, \dots, {\hat{f}}_{n})$ , and where $Δ \subset R_{+}^{n}$ is the unit simplex; see Section 2.1. For any decision $x \in X$ and weight $λ \in Δ$ , the multicriteria objective produces an overall score $\hat{H} (x | λ) = H (\hat{f} (x) | λ)$ , which the decision maker would like to maximize by choosing an appropriate decision—in the presence of (possibly limited) ambiguity about $λ$ (and possibly also about $\hat{f}$ ) as discussed in Section 3.7 (and Section 3.8). To guide the form of a sufficiently flexible and interpretable objective, we posit five axioms (Axioms 1–5), which nest those proposed by Kolmogorov (1930) in his seminal work “On the notion of mean.”

Axiom 1

(Monotonicity). For all $y = (y_{1}, \dots, y_{n}) \in Y$ , $\hat{y} = ({\hat{y}}_{1}, \dots, \hat{y}) \in Y$ , $λ = (λ_{1}, \dots, λ_{n}) \in Δ$ , and $i \in N$ :

\hat{y} - y = ({\hat{y}}_{i} - y_{i}) e_{i}, {\hat{y}}_{i} > y_{i} \Rightarrow {\begin{array}{l} H (\hat{y} | λ) > H (y | λ), & if λ_{i} > 0, \\ H (\hat{y} | λ) = H (y | λ), & if λ_{i} = 0 . \end{array}

Axiom 2

(Symmetry). For all $y = (y_{1}, \dots, y_{n}) \in Y$ , $λ = (λ_{1}, \dots, λ_{n}) \in Δ$ , and $i, j \in N$ :

(\hat{y}, \hat{λ}) = (y, λ) + (y_{i} - y_{j}, λ_{i} - λ_{j}) (e_{j} - e_{i}) \Rightarrow H (\hat{y} | \hat{λ}) = H (y | λ) .

Axiom 3

(Reflexivity). $H (α 1_{n} | λ) = α$ , for all $α > 0$ (with $α 1_{n} \in Y$ ) and $λ \in Δ$ .³⁴

Axiom 4

(Associativity). For all $y = (y_{1}, \dots, y_{n}) \in Y$ , $λ = (λ_{1}, \dots, λ_{n}) \in Δ$ , and $m \in N \ {n}$ :³⁵

(λ_{1}, \dots, λ_{m}) \neq 0, H (\sum_{i = 1}^{m} y_{i} e_{i} | \frac{\sum_{i = 1}^{m} λ_{i} e_{i}}{\sum_{i = 1}^{m} λ_{i}}) = α \Rightarrow H (α 1_{m}, y_{m + 1}, \dots, y_{n} | λ) = H (y | λ) .

Axiom 5

(Coordinate Filter). $H (y | e_{i}) = y_{i}$ , for all $y = (y_{1}, \dots, y_{n}) \in Y$ and $i \in N$ .

The significance of these five basic requirements is as follows: Monotonicity (Axiom 1) means that as long as the weight component $λ_{i}$ is positive, increasing the value $y_{i}$ of criterion $i \in N$ must also increase the weighted objective, whereas for $λ_{i} = 0$ the weighted objective becomes insensitive to criterion i; symmetry (Axiom 2) requires that the weighted objective is invariant with respect to any joint permutation of indices belonging to criteria and their associated weights; reflexivity (Axiom 3) imposes score-consistency in the sense that if all criterion scores are identical, then that should also be the value of the weighted objective, no matter what (normalized) weights are applied; in a similar vein, associativity (Axiom 4) postulates that replacing a group of inputs with their internal weighted average leaves the overall aggregation unchanged; finally, in order to guarantee that the weights provide a homotopic relation between all criteria in isolation, the weighted objective is a coordinate filter (Axiom 5) if it yields the i-th component of $y$ when putting all weight on the i-th coordinate (i.e., for $λ = e_{i}$ ).

Based on Kolmogorov’s result,³⁶ we define the general h-mean,

H (y | λ) = h^{- 1} (\sum_{i = 1}^{n} λ_{i} h (y_{i})),

(29)

where the kernel

h : D \to R

is a continuous strictly monotonic function, defined on a suitable domain

D \subset R

Lemma 9.

The general h-mean $H : Y \times Δ \to R$ in Equation (29) satisfies Axioms 1–5.

The following example illustrates the flexibility afforded by the general h-mean.

Example 14.

Consider two well-known averages.

Let $Y = R_{+ +}^{n}$ . For any weight $λ \in Δ$ and power parameter $p \in R$ , define the power mean:
$M_{p} (y | λ) = {\begin{array}{l} {(\sum_{i = 1}^{n} λ_{i} y_{i}^{p})}^{1 / p}, & p \neq 0, \\ \prod_{i = 1}^{n} y_{i}^{λ_{i}}, & p = 0, \end{array}$
which corresponds to Equation (29) with $h (y) = y^{p}$ for $p \neq 0$ , and $h (y) = \log y$ for $p = 0$ . This recovers the harmonic ( $p = - 1$ ), geometric ( $p = 0$ ), and arithmetic ( $p = 1$ ) means, among others. In the limit, the power mean approaches the minimum (for $p \to - \infty$ ) or maximum (for $p \to + \infty$ ) of all $y_{i}$ with positive weights. Importantly, $M_{p}$ is increasing in p (see, e.g., Hardy et al. 1934, theorem 16, p. 26) and uniquely satisfies homogeneity: $M_{p} (α y | λ) = α M_{p} (y | λ)$ for all $α > 0$ (see, e.g., Hardy et al. 1934, theorem 84, p. 68).³⁷
For $h (\cdot) = \exp (\cdot)$ , the general h-mean in Equation (29) becomes a weighted mean in the log-semiring, related to the LogSumExp (LSE) function, because $H (y | λ) = LSE (y_{1} + \log λ_{1}, \dots, y_{n} + \log λ_{n})$ , for all $(y, λ) \in R^{n} \times int (Δ)$ , where $LSE (y) = \log (\sum_{i = 1}^{n} \exp (y_{i}))$ , for all $y \in R^{n}$ . Such a formulation carries fruit in probabilistic modeling and neural networks, where LSE and softmax appear naturally because of their smoothness properties.³⁸

3.9.1. Reduction to Basic Framework.

Crucially, the general h-mean in Equation (29) reduces to our standard weighted objective in Equation (1) under a transformation of the criteria, by setting $f_{i} = h \circ {\hat{f}}_{i}$ , so that

\hat{H} (x | λ) = h^{- 1} (F (x | λ)), (x, λ) \in X \times Δ .

If the kernel h is increasing (resp., decreasing), then maximizing $\hat{H} (\cdot | λ)$ is equivalent to maximizing (resp., minimizing) $F (\cdot | λ)$ . Thus, our earlier results carry over directly—with minor adjustments in the decreasing case, as detailed in Section 3.9.2.

Example 15.

Consider Example 13 (ii) for the “diagonal” state space $Θ = {θ \in R_{+}^{n} : θ_{1} = \dots = θ_{n} \in [\underline{p}, \bar{p}]}$ , where the constants $\underline{p}, \bar{p}$ are such that $0 < \underline{p} < \bar{p} < \infty$ . By Example 14 (i), this represents a situation in which the decision maker is uncertain about the appropriate aggregation method, except that a (homogeneous) power mean $M_{p}$ should be used, for some $p \in [\underline{p}, \bar{p}]$ . The result in Example 13 (ii) suggests as robust choice the largest available option: the power mean $M_{\bar{p}}$ .

Remark 16.

(i) As Example 15 suggests, the insights about criterion ambiguity in Section 3.8 may sometimes be combined with the general representation of multicriteria objectives in Section 3.9 to handle aggregation ambiguity, that is, uncertainty over which aggregation rule or kernel should be used. (ii) For the general h-mean in Equation (29), we define $H = h^{- 1} \circ F_{h}$ with $F_{h} = \sum_{i = 1}^{n} λ_{i} (h \circ f_{i})$ . However, in general, $h^{- 1} \circ (F_{h} / F_{h}^{*}) \neq (h^{- 1} \circ F_{h}) / (h^{- 1} \circ F_{h}^{*})$ with $F_{h}^{*} (\cdot) = \max F_{h} (X | \cdot)$ , so our optimization focuses on $F_{h}$ directly. (iii) For the practically very important power mean in Example 14 (i) and Example 15 (as long as $p \neq 0$ ), the robust optimization of $F_{h}$ is equivalent to the robust optimization of $M_{p}$ , because $h (\cdot) = {(\cdot)}^{p}$ and $h^{- 1} (\cdot) = {(\cdot)}^{1 / p}$ both feature multiplicative separability.

3.9.2. Special Case: Minimization Under Decreasing Kernel.

For a decreasing kernel, $f_{i} = h \circ {\hat{f}}_{i}$ transforms larger ${\hat{f}}_{i}$ into smaller $f_{i}$ , so that the objective reflects a smaller-is-better interpretation (akin to optimizing a loss function). Assuming $R_{+ +}^{n} \subset Y$ , by Axiom 3 it is $\lim_{α \to 0^{+}} H (α 1_{n} | λ) = H (0^{+} | λ) = 0$ . Because in most practical applications (e.g., for logarithmic or inverse transformations in multiobjective loss minimization) it is $h (0^{+}) = \infty$ , we obtain $h^{- 1} (\infty) = 0^{+}$ , so that on the compact action set $X$ it is $f_{i}^{•} = \min f_{i} (X) = h ({\hat{f}}_{i}^{*}) > 0$ , for all $i \in N$ , as ${\hat{f}}_{i}^{*} = \max {\hat{f}}_{i} (X)$ is necessarily finite. Hence, we can consider the adjusted performance ratio,

φ (x | λ) = \frac{F^{•} (λ)}{F (x | λ)} \in (0, 1], (x, λ) \in X \times Δ,

(30)

where

F^{•} (λ) = \min F (X | λ)

, for all

λ \in Δ

. A representation of the performance index

ρ (x)

in Equation (4) then obtains, analogous to Equation (6), for all

x \in X

, as the minimum of the (adjusted) criterion-specific performance ratios

ϕ_{i} (x) = f_{i}^{•} / f_{i} (x)

with respect to

i \in N

. This defines the (nonempty, compact) set of pseudo-robust actions,

Ψ = \arg \max_{x \in X} {\min_{i \in N} \frac{f_{i}^{•}}{f_{i} (x)}} .

The set of Pareto-optimal actions $P$ needs to be based on the original vector of criteria $\hat{f}$ , so that

P = {x \in X : (\hat{f} (x) \leq \hat{f} (x^{'}) \Rightarrow \hat{f} (x) = \hat{f} (x^{'})), x^{'} \in X},

analogous to Equation (10). The robust decision set is then

R = Ψ \cap P

as before, in Equation (11).

3.10. Applying the Method

“What do I do? What do I get? How do I adapt it to my case?”—We now address these practitioner questions by outlining the method’s core, the interpretation of its results, and important extensions for customization, as a navigation device for approaching the multicriteria decision problem introduced in Section 2.1 and its generalizations.

(Core) To determine an approximately robust decision ${\hat{x}}_{ε}$ (at any prespecified precision $ε$ ) with respect to n criteria, one needs to solve just $n + 1$ optimization problems: one to maximize the $ε$ -augmented performance index $Φ_{ε}$ in Equations (12) and (13) and n to compute the normalization constants $f_{i}^{*}$ in Equation (5) for the criterion-specific performance ratios $ϕ_{i}$ in Equation (6). The associated normalized robust weight $\hat{λ} \in Δ$ , which encapsulates the tradeoffs between the criteria embedded in both the action set and the shape of the criterion functions (independent of any scaling), is then obtained (approximately) from Equations (19) and (20) by setting $\hat{x} = {\hat{x}}_{ε}$ .
(Interpretation) The performance index $ρ ({\hat{x}}_{ε}) \in [0, 1]$ , with $ρ (\cdot)$ given in Equation (6), guarantees a minimum percentage that the robust solution ${\hat{x}}_{ε}$ achieves of the optimal weighted objective in Equation (1), for any weight in $Δ$ . This means that the performance of ${\hat{x}}_{ε}$ is guaranteed to be no worse than $ρ ({\hat{x}}_{ε})$ times the best achievable performance under any possible weight vector, ensuring robustness to unknown or contested preferences. In this manner, the weight-induced subjectivity in multicriteria optimization can be removed. The robust weight rationalizes the robust decision as an optimum of the weighted objective in Equation (1) for $λ = \hat{λ}$ . The vector $\hat{λ}$ can be interpreted as the revealed weight structure that best justifies the robust solution, based on the problem’s internal tradeoffs.
(Customization) The method accommodates several practically important extensions:
- – Partial weight information: Prior knowledge or constraints on weights can be imposed by requiring weight vectors (not necessarily normalized) to belong to a suitable subset (cf. Section 3.7).
- – State-dependent criteria: Uncertain or context-dependent criteria $f_{i}$ can be captured by worst-case performance ratios, effectively reducing the problem to the core framework (cf. Section 3.8).
- – Generalized aggregation: Rather than relying on the arithmetic mean in Equation (1), one may adopt alternative scalarization functions consistent with an axiomatic foundation (cf. Section 3.9).

Overall, the relatively robust methodology provides a computationally tractable and conceptually transparent toolkit for balancing multiple, possibly ambiguous objectives in diverse real-world settings (cf. Section 4.3).

4. Discrete Applications

4.1. Finite Action Set

4.1.1. Relative Robustness Criterion.

Consider a finite set of alternatives $X = {1, \dots, J}$ , evaluated according to n positive criteria $f_{1}, \dots, f_{n} : X \to R_{+ +}$ . Each alternative $j \in X$ receives a score $f_{i} (j) = s_{i j} > 0$ , for all $i \in N$ . If we set ${\hat{s}}_{i} = \max_{j \in X} s_{i j}$ , then the performance index for the j-th option becomes

ρ_{j} = \min_{i \in N} \frac{s_{i j}}{{\hat{s}}_{i}}, j \in X .

(31)

By Proposition 3, a robust decision $j^{*} \in R$ can be found by computing the lower limit, as follows:

j^{*} \in {\underline{Lim}}_{ε \to 0^{+}} \arg \max_{j \in X} {(1 - ε) ρ_{j} + \frac{ε}{n} \sum_{i \in N} \frac{s_{i j}}{{\hat{s}}_{i}}} .

(32)

This robust decision achieves the optimal performance index,

ρ^{*} = \max_{j \in X} ρ_{j} = ρ_{j^{*}} = \min_{i \in N} \frac{s_{i j^{*}}}{{\hat{s}}_{i}} .

(33)

As discussed in Section 3, simply maximizing the performance index $ρ_{j}$ yields pseudo-robust solutions, which are generically inefficient; see, for instance, Example 6. We now compare the proposed approach with several alternative robustness criteria.

4.1.2. Alternative Robustness Criteria.

The following three robustness criteria, defined in the present context of discrete action sets, are frequently used in the literature for dealing with parameter ambiguity.

The Laplace criterion corresponds to a weighted objective $F (\cdot | λ_{Laplace})$ in Equation (1) with a uniform weight, $λ_{Laplace} = (1, \dots, 1) / n \in Δ$ , leading to the “Laplace solution,”
$j_{Laplace} \in \arg \max_{j \in X} F (j | λ_{Laplace}) = \arg \max_{j \in X} \sum_{i = 1}^{n} s_{i j} .$ (34)
The worst-case criterion is defined as the minimum payoff across the admissible weights, which yields a so-called “maximin solution,”
$j_{WC} \in \arg \max_{j \in X} \min_{λ \in Δ} F (j | λ) = \arg \max_{j \in X} \min_{i \in N} s_{i j} .$ (35)
The absolute-regret criterion evaluates absolute regret, that is, the maximum difference ex post between what is and the best that could have been, resulting in the “(minimax) absolute-regret solution,”
$j_{AR} \in \arg \min_{j \in X} \max_{λ \in Δ} {F^{*} (λ) - F (j | λ)} = \arg \min_{j \in X} \max_{i \in N} {{\hat{s}}_{i} - s_{i j}} .$ (36)

4.1.3. Comparison.

The following example, which features a finite action set, illustrates the proposed robust solution in Equation (32) against decisions recommended by alternative robustness criteria, notably the Laplace criterion in Equation (34), the WC (or maximin) solution in Equation (35), and the solution minimizing the (maximum) AR in Equation (36).

Example 16.

Consider a discrete-choice situation for $J = 5$ options, for which the scores $s_{i j}$ across $n = 3$ criteria are recorded in Table 1. In the context of relative robustness, options 2 and 5 are tied for the highest performance index in Equation (31) and are thus both pseudo-robust, so $Ψ = {2, 5}$ . At the same time, option 5 Pareto-dominates option 2 (with $P = {1, 3, 4, 5}$ ), so that $j^{*} = 5 \in R = Ψ \cap P$ is the unique robust choice. This solution can also be obtained directly from Equation (32), written in the form

j^{*} \in {\underline{Lim}}_{ε \to 0^{+}} \arg \max_{j \in X} Φ_{ε} (j),

Table 1. Discrete Decision Options in Example 16, Evaluated with Different Robustness Criteria

Table 1. Discrete Decision Options in Example 16, Evaluated with Different Robustness Criteria

Option (j)	Criteria			Performance ratios/index				Alternative evaluations
Option (j)	$s_{1 j}$	$s_{2 j}$	$s_{3 j}$	$s_{1 j} / {\hat{s}}_{1}$	$s_{2 j} / {\hat{s}}_{2}$	$s_{3 j} / {\hat{s}}_{3}$	$ρ_{j}$	Laplace	WC	AR
1	24	20	16	1	10/11	4/15	4/15	20	16	44
2	7	6	32	7/24	3/11	8/15	3/11	15	6	28
3	1	2	60	1/24	1/11	1	1/24	21	1	23
4	22	22	16	11/12	1	4/15	4/15	20	16	44
5	12	6	36	1/2	3/11	3/5	3/11	18	6	24

Note. Values in bold indicate optimality for the corresponding robustness criterion.

where $Φ_{ε} (j) = (1 - ε) ρ_{j} + (ε / n) \sum_{i = 1}^{n} (s_{i j} / {\hat{s}}_{i})$ , for all $j \in X$ and all $ε \in [0, 1]$ . Indeed, because $ρ_{2} = ρ_{5} = 3 / 11 > ρ_{1} = ρ_{4} = 4 / 15 > ρ_{3} = 1 / 24$ , for sufficiently small $ε \in (0, 1]$ the $ε$ -augmented performance index is

\max_{j \in {2, 5}} Φ_{ε} (j) > Φ_{ε} (4) > Φ_{ε} (3) .

Meanwhile, it is

Φ_{ε} (5) - Φ_{ε} (2) = \frac{ε}{3} (\frac{12 - 7}{24} + \frac{9 - 8}{15}) = \frac{ε}{3} (\frac{11}{40}) > 0, ε \in (0, 1] .

Therefore, $ℛ_{ε} = {5}$ , for all sufficiently small $ε > 0$ , so $j^{*} \in {\underline{Lim}}_{ε \to 0^{+}} ℛ_{ε} = {5}$ , and one finds $j^{*} = 5$ as the unique robust option. Regarding alternative robustness evaluation, both the Laplace criterion in Equation (34) and the absolute-regret criterion in Equation (36) produce the solution $j_{Laplace} = j_{AR} = 3$ with very poor performance in at least one criterion. In fact, changing the payoff $s_{13}$ from one to zero would produce even a zero performance index and zero worst-case performance guarantee for that same decision (still optimal under these criteria),³⁹ which may be rather difficult to justify in any real-world scenario. Finally, maximizing the worst-criterion performance in Equation (35) leads to indifference between options 1 and 4, at a suboptimal performance index and largest absolute regret. By contrast, the proposed robust solution $j^{*} = 5$ provides (by construction) the best performance index and a reasonable compromise solution in terms of the other criteria. It guarantees a strictly positive performance across all criteria. From Equations (19) and (20) we find that the corresponding (normalized) robust weight vector is

\hat{λ} = {(\frac{1}{12} + \frac{1}{6} + \frac{1}{36})}^{- 1} (\frac{1}{12}, \frac{1}{6}, \frac{1}{36}) = (0.3, 0.6, 0.1) .

Consistent with Equation (21), it is such that

{\hat{λ}}_{i} s_{i j^{*}} = {(\sum_{i \in N} \frac{1}{s_{i j^{*}}})}^{- 1} = {(\frac{1}{12} + \frac{1}{6} + \frac{1}{36})}^{- 1} = 3.6 \geq \max_{j \in X \ {j^{*}}} \min_{i \in N} {\hat{λ}}_{i} s_{i j}, i \in {1, 2, 3},

where 3.6 corresponds to the harmonic mean of the criterion scores for the robust option

j^{*}

. Consider now the robustly reweighted scores

σ_{i j} = {\hat{λ}}_{i} s_{i j}

, for all

(i, j) \in N \times X

, displayed in Table 2, together with the associated robustness evaluations. It is interesting that after reweighting, option 3 goes from lowest to largest absolute regret, which highlights the sensitivity of the AR criterion to uncertainty about the relative importance of the criteria. The maximin payoff (WC) solution shifts from options 1 and 4 to option 5, whereas the Laplace solution shifts from option 3 to option 4. By contrast, the performance index remains unaffected by the robust reweighting (or any other reweighting), so option 5 is still the unique robust solution.

Table 2. Discrete Decision Options in Example 16, Evaluated after Robust Reweighting

Table 2. Discrete Decision Options in Example 16, Evaluated after Robust Reweighting

Option (j)	Criteria			Performance ratios/index				Alternative evaluations
Option (j)	$σ_{1 j}$	$σ_{2 j}$	$σ_{3 j}$	$σ_{1 j} / {\hat{σ}}_{1}$	$σ_{2 j} / {\hat{σ}}_{2}$	$σ_{3 j} / {\hat{σ}}_{3}$	${\hat{ρ}}_{j}$	Laplace	WC	AR
1	7.2	12	1.6	1	10/11	4/15	4/15	20.8	1.6	4.4
2	2.1	3.6	3.2	7/24	3/11	8/15	3/11	8.9	2.1	9.6
3	0.3	1.2	6	1/24	1/11	1	1/24	7.5	0.3	12
4	6.6	13.2	1.6	11/12	1	4/15	4/15	21.4	1.6	4.4
5	3.6	3.6	3.6	1/2	3/11	3/5	3/11	10.8	3.6	9.6

Note. Values in bold indicate optimality for the corresponding robustness criterion.

4.2. Data-Driven Approach

Consider any (nonempty, compact) action set $X \subset R^{m}$ for some integer $m \geq 1$ , as in Section 2. Assume further that the decision maker does not know the functional form of the multicriterion $f = (f_{1}, \dots, f_{n})$ , but is still able to observe its value $s^{k} = (s_{1}^{k}, \dots, s_{n}^{k})$ for different decisions $x^{k} \in X$ over the course of $K \geq 1$ experiments, where $k \in K = {1, \dots, K}$ . Indeed, given the realized score set $\hat{S} = {s^{k} : k \in K}$ and the realized action set $\hat{X} = {x^{k} : k \in K} \subset X$ , the average criterion scores are⁴⁰

{\hat{f}}_{i} (x) = \frac{\sum_{k \in K} 1_{{x^{k} = x}} s_{i}^{k}}{\sum_{k \in K} 1_{{x^{k} = x}}}, (x, i) \in \hat{X} \times N,

for all

(x, i) \in \hat{X} \times N

. With this, the (data-driven) criterion-specific performance ratios are given by

{\hat{ϕ}}_{i} (x) = \frac{{\hat{f}}_{i} (x)}{{\hat{f}}_{i}^{*}}, (x, i) \in \hat{X} \times N,

where

{\hat{f}}_{i}^{*} = \max {\hat{f}}_{i} (\hat{X})

is the i-th maximum average criterion score. The (data-driven) performance index,

\hat{ρ} (x) = \min_{i \in N} {\hat{ϕ}}_{i} (x), x \in \hat{X},

is defined for all observed decisions. A (data-driven) robust decision

{\hat{x}}^{*}

(among all sampled actions in

\hat{X}

) can then be determined in the standard way by applying Proposition 3 to the observational analogues of our standard approach:

{\hat{x}}^{*} \in {\underline{Lim}}_{ε \to 0^{+}} \arg \max_{x \in \hat{X}} {(1 - ε) \hat{ρ} (x) + \frac{ε}{n} \sum_{i \in N} {\hat{ϕ}}_{i} (x)} .

(37)

In other words, $\hat{ρ} (x)$ captures the worst-case relative performance of action $x$ across all observed criteria, and ${\hat{x}}^{*}$ identifies a robust decision, which is efficient and has the best-possible worst-case performance. Note that instead of going through the motions of actually taking the limit, it is sufficient to maximize the $ε$ -augmented performance index, that is, the maximand in Equation (37), for a sufficiently small $ε \in (0, 1]$ . This approximates the optimal robustness performance, captured by the optimal performance index $ρ^{*}$ , arbitrarily closely by virtue of the $ε$ -performance guarantee provided in Proposition 4. Finally, as in Equations (19) and (20), the (data-driven) robust weight is

\hat{λ} = {(\sum_{i \in N} \frac{1}{{\hat{f}}_{i} ({\hat{x}}^{*})})}^{- 1} {(\frac{1}{{\hat{f}}_{i} ({\hat{x}}^{*})})}_{i = 1}^{n} \in Δ .

(38)

Based on the observed data, this vector provides a robust estimate of the tradeoffs between the different criteria from the vantage point of relative robustness.

Example 17.

Consider the joint evaluation of human performance on a certain task (such as giving a university lecture) by K evaluators who score J individuals in N different performance dimensions (or criteria) on a Likert scale (from one to seven),⁴¹ so $K = {1, \dots, K}$ , $N = {1, \dots, N}$ , and $X = {1, \dots, J}$ . Figure 8 shows a spider plot comparing $J = 4$ different individuals. Assuming that all evaluators scored all the individuals using a seven-point Likert scale, the realized score set $\hat{S}$ has elements $s^{k} \in {1, \dots, 7}^{N}$ , for all $k \in K$ , whereas the realized action set $\hat{X}$ is equal to $X$ . Individual 3, although never achieving a highest criterion-specific average score, exhibits the best data-driven performance index ( ${\hat{ρ}}^{*} = \hat{ρ} (3) = 76.19 % = \max_{j \in {1, 2, 3, 4}} \hat{ρ} (j)$ ). Because in a noisy observation environment ties in the performance index are fairly unlikely, the set of pseudo-robust alternatives $Ψ$ tends to be a singleton, thus also resulting in a singleton set of robust options (e.g., $R = {3}$ ). This example illustrates how the data-driven approach identifies robust performance through balanced tradeoffs rather than peak performance in any one dimension, thus favoring individual 3, whose weakest dimension is relatively strong.

4.3. Real-World Applications

Relatively robust multicriteria optimization provides a flexible and transparent framework for decision-making under uncertainty. As developed in Sections 2 and 3, it is especially well suited for complex problems where the relative importance of multiple criteria is ambiguous or hard to justify. The method systematically balances competing objectives using only minimal assumptions on the decision maker’s preferences, and delivers solutions that are both Pareto-efficient and robust (cf. Proposition 3). We illustrate its practical relevance in three diverse contexts: the energy trilemma, quality-adjusted life year (QALY)-based health evaluations, and corporate resource allocation. In each case, the relatively robust methodology facilitates structured tradeoff management and data-informed robustness, whether operating in an abstract policy space or on a finite empirical action set (cf. Sections 4.1 and 4.2).

4.3.1. Energy Trilemma: Balancing Energy Security, Equity, and Sustainability.

The World Energy Council’s Energy Trilemma framework evaluates nations on three key dimensions, namely “energy security” (i.e., the reliability and resilience of energy supply), “energy equity” (i.e., the accessibility and affordability of energy for all citizens), and “environmental sustainability” (i.e., the reduction of greenhouse gas emissions and environmental impact). Nations face challenges in improving one dimension without compromising others (World Energy Council 2024).⁴² For instance, increasing energy equity by subsidizing fossil fuels may undermine sustainability, while prioritizing environmental sustainability through renewable energy investment might initially reduce energy security or equity. An optimal strategy depends on the relative importance attributed to each of the three criteria. In this setting, a relatively robust approach can

Identify Robust Policies: By modeling each nation’s energy policies and outcomes as feasible decisions, the method identifies those policies that achieve strong tradeoffs across all three dimensions. For example, a relatively robust policy might balance investment in renewable energy, grid modernization, and subsidies for low-income households, ensuring consistent performance regardless of variations of the relative importance across dimensions, which might still be importance-ranked (cf. Section 3.7).
Quantify Tradeoffs: The approach provides a clear representation of tradeoffs, such as how much equity might need to be sacrificed for a given improvement in sustainability under different weighting scenarios.
“Robustify” Strategic Goals: Policymakers can develop long-term strategies that are resilient to evolving societal priorities, such as shifts toward greater emphasis on the precautionary principle (cf. Section 3.9).

4.3.2. QALY Impact of Diseases: Evaluating Quality-Adjusted Life Years.

QALYs measure the impact of diseases and medical treatments by combining longevity and quality of life into a single index (Pliskin 1974, Pliskin and Beck 1976, Miyamoto et al. 1998).⁴³ The impact of different impairments—such as mobility loss, chronic pain, or cognitive decline—depends on how these are weighted relative to each other in calculating overall health outcomes. Subject to restrictions on weights (e.g., a priority ranking; cf. Example 12) and with a flexible aggregation of criteria (cf. Section 3.9) the method can

Handle Uncertain Weightings: When precise utility-weightings for different impairments are unavailable (as one would generally assume), the method identifies health interventions or treatment plans that remain effective across a wide range of subjective assessments. For example, it might suggest treatments that optimize outcomes for both pain relief and mobility restoration, ensuring robust quality-of-life improvements.
Optimize Resource Allocation: Health authorities can use the method to prioritize interventions that deliver the highest QALY improvements per dollar spent, even when the relative importance of various health dimensions (e.g., physical versus mental health) is uncertain.⁴⁴
Support Evidence-Based Policy: The method provides a performance guarantee for proposed policies, demonstrating their effectiveness regardless of the relative emphasis placed on specific impairments.

4.3.3. Resource Allocation in Companies: Balancing Risk, Resources, and Rival Assets.

In organizations, resource allocation involves deciding how to distribute limited resources—financial, human, or physical—across competing projects. Each project is evaluated based on multiple criteria, such as “risk of noncompletion” (i.e., the probability that a project fails because of delays or unforeseen issues), “use of human resources” (i.e., the availability and workload of skilled personnel required for the project), and “use of rival assets” (i.e., the occupation of assets that cannot be used by multiple projects simultaneously, for example, specialized machinery). Applying the framework of relatively robust decision making allows organizations to⁴⁵

Identify Balanced Portfolios: By modeling projects and their criteria as feasible decisions $x \in X$ , with performance criteria $f_{1} (x), \dots, f_{n} (x)$ , the method selects a portfolio of projects that balances competing objectives, ensuring efficient resource use even when preferences over criteria are unclear.
Manage Tradeoffs: By aggregating criterion-specific performance ratios $ϕ_{i} (x)$ into a performance index $ρ (x)$ , the method reveals, for example, how prioritizing low-risk projects impacts resource utilization and asset deployment, informing decisions about a robust balance between risk and resource efficiency.
Adapt to Uncertainty: As organizational priorities shift (e.g., toward innovation or risk aversion), the method ensures that allocation strategies can remain robust across changing criteria (cf. Section 3.8) and weight configurations for the criteria, and possibly across different aggregation methods (cf. Section 3.9).

In settings with discrete project sets and empirical observations, a data-driven approach can be applied (cf. Section 4.2).

5. Conclusion

Multicriteria optimization aims to resolve conflicts between competing objectives by finding Pareto-efficient decisions for which improving one criterion necessarily degrades at least one other. Although Pareto-efficiency sets a minimum standard of nonwastefulness, practical decision making often requires selecting a single compromise solution from the available Pareto frontier. Scalarization, such as assigning weights to criteria, is a common method for operationalizing this selection, but it assumes knowledge of the weights, which may not be available or easily justifiable, and it may also exclude Pareto-efficient solutions when the action set is nonconvex.

Here, we seek to mitigate the dependence on precise weight specifications by introducing a performance index that evaluates the worst-case weighted performance of a decision relative to its maximum potential. A Pareto-efficient decision that maximizes this index is viewed as relatively robust, balancing competing criteria in a way that offers resilience to weight uncertainty. A critical feature of the method is its computational simplicity, relying only on the compactness of the feasible set and the continuity of criterion functions to guarantee the existence and basic regularity of the solutions. This ensures broad applicability, even for complex, nonconvex problems. Criterion ambiguity and more general aggregation methods can be accommodated. In the case of finite sets, this method is completely general and can be operationalized through a data-driven approach under virtually no assumptions.

Appendix. Proofs

Proof of Proposition 1.

The proof proceeds by analyzing what happens when either one of the two inequalities is violated, and subsequently when both are violated. This yields three cases, each of which leads to a contradiction.

Case 1: If $f_{i} (\hat{x}) < f_{i} (x)$ and $f_{j} (\hat{x}) \leq f_{j} (x)$ , then
$F^{*} (\hat{λ}) = F (\hat{x} | \hat{λ}) = \sum_{l = 1}^{n} {\hat{λ}}_{l} f_{l} (\hat{x}) < \sum_{l = 1}^{n} {\hat{λ}}_{l} f_{l} (x) = F (x | \hat{λ}),$
which is a contradiction.
Case 2: If $f_{i} (\hat{x}) \geq f_{i} (x)$ and $f_{j} (\hat{x}) > f_{j} (x)$ , then
$F^{*} (λ) = F (x | λ) = \sum_{l = 1}^{n} λ_{l} f_{l} (x) < \sum_{l = 1}^{n} λ_{l} f_{l} (\hat{x}) = F (\hat{x} | λ),$
which is a contradiction.
Case 3: If $f_{i} (\hat{x}) < f_{i} (x)$ and $f_{j} (\hat{x}) > f_{j} (x)$ , then
$F^{*} (\hat{λ}) = F (\hat{x} | λ) + ε (f_{i} (\hat{x}) - f_{j} (\hat{x})) \geq F (x | \hat{λ}) = F (x | λ) + ε (f_{i} (x) - f_{j} (x)) .$

But because $ε (f_{i} (x) - f_{j} (x)) > ε (f_{i} (\hat{x}) - f_{j} (\hat{x}))$ , this implies

\begin{array}{l} F (\hat{x} | λ) \geq F (x | \hat{λ}) - ε (f_{i} (\hat{x}) - f_{j} (\hat{x})) \\ = \underset{= F^{*} (λ)}{\underset{︸}{F (x | λ)}} + \underset{> 0}{\underset{︸}{ε (f_{i} (x) - f_{j} (x)) - ε (f_{i} (\hat{x}) - f_{j} (\hat{x}))}} \\ > F^{*} (λ), \end{array}

which is a contradiction.

Hence, $f_{i} (\hat{x}) \geq f_{i} (x)$ and $f_{j} (\hat{x}) \leq f_{j} (x)$ , as claimed. □

Proof of Lemma 1.

Fix $δ > 0$ and $i \in N$ , and let $(x, \hat{x}) \in X (w) \times X (\hat{w})$ . Then

F^{*} (\hat{w}) - F^{*} (w) = F (\hat{x} | \hat{w}) - F (x | w) = \underset{\geq 0}{\underset{︸}{F (\hat{x} | \hat{w}) - F (x | \hat{w})}} + \underset{= δ f_{i} (x)}{\underset{︸}{F (x | \hat{w}) - F (x | w)}} \geq δ f_{i} (x),

because, by assumption,

\hat{w} = w + δ e_{i}

. As the preceding inequality holds for any

x \in X (w)

, it follows that

F^{*} (\hat{w}) - F^{*} (w) \geq δ (\max f_{i} (X (w))) > 0,

and thus

F^{*} (\hat{w}) > F^{*} (w)

, which completes the proof. □

Proof of Proposition 2.

We first show that, for any given decision, the performance ratio is quasiconcave in the weight. To that end, fix any feasible action $x \in X$ , and define

U_{c} (x) = {λ \in Δ : φ (x | λ) \geq c}

(A.1)

as the set of (normalized) weights

λ

that yield a performance ratio

φ (x | λ)

in Equation (3) at a level of at least

c \in [0, 1]

. In fact, the upper contour sets in Equation (A.1) are nested in their level, that is,

c \leq \hat{c} \Rightarrow U_{\hat{c}} (x) \subseteq U_{c} (x),

(A.2)

for any

\hat{c} \in [0, 1]

. Furthermore, for sufficiently small values of c, we have

U_{c} (x) = Δ

.⁴⁶ Thus, we may choose

c \in [0, 1]

such that

U_{c} (x)

is nonempty, and consider any two weights

λ, \hat{λ} \in U_{c} (x)

. For any

θ \in (0, 1)

, their convex combination

λ_{θ} = θ λ + (1 - θ) \hat{λ}

belongs to

U_{c} (x)

if and only if

φ (x | λ_{θ}) = \frac{F (x | λ_{θ})}{F^{*} (λ_{θ})} \geq c .

(A.3)

Because $φ (x | λ) \geq c$ and $φ (x | \hat{λ}) \geq c$ , the definition of the weighted objective in Equation (1) implies

F (x | λ_{θ}) = F (x | θ λ + (1 - θ) \hat{λ}) = θ F (x | λ) + (1 - θ) F (x | \hat{λ}) \geq c (θ F^{*} (λ) + (1 - θ) F^{*} (\hat{λ})),

(A.4)

where we use that

\min {F^{*} (λ), F^{*} (\hat{λ})} > 0

by the nontriviality condition (N). Moreover, the convex combination of the individually optimized objectives

F^{*} (λ)

and

F^{*} (\hat{λ})

(each obtained from possibly different decisions) cannot be smaller than the value of the jointly optimized objective (from a single decision):

\begin{array}{l} \begin{array}{l} θ F^{*} (λ) + (1 - θ) F^{*} (\hat{λ}) = \max_{x, \hat{x} \in X} {θ F (x | λ) + (1 - θ) F (\hat{x} | \hat{λ})} \end{array} \\ \begin{array}{l} \geq \max_{x \in X} {θ F (x | λ) + (1 - θ) F (x | \hat{λ})} \\ = \max_{x \in X} {F (x | θ λ + (1 - θ) \hat{λ})} = F^{*} (λ_{θ}) . \end{array} \end{array}

(A.5)

Therefore, as long as $c \geq 0$ , the fact that $F^{*} (λ_{θ}) > 0$ , together with Equations (A.4) and (A.5), implies that Equation (A.3) holds. As an immediate consequence, it is $λ_{θ} \in U_{c} (x)$ , which in turn means that $U_{c} (x)$ is convex for all $c \in [0, 1]$ and all $x \in X$ , as claimed.

The convexity of the upper contour sets $U_{c} (x)$ implies that $φ (x | λ)$ is quasiconcave in $λ$ (see, e.g., Arrow and Enthoven 1961, p. 780). Hence, the minimum of the performance ratio over all weights in $Δ$ is attained on the boundary:

\min_{λ \in Δ} φ (x | λ) = \min_{λ \in \partial Δ} φ (x | λ) .

The same quasiconcavity argument applies to any face of $Δ$ , so the minimum must also lie on the boundary of each face. Repeating this argument recursively over edges and vertices, we conclude that the minimum is attained in the set $Λ = {e_{1}, \dots, e_{n}}$ of the n vertices of $Δ$ . Thus, we obtain a form of “perfect complementarity,”

ρ (x) = \min_{λ \in {e_{1}, \dots, e_{n}}} {\frac{F (x | λ)}{F^{*} (λ)}} = \min_{i \in N} {\frac{F (x | e_{i})}{F^{*} (e_{i})}} = \min_{i \in N} {\frac{f_{i} (x)}{f_{i}^{*}}} = \min_{i \in N} ϕ_{i} (x),

as claimed in Equation (6), which concludes the proof. □

Proof of Lemma 2.

For any $x \in X$ , Proposition 2 provides for a representation of the performance index, which can be rewritten in the form $ρ (x) = ϕ_{ι (x)}$ , where $ι (x) \in ℐ (x) = \arg \min_{i \in N} ϕ_{i} (x)$ . The set of pseudo-robust decisions $Ψ$ in Equation (7) is nonempty by the extreme-value theorem (Rudin 1976, theorem 4.16, p. 89), and it is compact by the maximum theorem (Berge 1963, p. 116). These two theorems also guarantee that the correspondence $ℐ (\cdot)$ is upper semicontinuous and nonempty-valued. Consider now $P$ , which by Equations (9) and (10) can be written as

P = {x \in X : ∄ x^{'} \in X s . t . x^{'} ≻ x} = {x \in X : x^{'} ⪯ x, x^{'} \in X},

where

⪯

is the negation of

≻

. By the continuity of

f

, the preference

≻

is continuous, so that

P

is compact. We now show that it must also be nonempty. For this, start with any

ξ^{0} \in Ψ

: If

ξ^{0} \in P

, then immediately

ξ^{0} \in R

, so the claim holds. Otherwise, if

ξ^{0} \notin P

, then there exists

ξ^{1} \in X

such that

ξ^{1} ≻ ξ^{0}

. By Equation (9) there exists

i \in N

such that

f_{i} (ξ^{1}) > f_{i} (ξ^{0})

and

f (ξ^{1}) \geq f (ξ^{0})

. Equivalently,

ϕ_{i} (ξ^{1}) > ϕ_{i} (ξ^{0})

and

ϕ_{j} (ξ^{1}) \geq ϕ_{j} (ξ^{0})

, for all

j \in N

, so that

ρ (ξ^{1}) \geq ρ (ξ^{0})

. Because by assumption

ξ^{0} \in Ψ

, by Equation (7) it must be

ρ (ξ^{1}) \geq ρ (ξ^{0}) = ρ^{*} = \max ρ (X)

, so

ξ^{1} \in Ψ

as well. In this manner one can now construct a sequence of decisions

ξ^{k}

, for

k \in {0, 1, \dots}

, continuing until

ξ^{k} \in P

for some

k \geq 0

, which then establishes that

Ψ \cap P

is indeed nonempty. Failing that, by the Bolzano-Weierstrass theorem (Berge 1963, p. 67) the sequence

{(ξ^{k})}_{k \geq 0}

must have a convergent subsequence in the compact set

X

. Because the convergence is monotone in the (continuous) preference, there can only be a single accumulation point, which is equal to the limit of the sequence:

\hat{x} = \lim_{k \to \infty} ξ^{k}

. Given that the set of all pseudo-robust decisions

Ψ

is compact (i.e., in particular closed) as noted earlier, it is

\hat{x} \in Ψ

. In addition,

x^{'} ⪯ \hat{x}

for all

x^{'} \in X

, as the upper and lower contour sets of a continuous preference are closed. But this means that

\hat{x} \in Ψ \cap P

, so the robust decision set

R

is both nonempty and compact, for it is the intersection of two compact sets. □

Proof of Lemma 3.

By Lemma 2, the robust decision set is nonempty and compact. Hence, by the extreme-value theorem there exists ${\hat{x}}^{*} \in R = Ψ \cap P$ , and Equation (7) yields $ρ (R) = ρ (Ψ \cap P) = {ρ^{*}} = ρ (Ψ)$ , as well as $\max ρ (P) = ρ^{*} = \max ρ (X)$ , establishing the claim. □

Proof of Lemma 4.

(i) The claim follows by combining Equations (7) and (12). (ii) Fix $ε \in (0, 1]$ , and—following the outcome-based logic discussed in Remark 9—consider a selection

y_{ε}^{*} \in Y_{ε} = \arg \max_{y \in Y} {\sum_{i \in N} y_{i} : t_{ε} \leq y_{1}, \dots, t_{ε} \leq y_{n}, t_{ε} \in T_{ε}} .

We now show that $y_{ε}^{*}$ is efficient (with respect to the coordinates of points in $Y$ ).⁴⁷ Suppose it is not; then there exists a feasible $\hat{y} ≻ y_{ε}^{*}$ which achieves a strictly greater payoff, a contradiction, which in turn establishes the claimed efficiency of $y_{ε}^{*}$ . Moreover, it is clear that $Y_{ε} = ϕ (ℛ_{ε})$ , so that

ℛ_{ε} = ϕ^{- 1} (Y_{ε}) = {x \in X : ϕ (x) \in Y_{ε}} .

As a result, $ℛ_{ε} \subset P$ , for all $ε \in (0, 1]$ , as claimed. (iii) Let $ε \in (0, 1]$ , and consider any $(x, \hat{x}) \in ℛ_{ε} \times Ψ$ . If $x \notin Ψ$ , then

ρ = \min_{i \in N} ϕ_{i} (x) < \min_{i \in N} ϕ_{i} (\hat{x}) = ρ^{*} = \max ρ (X) .

On the other hand, $x \in ℛ_{ε}$ implies that $Φ_{ε} (x) \geq Φ_{ε} (\hat{x})$ . Let $κ = \sum_{i \in N} (ϕ_{i} (x) - ϕ_{i} (\hat{x}))$ be the total coordinate-wise difference in performance. Then for any $\hat{ε} > 0$ ,

Φ_{\hat{ε}} (x) \geq Φ_{\hat{ε}} (\hat{x}) \Leftrightarrow \hat{ε} \cdot κ \geq ρ^{*} - ρ > 0 \Leftrightarrow \hat{ε} \geq \frac{ρ^{*} - ρ}{κ} > 0 .

This establishes the claim in part (iii), for any $ε^{'} \in (0, \min {ε, (ρ^{*} - ρ) / κ})$ and $\hat{ε} \in (0, ε^{'})$ . For $\hat{ε} = 0$ , the claim follows from part (i). □

Proof of Proposition 3.

We first note that $ℛ_{ε}$ is upper semicontinuous in $ε \in [0, 1]$ , and set $Q = {\underline{Lim}}_{ε \to 0^{+}} ℛ_{ε}$ . The proof proceeds in two steps. We first show that $Q \subset R$ and then $Q \neq \emptyset$ .

Step 1: $Q \subset R$ . By upper semicontinuity of $ℛ_{ε}$ we have that $Q \subset ℛ_{0} = Ψ$ (see, e.g., Aubin and Frankowska 1990, p. 41), taking into account that $Ψ$ is compact; cf. Endnote 19 for Section 3.1. Because $ℛ_{ε} \subset P$ , for all $ε \in (0, 1]$ , we further obtain $Q \subset cl P$ . Assume that there is a point $x \in Q \ P$ , which means that $x = \lim_{ε \to 0^{+}} x_{ε}$ , for some selection $x_{ε} \in ℛ_{ε}$ , where $ε \in (0, 1]$ . Because $x \notin P$ , there exists $\hat{x} \in P$ such that $\hat{x} ≻ x$ . That is, $ϕ_{j} (\hat{x}) > ϕ_{j} (x)$ for some $j \in N$ , and $ϕ (\hat{x}) \geq ϕ (x)$ . If we denote by $A = (1 / n) \sum_{i \in N} ϕ_{i}$ the average performance over all criteria, then $x \notin P$ implies that $A (x) < A (\hat{x})$ . By the continuity of $A (\cdot)$ on $X$ , it is therefore $\lim_{ε \to 0^{+}} A (x_{ε}) = A (x) < A (\hat{x})$ . Hence, there exists a $\bar{ε} \in (0, 1)$ such that $A (x_{ε}) < A (\hat{x})$ for all $ε \in (0, \bar{ε})$ . Thus, for any given $ε \in (0, \bar{ε})$ :

Φ_{ε} (\hat{x}) - Φ_{ε} (x_{ε}) = (1 - ε) (ρ (\hat{x}) - ρ (x_{ε})) + ε (A (\hat{x}) - A (x_{ε})) \geq ε (A (\hat{x}) - A (x_{ε})) > 0,

where we have taken into account that

\hat{x} \in Ψ

, so

ρ (\hat{x}) = \max ρ (X) \geq ρ (x_{ε})

. But this means

Φ_{ε} (x_{ε}) < Φ_{ε} (\hat{x})

, contradicting

x_{ε} \in ℛ_{ε}

. Hence, we have shown that necessarily

Q \ P = \emptyset

. Because we already know that

Q \subset Ψ \cap (cl P)

, one obtains

Q \subset Ψ \cap P = R

, as claimed.

Step 2: $Q \neq \emptyset$ . Consider a monotone sequence ${(ε_{k})}_{k = 0}^{\infty} \subset (0, 1)$ with $ε_{k + 1} < ε_{k}$ , for all $k \geq 0$ , and such that $\lim_{k \to \infty} ε_{k} = 0$ . Using the same selection $x_{ε}$ as in Step 1, the sequence ${(x_{ε_{k}})}_{k = 0}^{\infty} \subset X$ is a sequence of points contained in the compact set $X$ , so that by the Bolzano-Weierstrass theorem (Berge 1963, p. 67) there exists a convergent subsequence ${(x_{ε_{k_{j}}})}_{j = 0}^{\infty} \subset {(x_{ε_{k}})}_{k = 0}^{\infty}$ (with limit in $X$ ). That is, there exists $q \in X$ , such that $q = \lim_{j \to \infty} x_{ε_{k_{j}}}$ , and by the definition of $Q$ we therefore obtain that $q \in Q$ and thus, $Q \neq \emptyset$ , as stated in the result.

This concludes the proof. □

Proof of Lemma 5.

Continuity of the optimal $ε$ -augmented performance index $Φ_{ε}^{*}$ over $ε \in [0, 1]$ follows from the maximum theorem (Berge 1963, p. 116), given the continuity of $Φ_{ε}$ in both arguments and the compactness of $X$ . To establish monotonicity, fix $ε^{'}, ε^{″} \in [0, 1]$ with $ε^{″} > ε^{'}$ . By optimality of $x_{ε^{″}}$ for $Φ_{ε^{″}}$ and feasibility of $x_{ε^{'}}$ at $ε^{″}$ , we obtain

Φ_{ε^{″}}^{*} = (1 - ε^{″}) ρ_{ε^{″}} + ε^{″} μ_{ε^{″}} \geq (1 - ε^{″}) ρ_{ε^{'}} + ε^{″} μ_{ε^{'}} = Φ_{ε^{″}} (x_{ε^{'}}) .

Subtracting $Φ_{ε^{'}}^{*} = (1 - ε^{'}) ρ_{ε^{'}} + ε^{'} μ_{ε^{'}}$ , we find

Φ_{ε^{″}}^{*} - Φ_{ε^{'}}^{*} \geq Φ_{ε^{″}} (x_{ε^{'}}) - Φ_{ε^{'}}^{*} = (ε^{″} - ε^{'}) (μ_{ε^{'}} - ρ_{ε^{'}}) \geq 0,

because

μ_{ε^{'}} \geq ρ_{ε^{'}}

by definition. This confirms that

Φ_{ε}^{*}

is nondecreasing on [0, 1]. To establish convexity, we show that regular increments in

ε

increase the corresponding differences of the value function. If

\overset{ˇ}{ε} = (ε^{'} + ε^{″}) / 2

denotes the midpoint between

ε^{'}

and

ε^{″}

, it is sufficient to show that the difference of successive differences,

Δ = (Φ_{ε^{″}}^{*} - Φ_{\overset{ˇ}{ε}}^{*}) - (Φ_{\overset{ˇ}{ε}}^{*} - Φ_{ε^{'}}^{*}),

must be nonnegative. Indeed, setting

δ = (ε^{″} - ε^{'}) / 2 > 0

, it is

Φ_{ε^{″}}^{*} - Φ_{\overset{ˇ}{ε}}^{*} \geq δ (μ_{\overset{ˇ}{ε}} - ρ_{\overset{ˇ}{ε}}) \geq Φ_{\overset{ˇ}{ε}}^{*} - Φ_{ε^{'}}^{*},

where the first inequality follows from

Φ_{ε^{″}}^{*} \geq Φ_{ε^{″}} (x_{\overset{ˇ}{ε}})

, and the second from

Φ_{ε^{'}}^{*} \geq Φ_{ε^{'}} (x_{\overset{ˇ}{ε}})

. Together, these imply

Δ \geq 0

, which proves that

Φ_{ε}^{*}

is convex on [0, 1]. □

Proof of Lemma 6.

(i)/(iv) The function $Φ_{ε} (\cdot)$ conforms to Equation (1) as a weighted objective comprising the two criteria $ρ (\cdot)$ and $μ (\cdot)$ . By Proposition 1, increasing $ε \in [0, 1)$ , which shifts weight from the first criterion to the second, can only decrease the optimal value of the first and increase that of the second. Thus, $ρ_{ε}$ is decreasing and $μ_{ε}$ is increasing in $ε \in [0, 1]$ , as claimed. (ii) Because $Φ_{0} = ρ$ , we have that $ρ_{0} = \max_{x \in X} Φ_{0} (x) = ρ^{*}$ . Hence, by part (i) we have that $ρ^{*} \geq ρ_{ε}$ , for all $ε \in [0, 1]$ . (iii) Because $ρ_{ε}$ is bounded above by $ρ^{*}$ and nonincreasing in $ε$ , the sequence ${(ρ_{ε})}_{ε \in [0, 1]}$ converges as $ε \to 1^{-}$ by the monotone convergence theorem (see, e.g., Rudin 1976, theorem 3.14, p. 55). Moreover, because $ρ^{*}$ is the least upper bound, the limit of $ρ_{ε}$ as $ε \to 1^{-}$ must equal $ρ^{*}$ . (v) This result follows from the monotonicity of $Φ_{ε}^{*}$ established in Lemma 5. Because $Φ_{ε}^{*} = (1 - ε) ρ_{ε} + ε μ_{ε}$ is nondecreasing in $ε$ , and $ρ_{ε} \leq ρ^{*}$ , by parts (i) and (ii), it must be that $μ_{ε} \geq ρ^{*}$ for all $ε \in [0, 1]$ ; otherwise, the convex combination would decrease, contradicting monotonicity. □

Proof of Proposition 4.

By Equations (15) and (16), along with Lemma 5, we have

0 \leq ψ_{ε} = ρ^{*} - ρ_{ε} = Φ_{0}^{*} - \frac{Φ_{ε}^{*} - ε μ_{ε}}{1 - ε} \leq Φ_{0}^{*} - Φ_{ε}^{*} + ε μ_{ε} \leq ε μ_{ε}, ε \in [0, 1] .

Thus, given a desired approximation-error bound $δ > 0$ , for any $\hat{ε} \in (0, 1]$ , it is

ε \leq \min {\hat{ε}, δ / μ_{\hat{ε}}} \Rightarrow ψ_{ε} \leq δ .

Because, by Lemma 6 (iv), the function $\hat{ε} \mapsto δ / μ_{\hat{ε}}$ is decreasing, a simpler implication follows by choosing $\hat{ε} = 1$ :

ε \leq δ / μ_{1} \Rightarrow ψ_{ε} \leq δ .

This establishes the claimed approximation guarantee. □

Proof of Lemma 7.

Fix any $ε \in [0, 1]$ . By Equation (16), the difference between the upper and lower bounds in Equation (17) is given by

d_{ε} = Φ_{ε}^{*} - ρ_{ε} = ε (μ_{ε} - ρ_{ε}) \geq 0 .

The midpoint estimator ${\hat{ρ}}_{ε}$ defined in Equation (18) is the arithmetic average of these bounds:

{\hat{ρ}}_{ε} = \frac{Φ_{ε}^{*} + ρ_{ε}}{2} .

By construction, this implies

| {\hat{ρ}}_{ε} - ρ^{*} | \leq \frac{1}{2} (Φ_{ε}^{*} - ρ_{ε}) = \frac{d_{ε}}{2} .

Therefore, ${\hat{ρ}}_{ε}$ approximates $ρ^{*}$ to within $d_{ε} / 2$ , as claimed. □

Proof of Lemma 8.

Let $Ω \subset R_{+}^{n} \ {0}$ be a nonempty compact set, which is not reduced to the origin.

(a) Because $Ω$ is nonempty, its radial projection $π (Ω)$ is nonempty as well. Moreover, $π$ is continuous on the compact set $Ω$ , so $π (Ω)$ is compact (Apostol 1974, theorem 4.25, p. 82). We now show that $π (Ω) = C (Ω) \cap Δ$ . For the inclusion $π (Ω) \subset C (Ω) \cap Δ$ , take any $λ \in π (Ω)$ . Then there exists $w \in Ω$ such that $π (w) = λ$ , i.e., $λ = w / ‖ w ‖_{1}$ . Because $Ω \subset R_{+}^{n} \ {0}$ , it follows that for any $α > 0$ , we have $α w \in C (Ω)$ , and in particular, $λ = α w$ for $α = 1 / ‖ w ‖_{1}$ . Hence $λ \in C (Ω) \cap Δ$ . Conversely, suppose $\hat{λ} \in C (Ω) \cap Δ$ . Then there exists $\hat{α} > 0$ such that $\hat{w} = \hat{α} \hat{λ} \in Ω$ . Applying the radial projection yields $π (\hat{w}) = \hat{w} / ‖ \hat{w} ‖_{1} = \hat{λ}$ , because $‖ \hat{λ} ‖_{1} = 1$ . Therefore, $\hat{λ} \in π (Ω)$ , proving $C (Ω) \cap Δ \subset π (Ω)$ . Together, we obtain $π (Ω) = C (Ω) \cap Δ$ . (b) Because $\partial Ω \subset Ω$ , we trivially have $π (\partial Ω) \subset π (Ω)$ . To prove the reverse inclusion, let $\hat{λ} \in π (Ω)$ . Then $\hat{λ} = π (\hat{w})$ for some $\hat{w} = \hat{α} \hat{λ} \in Ω$ , with $\hat{α} > 0$ . Because $Ω$ is compact and $0 \notin Ω$ , the ray $α \hat{λ}$ intersects $Ω$ in a bounded interval of positive length. That is, there exist real numbers $\underline{α}, \bar{α}$ , with $0 < \underline{α} \leq \hat{α} \leq \bar{α} < \infty$ , such that $\underline{α} \hat{λ}, \bar{α} \hat{λ} \in \partial Ω$ . Hence, $\hat{λ} = π (\underline{α} \hat{λ}) \in π (\partial Ω)$ , so $π (Ω) \subset π (\partial Ω)$ , completing the proof that $π (Ω) = π (\partial Ω)$ .
Let $w \in int (Ω)$ . Because $Ω \subset C (Ω)$ , it follows that $w \in int (C (Ω))$ . Because $Δ$ is a smooth manifold and the intersection of an open set with it is open (in the relative topology), we obtain that $π (int (Ω)) = int (C (Ω)) \cap Δ$ is open in $Δ$ ,⁴⁸ and thus $π (w) \in int (π (Ω))$ . This proves that the radial projection maps interior points of $Ω$ to interior points of $π (Ω)$ . It follows that if $Ω^{'} \subset Ω$ is open in $R_{+}^{n}$ , then $π (Ω^{'})$ is open in $Δ$ .
Let $Ω^{'} \subset Ω$ be nonempty. Then $π (Ω^{'}) \subset Δ$ is nonempty. Its closure is compact because $Ω^{'}$ is contained in the compact set $Ω$ . Using continuity of $π$ and part (i) (b) we have that
$\partial π (Ω^{'}) = \partial (cl (π (Ω^{'}))) = \partial π (cl (Ω^{'})) = \partial π (\partial (cl (Ω^{'}))) = \partial π (\partial Ω^{'}) .$
Hence, the boundary of $π (Ω^{'})$ equals the boundary of the projection of the boundary of $Ω^{'}$ .
Suppose $Ω$ is convex. If $π (Ω)$ were not convex, then there would exist $λ^{'}, λ^{″} \in π (Ω)$ and $μ \in (0, 1)$ such that $λ = μ λ^{'} + (1 - μ) λ^{″} \notin π (Ω)$ . Let $w^{'} \in π^{- 1} ({λ^{'}})$ , and $w^{″} \in π^{- 1} ({λ^{″}})$ . Then for any $α > 0$ ,
$α λ = \frac{α μ}{‖ w^{'} ‖_{1}} w^{'} + \frac{α (1 - μ)}{‖ w^{″} ‖_{1}} w^{″} .$
In particular, define
$α = {(\frac{μ}{‖ w^{'} ‖_{1}} + \frac{1 - μ}{‖ w^{″} ‖_{1}})}^{- 1} > 0, and \hat{μ} = \frac{α μ}{‖ w^{'} ‖_{1}} \in (0, 1) .$
Then
$α λ = \hat{μ} w^{'} + (1 - \hat{μ}) w^{″} \in Ω,$
by convexity of $Ω$ . Hence, $π (α λ) = λ \in π (Ω)$ , contradicting our assumption. Therefore, $π (Ω)$ must be convex.
We now show that the image $π (Ω^{'})$ of a straight line segment $Ω^{'} \subset Ω$ is a straight line segment in $Δ$ , with the possibility of a degenerate case when the straight line segment is projected to a single point in $Δ$ . Indeed,
$\begin{array}{l} π (\hat{μ} w^{'} + (1 - \hat{μ}) w^{″}) = \frac{\hat{μ} w^{'} + (1 - \hat{μ}) w^{″}}{‖ \hat{μ} w^{'} + (1 - \hat{μ}) w^{″} ‖_{1}} = \frac{\hat{μ} ‖ w^{'} ‖_{1} λ^{'} + (1 - \hat{μ}) ‖ w^{″} ‖_{1} λ^{″}}{‖ \hat{μ} w^{'} + (1 - \hat{μ}) w^{″} ‖_{1}} \\ = \frac{\hat{μ} ‖ w^{'} ‖_{1}}{\hat{μ} ‖ w^{'} ‖_{1} + (1 - \hat{μ}) ‖ w^{″} ‖_{1}} λ^{'} + (1 - \frac{\hat{μ} ‖ w^{'} ‖_{1}}{\hat{μ} ‖ w^{'} ‖_{1} + (1 - \hat{μ}) ‖ w^{″} ‖_{1}}) λ^{″} \\ = μ (\hat{μ}) λ^{'} + (1 - μ (\hat{μ})) λ^{″}, \end{array}$
where $λ^{'} = w^{'} / ‖ w^{'} ‖_{1}, λ^{″} = w^{″} / ‖ w^{″} ‖_{1}$ , and
$μ (\hat{μ}) = \frac{\hat{μ} ‖ w^{'} ‖_{1}}{\hat{μ} ‖ w^{'} ‖_{1} + (1 - \hat{μ}) ‖ w^{″} ‖_{1}}, \hat{μ} \in [0, 1] .$
The latter function is continuously differentiable, with
$\frac{d μ (\hat{μ})}{d \hat{μ}} = \frac{‖ w^{'} ‖_{1} ‖ w^{″} ‖_{1}}{{(\hat{μ} ‖ w^{'} ‖_{1} + (1 - \hat{μ}) ‖ w^{″} ‖_{1})}^{2}} > 0, \hat{μ} \in [0, 1],$
which implies that $μ (\cdot)$ is strictly increasing on [0, 1]. Thus, provided that $λ^{'} \neq λ^{″}$ , the radial projection of a straight line is a bijection, which also preserves the orientation of any straight path between two different points in $Ω$ , as long as they do not map to the same point in $Δ$ (for $λ^{'} = λ^{″}$ ).
Consider first the case where $co (Ω)$ is a bounded polyhedron (i.e., a polytope; see, e.g., Nemhauser and Wolsey 1999, definition 2.2, p. 86). By part (i) (b) it is $π (co (Ω)) = π (\partial (co (Ω)))$ . By part (iv), the set $π (co (Ω))$ is convex. Part (v) then guarantees that the straight lines between any two vertices of $co (Ω)$ remain straight lines in their radial projection onto $Δ$ , which implies it is enough to project the vertices of $co (Ω)$ and then take the convex hull. Because the vertices are included in $\partial Ω$ , we therefore obtain that $π (co (Ω)) = co (π (\partial Ω))$ as claimed. If the bounded convex set $co (Ω)$ is not a polytope, then it can be approximated arbitrarily closely by a polytope (see, e.g., Bronstein 2008), so that the claim—omitting some of the convergence-specific details—follows in the limit.
Let $Ω^{'}, Ω^{″} \subset R_{+}^{n}$ such that $Ω = Ω^{'} \cup Ω^{″}$ . Then $C (Ω^{'} \cup Ω^{″}) = C (Ω^{'}) \cup C (Ω^{″})$ , which implies the claim by part (i) (a) (cf. Endnote 48): $π (Ω) = (C (Ω^{'}) \cap Δ) \cup (C (Ω^{″}) \cap Δ) = π (Ω^{'}) \cup π (Ω^{″})$ .
Let $Ω^{'}, Ω^{″} \subset R_{+}^{n}$ such that $Ω = Ω^{'} \cap Ω^{″}$ . Then, by part (i) (a) we have $π (Ω) = C (Ω^{'} \cap Ω^{″}) \cap Δ$ . If $Ω = Ω^{'} \cap Ω^{″}$ for some $Ω^{'}, Ω^{″} \subset R_{+}^{n}$ , then $π (Ω) = C (Ω^{'} \cap Ω^{″}) \cap Δ$ , as claimed. □

Proof of Proposition 5.

Recall that the ambiguity set $Ω \subset R_{+}^{n}$ is compact, nonempty, and not reduced to the origin. For any feasible decision $x \in X$ , the $Ω$ -conditioned performance index corresponds to its worst-case relative performance. Using the radial projection of $Ω$ and invoking Lemma 8 (i), we obtain:⁴⁹

ρ (x | Ω) = \min_{w \in Ω} \frac{F (x | w)}{F^{*} (w)} = \min_{λ \in π (Ω)} \frac{F (x | λ)}{F^{*} (λ)} = \min_{λ \in π (\partial Ω)} \frac{F (x | λ)}{F^{*} (λ)} .

As shown in the proof of Proposition 2, the performance ratio $φ (x | λ) = F (x | λ) / F^{*} (λ)$ is quasiconcave in $λ$ for fixed $x \in X$ . Because $π (Ω)$ is compact, the minimum is attained on the boundary $\partial π (Ω)$ , which equals $π (\partial Ω)$ by Lemma 8 (i) (b). Moreover, the quasiconcavity of $φ (x | \cdot)$ implies that all upper contour sets are convex, so the minimum is also attained on the boundary of the convex hull $co (π (\partial Ω))$ . Thus, if there exists an extremal base $Λ$ such that $co (Λ) = co (π (Ω))$ , then the minimum is attained on the compact set $Λ$ , as claimed. □

Proof of Proposition 6.

Fix any $θ \in Θ$ , and consider the state-contingent $Ω^{'}$ -conditioned performance index,

ϕ (x, θ | Ω^{'}) = \min_{λ^{'} \in π (Ω^{'})} \frac{G (x, θ | λ^{'})}{G^{*} (θ | λ^{'})}, x \in X,

(A.6)

which measures the performance of an action relative to all feasible weighted-sum scalarizations of the multicriteria decision problem, in state

θ

. Because

Λ^{'}

is an extremal base of

π (Ω^{'})

, we have

co (Λ^{'}) = co (π (Ω^{'}))

. The quasiconcavity of

G (x, θ | λ^{'}) / G^{*} (θ | λ^{'})

λ^{'}

for fixed

(x, θ)

, which obtains as in the proof of Proposition 2, ensures that the minimum in Equation (A.6) is attained at an extreme point of

co (π (Ω^{'}))

. Thus, as in Proposition 5, we obtain an equivalent representation,

ϕ (x, θ | Ω^{'}) = \min_{λ^{'} \in Λ^{'}} \frac{G (x, θ | λ^{'})}{G^{*} (θ | λ^{'})}, x \in X .

(A.7)

By Equation (27), the performance index is the minimum of the state-contingent $Ω^{'}$ -conditioned performance index in Equation (A.7) over all $θ \in Θ$ , that is,

ρ (x | Ω^{'}, Θ) = \min_{θ \in Θ} ϕ (x, θ) = \min_{θ \in Θ} {\min_{λ^{'} \in Λ^{'}} \frac{G (x, θ | λ^{'})}{G^{*} (θ | λ^{'})}}, x \in X .

(A.8)

Thus, reversing the order of minimization in Equation (A.8) yields Equation (27), as claimed. □

Proof of Corollary 1.

Let $Λ^{'} = {λ^{' (1)}, \dots, λ^{' (n)}}$ be the (smallest) extremal base of $π (Ω^{'})$ (or, equivalently, of $co (π (Ω))$ ), which is finite by hypothesis, with $n \geq 1$ . Then Equation (27) becomes

ρ (x | Ω^{'}, Θ) = \min_{λ^{'} \in Λ^{'}} {\min_{θ \in Θ} \frac{G (x, θ | λ^{'})}{G^{*} (θ | λ^{'})}} = \min_{i \in N} {\min_{θ \in Θ} \frac{G (x, θ | λ^{' (i)})}{G^{*} (θ | λ^{' (i)})}} = \min_{i \in N} {f_{i} (x)}, x \in X,

(A.9)

where we have set

f_{i} (x) = \min_{θ \in Θ} {G (x, θ | λ^{' (i)}) / G^{*} (θ | λ^{' (i)})}

, for all

(x, i) \in X \times N

. Because by definition

\max_{x \in X} G (x, θ | λ^{' (i)}) = G^{*} (θ | λ^{' (i)})

, it is

f_{i}^{*} = \max_{x \in X} f_{i} (x) = 1

, for all

i \in N

, so that Equation (A.9) is in fact equivalent to Equation (28), which completes the proof. □

Proof of Lemma 9.

Fix $i, j, m \in N$ (with $m \neq n$ ), $λ \in Δ$ , and $δ > 0$ . Then for any $y \in Y$ : If $\hat{y} = y + δ e_{i} \in Y$ , then

H (y | λ) = h^{- 1} (\sum_{l = 1}^{n} λ_{l} h (y_{l})) = h^{- 1} (\sum_{l = 1}^{n} λ_{l} h ({\hat{y}}_{l}) - λ_{i} (h ({\hat{y}}_{i}) - h (y_{i}))) {\begin{array}{l} < H (\hat{y} | λ), & if λ_{i} > 0, \\ = H (\hat{y} | λ), & if λ_{i} = 0 . \end{array}

This implies Axiom 1.⁵⁰ Consider now the case where $(\hat{y}, \hat{λ}) = (y, λ) + (y_{i} - y_{j}, λ_{i} - λ_{j}) (e_{j} - e_{i})$ . Then for $i = j$ , Axiom 2 holds trivially. For $i \neq j$ , it is

H (\hat{y} | \hat{λ}) = h^{- 1} (\sum_{l \in N \ {i, j}} λ_{l} h (y_{l}) + λ_{i} h (y_{i}) + λ_{j} h (y_{j})) = H (y | λ),

which also yields Axiom 2. Because

λ

is by assumption normalized, we also obtain that Axiom 3 holds, as

H (α 1_{n}) = h^{- 1} (\sum_{l = 1}^{n} λ_{l} h (α)) = h^{- 1} (h (α)) = α,

for any

α > 0

. As far as associativity is concerned, let us assume that

(λ_{1}, \dots, λ_{m}) \neq 0

, and let

α = H (\sum_{l = 1}^{m} y_{l} e_{l} | \frac{\sum_{l = 1}^{m} λ_{l} e_{l}}{\sum_{l = 1}^{m} λ_{l}}) = h^{- 1} (\frac{\sum_{l = 1}^{m} λ_{l} h (y_{l})}{\sum_{l = 1}^{m} λ_{l}}) .

With this, one obtains

H (α 1_{m}, y_{m + 1}, \dots, y_{n} | λ) = h^{- 1} (h \circ h^{- 1} (\frac{\sum_{l = 1}^{m} λ_{l} h (y_{l})}{\sum_{l = 1}^{m} λ_{l}}) \cdot (\sum_{l = 1}^{m} λ_{l}) + \sum_{l = m + 1}^{n} λ_{l} h (y_{l})) = H (y | λ),

whence Axiom 4 must hold (taking into account that

h \circ h^{- 1} (\cdot) = (\cdot)

). Finally, we observe that

H (y | e_{i}) = h^{- 1} \circ h (y_{i}) = y_{i},

so the coordinate-filter requirement, which constitutes Axiom 5, is also satisfied. This concludes the proof. □

Endnotes

¹ Practical examples for multicriteria decision making are legion; see, for example, the collections of case studies by Berbel et al. (2018) in agriculture, Masri et al. (2018) in financial decision making, and Ravindran (2016) for supply chain management, to name just a few.

² Extant ESG metrics differ widely. In their approach to “quantifying the impact of impact investing,” Lo and Zhang (2024) remain agnostic about the impact factors to be used, taking them as given and thus keeping at bay the difficulties of attributing weights to different ESG criteria and of dealing with this model uncertainty.

³ Lancaster (1966) already noted that a product can be viewed as a bundle of its attributes, with consumer valuations often empirically assessed using conjoint analysis (see, e.g., Green and Srinivasan 1990).

⁴ Example 9 in Section 3.6 discusses robust allocation in a two-agent exchange economy using the proposed framework.

⁵ This is notwithstanding the “norm equivalence” in a finite-dimensional Euclidean space, in the sense that any norm $‖ \cdot ‖$ can be bracketed by any other norm $\hat{‖} \cdot \hat{‖}$ , so $χ_{1} \hat{‖} \cdot \hat{‖} \leq ‖ \cdot ‖ \leq χ_{2} \hat{‖} \cdot \hat{‖}$ for suitable scalars $χ_{1}, χ_{2} > 0$ .

⁶ For the computation of Pareto sets, see Kung et al. (1975), Gabow et al. (1984), and Bentley et al. (1993).

⁷ See Yamamoto (2002) for details about maximizing a function on a Pareto-efficient set in a polyhedral setting.

⁸ A lexicographic evaluation of criteria based on perceived importance may justify an “elimination by aspects” (Tversky 1972), or more nuanced partial elimination heuristics using “attribute filters” (Kimya 2018).

⁹ Numerous ad hoc methods for determining weights exist, for example, in reliability engineering based on standards (Jiang and Chen 2020).

¹⁰ In his “Discussion on Making Things Equal,” Zhuang Zhou points out that “[t]here is nothing in the world bigger than the tip of an autumn hair, and Mount T’ai is tiny” (Watson 1968, p. 44), where Mount Tai is the highest point in the Shandong province of China.

¹¹ For example, getting $100 in a month, in addition to repayment of the invested principal, would be attractive if obtained by investing a principal of $1 today, but not if it required investing a principal of $10 million today.

¹² A point $x \in X$ is said to be isolated if there exists an open set $O \subset R^{m}$ such that $O \cap X = {x}$ .

¹³ By the extreme-value theorem (Rudin 1976, theorem 4.16, p. 89), the minimum of a continuous function on a compact set exists and is finite.

¹⁴ It is $R_{+}^{n} \ {0} = \cup_{α > 0} α Δ$ .

¹⁵ More precisely, as shown in the proof of Lemma 1, we have $F^{*} (\hat{w}) \geq F^{*} (w) + δ (\max f_{i} (X (w)))$ .

¹⁶ This “complete ignorance” (or full ambiguity) is relaxed in Section 3.7 where we allow the decision maker to face limited ambiguity.

¹⁷ Because $F^{*} (λ) \geq \min F^{*} (Δ) > 0$ , the function $φ : X \times Δ \to [0, 1]$ is well defined.

¹⁸ Leontief (1941) employed this aggregation in fixed proportions as a simplification for his analysis of a larger economy.

¹⁹ By the extreme-value theorem, $Ψ$ is nonempty, and by the maximum theorem, it is compact.

²⁰ Given two vectors $z, \hat{z} \in R^{n}$ , where $z = (z_{1}, \dots, z_{n})$ and $\hat{z} = ({\hat{z}}_{1}, \dots, {\hat{z}}_{n})$ , the standard vector inequalities are defined as follows: (i) $z \leq \hat{z} \Leftrightarrow z_{i} \leq {\hat{z}}_{i}, \forall i \in N$ ; (ii) $z ≪ \hat{z} \Leftrightarrow z_{i} < {\hat{z}}_{i}, \forall i \in N$ ; (iii) $z < \hat{z} \Leftrightarrow (z \leq \hat{z} and \exists j \in N such that z_{j} < {\hat{z}}_{j})$ . Here, $N = {1, \dots, n}$ , with $n \geq 2$ .

²¹ The relevant lower (set) limit is given by ${\underline{Lim}}_{ε \to 0^{+}} ℛ_{ε} = {x \in X : \lim_{ε \to 0^{+}} (\inf_{x^{'} \in ℛ_{ε}} ‖ x - x^{'} ‖) = 0}$ (see, e.g., Aubin and Frankowska 1990, definition 1.4.6, p. 41). Given a sequence ${(ε_{k})}_{k = 1}^{\infty} \subset (0, 1]$ with $\lim_{k \to \infty} ε_{k} = 0$ , this lower limit contains the accumulation points of any sequence ${(x_{k})}_{k = 1}^{\infty}$ with elements $x_{k} \in ℛ_{ε_{k}}$ , for all $k \geq 1$ .

²² Note that: $Φ_{ε} (2, \frac{3}{2}) - \max_{x_{1} \in [0, 1]} {\frac{(1 - ε) x_{1}}{2} + \frac{ε}{2} (\frac{x_{1}}{2} + \frac{2 + \sqrt{1 - x_{1}}}{3})} = \frac{ε}{3} \frac{1 - (7 / 12) ε}{2 - ε} > 0$ , for all $ε \in (0, 1]$ .

²³ The larger (path-connected, compact) action set $X^{'} = X \cup {(2, \frac{3}{2}) ζ : ζ \in [0, 1]}$ leaves results unchanged.

²⁴ It is $\frac{1}{24} (16 - 3 ε^{2}) / (2 - ε) - Φ_{ε} (3, \frac{1}{2}) = \frac{1}{24} (8 - 16 ε + 7 ε^{2}) / (2 - ε) > 0$ if and only if $ε < \frac{4}{7} (2 - \frac{1}{\sqrt{2}})$ .

²⁵ Normalizing the maximum criteria to the same score (e.g., $c \in {1, 10, 100}$ ) is natural in many applications.

²⁶ The term “robustness equivalence principle” is analogous to the well-known “certainty equivalence principle,” for example, in linear-quadratic optimal control problems (Bertsekas 1995, p. 23), where it is optimal (i.e., maximizing the expected value of a quadratic objective functional) to replace uncertain parameters by their means.

²⁷ For $α = β$ , Equations (22) and (23) imply the unique robust allocation $\hat{x} = (1 / 2, 1 / 2)$ , achieving $ρ^{*} = 1 / 2$ .

²⁸ The converse of part (iv) does not hold. That is, if $Ω$ is nonconvex, then $π (Ω)$ may still be convex.

²⁹ The analysis in Example 10 yields an extreme counterexample for $Ω = {w} = Ω^{'} \cap Ω^{″}$ , with $Ω^{'} = [\underline{w}, w]$ , $Ω^{″} = [w, \bar{w}]$ , and $\underline{w} ≪ w ≪ \bar{w}$ . Indeed, for $‖ \underline{w} ‖_{1} \to 0^{+}$ one obtains $int (Δ) \subset π (Ω^{'})$ , so $π (Ω^{'}) \cap π (Ω^{″}) = π (Ω^{″})$ , whereas $π (Ω) = π ({w})$ is merely a singleton.

³⁰ The underlying justification is Minkowski’s theorem, which states that any compact convex set is equal to the convex hull of its extreme points; see, for example, Rockafellar (1970, corollary 18.5.1, p. 167) or Schneider (2014, corollary 1.4.5, p. 17).

³¹ In general, the smallest $Λ$ may not be finite (e.g., if $Ω \subset R_{+}^{3}$ is a unit ball, then the smallest $Λ = \partial π (Ω)$ has a continuum of elements).

³² For analogous comments about the assumed continuity of the criteria in the action, see Remark 1 (i) in Section 2.

³³ For example, in the case where $Θ = {(1, 0), (0, 1)}$ for $l = n^{'} = 2$ , observing $θ_{1} = 1$ implies $θ_{2} = 0$ (and vice versa).

³⁴ For any $m \in N$ , we define $1_{m} = (1, \dots, 1) \in R_{+}^{m}$ .

³⁵ It is $(\sum_{i = 1}^{m} λ_{i} e_{i}) / (\sum_{i = 1}^{m} λ_{i}) = π (\sum_{i = 1}^{m} λ_{i} e_{i})$ , with the radial projection $π (\cdot)$ , from $R_{+}^{n} \ {0}$ onto $Δ$ ; see Section 3.8.

³⁶ Kolmogorov (1930) showed that if Axioms 1–4 are satisfied for $λ = 1_{n} / n$ (cf. Endnote 34), then there exists a (continuous) strictly monotonic function $h : D \to R$ , defined on a suitable domain $D \subset R$ , such that

H (y | 1_{n} / n) = h^{- 1} (\sum_{i = 1}^{n} (1 / n) h (y_{i})), y \in Y .

Nagumo (1930) obtained a similar result, and the preceding “quasiarithmetic mean” is therefore also known as the “Kolmogorov-Nagumo mean.”

³⁷ Homogeneity is not required by Axioms 1–5, because relatively robust decisions (cf. Proposition 2) are invariant under rescaling.

³⁸ The gradient of LSE is the softmax function (Boltzmann 1871; Gibbs 1902, chapter XIV), widely used in machine learning (Bridle 1990).

³⁹ The purpose of restricting all criteria to strictly positive values was merely to skip checking the nontriviality condition (N) in Section 2.1 which is satisfied here (e.g., $x_{d} = 2$ ). Hence, setting $s_{13} = 0$ poses no problem.

⁴⁰ Under some mild additional assumptions (which we omit), the law of large numbers would guarantee convergence of the average scores to the true criterion-function values, at least when the number of available score observations for each realized action (i.e., each element of $\hat{X}$ ) goes to infinity, assuming that decisions can be properly recognized, as may naturally be the case on a sufficiently discrete underlying action set $X$ .

⁴¹ Nathan and Lord (1983) propose “knowledge,” “delivery,” “relevance,” “interpersonal,” and “organization” as criteria for the quality of a university lecturer; see also DeNisi and Murphy (2017) for a survey.

⁴² This is not the only “energy trilemma”; Tilman et al. (2009) discuss a food-energy-environment trilemma in the context of biofuels.

⁴³ Klarman et al. (1968) recognized the need for quality-of-life considerations in cost-effective treatment options (for chronic renal disease).

⁴⁴ Cutler and Richardson (1997), Buxton (2005), and Newhouse (2021) discuss QALY for assessing cost-effectiveness in healthcare.

⁴⁵ Baker and Freeland (1975) provide an overview of early scoring models. Stewart (1991) proposes a multicriteria decision support system for project selection using essentially equal weights for different objectives (akin to the Laplacian approach; cf. Section 4.1). Finally, Hall et al. (2015) suggest a project evaluation with an “underperformance riskiness index” relying on (typically unavailable) past statistical data.

⁴⁶ Indeed, because $φ (x | λ) \geq \min_{i \in N} {f_{i} (x) / \underline{F} (x_{d})} = \underline{c} (x) \in [0, 1]$ , for all $λ \in Δ$ , Equation (A.2) implies that $c \leq \underline{c} (x) \Rightarrow U_{c} (x) = Δ$ .

⁴⁷ Efficiency in the outcome set corresponds directly to efficiency in the action set (cf. Section 3.2), because the latter is evaluated via a criterion vector that produces points in the outcome set.

⁴⁸ The representation $π (Ω) = C (Ω) \cap Δ$ was given in part (i) for a compact set $Ω$ , but it also applies to any subset of $Ω$ .

⁴⁹ By part (a) of Lemma 8 (i), $π (Ω)$ is nonempty and compact, so the minimum with respect to $λ$ must be attained by the extreme-value theorem.

⁵⁰ When h is increasing, the argument of $h^{- 1}$ cannot become smaller by dropping the term $λ_{i} (h ({\hat{y}}_{i}) - h (y_{i}))$ . Conversely, when h is decreasing, the argument of $h^{- 1}$ weakly decreases, so the value of that inverse cannot go down.

References

Apostol TM (1974) Mathematical Analysis, 2nd ed. (Addison-Wesley, Reading, MA).Google Scholar
Arrow KJ, Enthoven AC (1961) Quasi-concave programming. Econometrica 29(4):779–800.Crossref, Google Scholar
Aubin J-P, Frankowska H (1990) Set-Valued Analysis (Birkhäuser, Boston).Google Scholar
Baker N, Freeland J (1975) Recent advances in R&D benefit measurement and project selection methods. Management Sci. 21(10):1164–1175.Link, Google Scholar
Ben-David S, Borodin A (1994) A new measure for the study of on-line algorithms. Algorithmica 11(1):73–91.Crossref, Google Scholar
Ben-Tal A, den Hertog D, Waegenaere AD, Melenberg B, Rennen G (2013) Robust solutions of optimization problems affected by uncertain probabilities. Management Sci. 59(2):341–357.Link, Google Scholar
Bentley JL, Clarkson KL, Levine DB (1993) Fast linear expected-time algorithms for computing maxima and convex hulls. Algorithmica 9(2):168–183.Crossref, Google Scholar
Berbel J, Bournaris T, Manos B, Matsatsinis N, Viaggi D, eds. (2018) Multicriteria Analysis in Agriculture: Current Trends and Recent Applications (Springer, New York).Crossref, Google Scholar
Berge C (1963) Topological Spaces (Oliver and Boyd, Edinburgh, UK).Google Scholar
Bertsekas DP (1995) Dynamic Programming and Optimal Control, vol. 1 (Athena Scientific, Belmont, MA).Google Scholar
Blanchet J, Murthy K (2019) Quantifying distributional model risk via optimal transport. Math. Oper. Res. 44(2):565–600.Link, Google Scholar
Boltzmann L (1871) Über das Wärmegleichgewicht zwischen mehratomigen Gasmolekülen. Sitzungsberichte der Mathematisch-Naturwissenschaftlichen Classe der Kaiserlichen Akademie der Wissenschaften, Wien 63(3):397–418.Google Scholar
Bridle JS (1990) Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. Fogelman Soulié F, Hérault J, eds. Neurocomputing: Algorithms, Architectures and Applications, NATO ASI Series, vol. 68 (Springer, New York), 227–236.Crossref, Google Scholar
Bronstein EM (2008) Approximation of convex sets by polytopes. J. Math. Sci. 153(6):727–762. [Translation of Russian original, which appeared in Sovremennaya Matematika / Fundamental’nye Napravleniya (Contemporary Mathematics / Fundamental Directions), Vol. 22, Geometry, 2007.]Google Scholar
Buxton MJ (2005) How much are health-care systems prepared to pay to produce QALY? Eur. J. Health Econom. 6(4):285–287.Crossref, Google Scholar
Charnes A, Cooper WW (1961) Management Models and Industrial Applications of Linear Programming, vol. 1 (Wiley, New York).Google Scholar
Charnes A, Cooper WW, Ferguson RO (1955) Optimal estimation of executive compensation by linear programming. Management Sci. 1(2):138–151.Link, Google Scholar
Cournot A (1838) Recherches sur les Principes Mathématiques de la Théorie des Richesses (Hachette, Paris).Google Scholar
Cutler DM, Richardson E (1997) Measuring the health of the U.S. population. Brookings Papers Econom. Activity Microeconom. 28(1997):217–282.Crossref, Google Scholar
Delage E, Ye Y (2010) Distributionally robust optimization under moment uncertainty with application to data-driven problems. Oper. Res. 58(3):595–612.Link, Google Scholar
DeNisi AS, Murphy KR (2017) Performance appraisal and performance management: 100 years of progress? J. Appl. Psych. 102(3):421–433.Crossref, Google Scholar
Edgeworth FY (1881) Mathematical Psychics (C. Kegan Paul, London).Google Scholar
Edgeworth FY (1897) La teoria pura del monopolio. Giornale degli Economisti 15(Anno 8):13–31.Google Scholar
Ehrgott M (2010) Multicriteria Optimization, 2nd ed. (Springer, Berlin).Google Scholar
Fechner GT (1860) Elemente der Psychophysik (Part II) (Breitkopf & Härtel, Leipzig, Germany).Google Scholar
Gabow HN, Bentley JL, Tarjan RE (1984) Scaling and related techniques for geometry problems. Proc. 16th Annual ACM Sympos. Theory Comput. (Association for Computing Machinery, New York), 135–143.Google Scholar
Gibbs JW (1902) Elementary Principles in Statistical Mechanics (General Publishing, Toronto).Google Scholar
Goel A, Meyerson A, Weber TA (2009) Fair welfare maximization. Econom. Theory 41(3):465–494.Crossref, Google Scholar
Green PE, Srinivasan V (1990) Conjoint analysis in marketing: New developments with implications for research and practice. J. Marketing 54(4):3–19.Crossref, Google Scholar
Hall NG, Long DZ, Qi J, Sim M (2015) Managing underperformance risk in project portfolio selection. Oper. Res. 63(3):660–675.Link, Google Scholar
Han J, Weber TA (2023) Price discrimination with robust beliefs. Eur. J. Oper. Res. 306(2):795–809.Crossref, Google Scholar
Hardy GH, Littlewood JE, Pólya G (1934) Inequalities (Cambridge University Press, Cambridge, UK).Google Scholar
Harsanyi JC (1955) Cardinal welfare, individualistic ethics, and interpersonal comparisons of utility. J. Political Econom. 63(4):309–321.Crossref, Google Scholar
Hawkins TR, Gausen OM, Strømman AH (2012) Environmental impacts of hybrid and electric vehicles: A review. Internat. J. Life Cycle Assessment 17(8):997–1014.Crossref, Google Scholar
Jiang R, Chen Z (2020) A standard-based approach for multi-criteria performance evaluation of engineered systems. Reliability Engrg. System Safety 202(107001):1–12Crossref, Google Scholar
Kahneman D, Tversky A (1979) Prospect theory: An analysis of decision under risk. Econometrica 47(2):263–291.Crossref, Google Scholar
Kahneman D, Tversky A (1984) Choices, values, and frames. Amer. Psychologist 39(4):341–350.Crossref, Google Scholar
Keeney RL, Raiffa H (1993) Decisions with Multiple Objectives (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
Kimya M (2018) Choice, consideration sets, and attribute filters. Amer. Econom. J. Microeconom. 10(4):223–247.Crossref, Google Scholar
Klarman HE, Francis JOS, Rosenthal GD (1968) Cost effectiveness analysis applied to the treatment of chronic renal disease. Medical Care 6(1):48–54.Crossref, Google Scholar
Kolmogorov AN (1930) Sur la notion de la moyenne. Atti della Accademia Nazionale dei Lincei, Classe di Scienze Fisiche, Matematiche e Naturali Rendiconti 12(9):388–391.Google Scholar
Kouvelis P, Yu G (1997) Robust Discrete Optimization and Its Applications (Springer, New York).Crossref, Google Scholar
Kung H, Luccio F, Preparata F (1975) On finding the maxima of a set of vectors. J. ACM 22(4):469–476.Crossref, Google Scholar
Lancaster KJ (1966) A new approach to consumer theory. J. Political Econom. 74(2):132–157.Crossref, Google Scholar
Leontief WW (1941) The Structure of American Economy 1919–1929: An Empirical Application of Equilibrium Analysis (Harvard University Press, Cambridge, MA).Google Scholar
Lo AW, Zhang R (2024) Quantifying the impact of impact investing. Management Sci. 70(10):7161–7186.Link, Google Scholar
Loewenstein GF, Thompson L, Bazerman MH (1989) Social utility and decision making in interpersonal contexts. J. Personality Soc. Psych. 57(3):426–441.Crossref, Google Scholar
Marshall A (1890) Principles of Economics (Macmillan, London).Google Scholar
Mas-Colell A, Whinston MD, Green JR (1995) Microeconomic Theory (Oxford University Press, Oxford, UK).Google Scholar
Masri H, Pérez-Gladish B, Zopounidis C, eds. (2018) Financial Decision Aid Using Multiple Criteria: Recent Models and Applications (Springer, New York).Crossref, Google Scholar
Miyamoto JM, Wakker PP, Bleichrodt H, Peters HJM (1998) The zero-condition: A simplifying assumption in QALY measurement and multiattribute utility. Management Sci. 44(6):839–849.Link, Google Scholar
Mohajerin Esfahani P, Kuhn D (2018) Data-driven distributionally robust optimization using the Wasserstein metric: Performance guarantees and tractable reformulations. Math. Programming 171(1–2):115–166.Crossref, Google Scholar
Nagumo M (1930) Über eine Klasse der Mittelwerte. Japanese J. Math. 7:71–79.Crossref, Google Scholar
Nathan BR, Lord RG (1983) Cognitive categorization and dimensional schemata: A process approach to the study of halo in performance ratings. J. Appl. Psychol. 68(1):102–114.Google Scholar
Nemhauser GL, Wolsey LA (1999) Integer and Combinatorial Optimization (Wiley Interscience, New York).Google Scholar
Newhouse JP (2021) An ounce of prevention. J. Econom. Perspect. 35(2):101–118.Crossref, Google Scholar
Pareto V (1894) Il massimo di utilità dato dalla libera concorrenza. Giornale degli Economisti 9(2):48–66.Google Scholar
Pareto V (1897) Cours d’Économie Politique, vol. 2 (F. Rouge, Lausanne, Switzerland).Google Scholar
Pareto V (1906) Manuale di Economia Politica (Società Editrice Libraria, Milan).Google Scholar
Pliskin JS (1974) The management of patients with end-stage renal failure: A decision-theoretic approach. Doctoral thesis, Harvard University, Boston.Google Scholar
Pliskin JS, Beck CH (1976) A health index for patient selection: A value function approach with application to chronic renal failure patients. Management Sci. 22(9):1009–1021.Link, Google Scholar
Ravindran AR, ed. (2016) Multiple Criteria Decision Making in Supply Chain Management (CRC Press, New York).Crossref, Google Scholar
Rockafellar RT (1970) Convex Analysis (Princeton University Press, Princeton, NJ).Crossref, Google Scholar
Rudin W (1976) Principles of Mathematical Analysis, 3rd ed. (McGraw-Hill, New York).Google Scholar
Savage LJ (1951) The theory of statistical decision. J. Amer. Statist. Assoc. 46(253):55–67.Crossref, Google Scholar
Schneider R (2014) Convex Bodies: The Brunn-Minkowski Theory, 2nd ed. (Cambridge University Press, Cambridge, UK).Google Scholar
Sleator DD, Tarjan RE (1985) Amortized efficiency of list update and paging rules. Comm. ACM 28(2):202–208.Crossref, Google Scholar
Steuer RE (1986) Multiple Criteria Optimization: Theory, Computation, and Application (Wiley, New York).Google Scholar
Stewart TJ (1991) A multi-criteria decision support system for R&D project selection. J. Oper. Res. Soc. 42(1):17–26.Google Scholar
Tilman D, Socolow R, Foley JA, Hill J, Larson E, Lynd L, Pacala S, et al. (2009) Beneficial biofuels: The food, energy, and environment trilemma. Science 325(5938):270–271.Crossref, Google Scholar
Tversky A (1972) Elimination by aspects: A theory of choice. Psych. Rev. 79(4):281–299.Crossref, Google Scholar
Villani C (2008) Optimal Transport: Old and New, Grundlehren der Mathematischen Wissenschaften, vol. 338 (Springer, Berlin).Google Scholar
Wald A (1945) Statistical decision functions which minimize the maximum risk. Ann. Math. 46(2):265–280.Crossref, Google Scholar
Wang T, Fu Y (2020) Constructing composite indicators with individual judgements and best-worst method. Soc. Indicators Res. 149(1):1–14.Crossref, Google Scholar
Watson B (1968) The Complete Works of Chuang Tzu (Columbia University Press, New York).Google Scholar
Weber EH (1846) Der Tastsinn und das Gemeingefühl. Wagner R, ed. Handwörterbuch der Physiologie mit Rücksicht auf die physiologische Pathologie, vol. 3, section 2 (Vieweg, Braunschweig, Germany), 481–588.Google Scholar
Weber TA (2014) On the (non-)equivalence of IRR and NPV. J. Math. Econom. 52:25–39.Crossref, Google Scholar
Weber TA (2023) Relatively robust decisions. Theory Decision 94(1):35–62.Crossref, Google Scholar
Weber TA (2024) Optimal depth of discharge for electric batteries with robust capacity-shrinkage estimator. Proc. 2024 4th Internat. Conf. Smart Grid Renewable Energy (SGRE) (IEEE, New York), 1–5.Google Scholar
Weber TA (2025) Monopoly pricing with unknown demand. Scandinavian J. Econom. 127(1):235–285.Crossref, Google Scholar
World Energy Council (2024) World Energy Trilemma 2024: Evolving with resilience and justice. Technical report, World Energy Council, London.Google Scholar
Yager RR (1988) On ordered weighted averaging aggregation operators in multi-criteria decision making. IEEE Trans. Systems Man Cybernetics 18(1):183–190.Crossref, Google Scholar
Yamamoto Y (2002) Optimization over the efficient set: Overview. J. Global Optim. 22(1–4):285–317.Crossref, Google Scholar
Zarghami M, Szidarovszky F (2010) On the relation between compromise programming and ordered weighted averaging operator. Inform. Sci. 180(11):2239–2248.Crossref, Google Scholar
Zeleny M (1974) A concept of compromise solutions and the method of the displaced ideal. Comput. Oper. Res. 1(3):479–496.Crossref, Google Scholar

Volume 72, Issue 4

April 2026

Pages 2681-3628, iv-vi

Article Information

Metrics

Information

Received:February 07, 2025
Accepted:May 28, 2025
Published Online:August 14, 2025

Cite as

Thomas A. Weber (2025) Relatively Robust Multicriteria Decisions. Management Science 72(4):3175-3203.

https://doi.org/10.1287/mnsc.2025.00510

Keywords

PDF download

Available Issues

Available Issues

Relatively Robust Multicriteria Decisions

Abstract

1. Introduction

1.1. Practical Examples

1.2. Literature

1.2.1. Origins of Multicriteria Decision Making.

1.2.2. Relative Robustness.

1.2.3. Connection to Distributionally Robust Optimization.

1.2.4. Relative Evaluations.

1.3. Outline

2. Basic Framework

2.1. Decision Problem

2.2. Comparative Statics

2.3. Performance Index

3. Robust Multicriteria Decision Making

3.1. Pseudo-Robustness

3.2. Efficiency

3.3. Robust Decision Set

3.4. Robust Decisions: Computation

3.5. Approximation Error of ε-Robust Decisions

3.6. Robust Weights

3.6.1. General Case.

3.6.2. Balanced Case.

3.7. Limited Ambiguity

3.8. Criterion Ambiguity

3.9. General Multicriteria Objectives

3.9.1. Reduction to Basic Framework.

3.9.2. Special Case: Minimization Under Decreasing Kernel.

3.10. Applying the Method

4. Discrete Applications

4.1. Finite Action Set

4.1.1. Relative Robustness Criterion.

4.1.2. Alternative Robustness Criteria.

4.1.3. Comparison.

4.2. Data-Driven Approach

4.3. Real-World Applications

4.3.1. Energy Trilemma: Balancing Energy Security, Equity, and Sustainability.

4.3.2. QALY Impact of Diseases: Evaluating Quality-Adjusted Life Years.

4.3.3. Resource Allocation in Companies: Balancing Risk, Resources, and Rival Assets.

5. Conclusion

Appendix. Proofs

References

Volume 72, Issue 4

Article Information

Metrics

Information

Cite as

Keywords

3.5. Approximation Error of $ε$ -Robust Decisions