Organizational Selection of Innovation
Abstract
Budgetary constraints force organizations to pursue only a subset of possible innovation projects. Identifying which subset is most promising is an error-prone exercise, and involving multiple decision makers may be prudent. This raises the question of how to most effectively aggregate their collective nous. Our model of organizational portfolio selection provides some first answers. We show that portfolio performance can vary widely. Delegating evaluation makes sense when organizations employ the relevant experts and can assign projects to them. In most other settings, aggregating the impressions of multiple agents leads to better performance than delegation. In particular, letting agents rank projects often outperforms alternative aggregation rules—including averaging agents’ project scores and counting their approval votes—especially when organizations have tight budgets and can select only a few project alternatives out of many.
Funding: L. Böttcher acknowledges financial support from hessian.AI and the Army Research Office [Grant W911NF-23-1-0129].
Supplemental Material: The data files are available at https://doi.org/10.1287/orsc.2023.17357.
1. Introduction
We examine rules for aggregating individual project evaluations to make organizational portfolio selection decisions. Such resource-allocation decisions have specific characteristics (Levinthal 2017, Klingebiel 2018, Sengul et al. 2019). They are made intermittently, with organizations considering multiple resource-allocation alternatives at a time. They are subject to budget constraints, limiting the number of alternatives an organization can pursue. They are made under uncertainty, exposing organizations to allocation errors.
Consider how organizations select from a slate of innovation ideas (Brasil and Eggers 2019, Kavadias and Hutchison-Krupat 2020). A group of executives with different backgrounds meet periodically to review funding proposals. The resources they are permitted to allocate suffice for only a fraction of the proposals before them. Many of the proposals are outside of executives’ prior experience, leading to noisy assessments. They may thus consider delegating the portfolio-selection decisions to the relatively most qualified person, or to combine their limited expertise in various ways to arrive at better decisions. Which approach to arriving at an organizational portfolio will yield the best results in expectation?
The study of organizational aggregation processes is a resurgent scholarly pursuit (Knudsen and Levinthal 2007, Puranam 2018). Of particular relevance for our study is the finding that aggregating project approval decisions through majority voting usually leads to better outcomes than attainable through averaging scores (Csaszar and Eggers 2013). Delegation is effective when relevant executive expertise is available and evident or when organizations struggle to afford coordinating among a wider set of decision makers (Healey et al. 2023).
We advance such insights into aggregation by studying portfolio selection. Choosing a subset of available project alternatives is different from project approval in two ways. First, portfolio selection requires decision makers to observe a budget constraint. Second, to identify the best subset of projects to be funded, portfolio selection involves discrimination.
One implication of these unique features is a different performance dimension. What matters for portfolio selection is maximizing the expected performance of the projects chosen for funding, not the performance of every project approvable in isolation. Many of the latter may in fact not make the cut. Another implication is that portfolio selection, unlike isolated project approval, involves prioritization. Rank-order approaches discussed in the social-choice literature on multiwinner voting (Elkind et al. 2017, Pacuit 2019) thus become relevant.
The mathematical model of portfolio selection we develop contains heterogeneously informed, nonstrategic agents who are given the task of selecting from a list of independent project proposals. Different rules for aggregating agents’ selection decisions produce variation in performance. As our model is intended as a first step for studying organizational decision making at the portfolio level, we leave potentially interesting richness such as project interdependence, decision-maker interaction, and strategic behavior to future research.
We find that relying on individuals is almost always inferior to aggregating multiple opinions. Majority voting performs poorly when resource constraints require organizations to be highly selective. Averaging performs better but is often outdone by a simple process of having agents produce a ranked preference list. Totaling such ranks is the most performative method of aggregation in many scenarios, inferior to delegation only when firms know that they have the right experts for evaluation.
The dominance of ranking—based on an aggregation process known as Borda count (Brandt et al. 2016)—is due to its robustness against project-quality misclassification that degrades the performance of other selection methods like averaging more substantially. Organizational selection of innovation thus benefits from a relatively crude ordering process that differs from the voting procedures prior work would recommend for the isolated approval of individual projects.
Our work provides insights into how organizations can harness collective decision-making processes to effectively allocate resources. In the attempt to understand the management of innovation portfolios—of patents, drug candidates, or technology ventures (Eklund 2022, Toh and Ahuja 2022, Kumar and Operti 2023), for example—empirical work honed in on group-decision biases such as those concerning novelty (Criscuolo et al. 2017) or commitment (Klingebiel and Esser 2020). Gauging the meaningfulness of selection biases requires outlining the performance that can be achieved in the absence of bias. Our work provides such expectations for multiple selection procedures. It offers a structured answer to organizations searching for and experimenting with different aggregation methods (Luo et al. 2021, Sharapov and Dahlander 2021, Carson et al. 2023). Our work thus opens up avenues for future research on the performance of decision-making structures, particularly as regards rules for aggregating selection under uncertainty.
2. Innovation Portfolio Selection
The starting point of our work is the model of Csaszar and Eggers (2013). They compare the performance of collective decisions—voting and averaging—with that of individuals—anyone and experts. For detail on the research preceding the Csaszar and Eggers model, we refer to the authors’ comprehensive review of the field. Since then, much work, mostly nonorganizational, has focused on how weighted algorithms can improve crowd wisdom (Kameda et al. 2022, Xia 2022, Budescu et al. 2024). Csaszar and Eggers’ work remains the most relevant baseline for our purposes, because it considers the organizational context of projects with uncertain payoffs1 and variously informed decision makers, central features of organizational reality and part of what motivates our research.
We extend the Csaszar and Eggers model to the organizational selection of multiple project candidates, subject to resource constraints. Concurrent consideration is common when organizations or investors review a list of innovative proposals and select only those alternatives they deem most worthy of receiving resources from limited budgets. They rarely approve proposals without considering funding limits and opportunity costs (Wheelwright and Clark 1992, Cooper and Kleinschmidt 1993) because each dollar spent is one not spent elsewhere.
Organizations instead aim to make the most of the few resources at their disposal.2 Therefore, for projects to be selected into the portfolio, they need to not only clear a quality threshold such as an expected rate of return, but also be of higher quality than concurrently reviewed alternatives. Discrimination among projects not only complicates the application of the decision rules discussed in prior work. It also gives rise to an additional class of rules involving relative preferences that Csaszar and Eggers did not have to consider. This departure likely affects which rule helps organizations perform best.
Relative preferences and the general problem of selecting a subset of alternatives feature in the social-choice literatures on multiwinner voting systems (Nitzan and Paroush 2017) and participatory budgeting (Goel et al. 2019, Benade et al. 2021). A primary subject of inquiry in such social-choice research is how closely collective decisions reflect individual preferences, but a notable substream additionally examines how collectives reach “correct” decisions (Nitzan and Paroush 1982, Austen-Smith and Banks 1996).
Identifying the single best option in a set of more than two uncertain alternatives is a task in which majority voting still performs well with rules for tie-breaking (Young 1988, Hastie and Kameda 2005). We extend this insight by asking which aggregation method should be used if organizations want to select multiple projects—the best subset of opportunities that the organizational budget allows them to afford. Here, the multiwinner literature already foreshadows the usefulness of ranking methods (Procaccia et al. 2012). Large sets of homogeneous voters identify the correct order of noisily perceived choices more often when ranking, rather than approving, the choices (Faliszewski et al. 2017, Boehmer et al. 2023, Rey and Endriss 2023).
Generating similar insights for portfolio decision rules matters. Organizations have few decision makers, and with heterogeneous expertise. Aggregating their impressions is a problem that receives attention: Firms have been found to engage in costly trial-and-error search for innovation-selection processes (Sharapov and Dahlander 2021). Some venture capitalists deliberately adopt minority voting (Malenko et al. 2023), for example, in the attempt to improve performance by reducing errors of omissions in environments where success follows a power-law distribution. Broadly speaking, however, empirical work in this area suggests that firms are not particularly effective in making selection decisions (Klingebiel and Adner 2015, Criscuolo et al. 2017, Sommer et al. 2020). Our work thus aims to establish conditions under which one can expect different forms of aggregation to improve the performance of organizational portfolio selection.
3. Modeling Portfolio Selection
Portfolio selection occurs whenever multiple candidates vie for limited resources. Although one can easily imagine a court case to be judged in isolation, with culpability determined irrespective of the outcomes from other concurrent cases, it is harder to imagine companies to decide funding for an innovation project irrespective of superior alternatives. Organizations will want to spend scarce funds on innovation projects only if they perceive future payoffs to be in excess of those of other projects. Even when organizations proceed with a single project only, the decision likely resulted from a process of selection rather than an isolated instance of project assessment (Si et al. 2022).
Therefore, we introduce selection into the organizational decision framework of Csaszar and Eggers (2013) by adding a budget constraint of projects, chosen from alternatives. Agents’ evaluations of projects inform an organization’s selection of a subset (Figure 1).

Notes. N agents consider a list of n projects of types ti and qualities qi . Agents compile preference lists that are ordered based on their project-quality perceptions . Finally, an aggregation rule combines the individual preference lists.
We consider both m and n exogenous. In established organizations, top management sets aside a portion of organizational resources for innovation. Top management determines this budget m by gauging the need for rejuvenation and considering rival demands for the same resources (Schilling 2023). Innovation executives, who are to be our model agents, then decide on how to spend the given budget. In real-world organizations, innovation executives might influence the budget-setting process and occasionally request increases alongside emerging opportunities (Klingebiel and De Meyer 2013). We leave the examination of such exceptions to future research.
Likewise, we treat n as independent from our agents. The project candidates reviewed at an innovation-board meeting are typically generated by personnel other than the decision makers (Wheelwright and Clark 1992). The possibility that innovation executives are partial to the generation and evaluation of some but not other opportunities (Dushnitsky et al. 2023), or that they may revisit initial decisions at later points (Furr and Eggers 2021), are extension areas for future research.
Following Csaszar and Eggers (2013), we characterize project candidates with two stochastic variables: represents the type, and the quality of project . The distributions ψ and have supports and , respectively (Table 1).
|
Table 1. Model Components
Symbol | Definition |
---|---|
Number of available projects | |
Number of selected projects | |
Number of agents | |
Quality of project i | |
Type of project i | |
Quality of project i as perceived by agent j | |
Expertise value of agent j | |
Perceptual noise of agent j w.r.t. project i | |
Distribution of project qualities qi with support | |
ψ | Distribution of project types ti with support |
χ | Distribution of expertise values ej with support |
Note. An overview of the main model variables and distributions.
Project type ti can be viewed as a variable describing knowledge domains. Incorporating such domains means that agents cannot assess all projects equally well, a departure from social-choice models of multiwinner elections such as that of Procaccia et al. (2012).
Agents are characterized by expertise values ej, which are distributed according to a distribution function χ with support . Accordingly, denotes the quality of project i as perceived by agent j, where ηij is distributed according to a normal distribution with zero mean and standard deviation . The inclusion of a noise term accounts for uncertainty in project evaluation. Noise thus varies with domain expertise; the quantity captures the degree to which project type matches agent expertise.
This operationalization of variation in judgement quality is a plausible approximation of the organizational reality in portfolio decision making under uncertainty (Kornish and Hutchison-Krupat 2017). It endows agents with equally imperfect capabilities but recognizes that they may come from different backgrounds. For example, the innovation boards at pharmaceutical companies encompass experts from different therapeutic classes (Aaltonen 2020). Those experts’ judgements are more accurate for proposals in their own class. Domain-specific expertise has been documented to similarly influence innovation decision quality at device manufacturers (Vinokurova and Kapoor 2023) and service firms (Klingebiel and Esser 2020). We thus follow Csaszar and Eggers (2013) in recognizing this feature of evaluative precision in our model.3
Building on previous work on multiwinner electoral systems (Brandt et al. 2016, Elkind et al. 2017), we represent the quantities used to produce aggregated preference lists by an ordered triplet , where , and denote one realization of the sets of project qualities, project types, and expertise values, respectively. Each agent sorts n projects in descending order of perceived quality. For example, in the case of n = 2 available projects, agent j strictly prefers the first over the second project if her perception of project qualities satisfy . In general, the relation means that agent j strictly prefers project i over k, which is the case if and only if . To denote the position of project i in the preference order of agent j, we use the notation .
An aggregation rule is a function that maps a realization of , to a corresponding subset of selected projects. Ties occur if the selection rule produces multiple outcomes of the same cardinality. If ties occur in our simulations, we uniformly at random select one outcome.
We use to denote the probability density function (PDF) of , the ordered project qualities under selection rule . The support of is . The expected portfolio performance associated with under selection rule thus is
Dividing this quantity by m would yield the corresponding expected quality per selected project. Analytically evaluating the integral in Equation (1) is not tractable for general selection rules . Hence, our main approach involves running Monte Carlo simulations of independent and identically distributed (i.i.d.) realizations of various selection rules.
Appendix A derives the bounds within which we can expect portfolio performance to vary in the simulations. For a uniform quality distribution , the theoretical maximum of the expected portfolio performance is
It constitutes an upper limit against which we can evaluate different selections rules .
4. Aggregation Rules
The aggregation rules that we adapt and examine in our Monte Carlo simulations encompass classics that are simple and distinctive, a subset of potentially endless method variants (Elkind et al. 2017). They encompass the voting and scoring rules considered in Csaszar and Eggers (2013) plus a simple ranking rule known in the social-choice literature as Borda count (Elkind et al. 2017).
All our rules preserve the balance of type I and type II errors in expectation (Klingebiel 2018) and therefore disregard methods involving consensus, sequencing, or hierarchies that would skew the balance (Sah and Stiglitz 1988, Christensen et al. 2022, Malenko et al. 2023). We also assume nonstrategic agents (in contrast to Piezunka and Schilke (2023) or Marino and Matsusaka (2005), for example) and independent projects, disregarding potential benefits from composing portfolios with projects of varying novelty or knapsack efficiency (Faliszewski et al. 2017, Si et al. 2022). Our decision makers neither communicate nor learn from one another or across projects (Flache et al. 2017, Elhorst and Faems 2021, Becker et al. 2022). Relaxing some of these constraints would be a natural next step for considering additional aggregation rules in future research.
To transport classic aggregation rules to a portfolio-selection context, we modify them such that they impose a funding constraint at the organizational level. Selection criteria, therefore, are not based on thresholds, such as a positive average evaluation, or a majority of yes votes, that one would find in the context of isolated project approvals. Instead, organizations select into the portfolio m projects with the relatively highest scores.4 The subsequent definitions thus incorporate organizational discrimination.
4.1. Individual
All projects are evaluated by a single agent with expertise value , which is the mean of the expertise distribution. The organization then ranks projects based on the agent’s quality perceptions and selects the top projects. This selection rule implements the individual rule of Csaszar and Eggers (2013) in a portfolio context.
4.2. Delegation
Each project is evaluated by the agent whose expertise is most closely aligned with the project’s type. These are agents whose expertise value ej minimizes the uncertainty . The organization then ranks projects based on experts’ quality perceptions and selects the top projects. This selection rule implements the delegation rule of Csaszar and Eggers (2013) in a portfolio context.5
4.3. Voting
All projects are evaluated by all agents. Agents allocate a vote to each project for which they have a positive perception of quality. The organization then ranks projects based on the number of agent votes and selects the top projects. This selection rule implements in a portfolio context the voting rule used by Csaszar and Eggers (2013)6 and others (Li et al. 2001, Keuschnigg and Ganser 2017, Oraiopoulos and Kavadias 2020).
4.4. Averaging
All projects are evaluated by all agents. The organization then ranks projects based on agents’ mean quality perceptions—scores, effectively—and selects the top projects. This selection rule implements the averaging rule of Csaszar and Eggers (2013) in a portfolio context.
4.5. Ranking
All projects are evaluated by all agents. Each agent j places the projects in a descending order of perceived quality. Each project i thus receives a position . The organization then ranks projects based on the sum of agents’ reversed project positions, and selects the top projects. This selection rule implements the Borda rule of the social-choice literature (Elkind et al. 2017) in a portfolio context.7
5. Results
Our base-case analyses use the parameter values of Csaszar and Eggers (2013) to enable comparisons. The number of decision makers is set to N = 3, the type distribution to , the quality distribution to , and the noise distribution to . We additionally set the number of available projects to n = 100.
The expertise of the agent in the individual rule is set to a central . To represent the collective knowledge of an organization’s decision makers, we assign each agent an expertise value , where denotes the knowledge breadth of an organization.
For the given distributions, we generate i.i.d. realizations of the underlying model quantities to perform Monte Carlo simulations of all aggregation rules presented in Section 4. We then compare their portfolio performances, , and explore variation in the parameter space to probe for generality of our results.
All implementation details are provided at https://gitlab.com/ComputationalScience/multiwinner-selection.
5.1. Aggregation-Rule Performance
In the base case, ranking provides the highest performing aggregation rule for (Figure 2). Averaging approaches the performance of ranking for smaller values of β as the number of selected projects, m, increases.

Notes. The panels show the total performance of project portfolios selected by the aggregation rules of individual (





Delegation to project experts is the most effective selection protocol for (Figure 2). In our portfolio-selection setting, the knowledge breadth β at which a delegation protocol begins to outperform ranking is larger than the reported value of β at which delegation outperforms other protocols in a project approval setting (Csaszar and Eggers 2013), at least for small budgets (i.e., small values of m). This observation is insensitive to project-type variations as we elaborate in Section 5.5.
The delegation rule comes close to the maximum possible performance (m = 10) and 104.0 (m = 30) for intermediate levels of knowledge breadth β. For β = 0, all decision makers have the expertise of individual decision makers. As β increases, the expertise values of all decision makers cover a broader range of project types. Hence, decision makers with expertise values close to specific project types can be selected for intermediate values of β. If β is too large, the distance between available and required expertise grows.
For a general uniform type distribution , the maximum performance of the delegation protocol is achieved if the N decision makers have expertise values
For N = 3 decision makers with , the maximum performance of delegation is thus realized at expertise values
In contrast to project approval, voting is not very effective in portfolio selection. To see why, consider that it aggregates binary signals only. The aggregate scale for totaling the votes of N = 3 decision makers contains four levels only. Voting thus often fails to discriminate between many projects.
To gauge the discrimination limitation of the voting protocol, we conduct additional simulations with i.i.d. realizations. Among n = 100 projects, 31 on average receive a full three votes from N = 3 decision makers with knowledge breadth β = 0, 28 projects receive three votes with , and 23 projects with β = 5. Therefore, with m = 10, the voting rule, typically selects only projects receiving a full three votes. The quality difference between the best and the 20th best project can be large but aggregate votes tend not to reveal this.8 voting thus underperforms more discriminating rules such as ranking. With greater budgets such as m = 30, voting does relatively better.
Greater knowledge breadth decreases the performance of voting but less so than that of other aggregation rules. Therefore, as β and m increase, averaging and voting can achieve similar performance (Figure 2(b)). In such situations, voting caps the influence of erroneous classifications made by single agents with an unsuitable expertise. Averaging suffers relatively more quickly from the aggregation of erroneous estimates provided by agents with unsuitable expertise.
These results remain stable even with extremely small budgets that permit the selection of m = 1 project only (see Appendix A.2).
5.2. Discrimination Effectiveness
Why does ranking outperform averaging? For some intuition, consider N = 3 agents, n = 3 available projects, and knowledge breadth β = 0. In one realization of the agents’ quality perceptions (Table 2), we have the preference orders: , and . The organization would select Project 1 first, as its sum of reversed project positions is four. Project 2 is second-most attractive, with a sum of three. Project 3 would be least attractive, with a sum of two.
|
Table 2. Aggregation Example
Individual | Organization | |||||||
---|---|---|---|---|---|---|---|---|
Agent 1 | Agent 2 | Agent 3 | Averaging | Ranking | ||||
Project 1 | 7.1 | 1 | −11.7 | 3 | 4.4 | 1 | −0.07 | 4 |
Project 2 | 2.0 | 3 | 2.0 | 1 | 2.0 | 2 | 2.00 | 3 |
Project 3 | 5.5 | 2 | −4.1 | 2 | −1.8 | 3 | −0.13 | 2 |
Notes. The data are from a sample realization with n = 3 projects of type , and , respectively. Knowledge breadth of β = 0 has all N = 3 agents endowed with expertise value . True project qualities are , and . The quantity denotes the quality of project as perceived by agent and is the position of project i in the preference list of agent j. In the case of m = 1, the averaging rule would select Project 2, whereas ranking would select Project 1.
If the organization instead used averaging for the same data, it would select Project 2 first, as it receives a mean agent assessment of two. Project 1 would be second-most attractive, with a mean assessment of −0.07. Project 3 would be least attractive, with a mean assessment of −0.13. The aggregate organizational preference list produced by averaging would not list the best project first because it is vulnerable to a single agent’s misclassification.
In our base model with n = 100, m = 10, β = 0, and N = 3, ranking identifies the highest-quality project in about 63% of realizations, whereas averaging does so in about 58% of cases (see the selection probabilities in Figure 3). The reason is that agents with an outlying impression of project quality can sway the aggregate selection more readily in the averaging protocol than in the ranking protocol. Ranking accommodates extreme inputs more readily, because uncapped quality differences are translated into capped score differences in rank orders (the maximum rank-order score difference is n − 1 per agent). For the ranking protocol to misclassify a project in aggregate, a relatively greater number of individual agents would have to concurrently misclassify.

Notes. The figure charts the probability that ranking (


Ranking is thus particularly effective in identifying projects of extreme quality. This ability to discriminate is crucial for portfolio selection with tight budgets. Discrimination effectiveness is less relevant for larger budgets: If a 15th-best project is misclassified as 17th best, for instance, the impact on portfolio performance is marginal. Consequently, averaging gains in relative performance when budgets also permit the selection of more moderate-quality projects. The flatter project-selection probability distribution of averaging is more suited to more munificent budgets. Selecting more projects balances the impact of misclassifications. As m approaches n, the performance of all selection protocols becomes equal.
The performance dynamic of ranking and averaging resembles that observed for sample mean and sample median as gauges of population values. For normal distributions, the sample mean is more efficient than the sample median in estimating the mean value of the underlying population (i.e., the variance of the sample mean is smaller than that of the sample median; Kenney and Keeping 1962). However, the sample median is known to be less sensitive to small-sample outliers that can introduce unwanted bias in the sample mean. In accordance with these arguments, we show in Section 5.6 that averaging achieves higher portfolio performance than ranking for a large number of agents N. Most organizational selection committees, however, consist of only a small number of decision makers. For them, ranking is a more effective aggregation rule than averaging.
5.3. Budgets and Choice Sets
The size of an organization’s innovation budget determines the number of projects m it can select. The number of project alternatives n available and identified by an organization compose the choice set from which it can select. Although the former typically pales in comparison with the size of the latter (Klingebiel 2018), numbers can vary across organizations. Such variance could matter in principle because values for m and n bound the theoretically attainable portfolio performance. The supplemental analyses reported in Appendix A, however, shows that they rarely change the relative ordering of aggregation-rule performance as observed in our base-case analysis.
An interesting edge case is that of small choice sets. When the cardinality of the choice set of candidate projects in our model is smaller than about n = 10, and N = 3 agents with knowledge breadth β = 0 were to select m = 1 project, averaging outperforms ranking (Figure 4(a)), as it has access to more information on the underlying project quality. For larger numbers of candidate projects and small values of m, averaging is more likely to misclassify a project.

Notes. The panels show regions in (m, n)-space where either averaging (


Another edge case to ranking’s dominance is found for generous budgets and zero knowledge breadth. When the number of projects m that the budget permits is not much smaller than the number of projects n available in the choice set, averaging outperforms ranking for β = 0 (Figure 4(a)). In such cases, the benefit of greater information provision outweighs the low risk of misclassification. This effect is restricted to organizations with homogeneous decision makers.
If the knowledge breadth β takes on more realistic values above zero, such that available expertise values are better aligned with the underlying project types, the advantage of averaging over ranking diminishes. Figure 4(b) illustrates that with β = 5, ranking outperforms averaging even for the smallest possible choice set with n = 2 elements. Further simulations indicate that this dominance begins at even lower knowledge breadths—results for are consistent with those obtained for β = 5.
5.4. Delegation Errors
Innovation projects contain novel elements for which past data offers limited guidance. Experts from some domains will have relevant experience and, through associations, may gauge the promise of novelty better than other experts. However, organizations may not always know ex ante who these most suitable experts are, leading to errors in delegation. Such likelihood of delegation error is one reason for why academic journals, as well as grant institutions (Bian et al. 2022), for example, seek the opinion of multiple expert reviewers without fully delegating decisions to any.
In Figure 2, we show the selection performance of delegating to project experts as a function of delegation error . When r = 0, projects are always assigned to the most qualified expert, whereas with r = 1, projects are randomly distributed among the three available agents. In more mathematical terms, organizations assign projects with probability to any of the two least suitable agents and with probability to the most suitable (Csaszar and Eggers 2013).
Detailed simulations for a larger number of values of r show that for the delegation protocol no longer provides a substantially better performance than ranking for a small budget (Figure 2(a)). The influence of delegation errors diminishes with larger budgets (Figure 2(b)). An error of r = 0.2 means that 87% of projects are evaluated by an appropriate expert.
Although ascertaining delegation-error rates in prior empirical work is limited by the lack of counterfactuals, it is not hard to imagine that innovation projects, covering novel terrain by definition, are often mismatched to expertise in existing terrain. Ambiguity about the suitability of experts in evaluating innovation thus renders delegation an unattractive aggregation rule.
In an alternative approach, organizations could try delegating project evaluation to a single portfolio expert, whose expertise minimizes the uncertainty with respect to all projects. In our main specification, this would be the agent with expertise , which is equal to the mean project type. A portfolio expert would thus perform as well as an individual. Erroneously designating as portfolio expert one of the other agents with expertise values would yield a performance that is worse than that of the individual protocol for .
5.5. Environmental Turbulence
The performance of different selection rules does not depend only on the level of knowledge breadth in the group of decision makers but also on the distribution and range of project types. When market environments shift, the relevance of organizations’ existing knowledge base diminishes. Csaszar and Eggers (2013) exemplify such shifts with the technological transition from analog to digital photography, which rendered some of Polaroid’s expertise less useful for project selection (Tripsas and Gavetti 2000). Considering additional type distributions helps us examine how different aggregation rules cope with environmental shifts.
Figure 5 reports the portfolio performance of aggregation rules for the type distributions and . Performance generally decreases when the distance between required and available expertise increases. If the expertise of decision makers is close to the type of the project under evaluation, selection errors are small. Consequently, an expertise level of yields smaller errors for the type distribution than , for example.

Notes. The panels show the total performance of project portfolios selected by the aggregation rules of individual (





Ranking, however, is relatively less impacted by risk of misclassification when project-type distributions shift. The ranking rule’s performance surpasses that of error-free delegation for knowledge breadth as wide as for the base-case type distribution , and as wide as for the type distribution . The further the project-type distribution moves from agents with relevant expertise, the greater the knowledge breadth at which ranking outperforms even perfect delegation. Relatively homogeneous organizations facing disruptive change would thus fare best with ranking.
5.6. Crowds vs. Experts
Up to this point, we kept the number of decision makers at a constant N = 3. Relaxing this constraint can reveal relative differences in the marginal benefit of additional decision makers. Increasing crowd size also allows collectives to outperform experts even in settings where delegation error is absent and expertise broadly distributed (Davis-Stober et al. 2014).
Through approaches such as open innovation or open strategy (Chesbrough 2006, Stadler et al. 2021) organizations can enlarge their pool of internal decision makers, and it would be instructive to know how large such collectives would need to be to outperform delegation to three knowledgeable project experts. IBM, a large technology firm with an in-house crowd effort, managed to have 25 colleagues review projects of its iFundIT program, although not everyone evaluated all projects (Feldmann et al. 2014). We could take this observation as an upper bound of the number of suitable agents that organizations might feasibly recruit to the collective task of portfolio selection.9
We thus examine the number of decision makers required for collective protocols to outperform delegation to the three project experts of our base-case parameterization (Figure 6 illustrates crowds of N = 15 and N = 45). Averaging outperforms delegation to project experts as the number of decision makers N nears 15; ranking already does at around N = 13. Voting can compete with delegation over the whole range of knowledge breadth only with 45 or more decision makers.

Notes. The panels display the portfolio performance of crowd-aggregation rules voting (




Although ranking outperforms averaging with about 10 or fewer decision makers, the order reverses with bigger crowds (Figure 6), even at large values of knowledge breadth. In simulations with , this performance gap grows (2.57%, 2.86%, 3.06%, and 3.12%, respectively, at β = 0). The magnitude of the growing gap might nonetheless be insufficient to justify the use of averaging, given that such large crowds would be hard to manage and well in excess of those observed as feasible in the IBM study of Feldmann et al. (2014).
In all studied scenarios, voting is inferior to averaging and ranking. In particular, for β = 0, the performance of voting changes only very little with an increase of the crowd size, even if it is by an order of magnitude. This is because in voting, agents make binary choices, where all projects perceived to yield positive payoffs receive approval. When noise is within bounds and expertise overlaps, there is limited benefit to soliciting more near-identical decisions from a crowd. Ranking and averaging gain more from homogeneous crowds as they provide more fine-grained information for selection.
In a converse scenario with considerable noise and/or knowledge breadth, voting (very) slowly gains in performance with an increasing number of decision makers. Each additional decision maker adds granularity to the aggregation scale (3 decision makers mean that a project can have either no, 1, 2, or 3 votes—10 decision makers would classify a project anywhere between no and 10 votes, and so on). Ranking and averaging provide granular aggregation scales even with few decision makers.
5.7. Batching
In the aggregation protocols we study, agents evaluate each project on its own. One could alternatively imagine agents directly comparing projects and making relative judgments—at least when there is no strict need to first provide separate assessments, such as with voting or ranking. In such cases, cognitive limitations might weaken comparison effectiveness as the number of candidate projects grows. At some level of n, agent evaluations may become unreliable.
To guard against such scenario, one could design an evaluation regime in which individuals receive no more projects than they are able to compare reliably. The precise magnitude of such a cognitive limit c is unknown and varies with context (Scheibehenne et al. 2010).10 The illustrative analysis reported below sets the limit to a conservative batch size of c = 10. The idea is that, when an organization’s choice set is as large as the n = 100 projects considered in our base-case analyses, agents could share the load and each evaluate c = 10 projects only.
Reducing agents’ cognitive load requires proportionally more of them. The number of agents in our base-case analyses would have to go up by a factor of n/c to ensure that each project gets the same number of evaluations in the cognition-conscious batching regime.
If little is known ex ante about projects and agents, agents will receive a randomly drawn subset of c projects. The evaluation could also be shared among an organization’s cohort of evaluators based on preference (Bentert and Skowron 2020). A more directed approach is to allocate c projects each to N agents such that there is a match between the types of expertise required and available. The organization would ask its relatively most experienced colleagues to vote, estimate, or rank.11 This makes most sense when evaluators and projects are known to span a comparable range of expertise.
Finally, the organization may authorize those subgroups of evaluators to make decisions on its behalf. Innovating organizations often acknowledge limits to the comparability of projects of different departments, subdividing the overall budget and allowing departments to make their own decisions about which projects to select (Chao and Kavadias 2008).
Figure 7 reports the analysis for these approaches to batching. As the main analyses, batching is based on uniform type, quality, and expertise distributions, maintaining the number of project candidates at n = 100 and the number of selected projects at m = 10. We multiply the number of agents involved in each selection rule by n/c, yielding n = 10 for individual and delegation, n = 30 for voting, averaging, and ranking.12 Each agent receives a batch of c = 10 projects to assess.

Notes. The panels show the total performance of project portfolios selected by the aggregation rules of individual (





Without expertise matching, the assignment of c = 10 projects is uniformly at random without replacement from the pool of n = 100 projects. Expertise matching is a hard problem and a thorough review of the multitude of implementation possibilities goes beyond the scope of our work. We here employ simple ordinal matching. We begin by arranging projects in ascending type order and agents in ascending expertise order. We then assign the first batch of c = 10 projects to the first agent, in the case of individual and delegation, or the first three agents, in the case of voting, averaging, and ranking. The second batch goes to the second agent(s), and so on.
Agents normally submit their project votes, estimates, or ranks to a central organization for the final aggregate selection decision. In a decentralized setting, by contrast, each of n/c agents, or sets of agents, selects m/c projects. In the analysis of Figure 7, this means one project each. Collectively, these m selected projects make up an organization’s portfolio.
The results reported in Figure 7 show that the performance of averaging improves relative to ranking, at least at lower levels of knowledge breadth β. This is because aggregating 10 project ranks from three agents yields less granular distinctions than aggregating precise project estimates. Although agents’ detailed project estimates may be flawed, the random tiebreakers often necessary in aggregating rankings are relatively more detrimental to portfolio selection. Therefore, if cognitive limitations are a concern, evaluation noise moderate, and agents plentiful, averaging may offer a more effective batch-selection method than ranking.
The results reported in Figure 7 also show that random batching unsurprisingly underperforms expertise batching, especially when knowledge breadth β increases. Real-world organizations will find themselves somewhere in between the random and perfect expertise assignment.
Decentralizing decision rights, too, is usually a bad idea, due to the loss of being able to optimize at the portfolio rather than subportfolio level. Ranking, however, suffers less from decentralization than other rules. This is because the projects that would have been selected at the subportfolio level also often end up being selected at the portfolio level. The top projects of each batch also have the top scores in the portfolio. It is rare that the second-placed project in one batch has a greater sum of inverted ranks than the first-placed of another batch. Therefore, if cognitive limitations were a concern and addressed with batching, organizations that use a ranking rule could more easily decentralize with less of a performance sacrifice.
6. Discussion
We extend earlier work on aggregating project approval to the context of selecting projects for resource-limited portfolios. We show that ranking, an aggregation process specific to portfolio selection, is often more effective than averaging and voting, processes also available in a single-project approval context. These findings contribute to the literature on resource allocation and aggregation, respectively.
6.1. Resource Allocation Decisions
The earlier work of Csaszar and Eggers (2013) highlights how the choice of rules for aggregating individual decisions into an organizational one can produce meaningful performance differences. Its insights are applicable to contexts in which the (dis)approval of one project is viewed independent of the (dis)approval of other projects (see the assumptions in Sharapov and Dahlander (2021), Malenko et al. (2023), Piezunka and Schilke (2023), and Criscuolo et al. (2017), for example). Also relevant for the isolated approval of projects are attempts to aggregate project forecasts into decisions through polls or markets (Lamberson and Page 2012, Atanasov et al. 2017).
Acknowledging, however, that organizations are resource constrained, means that not all projects that would be approvable in isolation can be funded. The challenge for organizations is to identify the subset of many possible projects that most likely maximize organizations’ return on investment (Sharpe and Keelin 1998, Archer and Ghasemzadeh 2007, Kleinmuntz 2007). Solving such optimization problems involves preference orders, derived from aggregating individual agent preferences.
In this portfolio-selection context, the relative performance differences among aggregation rules reported by Csaszar and Eggers (2013) do not hold. Although the earlier study is justifiably concerned with the performance of all approved projects, the focus for portfolio selection is on the performance of only those projects that organizations can afford to fund. That is because resource allocation in organizations is not only about correctly identifying projects with positive returns but about selecting the subset of projects that deliver the greatest return on the investable resources (Klingebiel 2018, Brasil and Eggers 2019).
Our work reveals how totaling project ranks provided by agents offers the highest aggregation-rule performance in many circumstances that one might find in organizations. The ranking-rule performance is below the optimum that omniscient decision makers could attain, but it is often above the performance of other rules for aggregating decisions with limited information.
By highlighting performance dynamics of decision aggregation rules, our work provides a normative foundation for descriptive research on resource allocation. Crucially, it provides a baseline benchmark for work attempting to highlight behavioral inefficiencies in portfolio selection (Criscuolo et al. (2017) and Sommer et al. (2020), for example). It also provides a reference point without which empirical observations of portfolio-selection rule performance (Sharapov and Dahlander 2021, Malenko et al. 2023) are hard to interpret.
In future research, it would be valuable to expand upon our work by considering additional factors such as differential project types and costs (Goel et al. 2019) or dynamic features (Si et al. 2022). Further opportunities arise from merging our insights with those on managing portfolios under uncertainty, including the partial allocation and potential reallocation of resources over time (Klingebiel and Rammer 2021), the allocation of resources by more than one organization (Folta 1998), or the incentive structures used to populate choice sets for portfolio selection (Klingebiel 2022).
6.2. Organizational Decision Aggregation
Our work further contributes to the resurgent interest in aggregation structures (Keum and See 2017, Böttcher and Kernell 2022, Christensen et al. 2022). In particular, we shed further light on situations in which one might expect expert decision makers to outperform variously aggregated crowds (Mannes et al. 2014, Csaszar and Laureiro-Martínez 2018, Scott et al. 2020). Specifically, choosing the best subset from a range of options nontrivially departs from previously studied contexts due to its greater need for discrimination.
Although delegation performs highest in settings where experts can be found, the often imperfect organizational process of matching uncertain projects with the right domain specialists in turbulent environments calls for alternative approaches. Having multiple imperfectly informed decision makers weigh in on the same project propositions typically improves on the eventual performance that an organization can expect from its portfolio. Ranking does so most effectively.
When agents rank projects, they provide an assessment of how the quality of one project compares to that of others. Most rankings in real-world organizations are necessarily imperfect amalgamations of multiple criteria, ranging from profit forecasts over strategic fit to short-term versus long-term considerations.13 Using subjective rankings as an input to the ultimate organizational decision thus makes intuitive sense. In contrast to seemingly more precise project-value appraisals, crude rankings often help select higher-performing project portfolios.
Future field research on aggregating agents’ preference lists may benefit from the fact that ranking endogenizes a concern over strategic behavior. Employees who know of their organization’s resource constraints may not provide project assessments or votes that reflect their true beliefs, in an attempt to lift some projects above the cutoff (Bassi 2015). With ranking, agents maximize the chances of their organization funding their favorite projects by ranking projects in the preferred order. There is little room for gaming by submitting preference orders that fail to reflect beliefs (unlike with averaging, for example, where agents could inflate forecasts for their preferred projects, and deflate those for less preferred candidates).
Similarly beneficial is that ranking appears more tolerant of biased inputs, requiring fewer agents to select optimal sets than alternative aggregation methods (Boehmer et al. 2023). Ranking methods that additionally reflect preference intensity (Skowron et al. 2020) might be similarly robust to strategy and bias, presenting a straightforward extension possibility for our current work.
Further opportunities for future research include the extension of our work by accounting for quadratic voting effects (Eguia et al. 2019). An alternative direction to consider involves devising algorithms that can help identify effective selection rules akin to algorithmic solutions of multiwinner election problems (Peters 2018, Xia 2022). Future work may also explore project-cost distributions (Benade et al. 2021), skill heterogeneity and weighted aggregation (Ben-Yashar et al. 2021, Manouchehrabadi et al. 2022), strategic and coordinated selection behavior (Myatt 2007), and vote trading (Casella and Macé 2021). Further potential exists in recognizing the impact of organizational competition, which may favor contrarian rules such as minority voting (Arrieta and Liu 2023, Malenko et al. 2023).
6.3. Managerial Application
The performance of aggregation rules depends on the availability of information about the knowledge held by employees and the size of the innovation budget and choice set. Organizations looking for simplified guidance on which rule to adopt may consider the illustrative decision tree presented in Figure 8. In portfolio-selection situations with many choices, tight budgets, and unclear expertise, our work recommends the ranking rule.

Notes. When asking more than one person to decide on which projects to select for an innovation portfolio, organizations stand to benefit from adopting a ranking rule. Averaging is a good choice in specific situations. Simple voting is less suited to the selection of innovation projects.
The performance of the ranking protocol is good news for two reasons. One is that many organizations already informally aggregate rankings in some form when they meet in a committee setting. Such committee meetings often involve discussions that contribute to the convergence of individuals’ assessments of projects (Lane et al. 2022). The ranking protocol deals well with low belief heterogeneity, attenuating the potential impact of convergent beliefs. Therefore, organizational reality may not be too far from feasible aggregation optima.
The second reason is that organizations probably have an easier time implementing a ranking protocol than some of the other aggregation mechanisms reviewed here. Rather than having to submit seemingly precise project-value assessments, decisions makers simply have to put projects into a preference order. This may become more taxing as the number of projects to consider increases, but aggregation through ranking is somewhat forgiving of the accuracy of assessments that lead to the preference orders. It often produces innovation portfolios with the relatively highest performance outcomes.
Given these advantages, what could go wrong? A few aspects of the ranking rule’s practical application might be worth paying attention to in future empirical research. A first step would be studies of safeguards against loss in judgement quality that stems from the greater cognitive load of comparing a potentially large number of candidates simultaneously (Cui et al. 2019, Gelauff and Goel 2024). Our main models sidestep this issue by having agents score projects individually, which only later amounts to a ranked project list for each agent (the aforementioned procedure for the aggregate ranking of Top Universities does the same). Innovating organizations might get close to this ideal by having project proposals presented one at a time, making comparisons easier to avoid (Mussweiler and Epstude 2009).
To then further mitigate potential order effects, whereby evaluators compare a focal proposal to what they can remember about those evaluated previously (Elhorst and Faems 2021, Klingebiel and Zhu 2022), organizations might wish to shuffle the sequence of proposals for each evaluator. The setting of a committee meeting does not easily lend itself to different evaluation sequences but asynchronous online assessments would. One challenge for such asynchronous assessments would be to ensure that assessors evaluate all candidates. Incomplete rankings akin to those submitted on participatory budgeting platforms such as Stanford’s,14 for example—where assessors receive no compensation and thus prioritize attention—not only provide less information (as per Section 5.7) but also open the door to herding and influencing.
Moreover, future research could examine the effectiveness with which organizations are able to aggregate the rankings that their employees provide. Without an explicit aggregation rule, managers’ processing of rank information may differ from their processing of scores. For example, Chun and Larrick (2022) suggest that people sometimes treat rankings as a shortcut heuristic for separating top candidates from a cohort, forfeiting more fine-grained discrimination. Automating aggregation may thus prove useful in guarding against processing biases. In any case, adding ranking to the list of aggregation methods to be examined behaviorally (Niederberger et al. 2023) seems apt given its conceptual benefits for innovation-portfolio selection.
6.4. Conclusion
Our work contributes to the understanding of resource allocation in innovation portfolios. Increasing data availability and scholarly interest in the topic have revealed interesting patterns of behavior when multiple organizational actors make joint decisions. Yet, interpreting their relevance requires a normative foundation. In providing one, we show that some insights, such as about the effects of knowledge breadth and delegation error, apply in the context of portfolio project selection just as they do in the better-known but less applicable context of isolated project approvals. However, portfolio selection additionally requires discrimination between projects and the relative performance ordering of suitable decision-aggregation rules thus changes.
Our results indicate that ranking is the most effective selection rule, especially in unstable market environments, and often outperforms averaging even for small values of knowledge breadth. In many scenarios, ranking is preferable to other aggregation rules. delegation makes sense when companies can assign each project to a relevant expert. But environmental turbulence can cause ranking to outperform even perfect delegation.
Multicandidate selection may be relevant not only in the context of innovation, but also for other organizational decisions under uncertainty (Klingebiel and Zhu 2023), including investments in personnel or technology. Our work thus contributes to a better understanding of selection regimes within organizations. The choice of an appropriate aggregation rule is a discretionary element in the design of resource allocation processes that has substantial performance implications.
Our source codes are publicly available at https://gitlab.com/ComputationalScience/multiwinner-selection.
Appendix A. Performance Sensitivity to Budget and Choice Set
A.1. Theoretical Performance Limits
For selected projects, we use to denote the theoretical performance maximum. It can be derived from the order statistic (David and Nagaraja 2004) of the underlying project quality distribution.15 For n realizations of the random variable , one obtains the order statistic by sorting the realizations qi in ascending order. The value of is found by evaluating
The PDF of the order statistic in Equation (A.1) is given by
The mean of the beta distribution is , so we can compute the theoretical performance maximum for any uniform distribution according to
According to Equation (A.4), for any uniform quality distribution , the performance measures and depend on both the number of projects m that an organization’s budget permits to select and the total number of projects n in the choice set.
In the limit of a large number of candidate projects, the proportion of selected projects, or selectiveness (Klingebiel and Rammer 2021), at which reaches its peak value is16
In the same limit, the maximum expected quality per selected project can be approximated by (see Equation (A.4)). As the selectiveness m/n approaches one, the quantity approaches for a uniform quality distribution . Additionally, approaches the upper limit of the underlying uniform quality distribution, , as m/n approaches zero.
Figure A.1(a) shows as a function of m for different values of n and for a uniform quality distribution . The largest value of for constant n is , and it is obtained for m = 1. For n = 20, 50, and 100, the corresponding values are about 4.5, 4.8, and 4.9, respectively. As m approaches n, the maximum performance approaches zero (Figure A.1(a)).

Notes. (a) Theoretical maximum of the expected quality per selected project, , as a function of the number of projects permitted by an organization’s budget, for choice sets with different numbers of project alternatives n. (b) Theoretical maximum performance of a single selected project, . With the uniform quality distribution considered in our base-case analysis, the performance measure approaches in the limit of .
To visualize the dependence of on n for constant m, we show in Figure A.1(b) the performance measure for a single selected project as a function of n. The quality distribution is again . We observe that an increase in the number of available projects from 0 to 10 is associated with a large increase in from zero to more than four. Increasing n from 10 to 100 yields a much smaller increase in of about 0.8. In the limit , the maximum performance approaches . Although more available projects yield a larger value of for a given m, possible performance gains that are associated with further increasing n may be negligible if .
In addition to achieving a high performance per selected project, one often wishes to optimize the overall portfolio performance, whose theoretical maximum is . For a uniform quality distribution , we have in the large-n limit [see Equation (A.4)]. Figure A.2 shows as a function of m for three different uniform quality distributions. The optimum of is attained for

Notes. The figure charts the theoretical maximum of the expected portfolio performance, , as a function of the number of projects permitted by an organization’s budget, for uniform quality distributions with different support. The number of available project candidates is n = 100.
For the quality distributions used in Figure A.2, the corresponding rounded values of are 40, 50, and 60. Using Equation (A.6), the optimal selectiveness approaches in the limit of large n. Given the constraint for the optimal selectiveness, we obtain Equation (A.5) in the large-n limit.
A.2. Relative Performance Ordering
The maximum performance per selected project, , provides an upper bound for the aggregation-rule performance . Figure A.3, (a) and (b), shows that the (m, n)-dependence of associated with different aggregation rules is similar to the (m, n)-dependence of .

Notes. The panels show the performance per selected project and portfolio performance associated with the aggregation rules individual (




The performance measure exhibits a pronounced initial increase with n, gradually diminishing in magnitude for larger values of n (Figure A.3, (c) and (d)). In accordance with the results presented in the main text (Section 5.1), ranking and averaging perform well for a small knowledge breadth (Figure A.3, (a) and (c)), whereas delegation (without delegation errors) is closer to the maximum performance for large values of knowledge breadth (Figure A.3, (b) and (d)). The relative performance ordering of aggregation rules is consistent with the results reported in the main text.
A.3. Relative Performance with Very Small Budgets
In Section 5.1, we studied the portfolio performance of different aggregation rules for m = 10, 30. In Figure A.4, we compare the portfolio performance of all the aggregation rules considered for smaller budgets with m = 1, 3. The relative positioning of the aggregation rules in these two cases aligns with the case where m = 10. However, for m = 1 and intermediate knowledge breadths, there are no discernible performance differences between ranking and error-free delegation.

Notes. The panels show the total performance of project portfolios selected by the aggregation rules of individual





Appendix B. Performance Sensitivity to Project Distributions
In addition to the uniform quality distribution discussed in the main text, we explore the impact of variations in the quality distribution on portfolio performance. We examine two additional quality distributions: (i) a truncated normal distribution with a mean of zero and unit variance and (ii) a power-law distribution with an exponent of .
Both additional distributions have support , and for our analysis, we set and , consistent with the base-case analysis in the main text.
In contrast to a uniform quality type distribution where projects occur with equal probability, regardless of their quality, the truncated normal distribution leads to fewer occurrences of projects with large negative or positive qualities. Projects with qualities close to zero have higher probabilities of occurrence in this distribution.
Regarding the power-law distribution we consider, on average, approximately 70% of the projects will have negative quality. Moreover, only about 5% of the project qualities will exceed a value of four. This distribution represents scenarios where only a relatively small number of projects are associated with relatively large positive qualities.
For the two quality distributions under consideration, Figure B.1 charts the performance of aggregation rules for n = 100 projects and m = 10, 30 selected projects as a function of knowledge breadth. Both distributions encompass fewer high-quality projects than the uniform distribution analyzed in the main text, resulting in a lower overall portfolio performance. The shown differences in portfolio performance are consistent with the findings reported in the main text.

Notes. The panels show the total performance of project portfolios selected by the aggregation rules of individual (





Voting performs substantially better in the simulations with truncated normal distributions centered on zero than in simulations with uniform project-quality distributions. This is because of the many projects with near-zero quality. Whereas uniform distributions favor decision rules that detect relative quality differences between projects, narrow zero-centered normal distributions predominantly require detection of whether a project has a positive value. Voting’s coarseness more easily achieves the latter. The effectiveness of voting in portfolio selection from normally distributed projects thus comes to resemble its effectiveness in approving uniformly distributed projects in isolation (Csaszar and Eggers 2013).
1 Oraiopoulos and Kavadias (2020) also model the isolated approval of uncertain projects. They examine the effect of preference diversity on the performance of majority voting. In our model, we account for more aggregation rules, but exclude strategic behavior.
2 Firms may want to maximize returns at some level of risk. For example, financial portfolios often contain assets with potentially suboptimal return expectations to diversify sources of risk. Our present work, however, does not require the additional consideration of hedging goals. The payoffs from projects in our model are independent of each other and none is structurally more at risk than others. Relaxing these constraints would require arbitration among multiple goals (Faliszewski et al. 2017), a phenomenon worthy of further empirical research on preferences.
3 Our design choice of domain-specific expertise mirrors Hotelling models, in which actors have different distances to a focal point (Hotelling 1929, Novshek 1980). We refer the reader to Adner et al. (2014) for a corresponding review. Alternatives to the Hotelling approach include belief-updating models, in which decision makers share identical priors but receive different signals (Li et al. 2001, Oraiopoulos and Kavadias 2020) that together result in project assessments. This approach produces judgment-specific, rather than expert-specific, variation in precision (Einhorn et al. 1977, Hogarth 1978). Alternatively, one could conceive of expertise as a vector (Csaszar and Levinthal 2016). For instance, one dimension of expertise may pertain to environmental sustainability aspects and another to mechanical design aspects of project value. Multidimensional representations might reflect the microfoundations of the decision-making challenge in more detail—yet we do not expect the replacement of the Hotelling expedience with greater knowledge dimensionality to materially affect aggregation rules’ efficacy in dealing with judgment imprecision. We would welcome future research on this topic.
4 Our selection rules could additionally impose a project-quality threshold. For example, few executives would suggest committing to projects that they expect to yield negative payoffs. Because the parameterization of our main model effectively prevents such projects to be among the top m (see Section 5.1 and Appendix B), we chose to minimize rule complexity. Future work may adopt hurdle rates as required.
5 Instead of delegating projects to different experts, organizations may consider assigning the responsibility to a single portfolio expert. In this scenario, all projects would be assessed by the agent whose expertise minimizes the overall uncertainty . The organization then ranks projects based on the expert’s quality perceptions and selects the top projects. The expertise that minimizes the overall uncertainty is equal to the mean type. The approach is thus equivalent in expectation to the individual rule we state previously.
6 The model of Csaszar and Eggers (2013) is not multicandidate voting because decision makers never consider projects concurrently. They rather (dis)approve each project in isolation. The model of Csaszar (2018), with crowds voting for one of two projects, is closer to a multicandidate setting.
7 Although not in the realm of uncertain innovation projects, a public example application of the ranking rule is the Aggregate Ranking of Top Universities (https://research.unsw.edu.au/artu/methodology). In it, the University of New South Wales aggregates the preference lists of three agents: Times Higher Education, Quacquarelli Symonds, and ShanghaiRanking Consultancy. They each form their quality perceptions for hundreds of universities based on a list of different criteria. The rank that agents assign to a university automatically results from these scores. The Aggregate Ranking of Top Universities could be used to create a portfolio of m best universities.
8 The discrimination limitation of voting might be partially remedied by asking agents to approve m projects only. With small budgets and large choice sets, such m-approval voting (Elkind et al. 2017) limits the number of projects that are sanctioned by all agents, providing more discrimination in the top section of the aggregated preference list of projects and less at the bottom.
9 Open-science initiatives may worry less about innovation appropriation (Arora et al. 2016, Altman et al. 2022) and could thus attract larger numbers of assessors from outside the organization than IBM managed from within. EteRNA (https://eternagame.org), for example, enlists outsiders to select the most promising molecule designs for resource-intensive testing. Governments are another type of organization that could tap a greater pool of decision makers for selecting projects in participative-budgeting exercises, such as through the Consul project (https://consulproject.org).
10 The members of the Academy of Motion Pictures and Sciences, for example, rank between 5 and 10 candidates to collectively select the Best Picture (Economist 2015). In the laboratory, participants predicting league tables appear able to rank 30-odd sports teams without apparent difficulty (Lee et al. 2014). Other laboratory participants appeared to struggle with the comparisons necessary for the ranking of eight Kickstarter project candidates (Cui et al. 2019).
11 Practical examples of delegating a subset of candidates to assessors based on perceived expertise include, for example, the selection process for the Academy of Management’s Technology and Innovation Management division Best Dissertation award shortlist. Documented in the literature is the selection of treatments via ranking by groups of orthodontists (see Li et al. 2022) before making the final selection. Moods of Norway had employees rank products of a category with which they are familiar to estimate future demand for apparel (Salikhov and Rudi 2021). Geographically separate juries also select through ranking the semifinalists for the Eurovision Song Contest. Each jury accepts a quota (Ginsburgh and Moreno-Ternero 2023).
12 If the value of n/c is not an integer, it is rounded to the nearest integer. The same goes for m/c.
13 In the project and portfolio literatures, rankings already feature heavily: They are outputs of organizational prioritization efforts (Kornish and Hutchison-Krupat 2017, Schilling 2023). Our work underlines that rankings also have a place as inputs to those efforts.
14 See https://pbstanford.org.
15 Order statistics have also been used by Einhorn et al. (1977) to mathematically characterize an aggregation rule in which N individuals with varying levels of expertise evaluate a single project (n = 1).
16 Real-world organizations cannot confidently gauge the shape of the distribution of payoffs from the innovation projects proposed to them, and they consequently determine the size of their budget more pragmatically (Stein 2003, Sengul et al. 2019). Additionally, Appendix B shows how portfolio performance depends on the shape of the underlying quality distribution.
References
- 2020) Selection in R&D project portfolio management. Working paper, Helsinki University of Technology, Finland.Google Scholar (
- 2014) Positioning on a multiattribute landscape. Management Sci. 60(11):2794–2815.Link, Google Scholar (
- 2022) The translucent hand of managed ecosystems: Engaging communities for value creation and capture. Acad. Management Ann. 16(1):70–101.Crossref, Google Scholar (
- 2007)
Project portfolio selection and management . Morris PWG, Pinto JK, eds. The Wiley Guide to Project, Program & Portfolio Management (Wiley, New York), 94–112.Google Scholar ( - 2016) The paradox of openness revisited: Collaborative innovation and patenting by UK innovators. Res. Policy 45(7):1352–1361.Crossref, Google Scholar (
- 2023) Championing the flawed gems: In search of contrarian opportunities through minority ruling. Working paper, Amsterdam University, Amsterdam.Google Scholar (
- 2017) Distilling the wisdom of crowds: Prediction markets vs. prediction polls. Management Sci. 63(3):691–706.Link, Google Scholar (
- 1996) Information aggregation, rationality, and the Condorcet jury theorem. Amer. Political Sci. Rev. 90(1):34–45.Crossref, Google Scholar (
- 2015) Voting systems and strategic manipulation: An experimental study. J. Theoretical Politics 27(1):58–85.Crossref, Google Scholar (
- 2022) The crowd classification problem: Social dynamics of binary-choice accuracy. Management Sci. 68(5):3949–3965.Link, Google Scholar (
- 2021) Skill, power and marginal contribution in committees. J. Theoretical Politics 33(2):225–235.Crossref, Google Scholar (
- 2021) Preference elicitation for participatory budgeting. Management Sci. 67(5):2813–2827.Link, Google Scholar (
- 2020) Comparing election methods where each voter ranks only few candidates. Proc. 34th AAAI Conf. Artificial Intelligence (AAAI Press, Washington, DC), 2218–2225.Google Scholar (
- 2022) Good to go first? Position effects in expert evaluation of early-stage ventures. Management Sci. 68(1):300–315.Link, Google Scholar (
- 2023) Subset selection based on multiple rankings in the presence of bias: Effectiveness of fairness constraints for multiwinner voting score functions. Preprint, submitted June 16, https://arxiv.org/abs/2306.09835.Google Scholar (
- 2022) Examining the limits of the Condorcet jury theorem: Tradeoffs in hierarchical information aggregation systems. Collective Intelligence 1(2).Crossref, Google Scholar (
- 2016)
Introduction to computational social choice . Brandt F, Conitzer V, Endriss U, Lang J, Procaccia AD, eds. Handbook of Computational Social Choice (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar ( - 2019)
Product and innovation portfolio management . Oxford Research Encyclopedia of Business and Management (Oxford University Press, Oxford, UK).Crossref, Google Scholar ( - 2024) Introduction to the special issue on judgment and decision research on the wisdom of the crowds. Decision (Washington, DC) 11(1):1.Google Scholar (
- 2023) Choose your moments: Peer review and scientific risk taking. NBER Working Paper No. 31409, National Bureau of Economic Research, Cambridge, MA.Google Scholar (
- 2021) Does vote trading improve welfare? Annu. Rev. Econom. 13:57–86.Crossref, Google Scholar (
- 2008) A theoretical framework for managing the new product development portfolio: When and how to use strategic buckets. Management Sci. 54(5):907–921.Link, Google Scholar (
- 2006)
Open innovation: A new paradigm for understanding industrial innovation . Chesbrough H, Vanhaverbeke W, West J, eds. Open Innovation: Researching a New Paradigm (Oxford University Press, Oxford, UK).Crossref, Google Scholar ( - 2022) Context and aggregation: An experimental study of bias and discrimination in organizational decisions. Organ. Sci. (34):2163–2181.Google Scholar (
- 2022) The power of rank information. J. Personality Soc. Psych. 122(6):983.Crossref, Google Scholar (
- 1993) Screening new products for potential winners. Long Range Planning 26(6):74–81.Crossref, Google Scholar (
- 2017) Evaluating novelty: The role of panels in the selection of R&D projects. Acad. Management J. 60(2):433–460.Crossref, Google Scholar (
- 2018) Limits to the wisdom of the crowd in idea selection. Adv. Strategic Management 40(Organization Design):275–297.Google Scholar (
- 2013) Organizational decision making: An information aggregation view. Management Sci. 59(10):2257–2277.Link, Google Scholar (
- 2018) Individual and organizational antecedents of strategic foresight: A representational approach. Strategy Sci. 3(3):513–532.Link, Google Scholar (
- 2016) Mental representation and the discovery of new strategies. Strategic Management J. 37(10):2031–2049.Crossref, Google Scholar (
- 2019) Scoring vs. ranking: An experimental study of idea evaluation processes. Production Oper. Management 28(1):176–188.Crossref, Google Scholar (
- 2004) Order Statistics (John Wiley & Sons, Hoboken, NJ).Google Scholar (
- 2014) When is a crowd wise? Decision (Washington, DC) 1(2):79.Google Scholar (
- 2023) Randomisation as a tool for organisational decision-making: A debatable or debilitating proposition? Industry Innovations 30(10):1275–1293.Crossref, Google Scholar (
Economist (2015) The Economist explains: How Oscar winners are decided. The Economist (January 21), https://www.economist.com/the-economist-explains/2015/01/21/how-oscar-winners-are-decided.Google Scholar- 2019) Quadratic voting with multiple alternatives. Working paper, Michigan State University, East Lansing, MI.Google Scholar (
- 1977) Quality of group judgment. Psych. Bull. 84(1):158.Crossref, Google Scholar (
- 2022) The knowledge-incentive tradeoff: Understanding the relationship between research and development decentralization and innovation. Strategic Management J. 43(12):2478–2509.Crossref, Google Scholar (
- 2021) Evaluating proposals in innovation contests: Exploring negative scoring spillovers in the absence of a strict evaluation sequence. Res. Policy 50(4):1–13.Crossref, Google Scholar (
- 2017) Properties of multiwinner voting rules. Soc. Choice Welfare 48(3):599–632.Crossref, Google Scholar (
- 2017) Multiwinner rules on paths from k-Borda to Chamberlin-Courant. Sierra C, ed. Proc. 26th Internat. Joint Conf. Artificial Intelligence (IJCAI), 192–198.Google Scholar (
- 2014) Idea assessment via enterprise crowdfunding: An empirical analysis of decision-making styles. Working paper, Karlsruhe Institute of Technology.Google Scholar (
- 2017) Models of social influence: Toward the next frontiers. J. Artificial Soc. Soc. Simulation 20(4).Google Scholar (
- 1998) Governance and uncertainty: The trade-off between administrative control and commitment. Strategic Management J. 19(11):1007–1028.Crossref, Google Scholar (
- 2021) Behavioral innovation and corporate renewal. Strategic Management Rev. 2(2):285–322.Crossref, Google Scholar (
- 2024) Rank, pack, or approve: Voting methods in participatory budgeting. Preprint, submitted January 24, https://arxiv.org/abs/2401.12423.Google Scholar (
- 2019) Computational Statistics (Springer, Berlin).Google Scholar (
- 2023) The Eurovision Song Contest: Voting rules, biases and rationality. J. Cultural Econom. 47(2):247–277.Crossref, Google Scholar (
- 2019) Knapsack voting for participatory budgeting. ACM Trans. Econom. Comput. 7(2):1–27.Crossref, Google Scholar (
- 2005) The robust beauty of majority rules in group decisions. Psych. Rev. 112(2):494.Crossref, Google Scholar (
- 2023) Costs of collective wisdom: How resources influence information aggregation in organizational decision making. Strategic Organ. 21(2):283–310.Crossref, Google Scholar (
- 1978) A note on aggregating opinions. Organ. Behav. Human Performance 21(1):40–46.Crossref, Google Scholar (
- 1929) Stability in competition. Econom. J. (London) 39(153):41–57.Crossref, Google Scholar (
- 2022) Information aggregation and collective intelligence beyond the wisdom of crowds. Nature Rev. Psych. 1(6):345–357.Crossref, Google Scholar (
- 2020) A framework for managing innovation. TutORials in Operations Research, 202–228.Google Scholar (
- 1962) Mathematics of Statistics, Pt. 1, 3rd ed. (D. Van Nostrand Company, Princeton, NJ).Google Scholar (
- 2017) The influence of hierarchy on idea generation and selection in the innovation process. Organ. Sci. 28(4):653–669.Link, Google Scholar (
- 2017) Crowd wisdom relies on agents’ ability in small groups with a voting aggregation rule. Management Sci. 63(3):818–828.Link, Google Scholar (
- 2007)
Resource allocation decisions . Edwards W, Miles Jr RF, von Winterfeldt, eds. Advances in Decision Analysis: From Foundations to Applications (Cambridge University Press, Cambridge, UK), 400–418.Crossref, Google Scholar ( - 2018) Risk-type preference shifts in response to performance feedback. Strategic Organ. 16(2):141–166.Crossref, Google Scholar (
- 2022) Motivating innovation: Tunnels vs funnels. Strategy Sci. 7(4):300–316.Link, Google Scholar (
- 2015) Real options logic revisited: The performance effects of alternative resource allocation regimes. Acad. Management J. 58(1):221–241.Crossref, Google Scholar (
- 2013) Becoming aware of the unknown: Decision making during the implementation of a strategic initiative. Organ. Sci. 24(1):133–153.Link, Google Scholar (
- 2020) Stage-gate escalation. Strategy Sci. 5(4):311–329.Link, Google Scholar (
- 2021) Optionality and selectiveness in innovation. Acad. Management Discovery 7(3):328–342.Crossref, Google Scholar (
- 2022) Sample decisions with description and experience. Judgment Decision Making 17(5):1146–1175.Crossref, Google Scholar (
- 2023) Ambiguity aversion and the degree of ambiguity. J. Risk Uncertainty 67(3):299–324.Crossref, Google Scholar (
- 2007) Two faces of search: Alternative generation and alternative evaluation. Organ. Sci. 18(1):39–54.Link, Google Scholar (
- 2017) Research on idea generation and selection: Implications for management of technology. Production Oper. Management 26(4):633–651.Crossref, Google Scholar (
- 2023) Missed chances and unfulfilled hopes: Why do firms make errors in evaluating technological opportunities? Strategic Management J. 44(13):3067–3097.Crossref, Google Scholar (
- 2012) Optimal forecasting groups. Management Sci. 58(4):805–810.Link, Google Scholar (
- 2022) Conservatism gets funded? A field experiment on the role of negative information in novel project evaluation. Management Sci. 68(6):4478–4495.Link, Google Scholar (
- 2014) A cognitive model for aggregating people’s rankings. PLoS One 9(5):e96431.Crossref, Google Scholar (
- 2017) Resource allocation and firm boundaries. J. Management 43(8):2580–2587.Crossref, Google Scholar (
- 2001) Conflicts and common interests in committees. Amer. Econom. Rev. 91(5):1478–1497.Crossref, Google Scholar (
- 2022) Bayesian analysis of rank data with covariates and heterogeneous rankers. Statist. Sci. 37(1):1–23.Crossref, Google Scholar (
- 2021) Judgment aggregation in creative production: Evidence from the movie industry. Management Sci. 67(10):6358–6377.Link, Google Scholar (
- 2023) Catching outliers: Committee voting and the limits of consensus when financing innovation. Working paper, Harvard Business School, Boston.Google Scholar (
- 2014) The wisdom of select crowds. J. Personality Soc. Psych. 107(2):276.Crossref, Google Scholar (
- 2022) Democratic vs. elite governance for project selection decisions in executive committees. Eur. J. Oper. Res. 297(3):1126–1138.Crossref, Google Scholar (
- 2005) Decision processes, agency problems, and information: An economic analysis of capital budgeting procedures. Rev. Financial Stud. 18(1):301–325.Crossref, Google Scholar (
- 2009) Relatively fast! Efficiency advantages of comparative thinking. J. Experiment. Psych. General 138(1):1.Crossref, Google Scholar (
- 2007) On the theory of strategic voting. Rev. Econom Stud. 74(1):255–281.Crossref, Google Scholar (
- 2023) The good, the bad, and the average: Comparing individual and collective effects of voting and averaging in organizational decision making. Working paper, ETH Zurich, Zürich, Switzerland.Google Scholar (
- 1982) Optimal decision rules in uncertain dichotomous choice situations. Internat. Econom. Rev. 23(2):289–297.Crossref, Google Scholar (
- 2017)
Collective decision making and jury theorems . Parisi F, ed. The Oxford Handbook of Law and Economics: Volume 1 Methodology and Concepts (Oxford University Press, Oxford, UK), 494–516.Google Scholar ( - 1980) Equilibrium in simple spatial (or differentiated product) models. J. Econom. Theory 22(2):313–326.Crossref, Google Scholar (
- 2020) Is diversity (un)biased? Project selection decisions in executive committees. Manufacturing Service Oper. Management 22(5):906–924.Link, Google Scholar (
- 2019)
Voting methods . Zalta EN, ed. The Stanford Encyclopedia of Philosophy (Metaphysics Research Laboratory, Stanford University, Stanford, CA).Google Scholar ( - 2018) Single-peakedness and total unimodularity: New polynomial-time algorithms for multi-winner elections. McIlraith SA, Weinberger KQ, eds. Proc. AAAI Conf. Artificial Intelligence, vol. 32 (AAAI Press, Washington, DC), 1169–1176.Google Scholar (
- 2023) The dual function of organizational structure: Aggregating and shaping individuals’ votes. Organ. Sci. 34(5):1914–1937.Link, Google Scholar (
- 2012) A maximum likelihood approach for selecting sets of alternatives. de Freitas N, Murphy KP, eds. Proc. 28th Conf. Uncertainty Artificial Intelligence (AUAI Press, Arlington, VA), 695–704.Google Scholar (
- 2018) The Microstructure of Organizations (Oxford University Press, Oxford, UK).Crossref, Google Scholar (
- 2023) Epistemic selection of costly alternatives: The case of participatory budgeting. Preprint, submitted September 4, https://arxiv.org/abs/2304.10940.Google Scholar (
- 1988) Committees, hierarchies and polyarchies. Econom. J. (London) 98(391):451–470.Google Scholar (
- 2021) Forecasting demand for new products: Combining subjective rankings with sales data. Preprint, submitted February 18, https://dx.doi.org/10.2139/ssrn.3780420.Google Scholar (
- 2010) Can there ever be too many options? A meta-analytic review of choice overload. J. Consumer Res. 37(3):409–425.Crossref, Google Scholar (
- 2023) Strategic Management of Technological Innovation, 7th ed. (McGraw-Hill, New York).Google Scholar (
- 2020) Entrepreneurial uncertainty and expert evaluation: An empirical analysis. Management Sci. 66(3):1278–1299.Link, Google Scholar (
- 2019) The allocation of capital within firms. Acad. Management Ann. 13(1):43–83.Crossref, Google Scholar (
- 2021) Selection regimes and selection errors. Working paper, ESMT Berlin, Germany.Google Scholar (
- 1998) How Smith Kline Beecham makes better resource-allocation decisions. Harvard Bus. Rev. 76(2):45–53.Google Scholar (
- 2022) Managing innovation portfolios: From project selection to portfolio design. Production Oper. Management 31(12):4572–4588.Crossref, Google Scholar (
- 2020) Participatory budgeting with cumulative votes. Preprint, submitted September 6, https://arxiv.org/abs/2009.02690.Google Scholar (
- 2020) How do you search for the best alternative? Experimental evidence on search strategies to solve complex problems. Management Sci. 66(3):1395–1420.Link, Google Scholar (
- 2021) Open Strategy: Mastering Disruption from Outside the C-Suite (MIT Press, Boston).Crossref, Google Scholar (
- 2003)
Agency, information and corporate investment . Constantinides GM, Harris M, Stulz RM, eds. Handbook of the Economics of Finance (Elsevier, New York), 111–165.Crossref, Google Scholar ( - 2022) Integration and appropriability: A study of process and product components within a firm’s innovation portfolio. Strategic Management J. 43(6):1075–1109.Crossref, Google Scholar (
- 2000) Capabilities, cognition, and inertia: Evidence from digital imaging. Strategic Management J. 21(10–11):1147–1161.Crossref, Google Scholar (
- 2023) Kodak’s surprisingly long journey toward strategic renewal: A half century of exploring digital transformation that culminated in failure. Preprint, submitted March 4, https://dx.doi.org/10.2139/ssrn.4373683.Google Scholar (
- 1992) Revolutionizing Product Development: Quantum Leaps in Speed, Efficiency, and Quality (Simon and Schuster, New York).Google Scholar (
- 2022) Group decision making under uncertain preferences: Powered by AI, empowered by AI. Ann. New York Acad. Sci. 1511(1):22–39.Crossref, Google Scholar (
- 1988) Condorcet’s theory of voting. Amer. Political Sci. Rev. 82(4):1231–1244.Crossref, Google Scholar (
Lucas Böttcher is assistant professor of computational science at the Frankfurt School of Finance and Management and a courtesy professor in the Laboratory for Systems Medicine at the University of Florida. His research interests include applied mathematics, computational biology, statistical mechanics, and machine learning.
Ronald Klingebiel is professor of strategy at the Frankfurt School of Finance and Management in Germany. He studies the dynamics of strategic decision making under uncertainty, focusing on resource allocation in innovation.