Free Access

When Artificial Intelligence Does Strategy: Learning, Good Times, Lock-in, and Human-Driven Strategic Renewal

Nataliia Neshenko
Nataliia Neshenko
[email protected]
https://orcid.org/0000-0002-8404-3475
Department of Information Technology and Operations Management, Florida Atlantic University, Boca Raton, Florida 33431
Search for more papers by this author
,
Michael D. Ryall
Corresponding Author
Michael D. Ryall
[email protected]
https://orcid.org/0000-0002-7116-4606
The Madden Center for Value Creation, Florida Atlantic University, Boca Raton, Florida 33431
Search for more papers by this author

Nataliia Neshenko

[email protected]

https://orcid.org/0000-0002-8404-3475

Department of Information Technology and Operations Management, Florida Atlantic University, Boca Raton, Florida 33431

Search for more papers by this author

Michael D. Ryall

Corresponding Author

Michael D. Ryall

[email protected]

https://orcid.org/0000-0002-7116-4606

The Madden Center for Value Creation, Florida Atlantic University, Boca Raton, Florida 33431

Search for more papers by this author

Published Online:12 Feb 2026https://doi.org/10.1287/stsc.2025.0448

Abstract

What happens in industries where firms delegate strategy choices to private artificial intelligence (AI) agents? Would markets spiral into hypercompetition or settle into a comfortable status quo? We develop a formal model in which AI agents consider large business-model catalogs, predict performance, select, and learn from realized outcomes. Our representation accommodates existing AI paradigms, allowing for substantial increases in scale and computational capacity. We show that market dynamics converge to a self-confirming equilibrium; along the realized path, AI agents become well calibrated, and their choices become subjectively optimal—even though objectively superior business models may remain unexplored. This convergence can indeed sustain high profits. However, it also produces strategic lock-in; novel business-model implementations become rare long before catalogs are exhausted. This creates a distinct role for humans. A single episode of human-driven frame expansion—introducing a genuinely new business model to a catalog—can disrupt the AI-induced equilibrium and initiate strategic renewal. Yet, the ability to do so does not imply that it will be done. When the prevailing equilibrium is sufficiently lucrative, managers rationally refrain from triggering renewed learning. Our results clarify where humans still matter in AI-enabled strategy: deciding when to change the frame and not merely optimizing within it.

History: Accepted for the Special Issue: Can AI Do Strategy?

1. Introduction

Scholarship at the intersection of artificial intelligence (AI) and business strategy has moved quickly from documenting application-level performance gains (see, e.g., the review by Enholm et al. 2022) to more recent work on how AI reshapes competition, the architecture of strategic decision making, and the design of hybrid human-AI organizations.¹ A recurring theme in this recent literature is that because AI technologies diffuse quickly, persistent performance advantages built on AI-based applications must depend less on the technology per se and more on embedding it in complementary resources—data, routines, governance, etc.—that thereby make the resulting capabilities firm specific (e.g., Mikalef et al. 2020, Krakowski et al. 2023, Kemp 2024) and hence, sustainable. Much of this work views the “human-in-the-loop” question in an issue of automation-augmentation design (see, e.g., the reviews by Joshi 2025 and Nikzat 2025). A parallel stream studies Large Language Models (LLMs) as tools for generating and evaluating strategic alternatives, generally finding that performance is sensitive to the setting, evaluation architecture, and division of labor between people and machines (e.g., Csaszar et al. 2024, Doshi et al. 2025, López-Solís et al. 2025). Work in the emergent theory-based view emphasizes that current LLMs optimize predictive plausibility rather than causal explanation and therefore, do not substitute for theory-based reframing (e.g., Felin and Zenger 2017, Bender et al. 2021, Felin and Holweg 2024).

In this study, we approach the AI/strategy issue from a substantially different perspective. Rather than considering AI functionality in the role of a constituent part of a single firm’s strategy-making process—choice assessment, resource enhancement, coordination mechanism, or business-model modifier—we examine a setting in which the role of AI is promoted to comprehensive strategy generation and selection by all firms within a market. We ask the following question. What would happen in an industry in which the rivals delegated strategic decision making to individual AI agents² operating under existing technological paradigms? This is not as far fetched as it once may have seemed. Across industries, firms are already delegating components of strategy formulation to AI systems that integrate language models, optimization algorithms, and automated decision making (Belhadi et al. 2021, Haefner et al. 2021). If ours is not already a feasible scenario, increased computing power, refined algorithms, and expanded data availability may soon make it so.

This perspective raises a number of interesting questions, the following of which we investigate.

How would widespread AI adoption for the generation of business strategies affect firm profits? Would firms experience a downward spiral in profits because of AI-driven hypercompetition—with AI agents exploring the competitive landscape at superhuman speeds and rapidly squeezing out all of the valuable positions? Would this be the analog of modern financial markets in which large investment firms presently use AI to compete at speeds of microseconds or even nanoseconds to make arbitrage profits measured in fractions of a cent per trade. Or, on the opposite side of the spectrum, would AI agents generate a profitability boon by discovering “blue oceans” free of competitive imitators and/or adopting sophisticated collusive behaviors designed to foil detection and maximize incumbent profits?
Closely related to the profitability question is the question of strategic renewal. Does AI have the capacity to generate breakthrough strategies based upon genuinely novel business models independent of human involvement? Recent experiments in which AI functions as primary innovator present some promising results (Arias-Pérez et al. 2025). However, it is worth pointing out that in all of the studies cited, human involvement is still required in some form or another. Thus, especially in the complex problem domain of business strategy, it remains an open question whether strategic innovation can be fully automated. Nevertheless, we tip the scales toward AI and begin from the premise that AI technologies are capable of independently generating novel strategies in a fashion consistent with existing paradigms. Even with this generous stipulation, is there something in the nature of these technologies that implies that—at some point—strategic renewal inevitably dies?
Finally, thinking about scenarios in which strategy making is entirely off loadable to AI raises the question of what productive role, if any, humans might still play? If AI has the ability to discover novel, highly profitable “blue ocean” strategies and continue to do so indefinitely, then the answer to that question is “not much.” However, if AI strategy making leads to vicious competition or an eventual stasis in strategy invention, then human intervention may have the potential to add a lot. Yet, having the potential to act does not imply having the incentive to do so. If delegation of business strategies to AI leads to high profit outcomes under status quo strategy lock-in, would humans choose to disrupt it if they could?

1.1. Our Methodology

We investigate these questions by building a high-level mathematical representation of AI technologies that is faithful to their present operating paradigm. Our formalism models how contemporary AI decision-making systems would function when tasked with strategic business planning in competitive environments.³ Our AI agents “do strategy” by dynamically selecting business models on behalf of the firms that own them from a set of such models that can be synthesized from their training data. Our unit of analysis is the business model. Implicitly, available to the AI is a large database containing information about the constituent parts of business models (i.e., resources that can be acquired, the feasible portfolios into which they can be organized, and the activity systems that such portfolios can support). The idea is that the set of candidate business models present to the AI during its selection process includes all of the potential business models that can be synthesized from these constituent primitives.

We place three real-world-grounded features at the heart of our model. First, learning proceeds within a specific representational scope and knowledge base, which we refer to as the AI’s awareness frame. At a minimum, this includes (i) a catalog of candidate business models, (ii) a catalog of consequence categories that the AI can record, and (iii) an internal forecasting model (with associated priors) that maps business-model choices into predicted distributions over those consequences. The awareness frame can be thought of as being constituted by both empirically observed and expertly formulated business models in addition to, as just indicated, all of the additional models that can be synthesized from them using their constituent elements. We permit the size of this set to be in the billions or trillions (or more).

Second, our AI agent assesses their choices across a finite time horizon. For example, whether because data are nonstationary, inference is computationally constrained, or near-term performance metrics are binding, recommendation engines, planning modules, and competitive-move optimizers in the real world rely on rolling objective functions computed over bounded horizons. The idea is that the AI predicts explicit consequences (or more precisely, the likelihoods of explicit consequences) over a time horizon consistent with its finite storage capacity.

Third, our AI agents engage in on-path learning from realized outcomes. In many digital settings, this feedback is rapid and at massive scale. In each period, an agent observes its actualized business model and the resulting performance outcomes (annual cash flow, market share, observed competitor behavior, etc.), using these data to update its beliefs and strategy. Crucially, agents observe outcomes only from chosen business models; they do not observe counterfactual outcomes from unchosen alternatives. Importantly, we stipulate that each agent’s optimization objective is the maximization of its firm’s market value (i.e., the discounted present value of free cash flows).

Our theoretical agents reflect the key features of contemporary AI systems for strategic decision making; agents synthesize candidate business models from databases of primitives (resources, portfolios, and activity systems), evaluate potential consequences using learned models of market dynamics, and select strategies through rolling finite-horizon optimization. Business-model choices emerge from a decision kernel at the core of each agent. In our model, this decision kernel operates on an awareness frame—a bounded representation of available business models, market conditions, and competitor behaviors—and updates exclusively from on-path feedback: observed outcomes of deployed strategies.

Our representation is deliberately generous to the AI systems presently available in the real world. We allow arbitrarily long (although finite) planning horizons, permit awareness frames orders of magnitude larger than current computational constraints, include massive data and processing capacity, and employ an adaptive learning process that is dynamically consistent with all of these features. Our construction gives AI agents every advantage that their training, scale, and speed could plausibly confer in real-world deployment. If we accept that the agents in our model represent existing technologies at their theoretical peak performance, then we have a useful tool for investigating what happens when firms in a market delegate strategy making to such AI agents.

1.2. Three Regimes That We Consider

The paper analyzes three increasingly realistic settings. In the baseline setting, we study a market in which all firms delegate business-model choice to AI agents endowed with business model/consequence awareness frames that update beliefs from on-path feedback. Next, we allow the market to “surprise” an AI by generating consequences that were not previously represented in its awareness frame, and we study how learning and behavior reconstitute after such frame expansions. Finally, we introduce a human manager who has the capacity to expand the awareness frame intentionally by admitting a new business model or a new consequence category into the AI’s decision process.

1.3. Summary of Our Findings

Our main proposition shows that when firms off-load strategy making to complete-frame AI agents (Regime I), the market eventually converges to an $ε$ -self-confirming equilibrium ( $ε$ -SCE). What characterizes such equilibria are (1) that each firm’s strategy is optimal with respect to its predictions of the future and (2) that those predictions are nearly perfect.⁴ We show that this result holds even when agents experience out-of-frame consequences (Regime II). This is a remarkable finding; AI agents operating according to presently available paradigms (although certainly augmented by improved hardware, augmented data, etc.) learn to predict the future with near perfection and to make optimal strategic decisions given those predictions.

If the AI agent is predicting the future nearly perfectly and truly optimizing against those predictions, is there any cause for concern? Keep in mind that an $ε$ -SCE already implies an optimal level of strategic exploration; experimentation stops only when it is optimal to do so.⁵ The potential problem is that AI agent predictions are disciplined only along the realized path. Initially, many within-frame business models may be incorrectly evaluated and as a result, remain permanently unexplored. The agent may lock into a limited repertoire of business models that is objectively suboptimal.

This naturally leads to the question of human intervention (Regime III). What positive effects, if any, might humans contribute should strategic lock-in occur? Under present AI governance architectures, it is humans who have the ultimate authority to decide what goes into or stays out of an awareness frame. Should managers discover an exogenous business model, they are free to add it and thereby, disrupt the SCE. We assume that humans have the ability to make such discoveries and intervene in the process accordingly—an ability that we interpret as being because of a unique human capacity to be aware of one’s own unawareness.⁶

We show that the positive effect of such an intervention—were it to happen—is to restart the firm’s process of exploration and strategic renewal, potentially leading to superior performance outcomes for the innovator. However, as we also show, the “were it to happen” qualifier is not trivial. If a firm’s expected performance under an $ε$ -SCE is sufficiently high, those humans lose the incentive to force a new round of experimentation. If the AI is predicting the future nearly perfectly and truly optimizing against those predictions, why disrupt it?

1.4. Contributions

We develop a novel model of dynamic competition between firms that delegate their strategy making to AI agents, which operate in accordance with existing technological paradigms. Our main result demonstrates that within finite time, AI agents in such a setting converge to an $ε$ -SCE in which they have learned to predict the future nearly perfectly within their awareness frames and then, select optimal strategies with respect to those predictions. During the learning phase, agents are capable of generating truly novel strategies. Once past that phase, however, strategic innovation falters. This is a new insight into the potential limits of AI-driven strategic competition.

Work in the intersection of AI and strategy has considered AI as a constituent element in the firm’s strategy-making process. For example, much of the theoretical discussion about AI in strategy has been about whether AI as a resource can be the source of sustained competitive advantage. Because AI technologies are widely available, they are neither rare nor inimitable, and therefore, they cannot be the source of sustained advantage. Thus, much of the discussion in the literature has been around how to embed AI technologies within resource portfolios in ways that make the portfolio as a whole rare and inimitable. Human-AI complementarities have been explored as a promising source of productive scarcity.

We take a new tack by exploring a world in which the function of AI systems is promoted to the autonomous selection of the entire strategy. At this level, it is not obvious that—even if all of the firms run identical AI agents—low levels of profit are implied. As Gans and Ryall (2017) point out, other things equal, firms have strong incentives not to imitate each other. Given the power of AI agents under existing technological paradigms to, in principle, generate distinctive, highly profitable business models, it is altogether possible that AI-driven firms learn $ε$ -SCEs in which all enjoy high profits from distinctive strategic positions, with none moving to imitate another even if they could (everyone sails comfortably in their own blue lake).

Although many AI advocates question whether humans will ultimately add anything to functions delegated to an AI agent, we give humans the benefit of the doubt and assume that they can. At the same time, we ask a very relevant question in this context. Even if humans could nudge their AI agents out of a status quo and into a new phase of strategic renewal, would they? The bad news (presumably for advocates on both sides of the human versus AI divide) is that the better AI does at doing its job (achieving high profitability with high reliability), the less incentive either AI agents or humans have to innovate at the strategic level. To the best of our knowledge, this issue has not been raised in any previous works.

Finally, as mentioned above, this paper also adds to an extant stream of work in strategy on self-confirming equilibria. Thus, although we frame our analysis around AI delegation, the SCE concept is not limited to that context. Prior strategy research has shown that purely human organizations can also become trapped in self-confirming patterns; for example, Repenning and Sterman (2002) demonstrate how capability traps arise when managers, learning only from on-path feedback, rationally underinvest in process improvement even when doing so would be objectively superior. Our analysis can, therefore, be read as an AI-grounded complement to this earlier work.

Section 2 builds a nontechnical bridge from present AI systems to the formal objects in the model, and it clarifies how we use “awareness,” “frames,” and the within-frame versus expanded-frame distinction. Section 3 then presents the formal market model. Sections 4 and 5 establish the convergence and incentive results, and Section 6 discusses implications and caveats.

2. Delegating Strategic Reasoning to AI Agents

This section provides a nontechnical bridge between the formal model and how AI systems are currently used in practice. Our goals are to (i) describe a realistic AI agent architecture integrating knowledge retrieval, optimization, and adaptive learning for strategic reasoning and guidance; (ii) define what we mean by an AI agent’s awareness frame in a way that is operational (and therefore, robust to how these systems are deployed); and (iii) make explicit what in our model humans add to the system in the context of present deployment practices. We also wish to clarify the distinction between strategic innovation that can be classified as within frame versus frame expanding.

2.1. The AI Paradigm Underlying Our Model

We are unaware of any AI systems presently deployed as “end-to-end” strategists. Instead, they appear as modules in a pipeline that separates the generation of candidate business options from evaluation and selection. On the generation side, emerging LLM-based tools (often with retrieval) are beginning to assist in mapping a business context into structured proposals: a market posture, a pricing scheme, a channel strategy, a product road map, or a reconfiguration of activities. On the evaluation side, organizations rely on established quantitative systems—forecasting demand, churn, conversion, capacity utilization, or risk—and then, choose among policy alternatives using optimization algorithms.

It is not hard to imagine a time too far in the future at which advances in hardware, refinements of algorithms, and increased data availability will permit AI agents operating under this existing technological paradigm to be capable of end-to-end strategic guidance. Such agents will integrate multiple capabilities—synthesis, evaluation, and selection—for end-to-end strategic guidance, regardless of their underlying architectural implementation. These capabilities map directly. We identify the business model as the object of strategic choice, where we define a “business model” as a description of an activity system (Porter 1996) and the portfolio of resources required to support it (Wernerfelt 1984, Barney 1989). Thus, the generator proposes a set of candidate business models, the evaluator predicts their consequences, and the selector chooses an optimal candidate to implement.

Two features of present deployments matter for our formalization. First, these modules operate over bounded catalogs. In practice, the generator is constrained by admissible action templates exposed to it. In our context, this would include resource manifests (tangible and intangible, with descriptions of their relevant features), libraries of feasible portfolio relationships between resources, and registries of activities that can be supported by different resource configurations. The evaluator is constrained by the outcome labels and explicated performance measures that the organization tracks (revenue, margin, retention, growth, quality, safety, regulatory risk, etc.). Finally, the selector optimizes over finite horizons; this is because our AI agents are assumed to make explicit predictions about the future consequences of business-model choices period by period and store them both as the concrete specifics of what to expect and as the feedback benchmark against which learning is transmitted. Given finite storage constraints (even large ones), this implies that AI agent forecasts must be of finite length.

2.2. Within-Frame vs. Frame-Expanding Innovation

The word “frame” is used loosely in many discussions of AI. We use “awareness frame” in a broad operational sense; it is the finite set of named distinctions, decision-relevant objects, and admissible options that a deployed system can represent and store in order to generate, forecast, and choose business models. To make this concrete, we refer to the operational catalogs that the agent can actually read and act on at decision time: the library of implementable action templates that it can propose (e.g., policy libraries, tool/function manifests, permitted contract or pricing templates, and business-model objects), the dictionary of outcome labels and performance measures that it can forecast and optimize over (e.g., margin, retention, quality, and compliance risk), and the associated schema/registries that define how these objects are represented and logged. When we say an AI agent is “aware” of a business model or consequence, we mean that it is represented in these catalogs in a way that allows the agent to (i) generate it as an admissible option, (ii) forecast its consequences using the available measurement channels, and (iii) compare and select among options using the organization’s declared objectives.

In real-world deployments, three coexisting layers can be distinguished. First, the representational layer (tokenization, architecture, and weights) determines what distinctions the model can represent at all; this layer is fixed during ordinary inference and changes only when new weights/adapters are deployed. Second, the operable layer is the persistent catalog of action templates, outcome labels, and measurement interfaces that the platform exposes to the agent; this catalog is a software artifact (registries, libraries, and manifests) that changes only when the agent is updated and redeployed. Third, the session layer is the task-conditioned subset of operable items loaded into context for a particular decision (the prompt, retrieved documents, enabled tools, and performance weights); this layer can vary from run to run without implying frame expansion.

Our formal model focuses on the operable layer. We summarize the operable strategic frame using two catalogs: a business-model catalog and a consequence catalog. These catalogs delimit what the system can propose and implement as a strategic option, what it can record as feedback, and therefore, what it can forecast and optimize over. For strategic choice, a deployed AI agent also needs a choice assessment system: an internal forecasting library that can be used to map contemplated business-model choices into expected consequences, preferably with a learning capability driven by real-world feedback. In the formal model, we represent the assessment system by a finite library of predictive theories and a Bayesian posterior over that catalog.⁷ Like the business-model and consequence catalogs, the forecasting library is persistent and decision relevant. Therefore, we define the combination of (i) the two catalogs and (ii) the assessment system as the agent’s awareness frame.

Generation of and selection among elements of the business-model catalog can be extremely rich—including the discovery of novel, creative, and effective strategies—even while the representational scope (the awareness frame) is fixed. When humans add a new business model, the awareness frame is expanded, leading to the potential not only for the AI agent to select that model but also, for it to synthesize additional models from its primitives. This leads to the following classification rule for business-model innovations.

Definition 1

(Within-Frame versus Frame-Expanding Innovation). A business model selected by the AI agent for first-time implementation is classified as

within-frame innovation if it is among the business-model candidates contained in the AI agent’s initial business-model catalog (i.e., regardless of the period in which it is actually implemented) and
frame-expanding innovation if first-time implementation is enabled only because the business-model catalog has been expanded relative to the system’s initial catalog (e.g., by adding new primitives, templates, or expert-coded options that were not previously admissible).

In other words, adding an element to the business-model catalog that was not contained in the initial (period 1) version is “frame expanding,” and any business model introduced to the real world for the first time (whether from the original catalog or an augmented update) is an “innovation.” A frame-expansion event occurs only when a new element is added to one of the catalogs in a persistent way. We make the simplifying—favorable to AI—assumption that the business-model catalog includes all of the business models that can possibly be synthesized during the (unmodeled) generation step.⁸

Initial business-model catalogs are most likely to be generated from historic resource-activity data augmented by subject matter experts. However, this does not imply that the resulting set of admissible candidates is small; it may number in the billions, trillions, or more. Thus, within-frame innovation is real innovation—never before implemented and in principle, strikingly creative and effective—even though it is drawn from a fixed catalog. We nevertheless allow for frame expansion because the deployed catalog cannot include every strategically relevant distinction that might later become representable or admissible through new data, new measurement, or managerial coding.

Example 1

(Within-Frame Innovation). Suppose a retailer’s catalog includes two established business models at opposite ends of the spectrum: a “low-price leader” configuration and a “broad high-quality differentiator” configuration. Depending upon the primitives from which the established models were generated, it is not hard to imagine an AI agent synthesizing a new, never-before-imagined “best-cost provider” variant. Each of the first two business models count as distinct “generic strategies” in Porter (1980) as does the third; because the generic strategy categories are intended to be mutually exclusive, the latter would certainly represent a significant innovation (had Porter 1980 not already thought of it).

Example 2

(Frame-Expanding Innovation). Now, imagine that marketing managers discover horizontal differentiation; refined data reveal that buyers are heterogeneous in what they treat as “high quality.” Management then commits this distinction to the AI agent (e.g., by adding a segmentation variable to the measurement schema and admitting a corresponding set of focused-differentiation templates to the business-model catalog). The AI agent can now generate and evaluate focused-differentiation business models (another of the five generic strategies of Porter 1980) that were not admissible under the initial catalog, and a first-time implementation of such a model is frame-expanding innovation in our sense.

Additions to the consequence catalog are also frame expanding. Adding a new objective term—for instance, a regulatory-risk metric or a trust-and-safety constraint—is frame expanding. Operationally, this kind of expansion often occurs when management adds new measurement systems, new logging, or new governance requirements.

A useful nonexample is “using more data” without changing the outcome labels. Improving forecast accuracy by feeding the evaluator more observations of already-defined variables can change forecasts and improve within-frame choice, but it is not frame expanding in our sense unless it adds a new outcome label to the consequence catalog or admits a new business-model template to the business-model catalog.

2.3. What Humans Add Under Present Deployments

As indicated in Section 1, the AI agents as represented in our model are supercapable versions of existing technologies. In addition to arbitrarily large storage capacities, accurate data, and fast computation speeds, we allow each agent to maintain coherent probability assessments over consequences, to update those assessments correctly using on-path feedback, and to generate a catalog of potentially high-value business models. These assumptions stack the deck in favor of AI performance even within fixed business-model and consequence catalogs.

What then do humans add? One way to answer this is to identify the capacities that AI systems would need to acquire—beyond existing paradigms—to exhibit “human-like” intelligence according to AI researchers themselves. The following points represent broad consensus in the machine learning literature regarding gaps between current systems and human cognition.

Continual learning without catastrophic forgetting. Neural networks notoriously lose prior competencies when trained on new tasks, a phenomenon termed catastrophic forgetting (McCloskey and Cohen 1989, Kirkpatrick et al. 2017). When trained sequentially on multiple tasks, the weights important for task A are overwritten to meet the objectives of task B, causing performance on A to degrade drastically. In marked contrast, humans maintain stable long-term knowledge while continuously incorporating new information (van de Ven et al. 2024).
Sample efficiency in novel domains. Humans learn new concepts from remarkably few examples. A child can recognize a new handwritten character from a single instance, and then, the child can generate new examples, parse the character into parts and relations, and produce related concepts (Lake et al. 2017). Existing systems typically require orders of magnitude more data to achieve comparable performance.
Robust out-of-distribution generalization. The abstraction and reasoning corpus introduced by Chollet (2019) was explicitly designed to test fluid intelligence on novel problems that cannot be solved by pattern matching against training data; it evaluates an AI’s ability to tackle tasks using only the kind of prior knowledge about the world that humans naturally possess, such as intuitions about objectness, counting, and basic geometry. State-of-the-art AI systems achieve substantially lower accuracy than humans (Xu et al. 2023).
Causal reasoning. Humans routinely infer causal structure from observation and reason about interventions even without experimental manipulation. One of the early and foundational results in the structural causal model literature is that—even when all relevant variables are measured—some causal effects remain nonidentifiable from observational data alone (Pearl 2009). The situation worsens substantially under unmeasured confounding. The implication for AI systems is stark; no amount of pattern recognition over observational data, however sophisticated, can substitute for the kind of causal reasoning required to answer interventional or counterfactual queries. Proponents of the theory-based view in strategy have emphasized that effective strategic reasoning requires causal models that support explanation and prediction under intervention, not merely empirical regularities (Felin and Zenger 2017, Felin and Holweg 2024). A critical review of causal reasoning benchmarks concludes that many existing tasks merely test “causal parrots”—models that retrieve causal facts from training data rather than perform genuine causal inference (Yang et al. 2024).
Autonomous goal generation. Last but not least, existing AI systems respond to prompts or maximize externally specified reward functions; they do not spontaneously generate goals, revise them in response to experience, or pursue nested hierarchies of objectives without external specification. LeCun (2022) argues that this limitation is fundamental to present architectures. LLMs predict the next token; reinforcement learning agents maximize cumulative reward. Neither exhibit the kind of autonomous, hierarchical goal-directed behavior characteristic of human cognition.

These gaps share a common thread; current AI systems excel at interpolation within training distributions but struggle with extrapolation to genuinely novel situations requiring causal or structural inference. This characterization maps naturally onto the strategic decision-making context of our model. Business strategy inherently involves novelty—new competitive configurations, unprecedented environmental shifts, and untested business models—and demands causal reasoning under uncertainty about the consequences of interventions. These are precisely the domains where the gaps between human and artificial cognition remain widest.

Importantly, these abilities are not independent; their conjunction produces a metacognitive capacity central to strategic renewal: the ability to recognize that one’s representational frame is incomplete and to direct inquiry toward its expansion. Causal reasoning enables recognizing when observations do not fit existing causal structures, triggering abductive search for new structure rather than mere parameter updating. Autonomous goal generation requires sensing that current objectives are inadequate and revising them—which presuppose detecting inadequacy in the first place. Sample efficiency from sparse data requires building generative models that can recognize when they fail to account for observations. Out-of-distribution generalization requires something beyond flagging anomalies; it requires hypothesizing what kind of thing this might be. Together, these capacities produce what the decision-theoretic literature terms awareness of unawareness: the recognition that one’s representational frame is incomplete in ways that one cannot yet articulate coupled with the capacity to direct inquiry toward productive expansion of that frame.⁹

Present AI paradigms lack this metacapacity because they lack its constituents. An LLM can produce text expressing epistemic humility (“there may be factors I have not considered” or “my training data may not include relevant information”), but this is retrieval of linguistic patterns associated with uncertainty in the training corpus, not a functional state that redirects computation. When prompted, current AI systems readily enumerate categories of potential ignorance: the nuanced oral histories of a tight-knit community, the precise chemical adjustments in a chef’s closely guarded recipe, or the insider details of an unpublished scientific experiment.¹⁰ Yet, the capacity to list such categories is not the capacity to sense that one’s own frame is incomplete in a specific, actionable way. The generator-predictor-evaluator architecture operates within whatever representational frame that it is given; it cannot detect that the frame itself is inadequate, much less pursue that detection productively. Humans, by contrast, experience observations that “do not fit” not merely as low-probability events but as anomalies demanding explanation—which trigger abductive search for new categories, new causal structures, and new goals. This is what enables frame-expanding innovation: not random exploration outside the known space, but directed search informed by a sense of where the current frame fails.

At a mundane level, this distinction is reflected in the governance of present AI deployments; it is humans who serve as administrators of the awareness frame, deciding what business models and consequence categories to admit into the system’s operational catalogs. But, the deeper point is that this administrative role reflects a cognitive asymmetry. Humans hold the potential to sense incompleteness in their strategic representations and to investigate accordingly—to discover business models that were not merely unselected but unconceived. Whether any particular human exercises this potential and whether doing so is privately rational are separate questions that we take up in Sections 4 and 5. For now, we stipulate that humans possess this capacity and that present AI paradigms do not.

In our experience, the assumption that humans have anything at all to add to AI systems as they are presently constituted is the most controversial—one that brings almost reflexive pushback from some colleagues. Yet, given the consensus among computer scientists themselves that machines have yet to reach “human-like” or “general” intelligence (and why), we are comfortable with our characterization of awareness of unawareness as a distinctive source of human creativity.

As we have already indicated, this is not claiming that AI agents cannot be creative or innovative in their own way. Nor do we claim that the existing gaps are permanent or that future paradigm shifts will not close them. Our model stipulates only that under present technological paradigms (however augmented by scaling, optimization, or incremental refinements), awareness of unawareness is one capacity (of possibly others) that permits humans to complement the innovative ability of AI agents. The human manager in Regime III of our model is assumed to possess this capacity and thereby, the potential to break the self-confirming equilibria into which AI-driven markets converge. Whether this assumption reflects a durable cognitive advantage or merely a temporary technological lag is an empirical question that we leave open.

3. A Model of Universal, AI-Driven Strategy

3.1. Notational Conventions

Capital letters (e.g., G, N, and S) typically refer to sets. Lowercase letters (Arabic or Greek) refer to elements of sets or to functions. Profiles are ordered lists of set elements: for example, $x = (x_{1}, \dots, x_{n})$ , where $x_{i}$ is an element of the set $X_{i}$ . Given a profile $x$ , we write $x_{- i}$ to indicate the corresponding profile in which the ith element is removed ( $x_{- i}$ lists all of the elements in x except $x_{i}$ ). We assume that all sets are finite unless stated otherwise. If $X$ is a set, then $Δ (X)$ is the set of all probability distributions on $X$ . Time subscripts t refer to objects available at the beginning of that period, typically information, and those that are actually occurring during the period (e.g., consequences). In-text definitions are italicized when first presented.

3.2. Market Dynamics

The market consists of $n \geq 1$ incumbent firms, which we label by the index set $N = {1, \dots, n}$ . Each firm i is endowed with a discount rate $δ_{i} \in (0, 1)$ , which may reflect managerial impatience or the agent’s cost of capital. Firms compete dynamically over discrete time periods $t = 1, 2, \dots$ for an indefinite length of time. (To reduce notational clutter, we omit time subscripts where they are not required.)

3.2.1. Deterministic Elements.

Our unit of analysis is the business model. We assume that the AI agent owned by firm i has access to a business-model catalog that represents the set of all business models of which it can conceive denoted $B_{i}$ .¹¹ We imagine that $B_{i}$ contains the details of historically observed and expert formulated models—the training database—as well as every possible business model that could conceivably be synthesized from that data. In other words, the business-model catalog is a set of business models that is closed under within-frame innovation.¹² We refer to a list of business-model choices in period t, one for each firm in the market, as a business-model profile denoted $b = (b_{1}, \dots b_{n})$ , where $B$ is the set of all such profiles.¹³ At the start of period t, firm i’s AI agent recommends a business model $b_{i, t}$ from its business-model catalog $B_{i}$ (in a fashion described below), which we assume the owning firm dutifully implements.

Firm i’s agent is also equipped with a nonempty consequence frame denoted $C_{i}$ with typical element $c_{i}$ . This is the set of all conceivable (to the AI agent) outcomes that might be observed following a choice of business model. The interpretation of a “consequence“ is quite flexible. They can be multivalued to include such elements as market shares, firm and rival profits, firm efficiency measures, costs, human resources data, ranges of these values, revealed features of rivals’ business models, and so on. For example, AI agents may observe the activity systems of their rivals but not the resource portfolios that support them. Then, $c = (c_{1}, \dots, c_{n})$ is a profile of consequences, and $C$ is the set of all such profiles.¹⁴

We also need to keep track of the data collected and accumulated by the AI agents as the action proceeds over time. Assume that at the end of period t, agent i observes her own choice of business model and the resulting consequence, $(b_{i, t}, c_{i, t})$ . Then, let $h_{i, t} = (h_{i, 0}, b_{i, 1}, c_{i, 1}, \dots, b_{i, t - 1}, c_{i, t - 1})$ be firm i’s period-t individual history (i.e., a list of its business-model choices and the resulting consequences experienced by i from periods 1 through $t - 1$ ). Here, $h_{i, 1} = (h_{i, 0})$ is the null history representing the information known by i at the start of period 1. Let $H_{i}$ denote the set of all such finite histories. Analogously, a market history at time t is a list of the business-model and consequence profiles experienced up to the start of the period; $h_{t} = (h_{0}, b_{1}, c_{1}, \dots, b_{t - 1}, c_{t - 1})$ , where H is the set of all such finite histories.

3.2.2. Stochastic Elements.

Business-model profiles give rise to consequence profiles according to Nature’s law $p (c | b)$ : the probability that the profile of consequences c arises given the profile of business-model choices b.¹⁵ For the moment, we assume the initial frames are complete; that is, firm i’s consequence catalog $C_{i}$ contains all of the potential consequences that could have actually given the instantiation of any feasible business-model profile (i.e., this is a no-surprise consequences condition, and we will relax this constraint later).

A “behavior” for firm i in period t indicates the probability with which it implements a business model given a realized history in any period. The behavior summarizes the actual likelihood with which a firm chooses a business model contingent upon the history that it experiences as a result of whatever process is driving business-model selection “under the hood.” Our focus will be on AI-driven behavior (described below), but for now, it suffices to keep track of actual history-contingent behavior without getting into those details. Specifically, for any $h_{i, t}$ in finite time, $σ_{i}$ denotes firm i’s behavior, where $σ_{i} (b_{i, t} | h_{i, t})$ indicates the probability that firm i implements business model $b_{i, t}$ upon observing the history $h_{i, t}$ .¹⁶ Then, $σ = (σ_{1}, \dots, σ_{n})$ is a behavior profile.

3.2.3. Objective Expected Payoffs.

For any business model chosen and consequence experienced in period t, firm i receives an amount of net cash flow denoted $π_{i} (b_{i, t}, c_{i, t})$ .¹⁷ It can be shown that a behavior profile $σ$ together with Nature’s law p induces an objective probability distribution $μ_{σ}$ on the set of long-run (infinite) market histories $H_{\infty}$ .¹⁸ From this construction in conjunction with the payoff function $π_{i}$ , we obtain $E_{i} (σ)$ , the objective expected discounted payoff for firm i over the long run when firm behaviors are summarized by $σ$ .

3.3. AI-Driven Business Model Selection

Now, suppose that the managers of all of the firms in the market delegate their business-model choices to their own AI agents.¹⁹ The information available to an AI agent is its awareness frame augmented by its individual history. Formally, the AI agent’s initial awareness frame is given by $(B_{i}, C_{i}, E_{i}, λ_{i, 1},)$ , which includes (i) the business-model catalog $B_{i}$ ; (ii) the consequence catalog $C_{i}$ ; (iii) an internal predictive-theory catalog $E_{i}$ , the elements of which (denoted $e_{i}$ ) are fixed, history-contingent forecasting rules that map the elements of $B_{i}$ to probabilities on $C_{i}$ ;²⁰ (iv) an initial table of Bayesian priors $λ_{i, 1}$ on $E_{i}$ indicating for each predictive theory the probability that it is the true one; and (v) the payoff function $π_{i}$ that maps business model-consequences pairs to the firm’s cash flow objective.

Readers should assume that all of the catalogs are enormous—containing billions, trillions, or an even greater number of elements. The only size constraint is that they be finite. In addition, we require that $λ_{i, 1}$ be strictly positive (no predictive theories are entirely ruled out), and at the system’s representational resolution, buried somewhere in the predictive-theory catalog is at least one that matches the true data-generating process for the firm’s observable consequences along the realized path. This latter assumption is strong and deliberately AI favoring.²¹

Firm i wants its AI agent to identify an optimal behavior $σ_{i}^{*}$ , one that maximizes the actual, long-run present value of cash flows $E_{i} (σ_{i}, σ_{- i})$ . Unfortunately, this is not possible. The first problem is that, realistically, neither Nature’s law p nor the behaviors of the other agents, $σ_{- i}$ are known to firm i’s agent. Therefore, $λ_{i, 1}$ is almost certainly grossly misspecified. The second problem is that AI agents are storage and compute bounded, so the best that they can do is forecast explicit outcomes a finite number of steps ahead, which prevents solution of the infinite-horizon, long-run objective $E_{i} (σ_{i}, σ_{- i})$ .

A decision kernel defines the process by which an AI agent operates: how it observes data in the real world, updates its estimate of the true process driving outcomes (the posterior table), and then, recommends a business model from its knowledge frame. To proceed, we require a realistic representation of a decision kernel for our model. The following approach is consistent with how modern AI agents actually work, accounting for errors in their initial probability weights as well as their finite processing and storage capacities.

3.3.1. Subjective Predictive Theories.

Given the elements in its awareness frame, the AI agent can construct a subjective predictive theory of how its own consequences are generated by business-model choices within its awareness frame in the form of an initial posterior-predictive table ${\tilde{e}}_{i} (c_{i} | b_{i}, h_{i, 0})$ : the agent’s probability assessment that it will record consequence $c_{i}$ if it implements business model $b_{i}$ in period 1 given its initial history $h_{i, 0}$ . From that starting point, given any $h_{i, t}$ , the AI agent can update its posterior weights in the $λ_{i, 1}$ table to get a revised subjective predictive model ${\tilde{e}}_{i} (\cdot | \cdot, h_{i, t})$ .²²

3.3.2. Rolling-T Optimization.

Assume the AI agents all operation over a fixed planning horizon $\infty \geq T \geq 1$ . Then, each agent is coded with an algorithmic decision kernel. Formally, this is a function that maps histories over the T-period time horizon to choices over the set of business models in the catalog. The decision kernel goes through the following steps upon observing an individual history $h_{i, t}$ .

It augments its database with the new information observed in $h_{i, t}$ and uses it to update its table of posterior probabilities to $λ_{i, t}$ .
It evaluates the present value payoff consequences of every current-period action $b_{i, t} \in B_{i}$ by forecasting the subjective expected payoffs over the next T periods under the assumption that after $b_{i, t}$ is chosen, the same kernel will be reapplied each period.
It outputs a choice of a subjectively optimal business model accordingly, one that is computed to maximize the subjective expected T-step discounted payoff.

Even though the AI agent operates on a finite, $T$ -horizon basis, it should be easy to see that its awareness frame and decision kernel imply infinite-horizon behavior in the real world, which we will denote $σ_{i}^{T}$ for firm i, with $σ^{T}$ being the profile of all of the behaviors implied by the AI agents’ awareness-frame endowments. Thus, given the AI agents’ awareness frames and decision kernels, $μ_{σ^{T}}$ determines firm i’s objective expected present value of cash flows over the long run, $E_{i} (σ^{T})$ . To emphasize, these objects are not available to the AI agent. Rather, they are the long-run implications of the agent’s rolling-T optimization.

It can also be shown that the AI agent’s subjective predictive model for firm i, ${\tilde{e}}_{i}$ , along with its choice architecture implies a subjective distribution on long-run individual histories for firm i denoted ${\tilde{μ}}_{σ_{i}^{T}}$ . Again, this is not accessible to the AI agent. Rather, it is the long-run subjective expectation with respect to its individual histories implied by ${\tilde{e}}_{i}$ and its choice architecture. Finally, using ${\tilde{μ}}_{σ_{i}^{T}}$ , we can compute ${\tilde{E}}_{i}$ , which is the agent’s implicit, long-run subjective expected payoff.

These long-run objects are important because $E_{i} (σ^{T})$ , the long-run objective expected net present value generated by firm i’s AI agent, is what managers are tasked with maximizing for their shareholders. This is what they care about, not profits over the short-term, T-period horizon.²³ With ${\tilde{μ}}_{σ_{i}^{T}}$ and ${\tilde{E}}_{i}$ in hand, we can make apples-to-apples comparisons between reality and the AI agent’s subjective “beliefs.”

3.3.3. Main Result.

We say that two conditional distributions are $ε$ -close if the predicted probability of any history event differs from the objective probability by at most $ε$ .²⁴ This definition of closeness is robust to finite samples and finite planning horizons.²⁵

Definition 2

( $ε$ -Self-Confirming Equilibrium). Fix a tolerance $ε \geq 0$ and a planning horizon $T \geq 1$ . The n rolling-T decision kernels achieve an $ε$ -self-confirming equilibrium at history $h_{t}$ if for every firm i, along the realized continuation path,

business-model choices are optimal against current forecasts. At $h_{i, t}$ , the AI agent selects a business model that maximizes its own T-step forecasted discounted value constructed from its posterior-predictive table ${\tilde{e}}_{i} (\cdot | \cdot, h_{i, t})$ .
there is on-path calibration. The agent’s subjective distribution is $ε$ -close to the objective distribution induced by Nature’s law p and the realized behavior profiles of the AI agents.

Informally, market behavior is self-confirming when two conditions hold along the realized path of play. (i) Each AI agent chooses a business model that is optimal given its own forecasts (rolling-T optimization), and (ii) those forecasts are rarely contradicted by the consequences that the firm actually experiences. AI agent assessments regarding never-implemented (off-path) business models can remain wrong because the system never receives feedback on them (e.g., Fudenberg and Levine 1993).

Theorem 1

(Finite-Horizon Convergence to ε-SCE). Under the preceding assumptions, for every $ε > 0$ , almost surely there exists a finite (random) calendar time after which the rolling-T market dynamics constitute an $ε$ -SCE.²⁶

3.3.4. How to Interpret an $ε$ -SCE.

One general concern that has been raised about delegating human thinking to AI is what we refer to as the contraction mapping problem. AI agents begin with a finite knowledge frame to generate output $\to$ that output gets added into the knowledge frame $\to$ the knowledge frame becomes increasingly self-referential $\to$ knowledge converges to a “fixed point” in “knowledge space” $\to$ the spark of innovative renewal goes out. AI advocates counter that even if AI agents never advance beyond within-frame innovation, the initial catalogs are so large that the sun is likely to go out long before all socially productive innovations of this kind are exhausted.

Our main result is different from the contraction mapping problem. The $ε$ -SCE is not the result of exhausting all of the options within a finite set of choices. We demonstrate that a strategic status quo may arise (and probably will) long before the set of strategic options has been exhausted. Our setup provides a reasonably general representation of how delegation of business strategy to AI agents operating under existing technological paradigms would actually work. Unlike the contraction mapping problem, ours is not a result that is obvious from intuitive inspection of the model. The AI agents in our model retain access to the full strategy space; the equilibrium emerges from the structure of strategic interaction itself, not from an artificially constrained choice set.

It is important to clarify what does not follow from this result. An $ε$ -SCE need not be static, low performing, or noninnovative in any ordinary sense. Observed performance novelty can be persistent; firms may pursue exploratory research and development (R&D) routines as encoded within their business models and exhibit persistent product and process innovation as a result. Indeed, an $ε$ -SCE may include various cycles of business-model adoption that result in shifting market positioning, performance leapfrogging, and increasing industry profitability.²⁷ Indeed, an outside observer might well find it impossible to distinguish a market in an $ε$ -SCE from one that is not.

To see this, consider a simple case in which a firm locks in on a single business model that specifies “maintain market research and product development capabilities and use of them to generate and launch innovative products,” which is reliably followed by the consequence (tracked within the AI agent’s awareness frame) “new product successfully gains 40%–60% market share.” What an outside observer would see is a firm engaging in ongoing and possibly quite successful product renewal. Yet, behind the scenes, strategic renewal is halted. The key is to distinguish the locus of novelty; the menu of business models that are employed by each firm in an $ε$ -SCE ceases to expand on path, even if the operating behavior generated by those models exhibits persistent novelty.

The striking aspect of this finding is that although our AI agent does not conceptualize mental models or theorize in terms of causal propositions the same way that humans do, it nevertheless eventually arrives in a state in which its assessments of the consequences of its suggested business model are optimal given its predictions and its predictions are (nearly) perfect. In comparison, a human strategist may make decisions according to a complicated, uber-human cognitional process involving mental models and causal theories; yet, all of this detail can also be boiled down to a process in which she makes assessments about the likely consequences of implementing various business models and from those assessments, choosing the one thought to lead to the best performance. Thus, assuming that the human is a sharp, rational strategic analyst, her decisions will be closely approximated by the AI. If the human is not very good at these kinds of computational tasks, then the agent will outperform her.

Now, this conclusion is under the assumption that the human always operates from a fixed set of business-model choices. However, as we discussed earlier, one thing that humans can do that present AI agents cannot do is to be aware of their own unawareness. This is at least one feature that may give the human an advantage in the context of business strategy innovation. We examine this in Sections 4 and 5. Before turning to an extended example, it is useful to clarify how an AI-mediated $ε$ -SCE would be recognized in practice.

3.3.5. How Would You Tell? Diagnosing an AI-Mediated $ε$ -SCE.

Theorem 1 is a convergence result for a stylized learning-and-choice process, but it also suggests an operational diagnostic logic. Because self-confirmation is defined relative to fixed business-model and consequence catalogs, any diagnosis should begin by specifying a time window on which the relevant catalogs are stable.²⁸ Within such an “anchored” window, an AI-mediated $ε$ -SCE has three signatures that are conceptually distinct and in principle, measurable.

On-path calibration. For each business model that is implemented repeatedly, the evaluator’s recorded probability assessments over consequences should match the empirical frequencies of realized consequences on the realized path (up to $ε$ and sampling error). Equivalently, the probability tables attached to frequently used business models should become stable in the sense that updates become small once sufficient feedback has accumulated.
Subjective near optimality. At each decision point, given the evaluator’s own forecasts and the objective function that it is optimizing, the recommended business model should be within $ε$ of the best alternative available in the current business-model catalog. This condition is about choice rather than prediction; it asks whether the system is exploiting its (on-path) beliefs efficiently.
Repertoire confinement and the slowdown of first-time implementations. After a transient learning phase, realized play draws from a stable subset of business models that are used repeatedly (possibly in long or state-contingent cycles), whereas first-time implementations become rare. This is the sense in which strategic renewal can stall long before a large within-frame catalog is exhausted; what matters is not the size of the catalog in principle but the emergence of a self-confirming on-path repertoire that already delivers high expected value.

3.3.5.1. Frame Expansion Resets the Diagnostic Window.

The clearest evidence of frame-expanding innovation is a persistent catalog update (a genuinely new business-model template is admitted to the business-model catalog, and/or a new outcome label/KPI (Key Performance Indicators) is admitted to the consequence catalog) followed by a change in forecasts and choices that is consequential on the realized path. Such an event ends the anchored window for the prior $ε$ -SCE and initiates a new learning regime.

Again, it is important to keep in mind that none of the preceding signatures imply a halt to product or process innovation within business models. A business model may itself prescribe ongoing R&D, experimentation in marketing, operational search, and continual product renewal. Our theoretical claim is narrower; in an AI-mediated $ε$ -SCE, the menu of active business models rarely, if ever, expands—even though the observable behavior generated by those models can remain highly dynamic.

With these diagnostics in mind, we now turn to an extended example to illustrate the preceding setup and results.

3.4. Extended Example: A Noisy Cournot Market

Before moving on to our investigation of deviations from the AI status quo, we want to walk through an example that should help ground the abstract ideas elaborated above. To that end, we now set up a detailed example of a two-firm Cournot model with noisy consequences. In this example, we

map the primary mathematical objects in Sections 3.2 and 3.3 onto a concrete setup (Cournot), the primitives of which most readers will be familiar;
show a clear case of AI convergence to a collusive, self-confirming equilibrium;
illustrate AI learning from Dirichlet tables and the resulting convergence to approximate self-confirmation (an $ε$ -SCE); and
show how the demonstrated SCE differs from its well-studied, repeated Nash counterpart.

3.4.1. Model Primitives.

Consider a Cournot setup in which two symmetric firms choose production quantities, market price is given by $P (q_{1}, q_{2}) = 100 - (q_{1} + q_{2})$ , and marginal costs are constant at $c = 10$ . Then, the profit of firm i is given by $π_{i} (q_{i}, P) = (P - 10) q_{i}$ .

Suppose the firms consider three business models $B_{i} = {1, 2, 3}$ , each of which is based upon a particular theory about how the market works: (1) collusive oligopoly (the firms each produce half of the monopoly quantity), (2) competitive oligopoly (the firms produce the Cournot equilibrium quantities), and (3) competitive market (firms produce the competitive quantities in both periods). Then, given the numerical parameters, the data associated with firm $i = 1$ are as shown in Table 1.

Table 1. Firm 1 Outcomes Across Business-Model Profiles

Table 1. Firm 1 Outcomes Across Business-Model Profiles

b profile	Quantity $q_{1}$	Price $P (b)$	Profit $π_{1} (c_{i}, b_{i})$
$(1, 1)$	22.5	55.0	1,013
$(2, 1)$	30.0	47.5	1,125
$(3, 1)$	45.0	32.5	1,013
$(2, 2)$	30.0	40.0	900
$(1, 2)$	22.5	47.5	844
$(3, 2)$	45.0	25.0	675
$(1, 3)$	22.5	32.5	506
$(3, 3)$	45.0	10.0	0
$(2, 3)$	30.0	25.0	450

Table 1 assumes that firms produce at the level of output consistent with their business model. Because of symmetry, there are six potential market prices corresponding to the nine possible business-model profiles. Firms observe market prices. Therefore, the set consequences correspond to the prices $C_{i} = {55.0, 47.5, 40.0, 32.5, 25.0, 10.0}$ .

3.4.2. The Objective Nature’s Law.

Now, suppose that the market price is observed with noise. For each business-model profile $b = (k_{1}, k_{2})$ , let $P (b) \in {55.0, 47.5, 40.0, 32.5, 25.0, 10.0}$ denote the deterministic price implied by that profile. Assume that Nature generates the observed price as follows; with probability 0.9, the realized price equals $P (b)$ , and with the remaining probability 0.1, the observation is garbled and replaced uniformly by one of the other five prices (probability of 0.02 each). Thus, the observed price is informative but imperfect. In our notation, $p (P (b) | b) = 0.9$ and $p (P^{'} | b) = 0.02$ for each $P^{'} \in C_{i} ∖ {P (b)}$ .

3.4.3. Initial Estimates and Illustrative Dirichlet Tables ( $t = 1$ ).

To illustrate learning (not to impose it as an assumption of the theorem), suppose that each AI represents its forecasts for prices using a Dirichlet multinomial table with strictly positive pseudocounts $α_{i}^{k ℓ}$ , where $k \in B_{i}$ indexes the implemented business model and $ℓ$ indexes the six prices. Consider the following (illustrative) initial pseudocounts for each firm i:²⁹

\begin{matrix} Theory k_{i} & 55.0 & 47.5 & 40.0 & 32.5 & 25.0 & 10.0 \\ 1 & 10 & 10 & 1 & 1 & 1 & 1 \\ 2 & 1 & 1 & 10 & 10 & 1 & 1 \\ 3 & 1 & 1 & 1 & 1 & 10 & 10 . \end{matrix}

These pseudocounts imply the following initial posterior-predictive probabilities ${\tilde{e}}_{i} (P | k_{i}, h_{i, 1})$ and one-period subjective expected profits:³⁰

\begin{matrix} Theory k_{i} & 55.0 & 47.5 & 40.0 & 32.5 & 25.0 & 10.0 & {\tilde{E}}_{i} [π_{i} | k_{i}] \\ 1 & 0.42 & 0.42 & 0.04 & 0.04 & 0.04 & 0.04 & 837 \\ 2 & 0.04 & 0.04 & 0.42 & 0.42 & 0.04 & 0.04 & 778 \\ 3 & 0.04 & 0.04 & 0.04 & 0.04 & 0.42 & 0.42 & 534 . \end{matrix}

Thus, at $t = 1$ , each AI agent selects business model 1.

3.4.4. An Implicitly Collusive Learning Trap.

Because only the row corresponding to the implemented business model is updated, the beliefs that make business model 1 attractive are continually reinforced. Under Dirichlet updating, after n observations while playing business model 1, the posterior-predictive probability of price P is

{\tilde{e}}_{i} (P | 1, h_{i, n + 1}) = \frac{α_{i}^{1, P} + n_{P}}{\sum_{P^{'} \in C_{i}} α_{i}^{1, P^{'}} + n},

where

n_{P}

counts how many times price P has been observed during those n periods. Along the realized collusive path

(1, 1)

, Nature generates 55.0 with probability 0.9 (and each other price with probability 0.02), so the posterior-predictive distribution in row 1 of the preceding table converges to that same distribution. Rows 2 and 3 of the table need not become accurate at all because those business models are never tried.³¹

3.4.5. Interpretation as an $(ε)$ -SCE.

For large t, the induced play is approximately self-confirming in the sense of Definition 2; firms optimize against their posterior-predictive beliefs, and those beliefs become arbitrarily close to the truth on the realized path. In the limit as $ε \to 0$ (and as T grows so that $η_{T} \to 0$ ), the outcome is an exact self-confirming equilibrium.

4. Human-Driven Frame-Expanding Innovation

In the preceding section, we showed that within fixed business-model and consequence catalogs, an AI agent can generate substantial novelty by searching and recombining over a very large business-model catalog, and yet, the induced market play converges to an $ε$ -SCE. This convergence does not reflect exhaustion of the option space. Rather, it reflects the fact that once the evaluator’s on-path forecasts become sufficiently accurate, the rolling-T decision kernel has little incentive to keep paying for further experimentation when exploitation of a self-confirming repertoire already delivers high expected value.

The implication for strategic renewal at our level of abstraction is, therefore, sharp. Because the within-frame (synthetic) business-model catalog is finite, there are only finitely many first-time implementations available within that catalog, and Theorem 1 implies that realized play eventually becomes confined, typically to a subset of business models that are used repeatedly (possibly in long or state-contingent cycles). Thus, even if within-frame novelty can be extensive early on, it is generically transient on the realized path and may end long before the catalog is fully explored.³²

We now extend the model to include frame-expanding strategic innovation by a human manager (“de novo” in the sense of adding new admissible business-model templates that were not in the AI agent’s initial catalog). The questions that we tackle are mainly technical. (i) How does a one-shot frame expansion propagate through on-path learning and competitor responses? (ii) How does the market re-equilibrate once the new business models are admitted into the strategic menu?

4.1. AI Response to Frame-Expanding Innovation

Let $B_{i}^{*}$ denote the set of all of the feasible business models that are actually available to firm i’s AI agent. The idea is that the agent’s initial business-model catalog is a strict subset of these. Formally, assume that the AI’s (nonempty) initial catalog of available business models is a strict subset, $B_{i, 0} \subset B^{*}$ . We treat consequences similarly. Let $C_{i}^{*}$ denote the set of all possible individual consequences that firm i could ever experience given some selection of a business model in $B_{i}^{*}$ . As with the set of business models, assume that the AI agent for firm i begins with a reference catalog of consequences that is a strict, nonempty subset of the complete set of potential consequences. Label it $C_{i, 0}$ . As before, to evaluate business models inside these catalogs, each agent begins with a (finite) predictive-theory catalog $E_{i}$ and a full-support prior $λ_{i, 1}$ on $E_{i}$ . As the catalogs expand, the predictive-theory catalog and the associated prior/posterior are extended accordingly.

Now, define agent i’s AI awareness frame at calendar date t as $A_{i, t} = (B_{i, t}, C_{i, t}, E_{i, t}, λ_{i, t}),$ where $B_{i, t}$ is the set of business models that it holds in memory and can implement, $C_{i, t}$ is the set of consequence labels that it can record and value, $E_{i, t}$ is its (finite) catalog of candidate predictive theories (environment response functions), and $λ_{i, t}$ is its Bayesian posterior over $E_{i, t}$ . Together, $(E_{i, t}, λ_{i, t})$ induce the posterior-predictive table ${\tilde{e}}_{i} (\cdot | \cdot, h_{i, t})$ used by the rolling-T decision kernel. Thus, we can reasonably refer to $(B_{i}^{*} - B_{i, t})$ and $(C_{i}^{*} - C_{i, t})$ as the sets of i-specific business models and consequences, respectively, of which agent i’s AI agent is unaware at time t. The key extension from the previous period then is that we now allow the AI agent’s knowledge frames to be incomplete and evolve dynamically upon the experience of being presented with previously unimaginable (to it) business models and/or consequences.

In this version of the model, Nature’s law p is defined on profiles drawn from the extended sets $B_{i}^{*}$ and $C_{i}^{*}$ . As before, we assume that these sets are consistent; $c_{i}$ is included in the extended consequence frame of firm i if and only if there exist business-model profiles b drawn from the extended business-model frames such that $p ((c_{i}, c_{- i}) | b) > 0$ . The extended consequence frames contain all of the consequences that arise with positive probability when the firm (and rivals) chooses business models from their extended business-model frames.

Importantly, we do not assume that the initial consequence frame is consistent with the initial business-model frame in the same way as their extensions. This adds an additional dimension of realism; a choice of business model from the AI agent’s awareness frame may now result in a surprise consequence—a consequence that falls outside of the contemporaneous awareness frame. Nevertheless, absent frame-expanding innovation, the business awareness frames of all agents remain static. They are only capable of recommending business models from the frames of which they are aware.

Let agent $m \in N$ be a manager who considers deviating from the norm of relying unquestioningly upon AI to choose her business model. Instead, she opts to engage in frame-expanding innovation to discover new business models of which her agent is unaware. All of the other agents use their AI agents as previously described with some minor adjustments explained below.

Whenever a frame-expanding edit occurs (a newly observed consequence and/or a newly added business model), assume that the AI kernel applies the following procedure to update its awareness (at the end of t).

Augment consequences. If a realized consequence surprise occurs—that is, if $c_{i, t} \notin C_{i, t}$ —then the AI agent’s consequence frame is expanded to include it: $C_{i, t + 1} = C_{i, t} \cup {c_{i, t}}$ . (If no surprise occurs, set $C_{i, t + 1} = C_{i, t}$ .)
Augment business models (when applicable). If a human adds a new business model $b_{i}^{new} \notin B_{i, t}$ to the firm’s operational catalog, then the business-model frame expands: $B_{i, t + 1} = B_{i, t} \cup {b_{i}^{new}}$ . (For firms that do not innovate, $B_{i, t + 1} = B_{i, t}$ .)
Extend the belief catalog and priors. Whenever either component of the awareness frame expands, the agent’s internal catalog of candidate response functions must be extended to the enlarged sets. Formally, replace $E_{i, t}$ by an expanded catalog $E_{i, t + 1}$ defined on $(B_{i, t + 1}, C_{i, t + 1})$ , and lift the current posterior $λ_{i, t}$ to a full-support distribution on $E_{i, t + 1}$ . (Operationally, the same human-driven “frame patch” that adds new actions/outcomes also supplies an initialization for the AI’s forecasting models and priors on the new entries.)
Posterior update. Given $(b_{i, t}, c_{i, t})$ , the AI agent updates $λ_{i, t}$ by Bayes’ rule on $E_{i, t + 1}$ and forms the posterior-predictive table ${\tilde{e}}_{i} (\cdot | \cdot, h_{i, t + 1})$ .

This provides a simple implementation-level picture of “dynamic awareness.” Whenever the AI agent experiences an expansion of $B_{i, t}$ and/or $C_{i, t}$ , it correspondingly expands and reinitializes its internal forecasting machinery before resuming learning and rolling-T optimization. If no frame edit occurs, the awareness frame and model catalog are carried forward unchanged.

4.2. A Greedy Benchmark for Frame Expansion

As Schipper (2021) suggests, learning of the type described in Section 3 cannot happen as long as firms’ AI strategists are experiencing disruptions to their algorithms:

[D]ynamic learning processes in games with unawareness must not only deal with learning about the opponents’ play but also with discoveries that may lead to transformative changes in players’ views of the game. Games with unawareness may be self-destroying’ representations of the strategic situation in the sense that rational play may destroy some player’s view of the strategic situation. Only when a view of the strategic situation is self-confirming (i.e., rational play in such a game does not lead to further changes in the players’ views of the game), an equilibrium notion as a steady-state of a learning process of behavior may be meaningfully applied. (Schipper 2021, p. 3)

We adopt this view and proceed by breaking the market dynamics into a discovery phase, during which agents’ awareness frames evolve, followed by a learning phase, which once the discovery process settles down, proceeds along the lines of Section 3.

4.2.1. Discovery Phase.

Consider what happens when a single human strategist $m \in N$ engages in frame-expanding innovation every period (a “greedy“ frame-expanding innovation approach) and discovers new business models (for whatever reason and by whatever process). Assume that the human innovates with fixed probability $ϵ \in (0, 1)$ , and when she does, she discovers a new business model out of the collection of business models of which her AI is unaware. Formally, given a history $h_{m, t}$ and some corresponding awareness frame for agent m, $A_{m, t - 1} = (B_{m, t - 1}, C_{m, t - 1}, E_{m, t - 1}, λ_{m, t - 1}),$ the probability that business model $b_{m, t} \in B_{m}^{*}$ is chosen in period t is

P r (b_{m, t} | h_{m, t}) \equiv {\begin{array}{l} ρ_{t} (b_{m, t}) & with probability ϵ, \\ σ_{m}^{T} (b_{m, t} | h_{m, t}) & with probability 1 - ϵ, \end{array}

where

ρ_{t}

is uniform on the remaining set of business models of which agent m’s AI agent is unaware at time t. If the frame of business models of which the agent is unaware is ever exhausted, the discovery phase ends and reverts to the process described in the preceding section.

An obvious interpretation of the preceding formalism is that m is aware of her unawareness. If she is aware of the missing business models, she can immediately code them into the AI agent’s awareness frame and then, let the system choose them according to its kernel logic. If she is completely unaware of them, then she has no reason to explore in the first place. Instead, she must believe that there are other business models “out there,” the details of which she cannot presently articulate and the exact consequences of which she cannot imagine. Thus, she implements a process of discovery, and with probability $ϵ$ , she does indeed become aware of a new business model selected at time t from the set of outstanding business models according to the distribution $ρ_{t}$ .

Assume that all of the other agents follow their AI kernels, selecting exclusively from their awareness frames. Because $B_{m}^{*}$ is finite, exploration eventually triggers all potential discoveries. Similarly, each surprise consequence for an agent i adds a single new element to $C_{i, t}$ from the finite extended consequence frame. Taken together, these features immediately imply the following.

Lemma 1

(Finite Discovery). With probability 1, there exists a finite random date $T^{disc}$ such that business-model discovery and consequence surprises stop.

4.2.2. Learning Phase.

Define the learning phase as the market dynamic in periods $t \geq T^{disc}$ , during which the profile of awareness frames becomes stable. For the innovative firm m, stability means that its catalog components have reached the complete frame; for all $t \geq T^{disc}$ , $B_{m, t} = B_{m}^{*}$ and $C_{m, t} = C_{m}^{*}$ (and thus, the belief objects $(E_{m, t}, λ_{m, t})$ ) are defined on the complete catalogs. The other agents remain locked in to their initial business-model frames. For these agents, their initial consequence frames are augmented by any additional elements that get added by random surprises. Thus, m’s rivals do notice any unexpected consequences generated by her adoption of new models. Following firm m’s discovery phase, market dynamics once again converge to approximate self-confirmation.

Theorem 2

(Discovery $\Rightarrow$ Approximate Self-Confirming Play). Fix a planning horizon $T \geq 1$ . For every $ε > 0$ , almost surely there exists a finite (random) date after which AI-driven competition constitutes an $ε$ -SCE.

The original convergence logic extends seamlessly to accommodate discovery and surprises; frame-expanding edits restart learning on an enlarged awareness frame. If frame-expanding innovation follows a point at which the market dynamic has settled into an $ε$ -SCE, then strategic lock-in is disrupted as long as new business models and their consequences are introduced into the system. These frame edits play the role of “theory revision” in the theory-based view; they expand what the AI agent can represent and therefore, restart a new statistical learning phase, after which AI-mediated competition converges to a new $ε$ -SCE in the enlarged frame.

5. AI Incentives Against the Frame-Expanding Innovation

Having established how purely AI-driven business-model selection and the greedy benchmark for frame expansion converge to an $ε$ -SCE, we now introduce a single-agent, one-shot episode of frame-expanding innovation. The last section simply assumed that one of the human managers doggedly engaged in novel innovation until her AI agent became aware of every business model that she could imagine, allowing for the possibility that her introduction of novel business models might generate surprise consequences for the agents of her rivals and herself.

In this section, we examine a manager’s decision of whether to do frame-expanding innovation in the first place. This decision creates a challenge. Given the manager’s lack of knowledge of the explicit details of the business models of which she is unaware, how can she assess whether to innovate or not? In the previous section, we sidestepped this problem by simply assuming that her vague sense that some valuable business model existed beyond her awareness frame was sufficient to induce her to innovate every period until the set of novel business models was exhausted.

Our approach to intentional assessment of known unknowns follows in the spirit of Karni and Vierø (2013, 2017) and related work. These papers focus on modeling decision makers who are aware that they are unaware of certain relevant aspects of their decision problems.

Let us return to our manager m whose market has converged to an $ε$ -SCE generated by the rolling-T kernels in accordance with Theorem 1. Assume that we pick up the action in period t with m facing the individual history $h_{m, t}$ and equipped with a stable awareness frame $A_{m}^{disc} = (B_{m}^{disc}, C_{m}^{disc}, E_{m}^{disc}, λ_{m}^{disc})$ . Let $σ^{SCE}$ be the behavior profile at that point, and to help conserve on notation, denote the associated subjective expected discounted payoff to m by maintaining the AI system’s status quo t by $V_{m}^{SQ} = {\tilde{E}}_{m} (σ_{m}^{SCE} | {\tilde{e}}_{m}, h_{m, t})$ as defined in Equation (B.1).

5.1. To Expand the Frame or Not to Expand

Suppose m must pay a fixed cost $K > 0$ to conduct a frame-expanding innovation for an efficiency-improving business model. The agent believes that the search

succeeds with probability $\tilde{η} \in (0, 1)$ . If so, a single model ${\hat{b}}_{m}$ is revealed from her unawareness frame and added to her AI’s business-model awareness frame.
fails with probability $1 - \tilde{η}$ , in which case her awareness frame remains unchanged.

Although the agent cannot imagine precisely what she will discover, she has a subjective belief that if her exploration is successful, the new business model will result in a cost reduction per period within some range $[{\tilde{s}}_{\min}, {\tilde{s}}_{\max}]$ . To evaluate her preferences over intervals, we adopt the approach in Brandenburger and Stuart (2007) and equip her with a preference parameter $\tilde{β} \in [0, 1]$ such that

\tilde{s} (\tilde{β}) \equiv (1 - \tilde{β}) {\tilde{s}}_{\min} + \tilde{β} {\tilde{s}}_{\max} .

Thus, a larger value for $\tilde{β}$ reflects greater optimism on the part of the agent.

Finally, adopting a new business model may result in unexpected patterns of consequences and surprises for rivals. Two concerns may arise as a result. The first is that the disruptions may cause a punishment response from the market. A simple example is the punishment phase that rivals impose when they are following a Grimm strategy. To account for this, let $\tilde{φ} \geq 0$ denote m’s subjective assessment of the expected present value of the opportunity costs that she will incur because of an explicit punishment phase imposed by her rivals or because of poor performance that she may experience as the market enters a new convergence period provoked by the deployment of her novel business model.

The second concern is that by disrupting the status quo SCE, the agent may cause the market to converge to something different. That is, should the new business model disrupt the market, the standard learning dynamics resume on the enlarged profile of awareness frames, picking up again with $A_{i, t}$ for all $i \in N$ . Then, as in Theorem 2, market dynamics converge almost surely to a new $ε$ -SCE.

This is a problem when the original status quo $V_{m}^{SQ}$ corresponds to a high-profit SCE that could devolve into a highly competitive, low-profit equilibrium. With this in mind, assume that following the adoption of a more efficient business model and some period of disruption, the market converges to one of two SCEs at time $t < T^{disc}$ : collusive or competitive.

Under the collusive outcome, agent m realizes her efficiency improvement under the radars of her rivals, thereby enjoying an expected present value of payoffs at $T^{disc}$ of $V_{m}^{SQ} + \tilde{s} (\tilde{β}) / (1 - δ_{m})$ . Under the competitive outcome and perhaps because of surprise consequences, the efficiency gains diffuse into the market and are competed away, resulting in expected present value of payoffs equal to $V_{m}^{SQ} - \tilde{s} (\tilde{β}) / (1 - δ_{m})$ . Manager m attaches probability $ω_{m} \in [0, 1]$ that the market returns to the tacitly collusive equilibrium (and m enjoys the fruits of its more efficient business model) and $1 - ω_{m}$ that the market devolves to aggressive competition (in which the efficiency gains are not only erased but also, payoffs approach zero economic profits).

5.2. Manager M’s Expected Gain from Exploration

Conditioned on the individual history $h_{m, t}$ that triggers her decision, the expected discounted payoff from conducting a one-shot frame-expanding innovation is

E_{m}^{EXP} = \tilde{η} δ_{m}^{d} [V_{m}^{SQ} + (2 ω_{m} - 1) \frac{\tilde{s} (\tilde{β})}{1 - δ_{m}}] + (1 - \tilde{η}) V_{m}^{SQ} - (K + \tilde{φ}),

(1)

where

d \equiv T^{disc} - t

is the number of periods until the new equilibrium materializes. Let

Δ V_{m} \equiv E_{m}^{EXP} - V_{m}^{SQ}

. Then, agent m rejects exploration when

Δ V_{m} \leq 0

and accepts otherwise. Algebraic simplification of (1) yields

Δ V_{m} = \tilde{η} (δ_{m}^{d} - 1) V_{m}^{SQ} + \tilde{η} δ_{m}^{d} (2 ω_{m} - 1) \frac{\tilde{s} (\tilde{β})}{1 - δ_{m}} - (K + \tilde{φ}) .

Note that the term with $V_{m}^{SQ}$ is nonpositive because $δ_{m}^{d} \leq 1$ . This reflects the fact that the potential efficiency benefits arrive after a delay of d periods.

Theorem 3

(Frame-Expanding Business-Model Innovation versus the AI Status Quo). Manager m undertakes the one-shot frame-expanding innovation at time t if and only if

\tilde{η} δ_{m}^{d} (2 ω_{m} - 1) \frac{\tilde{s} (\tilde{β})}{1 - δ_{m}} > K + \tilde{φ} + \tilde{η} (1 - δ_{m}^{d}) V_{m}^{SQ} .

(2)

Proof.

Subtract $V_{m}^{SQ}$ from both sides of (1), rearrange, and Theorem 3 obtains. □

Intuitively, innovate if the discounted expected private gain from a more efficient, still-collusive future exceeds the innovation cost plus the foregone value of staying in the status quo during the d-period relearning window.

Although once set up, the proof of Theorem 3 is a matter of simple algebraic manipulation, its content is not trivial. First, the expected exploration premium is

\frac{\tilde{η} δ_{m}^{d} (2 ω_{m} - 1) s (\tilde{β})}{(1 - δ_{m})} .

It is positive only when m believes convergence to the previous, collusive status quo with the ability to capture efficiency gains is more likely than devolution to an aggressively competitive market ( $ω_{m} > \frac{1}{2}$ ). The premium is attenuated if the new equilibrium is far in the future (d large) or if the firm’s cost of capital ( $δ_{m}$ ) is small.

The option cost on the right combines the direct cost of frame-expanding innovation $(K + φ)$ with the expected waiting cost because of the period of disruption that occurs in the event that a new business model is discovered, $\tilde{η} (1 - δ_{m}^{d}) V_{m}^{SQ}$ . This is the fraction of the baseline value that is forfeited while the market is in the process of convergence.

Corollary 1

(Managerial Optimism and Impatience). Exploration is never optimal if $ω_{m} \leq \frac{1}{2}$ or if $δ_{m}^{d} \leq (K + \tilde{φ}) / (\tilde{η} V_{m}^{SQ})$ . Conversely, as soon as $ω_{m} > \frac{1}{2}$ or d shortens, Inequality (2) can flip and induce exploration.

Exploration becomes attractive when the discounted expected gains from a more efficient and still-collusive future outweigh both the search cost and the foregone value of staying on the old path for d periods. The proposition and corollary show exactly how an awareness-of-unawareness attitude, optimism about postdiscovery market conduct ( $ω_{m}$ ), and cost of capital ( $δ_{m}$ ) combine to determine whether even a single episode of frame-expanding innovation is privately worthwhile.

Corollary 2

(Risk Class and the Propensity to Explore). Fix all primitives except the discount factor $δ_{m} = 1 / (1 + r_{m})$ , where $r_{m}$ is the firm’s cost of capital. If $ω_{m} > \frac{1}{2}$ (the manager believes that a return to the collusive SCE is more likely than a competitive collapse), then

\frac{\partial Δ V_{m}}{\partial δ_{m}} > 0 .

Hence, there exists a critical discount factor ${\bar{δ}}_{m} \in (0, 1)$ —equivalently, a critical cost of capital ${\bar{r}}_{m} = (1 / ({\bar{δ}}_{m}) - 1)$ —such that frame-expanding innovation is optimal iff $δ_{m} > {\bar{δ}}_{m}$ (i.e., iff (if and only if) the firm’s required return $r_{m}$ is below ${\bar{r}}_{m}$ ).

This result implies a clear finance-cum-AI sorting logic. Firms with low systematic risk/cost of capital attach relatively high value to efficiency gains that arrive only after the AI agents finish relearning. Provided that they judge a return to a high-profit SCE to be more likely than a pivot to aggressive competition ( $ω_{m} > \frac{1}{2}$ ), these patient and low-beta incumbents are more willing to override the AI autopilot, incur the one-shot cost K, and launch a human-led frame-expanding innovation. They are, in effect, the natural suppliers of genuinely novel business-model experiments in an environment dominated by AI-driven strategies in which within-frame innovation by AI has stalled.

Conversely, firms whose shareholders demand a high required return (high-beta, capital-constrained firms) discount the same distant gains so heavily that the waiting cost on the right-hand side of Equation (2) dominates. For them, even modest search or disruption penalties tip the balance toward maintaining a profitable, AI-induced SCE and harvesting its near-term cash flows. The model, therefore, predicts a systematic pattern. Should AI-driven business-model development become common, industries populated by risk-tolerant utilities, diversified conglomerates, or state-backed champions should display a disproportionate share of “maverick“ business-model breakthroughs, whereas more volatile, leveraged sectors are likely to remain locked into the AI status quo for longer—despite recognizing that the machines have ceased to discover fundamentally new ideas. In short, cheaper capital buys the patience required to trigger a new round of AI learning.

6. Conclusion

We analyzed markets in which firms fully delegate the generation of strategy options to AI agents built on technological architectures of a kind consistent with those that presently exist. By “of a kind consistent,” we mean LLM-style generative models coupled to explicit prediction and optimization modules: large neural networks trained on vast corpora that can propose coherent strategic options together with statistical learners that forecast consequences and score options against declared objectives.

What happens when strategic decision making is fully delegated to AI systems? How does removing human judgment from strategy development affect firm behavior and the long-run potential for strategic renewal? Our analysis shows that when strategic decision making is fully delegated to AI agents running on systems consistent with modern technological architectures, markets can converge to a (near-)self-confirming equilibrium in which agents optimize business-model choices using feedback generated by their own actions. This convergence is not a failure of reasoning but an implication of rational learning within a bounded awareness frame.

On the one hand, the result that we demonstrate is remarkable. AI agents learn in a competitive setting; in a decentralized fashion, they learn to make accurate predictions about the performance consequences of the business models that they adopt from their awareness frames, and the business models that they adopt are (near) optimal with respect to those predictions. The SCE that the market settles into may be dynamic in the sense of responding to contingencies (such as demand shocks or cycling periods of market leadership). In addition, convergence to an SCE may raise the average performance of the industry as a whole—including, under some conditions, through forms of tacit coordination that resemble implicit collusion.

Nevertheless, once a market converges (and we continue to emphasize that this can happen quickly), the strategic issue is not that the AI agent ceases to adapt within its current awareness frame. Rather, the realized path can become confined to a self-confirming repertoire of business models; forecasts become accurate for the business models that are implemented repeatedly, and first-time implementations become rare long before a large within-frame catalog is exhausted. Product and process innovation may continue within those business models—for example, through R&D, marketing experimentation, and operational search—but the menu of business models used for strategic choice does not expand without a frame-expanding intervention.

Where does human agency enter? Under present deployments, persistent changes to what an AI agent can propose and evaluate are governed. Admitting a genuinely new business-model template, adding a new performance criterion, or authorizing a new measurement channel typically occurs through managerial and engineering processes rather than during on-path inference. We represent this governance as the capacity for intentional awareness-frame expansion, often triggered by evidence that the current frame is incomplete (“known unknowns” in the sense developed in Section 2). This is not a call to program the AI agent to behave irrationally. It is a call to separate fast, within-frame optimization from slower, reasoned decisions about whether the awareness frame itself should change.

Indeed, we prove that a single episode of frame-expanding innovation, which introduces new business models outside the AI agents’ awareness frame, can disrupt AI-induced strategic lock-in and thereby, set the conditions for renewed AI learning. However, this potential is tempered precisely because AI agents can create a universally comfortable status quo. We show that firms with low systematic risk and hence, lower cost of capital are more likely to tolerate the short-term opportunity costs of disruption to realize the long-term gains from superior strategic innovation. By contrast, firms with higher required rates of return discount future benefit more heavily. For them, even modest innovation costs may make it (rationally) attractive to maintain a profitable, AI-induced equilibrium and continue harvesting near-term cash flows.

Note that there is nothing irrational about the decision to embrace the status quo. Quite the contrary, human + AI policies can be perfectly rational. The benefits of operating within the status quo—given the knowledge available to the AI agents and their human overseers—may legitimately outweigh a move toward risky frame-expanding innovation. The society-level concern is that given the speed with which an SCE can arise (and here, it is worth reiterating that this may occur long before the options for within-frame innovation are adequately exhausted), strategic lock-in can foreclose the possibility of Schumpeterian creative destruction.

This suggests that the responsibility for frame-expanding innovation, which already falls disproportionately on entrepreneurial entrants willing to challenge the industry status quo, may fall even more heavily on them. Under universal incumbent AI adoption, the incumbent-entrant dichotomy may become even more salient. Entrants will not have the finely tuned AI agents that grew to master individual roles within the status quo. Therefore, a natural avenue to successful entry is the adoption of business models that are off the radar of incumbent rivals. Under these conditions, finely tuned SCEs may be fragile. Having said that, the possibility also exists that incumbent AI agents may learn to deter or neutralize challengers through subtle, implicitly coordinated responses. We put such speculation aside for future research.

Summing up, understanding the general effects of wide-scale AI adoption for business strategy guidance matters because the scale, speed, accuracy, and logical consistency of these machines change how firms arrive at the business models that drive their strategic activities. First, in the environments that we study, repeated interaction among such agents tends to settle into SCE. Delegation to AI compresses noise and accelerates within-frame convergence. Second, SCE can be robust yet wrong when awareness frames omit value-relevant possibilities; more and faster synthesis does not repair a misspecified frame. Third, human reframing—changing the set of feasible business models that define the awareness frame—remains pivotal; even a single, well-timed reframing can redirect the system toward a superior SCE (both individually and socially). These points connect directly to our formal apparatus. The agent’s decision kernel governs within-frame learning; frame expansion is (presently) a governed change to what the system can propose and evaluate.

For practice, read the division of labor this way. Strategy (in the business sense) is the choice of which business models to implement. AI agents are best used to enumerate, synthesize, stress test, and update candidate business models within the current awareness frame, whereas humans decide when the awareness frame must expand. Delegation raises baseline competence and speed; the risk is deeper commitment to an outdated frame if frame-expanding innovation is neglected. Several practical implications follow.

Configure the agent for the frame that you have. Horizon length, retrieval scope, and evaluation criteria should reflect the firm’s current commitments and constraints; these settings shape how quickly the agent converges to SCE and how it trades off stability versus responsiveness.
Run two cadences with different decision rights. Maintain high-frequency, AI-led within-frame updates (forecasting, scoring, and selecting among admitted business models). Separately, schedule lower-frequency, management-led frame audits that evaluate whether new business-model templates or new consequence labels/KPIs should be admitted and authorized for deployment. Triggers include persistent forecast residuals, repeated constraint violations, or salient environmental shocks. The goal is not random exploration but governed frame expansion when justified.
Measure two things, not one. Track within-frame learning (calibration and realized value under the current awareness frame) separately from frame-expanding interventions (catalog changes and first-time implementations of newly admitted business models). Maintain a simple frame ledger so that performance changes can be attributed to within-frame learning versus deliberate reframing.
Watch for false variety. Many business models that appear innovative because of a remix of constituent elements nevertheless reflect the same causal logic as those already in use—the status quo in disguise. Use theoretical consistency checks on proposed business models to avoid mistaking cosmetic divergence for genuine change.

In principle, AI agents based on existing technological architectures have the potential to automate business-model selection, the essential task of strategic management. Moreover, they have ability (again, in principle) to play the strategy guidance game exceptionally well. As remarkable as that may be, humans have the vital role of deciding when to change the game by expanding the awareness frames of their AIs (at least for now). Our analysis explains why SCE will eventually arise under delegation to AI agents; why high within-frame performance is not a substitute for human-driven frame expansion when strategic renewal is at stake; and why human-driven, frame-expanding innovation may be quite rationally stymied by the incentives created by the decision to delegate to AI in the first place.

Appendix A. Updating Bayesian Priors

At the start of period 1, the AI agent assigns prior weights $λ_{i, 1} \in Δ (E_{i})$ to the set of predictive theories. After it implements $b_{i, 1}$ and then, observes consequence $c_{i, 1}$ , Bayes’ rule yields posterior weights for each $e \in E_{i}$ according to

λ_{i, 2} (e_{i}) = \frac{λ_{i, 1} (e_{i}) e_{i} (c_{i, 1} | b_{i, 1}, h_{i, 1})}{\sum_{e_{i}^{'} \in E_{i}} λ_{i, 1} (e_{i}^{'}) e_{i}^{'} (c_{i, 1} | b_{i, 1}, h_{i, 1})},

where

h_{i, 1} = h_{i, 0}

is the null history. Proceeding inductively, after observing

(b_{i, t}, c_{i, t})

, the period

t + 1

posterior becomes

λ_{i, t + 1} (e_{i}) = \frac{λ_{i, t} (e_{i}) e_{i} (c_{i, t} | b_{i, t}, h_{i, t})}{\sum_{e_{i}^{'} \in E_{i}} λ_{i, t} (e_{i}^{'}) e_{i}^{'} (c_{i, t} | b_{i, t}, h_{i, t})} .

Then, ${\tilde{e}}_{i} (c_{i} | b_{i}, h_{i, t}) = \sum_{e_{i} \in E_{i}} e_{i} (c_{i} | b_{i}, h_{i, t}) λ_{i, t} (e_{i})$ , where $λ_{i, t}$ is $λ_{i, 1}$ updated as shown.

Appendix B. Rolling-T Decision Process

In each period t, agent i observes her individual history

h_{i, t} = (\emptyset, b_{i, 1}, c_{i, 1}, \dots, b_{i, t - 1}, c_{i, t - 1}) \in H_{i},

where

b_{i, τ} \in B_{i}

is the chosen business model and

c_{i, τ} \in C_{i}

is the observed consequence. Let

H_{i}

denote the set of all finite histories that agent i can ever experience. Let

{\tilde{e}}_{i} (c_{i} | b_{i, t}, h_{i, t})

be the posterior‐updated probability of observing

c_{i}

next period given that

b_{i, t}

is chosen and that

h_{i, t}

has been observed so far. The posterior is updated via Bayes’ rule each time new data

(b_{i, τ}, c_{i, τ})

are realized.

Also, given an individual history $h_{i, t}$ , let $H (h_{i, t})$ be the (cylinder) set of all individual histories $h_{i, t + T - 1}$ that agree on $h_{i, t}$ .

B.1. Finite-Horizon Value Function

Fix a planning horizon $T \geq 1$ . At the start of period t, if agent i is considering a current choice $b_{i, t} \in B_{i}$ , she thinks through the next T periods (from t to $t + T - 1$ ). A length‐T continuation is thus a finite sequence

\hat{h} = (b_{i, t}, c_{i, t}, b_{i, t + 1}, c_{i, t + 1}, \dots, b_{i, t + T - 1}, c_{i, t + T - 1})

with each

b_{i, τ} \in B_{i}

and each

c_{i, τ} \in C_{i}

. Let

H^{T} (h_{i, t})

be the set of all possible such continuations. Then, the T‐period subjective probability of a particular continuation

\hat{h}

, conditional on

(h_{i, t}, b_{i, t})

, is given by

\tilde{μ} (\hat{h} | h_{i, t}, b_{i, t}) = {\tilde{e}}_{i} (c_{i, t} | b_{i, t}, h_{i, t}) \prod_{k = 1}^{T - 1} [q_{i, t + k} (\cdot), e_{i} (\cdot)],

where

q_{i, t + k}

is the action‐selection probability at period

t + k

(see below) and

{\tilde{e}}_{i} (\cdot | \cdot, h_{i, t + k})

elaborates the Bayesian-updated table of probability estimates after

h_{i, t + k}

is observed.

Agent i’s T‐period expected discounted payoff given the choice of $b_{i, t}$ at $h_{i, t}$ is

V_{i}^{T} (b_{i, t} | h_{i, t}) = \sum_{\hat{h} \in H^{T} (h_{i, t})} \tilde{μ} (\hat{h} | h_{i, t}, b_{i, t}) [\sum_{k = 0}^{T - 1} δ_{i}^{k + 1} π_{i} (b_{i, t + k}, c_{i, t + k})],

where

π_{i}

is agent i’s per‐period payoff function and

δ_{i}

is her discount factor (with payoffs accruing at the end of each period).

B.2. Rolling-T Decision Rule

Because $B_{i}$ is finite, for each history $h_{i, t}$ there is a well‐defined maximizer of $V_{i}^{T} (\cdot | h_{i, t})$ . Therefore, define

f_{i}^{T} (h_{i, t}) \in \arg \max_{q \in Δ (B_{i})} \sum_{b_{i, t} \in B_{i}} q (b_{i, t}) V_{i}^{T} (b_{i, t} | h_{i, t}) .

(B.1)

This yields a (possibly mixed) distribution over next‐period business-model choices $b_{i, t} \in B_{i}$ . Because $f_{i}^{T}$ is well defined for every $h_{i, t} \in H_{i}$ , it defines a stationary behavior according to

σ_{i}^{T} : H_{i} \to Δ (B_{i}), where σ_{i}^{T} (h_{i, t}) = f_{i}^{T} (h_{i, t}) .

(B.2)

If all $i \in N$ adopt these rolling‐T strategies, then $σ^{T} = (σ_{1}^{T}, \dots, σ_{n}^{T})$ together with the objective probability law p generates a unique probability measure $μ_{σ^{T}}$ on infinite market histories $(b_{1}, c_{1}, b_{2}, c_{2}, \dots)$ .

Fix agent $i \in N$ , calendar date $t \geq 1$ , individual history $h_{i, t} \in H_{i}$ , and the updated posterior table ${\tilde{e}}_{i} (\cdot | \cdot, h_{i, t})$ . Let $Z_{i} (h_{i, t}) \subset {(C_{i} \times B_{i})}^{N}$ be the set of infinite individual histories $z_{i} = (b_{i, t}, c_{i, t}, b_{i, t + 1}, c_{i, t + 1}, \dots) that extend h_{i, t}$ . For the own-behavior map $σ_{i} : H_{i} \to Δ (B_{i})$ , define the measure ${\tilde{μ}}_{σ_{i}} (\cdot | h_{i, t})$ on $Z_{i} (h_{i, t})$ recursively starting at $h_{i, t}$ by ${\tilde{μ}}_{σ_{i}} (h_{i, t} | h_{i, t}) = 1$ , and then,

{\tilde{μ}}_{σ_{i}} (h_{i, s + 1}) = {\tilde{μ}}_{σ_{i}} (h_{i, s}) σ_{i} (b_{i, s} | h_{i, s}) {\tilde{e}}_{i} (c_{i, s} | b_{i, s}, h_{i, s}), s \geq t,

where we now adopt the convention of using tildes to indicate objects from the subjective perspective.

Definition B.1

(Subjective Expected Discounted Payoff). Agent i’s subjective expected present value of choosing $σ_{i}$ at information set $({\tilde{e}}_{i}, h_{i, t})$ is

{\tilde{E}}_{i} (σ_{i} | {\tilde{e}}_{i}, h_{i, t}) \equiv \int_{Z_{i} (h_{i, t})} [\sum_{s = t}^{\infty} δ_{i}^{s - t + 1} π_{i} (b_{i, s}, c_{i, s})] {\tilde{μ}}_{σ_{i}} (d z_{i} | h_{i, t}) .

(B.3)

Remark B.1.

The results of Section 3 do not rely on Dirichlet conjugacy. What matters for the learning argument is a compatibility condition in the sense of Kalai and Lehrer (1995); under the AI’s prior, every individual history that can occur under the true environment has positive probability. A finite predictive-theory catalog with strictly positive prior mass on the true response function is a convenient sufficient condition. The rolling‐T process repeats each period, always using the same formula for $V_{i}^{T}$ but with updated posterior inputs.

Let $μ_{σ} (h_{1}) = 1$ . Then, in period 1, the probability of each $h_{2} = (h_{1}, b_{1}, c_{1})$ is computed

μ_{σ} (h_{2}) = μ_{σ} (h_{1}) \prod_{i \in N} σ_{i} (b_{i, 1} | h_{i, 1}) p (c_{1} | b_{1}) .

We can then proceed inductively for any finite history $h_{t}$ . Because the market operates for an indefinite number of periods, let $z = (b_{1}, c_{1}, b_{2}, c_{2}, \dots)$ denote an infinite history. Let Z be the set of all such histories. Then, it can be shown that $μ_{σ}$ implies a probability distribution on the cylinder set of all infinite histories, Z.

Given a behavior profile $σ$ , let $μ_{σ}^{i} (\cdot | h_{i, t})$ denote the conditional marginal on agent i’s individual continuations. Specifically, for every $h_{i, τ} \in H_{i} (h_{i, t})$ with $τ \geq t$ ,

μ_{σ}^{i} (h_{i, τ} | h_{i, t}) \equiv \sum_{h_{- i, τ}} μ_{σ} ((h_{i, τ}, h_{- i, τ}) | h_{i, t}),

(B.4)

where the sum ranges over all opponents’ continuations

h_{- i, τ}

that together with

h_{i, τ}

, yield a valid market history.

Definition B.2

( $ε$ -Close). For probability measures $μ$ and $\tilde{μ}$ on the same measurable space, define the total variation distance

d (μ, \tilde{μ}) \equiv \sup_{A} | μ (A) - \tilde{μ} (A) | .

We say that $\tilde{μ}$ is $ε$ -close to $μ$ if $d (μ, \tilde{μ}) \leq ε$ .

Definition B.3

(Self-Confirming Equilibrium). Fix a planning horizon $T \geq 1$ . A pair $(σ, \tilde{e})$ is a self-confirming equilibrium at history $h_{t}$ if for every firm I, we have the following.

Subjective optimization (rolling T). At $h_{i, t}$ , the AI’s recommended mixed action $σ_{i} (\cdot | h_{i, t}) \in Δ (B_{i})$ maximizes its T-step subjective value function $V_{i}^{T} (\cdot | h_{i, t})$ :
$σ_{i} (\cdot | h_{i, t}) \in \arg \max_{q \in Δ (B_{i})} \sum_{b_{i} \in B_{i}} q (b_{i}) V_{i}^{T} (b_{i} | h_{i, t}),$
where $V_{i}^{T}$ is defined in Appendix B from the posterior table ${\tilde{e}}_{i} (\cdot | b_{i}, h_{i, t})$ .
Uncontradicted on-path beliefs. The subjective conditional distribution on agent i’s individual continuation histories induced by $(σ_{i}, {\tilde{e}}_{i})$ matches the true conditional marginal induced by $(σ, p)$ :
${\tilde{μ}}_{σ_{i}} (\cdot | h_{i, t}) = μ_{σ}^{i} (\cdot | h_{i, t})$
.³³

Definition B.4

((ε,η)-SCE). Fix $ε \geq 0$ , $η \geq 0$ , and a planning horizon $T \geq 1$ . A pair $(σ, \tilde{e})$ is an $(ε, η)$ -self-confirming equilibrium at history $h_{t}$ if for every firm I, we have the following.

Subjective optimization (rolling T). At $h_{i, t}$ , $σ_{i} (\cdot | h_{i, t})$ maximizes $\sum_{b_{i}} q (b_{i}) V_{i}^{T} (b_{i} | h_{i, t})$ as in Definition B.3.
$ε$ -uncontradicted on-path beliefs. The subjective conditional marginal ${\tilde{μ}}_{σ_{i}} (\cdot | h_{i, t})$ is $ε$ -close to the true conditional marginal $μ_{σ}^{i} (\cdot | h_{i, t})$ .
$η$ -approximate optimality for one-shot deviations. Let $σ_{i}^{〈 b_{i} 〉}$ denote the one-shot deviation that plays $b_{i}$ at $h_{i, t}$ and then, coincides with $σ_{i}$ from period $t + 1$ onward. Then,
${\tilde{E}}_{i} (σ_{i}^{〈 b_{i} 〉} | {\tilde{e}}_{i}, h_{i, t}) \leq {\tilde{E}}_{i} (σ_{i} | {\tilde{e}}_{i}, h_{i, t}) + η, \forall b_{i} \in B_{i} .$

The following theorem is an expanded version of Theorem 1 that includes a claim about planning bounds.

Theorem B.1

(Finite-Horizon Convergence to $(ε, η_{T})$ -SCE, Precise). Fix a finite planning horizon $T \geq 1$ and discount factors $δ_{i} \in (0, 1)$ . Under the assumptions listed in Appendix B, we have the following.

Compatibility. For each agent i, the objective conditional marginal $μ_{σ^{T}}^{i} (\cdot | h_{i, t})$ is compatible with the AI’s subjective conditional marginal ${\tilde{μ}}_{σ_{i}^{T}} (\cdot | h_{i, t})$ in the sense of Kalai and Lehrer (1995).
Belief merging. For every $ε > 0$ , almost surely there exists a finite random date $t_{0} (ε)$ such that for all $t \geq t_{0} (ε)$ and every i,
${\tilde{μ}}_{σ_{i}^{T}} (\cdot | h_{i, t}) i s ε - close t o μ_{σ^{T}}^{i} (\cdot | h_{i, t}) .$
Finite-horizon planning bound. Let ${\bar{π}}_{i} \equiv \max_{b_{i} \in B_{i}, c_{i} \in C_{i}} | π_{i} (b_{i}, c_{i}) |$ , and define
$η_{T} \equiv \max_{i \in N} 2 {\bar{π}}_{i} \frac{δ_{i}^{T + 1}}{1 - δ_{i}} .$
Then, at any history $h_{i, t}$ , the gain from any one-shot deviation in the current business-model choice (holding fixed the continuation kernel) is bounded above by $η_{T}$ in the AI’s subjective expected present value.
Consequently. For every $ε > 0$ , almost surely there exists a finite (random) calendar time $t_{0} (ε, T)$ such that from period $t_{0} (ε, T)$ onward, the rolling-T market dynamics constitute a $(ε, η_{T})$ -SCE as in Definition B.4.

Proof of Theorem 1.

Assume the following.

B1 (Finite action and consequence sets). There is a finite set of firms N, and for each firm I, the business-model set $B_{i}$ and consequence set $C_{i}$ are finite.
B2 (Nature’s law). Nature’s law, $p (c_{t} | b_{t})$ , is stationary over time.
B3 (Observed signal). Each AI observes its own past business-model choices and realized consequences (i.e., $(b_{i, t}, c_{i, t})$ each period).
B4 (Finite predictive-theory catalogs and positive prior support (compatibility)). For each firm i, the AI’s predictive model is a Bayesian mixture over a finite catalog $E_{i}$ of candidate predictive theories (individual response functions) $e_{i} : H_{i} \times B_{i} \to Δ (C_{i})$ , with prior $λ_{i} (e) > 0$ for all $e \in E_{i}$ . Moreover, the true response function induced by $(p, σ_{- i}^{T})$ belongs to the catalog (at the AI’s representational resolution); there exists $e_{i}^{*} \in E_{i}$ with $λ_{i} (e_{i}^{*}) > 0$ such that the distribution over i’s individual histories generated by $(σ_{i}^{T}, e_{i}^{*})$ coincides with the objective marginal $μ_{σ^{T}}^{i}$ .
B5 (Rolling-T decision kernel). At each individual history $h_{i, t}$ , firm i’s AI chooses a business model using the rolling-T kernel f based on its current posterior-predictive table ${\tilde{e}}_{i} (\cdot | \cdot, h_{i, t})$ . Learning is Bayesian; the posterior weights $λ_{i, t}$ on the predictive-theory catalog $E_{i}$ are updated by Bayes’ rule as $(b_{i, t}, c_{i, t})$ observations arrive, and ${\tilde{e}}_{i}$ is the induced mixture forecast. The resulting behavior profile is denoted $σ^{T}$ .
Step 1. Compatibility. Under B4, the subjective measure ${\tilde{μ}}_{σ_{i}^{T}}$ is a Bayesian mixture over the measures generated by each predictive theory in $E_{i}$ . Because the true predictive theory $e_{i}^{*}$ has strictly positive prior mass, every event that has positive probability under the true objective marginal also has positive probability under the subjective mixture. Equivalently,
$μ_{σ^{T}}^{i} ≪ {\tilde{μ}}_{σ_{i}^{T}} for each i,$
which is the compatibility condition of Kalai and Lehrer (1995).
Step 2. Belief merging. By the merging result in Kalai and Lehrer (1995, theorem 3.1 and the discussion following it), compatibility implies that the AI’s subjective conditional marginals merge with the true conditional marginals on the realized path. In particular, for every $ε > 0$ , almost surely there exists a finite random date $t_{0} (ε)$ such that for all $t \geq t_{0} (ε)$ and every i,
${\tilde{μ}}_{σ_{i}^{T}} (\cdot | h_{i, t}) is ε - close to μ_{σ^{T}}^{i} (\cdot | h_{i, t}) .$
Step 3. Finite-horizon planning bound. Fix an agent i and history $h_{i, t}$ . Let ${\bar{π}}_{i} = \max_{b_{i}, c_{i}} | π_{i} (b_{i}, c_{i}) | < \infty$ . The discounted contribution of payoffs beyond the T-period look ahead is bounded in absolute value by
${\bar{π}}_{i} \sum_{k = T}^{\infty} δ_{i}^{k + 1} = {\bar{π}}_{i} \frac{δ_{i}^{T + 1}}{1 - δ_{i}} .$
Because $σ_{i}^{T}$ maximizes the T-step value $V_{i}^{T} (\cdot | h_{i, t})$ , any one-shot deviation can only improve the AI’s infinite-horizon subjective expected present value through this discounted tail; hence, the gain is bounded above by twice the tail bound. Taking the maximum across i yields the stated $η_{T}$ .
Step 4. Conclusion. Step 2 gives the $ε$ -belief component in Definition B.4. Step 3 gives the $η_{T}$ one-shot-deviation bound, and rolling-T subjective optimization holds by construction of $f_{i}^{T}$ . Therefore, for every $ε > 0$ , almost surely there exists $t_{0} (ε, T)$ such that the rolling-T dynamics constitute an $(ε, η_{T})$ -SCE from period $t_{0} (ε, T)$ onward. □

Endnotes

1 This literature has grown so large so quickly that we already have multiple literature reviews on the topic—several of which we cite here.

² Throughout this paper, we use “AI agent” or “agent” to refer to the decision-making entity in our theoretical model—the autonomous system that selects strategies, forms beliefs, and learns from outcomes. We use “AI system” more broadly to refer to real-world AI technologies and implementations that such agents aim to model.

³ Our representation allows for improvements to present AI technologies through scaling, optimization, or incremental refinements, such as increased computational power, larger data sets, enhanced memory, or algorithmic tweaks—improvements collectively referred to as “within-paradigm advancements.” We stop short of speculating on disruptive breakthroughs, such as fundamentally novel architectures beyond current deep learning systems or alternative computational paradigms, that might enable artificial general intelligence. We have only seen the tip of the iceberg of potential applications of existing technologies. Beyond that, any speculation about paradigm-shifting breakthroughs is likely to be unreliable.

⁴ The notion of “self-confirming” or “subjective” equilibrium was formalized in game theory. Pioneering papers include Fudenberg and Levine (1993) and Kalai and Lehrer (1993, 1995). This formalism was introduced to strategy in Ryall (2003). Selected examples of related work include Repenning and Sterman (2002), Sorenson and Waguespack (2006), Adner et al. (2009), Alvarez and Parker (2009), Felin and Foss (2009), Ferraro et al. (2009), Ryall (2009), Powell et al. (2011), Menon (2018), Denrell et al. (2019), Bryan et al. (2022), Menon and Yao (2024), and Shelef et al. (2025).

⁵ Experimentation stops only when it is optimal to do so given the AI’s priors.

⁶ Given finite storage capacities of both humans and AI agents, human uniqueness in the sense of having supra-AI cognitive capacities is not required for a human to be aware of a different set of things than her AI agent. Nevertheless, as we discuss in greater detail below, we find the human capacity for actionable awareness-of-unawareness states to be a compelling source of frame-extending discoveries.

⁷ Our predictive theories correspond to what Kalai and Lehrer (1995) refer to as “environment response functions.”

⁸ One need not interpret the business-model catalog as being fully elaborated during generation and then stored in physical media. Rather, think of it as the complete set of business models that could be generated from a finite (although potentially enormous) collection of primitives data.

⁹ The distinction between uncertainty and unawareness has been formalized in decision theory. Standard probability theory provably cannot represent unawareness. One can assign probability 0 to a proposition, but one cannot exclude a proposition from a sigma algebra in a probability space and simultaneously reason about that exclusion; see Modica and Rustichini (1994), Heifetz et al. (2006), and Bryan et al. (2022) for a strategy application. Our claim is that the architectural features of present AI paradigms—operating over fixed representational spaces with probabilistic or quasiprobabilistic evaluation—inherit this limitation.

¹⁰ We thank an anonymous reviewer for these examples, which are actual outputs from an LLM.

¹¹ Not modeled but implicitly available to the agent is a structured database that associates each historically observed business model with an elaboration of its constituent elements (i.e., resources and the activity systems that they can support).

¹² We treat $B_{i}$ as being fully generated and stored at the beginning of time, but its essential purpose is to define all of the within-frame innovations that the AI agent could come up with given its initial training procedure.

¹³ Formally, $B \equiv \times_{i} B_{i}$ .

¹⁴ Formally, $C \equiv \times_{i} C_{i}$ .

¹⁵ We assume that p is stationary (i.e., $p (c_{t} | b_{t})$ is independent of t).

¹⁶ Formally, $σ_{i} : H_{i} \to Δ (B_{i})$ .

¹⁷ Formally, $π_{i} : B_{i} \times C_{i} \to R$ .

¹⁸ The marginal distribution on firm i’s individual histories is denoted $μ_{σ}^{i}$ . The construction of a cylinder-set measure on infinite histories appears in Appendix B.

¹⁹ Note that a centralized AI agent is not acting as an overarching market coordination device and that the individual AI agents are not assumed to be communicating with each other.

²⁰ Formally, $e_{i} : H_{i} \times B_{i} \to Δ (C_{i})$ .

²¹ Without some “grain-of-truth” condition in the sense of real-world compatibility à la Kalai and Lehrer (1995), asymptotically calibrated prediction cannot generally be achieved from feedback alone.

²² See Appendix A for the details.

²³ This is what they care about at least in principle. The short-term focus of managers in the real world is a topic for another paper.

²⁴ Equivalently, total variation distance is at most $ε$ ; Appendix B records the formal definition.

²⁵ For the precise formal definitions and results, see Appendix B and Definition B.4.

²⁶ See Appendix B for the precise details.

²⁷ It is not controversial to point out that much ink has been spilled in the strategy literature cataloging all of the ways in which humans fail on cognitive tasks like these.

²⁸ In many deployments, these catalogs are versioned artifacts: the library of admissible business-model templates that the system may propose and implement and the dictionary of outcome labels/KPIs that the evaluator forecasts and optimizes.

²⁹ These priors are chosen only to make the logic transparent; the AI initially views the “collusive” business model as associated with relatively high prices, whereas more aggressive business models are associated with lower prices. The convergence theorem does not rely on Dirichlet updating; it relies on finite predictive-theory catalogs with full-support priors (Appendix B).

³⁰ For each $k_{i}$ , the one-period expected profit equals $q_{i} (k_{i}) \cdot E_{{\tilde{e}}_{i} (\cdot | k_{i}, h_{i, 1})} [P - 10]$ , where $q_{i} (k_{i}) \in {22.5, 30.0, 45.0}$ is the output associated with business model $k_{i}$ .

³¹ This logic differs from textbook repeated-game collusion, which sustains cooperation via threats and punishments. Here, “collusion” is sustained by belief-driven optimization plus selective feedback; the only beliefs disciplined by data are those attached to on-path business models.

³² This is not a claim about what the AI could list, imagine, or discuss off path; it is a claim about which business models are actually implemented and generate feedback in the market.

³³ Given a behavior profile $σ$ , $μ_{σ}^{i} (\cdot | h_{i, t})$ is the true conditional marginal distribution on agent i’s individual continuation histories from period t onward. The subjective counterpart ${\tilde{μ}}_{σ_{i}} (\cdot | h_{i, t})$ is induced by agent i’s behavior $σ_{i}$ and posterior table ${\tilde{e}}_{i}$ as in Appendix B.

References

Adner R, Pólos L, Ryall M, Sorenson O (2009) The case for formal theory. Acad. Management Rev. 34(2):201–208.Crossref, Google Scholar
Alvarez SA, Parker SC (2009) Emerging firms and the allocation of control rights: A Bayesian approach. Acad. Management Rev. 34(2):209–227.Crossref, Google Scholar
Arias-Pérez J, Vélez-Jaramillo J, Callegaro-de Menezes D (2025) Leveraging artificial intelligence capability and open innovation to optimize agility: Is generative AI outmatching human expertise? J. Knowledge Econom., ePub ahead of print June 16, https://doi.org/10.1007/s13132-025-02799-2.Crossref, Google Scholar
Barney JB (1989) Asset stocks and sustained competitive advantage: A comment. Management Sci. 35(12):1511–1513.Link, Google Scholar
Belhadi A, Mani V, Kamble SS, Khan SAR, Verma S (2021) Artificial intelligence-driven innovation for enhancing supply chain resilience and performance under the effect of supply chain dynamism: An empirical investigation. Ann. Oper. Res. 3(2):1–26.Google Scholar
Bender EM, Gebru T, McMillan-Major A, Shmitchell S (2021) On the dangers of stochastic parrots: Can language models be too big? Proc. 2021 ACM Conf. Fairness Accountability Transparency (Association for Computing Machinery, New York), 610–623.Google Scholar
Brandenburger A, Stuart H (2007) Biform games. Management Sci. 53(4):537–549.Link, Google Scholar
Bryan KA, Ryall MD, Schipper BC (2022) Value capture in the face of known and unknown unknowns. Strategy Sci. 7(3):157–189.Link, Google Scholar
Chollet F (2019) On the measure of intelligence. Preprint, submitted November 25, https://arxiv.org/abs/1911.01547.Google Scholar
Csaszar FA, Ketkar H, Kim H (2024) Artificial intelligence and strategic decision-making: Evidence from entrepreneurs and investors. Strategy Sci. 9(4):322–345.Link, Google Scholar
Denrell J, Fang C, Liu C (2019) In search of behavioral opportunities from misattributions of luck. Acad. Management Rev. 44(4):896–915.Crossref, Google Scholar
Doshi AR, Bell JJ, Mirzayev E, Vanneste BS (2025) Generative artificial intelligence and evaluating strategic decisions. Strategic Management J. 46(3):583–610.Crossref, Google Scholar
Enholm IM, Papagiannidis E, Mikalef P, Krogstie J (2022) Artificial intelligence and business value: A literature review. Inform. Systems Frontiers 24(5):1709–1734.Crossref, Google Scholar
Felin T, Foss NJ (2009) Social reality, the boundaries of self-fulfilling prophecy, and economics. Organ. Sci. 20(3):654–668.Link, Google Scholar
Felin T, Holweg M (2024) Theory is all you need: AI, human cognition, and causal reasoning. Strategy Sci. 9(4):346–371.Link, Google Scholar
Felin T, Zenger TR (2017) The theory-based view: Economic actors as theorists. Strategy Sci. 2(4):258–271.Link, Google Scholar
Ferraro F, Pfeffer J, Sutton RI (2009) How and why theories matter: A comment on Felin and Foss (2009). Organ. Sci. 20(3):669–675.Link, Google Scholar
Fudenberg D, Levine DK (1993) Self-confirming equilibrium. Econometrica 61(3):523–545.Crossref, Google Scholar
Gans J, Ryall MD (2017) Value capture theory: A strategic management review. Strategic Management J. 38(1):17–41.Crossref, Google Scholar
Haefner N, Wincent J, Parida V, Gassmann O (2021) Artificial intelligence and innovation management: A review, framework, and research agenda. Tech. Forecasting Soc. Change 162:120392.Crossref, Google Scholar
Heifetz A, Meier M, Schipper BC (2006) Interactive unawareness. J. Econom. Theory 130(1):78–94.Crossref, Google Scholar
Joshi S (2025) The role of artificial intelligence in strategic decision-making: A comprehensive review. Preprint, submitted April 30, https://doi.org/10.20944/preprints202505.0047.v1.Google Scholar
Kalai E, Lehrer E (1993) Subjective equilibrium in repeated games. Econometrica 61(5):1231–1240.Crossref, Google Scholar
Kalai E, Lehrer E (1995) Subjective games and equilibria. Games Econom. Behav. 8(1):123–163.Crossref, Google Scholar
Karni E, Vierø M-L (2013) “Reverse Bayesianism”: A choice-based theory of growing awareness. Amer. Econom. Rev. 103(7):2790–2810.Crossref, Google Scholar
Karni E, Vierø M-L (2017) Awareness of unawareness: A theory of decision making in the face of ignorance. J. Econom. Theory 168:301–328.Crossref, Google Scholar
Kemp A (2024) Competitive advantage through artificial intelligence: Toward a theory of situated AI. Acad. Management Rev. 49(3):618–635.Crossref, Google Scholar
Kirkpatrick J, Pascanu R, Rabinowitz N, Veness J, Desjardins G, Rusu AA, Milan K, et al. (2017) Overcoming catastrophic forgetting in neural networks. Proc. Natl. Acad. Sci. USA 114(13):3521–3526.Crossref, Google Scholar
Krakowski S, Luger J, Raisch S (2023) Artificial intelligence and the changing sources of competitive advantage. Strategic Management J. 44(6):1425–1452.Crossref, Google Scholar
Lake BM, Ullman TD, Tenenbaum JB, Gershman SJ (2017) Building machines that learn and think like people. Behav. Brain Sci. 40(10):e253.Crossref, Google Scholar
LeCun Y (2022) A path towards autonomous machine intelligence version 0.9.2, 2022-06-27. Open Rev. 62(1):1–62. Google Scholar
López-Solís O, Luzuriaga-Jaramillo A, Bedoya-Jara M, Naranjo-Santamaría J, Bonilla-Jurado D, Acosta-Vargas P (2025) Effect of generative artificial intelligence on strategic decision-making in entrepreneurial business initiatives: A systematic literature review. Admin. Sci. 15(2):66.Crossref, Google Scholar
McCloskey M, Cohen NJ (1989) Catastrophic interference in connectionist networks: The sequential learning problem. Bower GH, ed. Psychology of Learning and Motivation, vol. 24 (Academic Press, Cambridge, MA), 109–165.Google Scholar
Menon A (2018) Bringing cognition into strategic interactions: Strategic mental models and open questions. Strategic Management J. 39(1):168–192.Crossref, Google Scholar
Menon A, Yao D (2024) Rationalizing outcomes: Interdependent learning in competitive markets. Strategy Sci. 9(2):97–117.Link, Google Scholar
Mikalef P, Krogstie J, Pappas IO, Pavlou P (2020) Exploring the relationship between big data analytics capability and competitive performance: The mediating roles of dynamic and operational capabilities. Inform. Management 57(2):103169.Crossref, Google Scholar
Modica S, Rustichini A (1994) Awareness and partitional information structures. Theory Decision 37(1):107–124.Crossref, Google Scholar
Nikzat P (2025) Review of artificial intelligence (AI) revolution and strategic competitive advantage in business and management. Amer. J. Indust. Bus. Management 15(11):1685–1699.Google Scholar
Pearl J (2009) Causality: Models, Reasoning, and Inference, 2nd ed. (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
Porter ME (1980) Competitive Strategy: Techniques for Analysing Industries and Competitors (Free Press, New York).Google Scholar
Porter ME (1996) From competitive advantage to corporate strategy. Goold M, Luchs KS, eds. Managing the Multibusiness Company: Strategic Issues for Diversified Groups (Routledge, Oxford, UK), 234–255.Google Scholar
Powell TC, Lovallo D, Fox CR (2011) Behavioral strategy. Strategic Management J. 32(13):1369–1386.Crossref, Google Scholar
Repenning NP, Sterman JD (2002) Capability traps and self-confirming attribution errors in the dynamics of process improvement. Admin. Sci. Quart. 47(2):265–295.Crossref, Google Scholar
Ryall MD (2003) Subjective rationality, self-confirming equilibrium, and corporate strategy. Management Sci. 49(7):936–949.Link, Google Scholar
Ryall MD (2009) Causal ambiguity, complexity, and capability-based advantage. Management Sci. 55(3):389–403.Link, Google Scholar
Schipper BC (2021) Discovery and equilibrium in games with unawareness. J. Econom. Theory 198:105365.Crossref, Google Scholar
Shelef O, Wuebker R, Barney JB (2025) Heisenberg effects in experiments on business ideas. Acad. Management Rev. 50(3):560–580.Crossref, Google Scholar
Sorenson O, Waguespack DM (2006) Social structure and exchange: Self-confirming dynamics in Hollywood. Admin. Sci. Quart. 51(4):560–589.Crossref, Google Scholar
van de Ven GM, Soures N, Kudithipudi D (2024) Continual learning and catastrophic forgetting. Preprint, submitted March 8, https://arxiv.org/abs/2403.05175.Google Scholar
Wernerfelt B (1984) A resource-based view of the firm. Strategic Management J. 5(2):171–180.Crossref, Google Scholar
Xu Y, Li W, Vaezipoor P, Sanner S, Khalil EB (2023) LLMs and the abstraction and reasoning corpus: Successes, failures, and the importance of object-based representations. Preprint, submitted May 26, https://arxiv.org/abs/2305.18354.Google Scholar
Yang L, Shirvaikar V, Clivio O, Falck F (2024) A critical review of causal reasoning benchmarks for large language models. Preprint, submitted July 10, https://arxiv.org/abs/2407.08029.Google Scholar

Nataliia Neshenko is an assistant professor in the Department of Information Technology and Operations Management at Florida Atlantic University. She holds a PhD in Computer Science from FAU. Her research combines computational methods, machine learning, and formal modeling to study strategic and governance implications of AI delegation in competitive and adversarial settings. Her work has appeared in Risk Analysis, IEEE Communications Surveys and Tutorials, and the Journal of Big Data.

Michael D. Ryall is a professor of management at Florida Atlantic University, Director of Policy at the Madden Center for Value Creation, and professor emeritus at the University of Toronto’s Rotman School of Management. A pioneer in game-theoretic foundations of competitive strategy, his research spans value capture, strategic learning, causal inference, and AI in competitive environments. His work appears in Management Science, Strategic Management Journal, and Academy of Management Review.

Volume 11, Issue 1

March 2026

Pages 1-179, ii

Article Information

Metrics

Information

Received:May 01, 2025
Accepted:January 17, 2026
Published Online:February 12, 2026

Cite as

Nataliia Neshenko, Michael D. Ryall (2026) When Artificial Intelligence Does Strategy: Learning, Good Times, Lock-in, and Human-Driven Strategic Renewal. Strategy Science 11(1):157-179.

https://doi.org/10.1287/stsc.2025.0448

Keywords

PDF download

Available Issues

Available Issues

When Artificial Intelligence Does Strategy: Learning, Good Times, Lock-in, and Human-Driven Strategic Renewal

Abstract

1. Introduction

1.1. Our Methodology

1.2. Three Regimes That We Consider

1.3. Summary of Our Findings

1.4. Contributions

2. Delegating Strategic Reasoning to AI Agents

2.1. The AI Paradigm Underlying Our Model

2.2. Within-Frame vs. Frame-Expanding Innovation

2.3. What Humans Add Under Present Deployments

3. A Model of Universal, AI-Driven Strategy

3.1. Notational Conventions

3.2. Market Dynamics

3.2.1. Deterministic Elements.

3.2.2. Stochastic Elements.

3.2.3. Objective Expected Payoffs.

3.3. AI-Driven Business Model Selection

3.3.1. Subjective Predictive Theories.

3.3.2. Rolling-T Optimization.

3.3.3. Main Result.

3.3.4. How to Interpret an ε-SCE.

3.3.5. How Would You Tell? Diagnosing an AI-Mediated ε-SCE.

3.3.5.1. Frame Expansion Resets the Diagnostic Window.

3.4. Extended Example: A Noisy Cournot Market

3.4.1. Model Primitives.

3.4.2. The Objective Nature’s Law.

3.4.3. Initial Estimates and Illustrative Dirichlet Tables (t=1).

3.4.4. An Implicitly Collusive Learning Trap.

3.4.5. Interpretation as an (ε)-SCE.

4. Human-Driven Frame-Expanding Innovation

4.1. AI Response to Frame-Expanding Innovation

4.2. A Greedy Benchmark for Frame Expansion

4.2.1. Discovery Phase.

4.2.2. Learning Phase.

5. AI Incentives Against the Frame-Expanding Innovation

5.1. To Expand the Frame or Not to Expand

5.2. Manager M’s Expected Gain from Exploration

6. Conclusion

Appendix A. Updating Bayesian Priors

Appendix B. Rolling-T Decision Process

B.1. Finite-Horizon Value Function

B.2. Rolling-T Decision Rule

References

Volume 11, Issue 1

Article Information

Metrics

Information

Cite as

Keywords

3.3.4. How to Interpret an $ε$ -SCE.

3.3.5. How Would You Tell? Diagnosing an AI-Mediated $ε$ -SCE.

3.4.3. Initial Estimates and Illustrative Dirichlet Tables ( $t = 1$ ).

3.4.5. Interpretation as an $(ε)$ -SCE.