March 5, 2026 in Healthcare Analytics

Digital First Responder: A Scalable AI to Uncover Hidden Patient Risk

SHARE: PRINT ARTICLE:print this page https://doi.org/10.1287/orms.2026.01.15

ER nurse and doctor

A horse! A horse! My kingdom for a horse! 
– William Shakespeare, Richard III

Prioritizing patients for treatment is often a life-and-death decision. The frontline clinicians must make decisions with incomplete information, and an incorrect decision could result in delay or even death. An artificial intelligence (AI) assistant could “triage at scale” to help clinicians in challenging environments, such as disaster response after an earthquake.

Gaps in Current Triage
Consider this scenario: A 69-year-old male with a minor cut on his head from a collapsed ceiling is conscious, reports mild dizziness and walks around to check on others. He appears to be a low-risk patient to the field clinician. Unfortunately, he is taking warfarin (a blood thinner that prevents his blood from clotting). His head could have an internal hemorrhage and thus need a head scan immediately. This is when an AI triage assistant could synthesize his full clinical history to inform the frontline staff and reduce their cognitive loads.

What Is a Triage, and Why Is It Important?
Emergency departments (EDs) are often overcrowded. The triage clinicians often prioritize patients based on observations. Therefore, we propose a large language model (LLM) tool to inform users of the relevant clinical history of a patient to achieve “triage at scale,” which is particularly critical during large-scale crises. 

Beyond crisis management, this technology is vital for navigating the daily complexities of an aging population. It offers physicians the most relevant information at the point of care. Ultimately, deploying such a tool enhances the health system’s allocative efficiency, providing a scalable strategy to tackle rapidly growing health care expenditures.

The Components and Rationale
The proposed solution contains a data layer, a triage decision engine and a validation layer. The first element is a data layer that uses Fast Healthcare Interoperability Resources (FHIR), a modern standard for electronic health information exchange. The triage LLM needs to ingest and understand data in its native format for standardization and auditability.

The second element is a triage decision engine that uses an open-source or open-weight LLM if possible. Clinical data is sensitive, and it may not be possible to be accessed by a commercial LLM. Another benefit is that the system will not be affected by unplanned changes to ensure consistency.

The third element is a validation layer (LLMs-as-judges) that acts as a quality assurance of the triage LLM suggestions. During the development phase, the reasoning of the recommendations from the triage LLM will be compared with the explanations (ground truth) from the test data to establish validity. During the implementation phase, the reasonings are checked for consistency. It is not practical to run validations in every decision, but it is advisable to run validation on regular intervals to check against unexpected model degradation. Figure 1 shows an illustrated diagram on using an ensemble (LLMs from different families for diversity to minimize inherent bias) to validate the output of a specialized LLM as “AI Peer Review.”

Figure 1 An illustrative diagram on using LLMs-as-judges
Figure 1: An illustrative diagram on using LLMs-as-judges (AI Peer Review).

 

Prototype Performance Comparable to Baseline Human Accuracy
The lead author created a proof-of-concept prototype using 100 clinical cases from an emergency medicine handbook [1] transformed into FHIR using Gemma 3 27B (an open-weight LLM). A MedGemma 27B (4-bit quant) model (a clinically trained open-weight LLM from Google) is used because it understands clinical terms and can directly ingest FHIR data. 

The overall Emergency Severity Index (ESI) prediction accuracy (between actual and predicted ESI) is 56% (exact ESI), which is on par with the 59% from humans [2]. The LLM tends to be more cautious (guardrail bias), with only 2 out of 100 cases being undertriaged. The clinical reasoning of the predicted severity is consistent with the handbook explanations in 90% of cases. Consistency was assessed by checking whether the model cited the same risk factors as the handbook rationale. The clinicians are decision-makers and need to know and trust the clinical reasoning. This is also a profound point about explainable AI.

Patient safety is the priority because emergency triage errors have asymmetric costs: a false negative (underestimating risk) can be fatal, whereas a false positive (overestimating risk) may result in additional tests. Therefore, the model is tuned to minimize adverse outcomes and prioritize patient safety over resource efficiency. We accept increased utilization of precautionary checks to prevent the catastrophic failure of missing life-threatening conditions. This is why “human-in- the-loop” as a decision-maker is paramount, even though the process is more efficient.

To move from prototype to practice, it is essential to follow the responsible AI principles to ensure that the prototype will not inadvertently discriminate against any group. The reasoning output should be detailed enough to understand but simple enough to avoid cognitive overload. Clinical decisions are high stakes: Extensive testing on various situations and edge cases are needed to ensure consistency. In production, the judge ensemble should regularly run on a stratified sample and route high-disagreement cases to clinical review.

To sum up, this LLM-enabled scalable triage tool could save time and lives, especially under intense pressure; our approach supports a safer priority rule by surfacing hidden high-risk signals from longitudinal history in seconds.

Aaron Lai
Aaron Lai
Li-Lin Liang
Li-Lin Liang

SHARE:

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.