Abshari D, Fu C, Sridhar M (2024) LLM-assisted physical invariant extraction for cyber-physical systems anomaly detection. Preprint, submitted November 17, https://arxiv.org/abs/2411.10918.Google Scholar
Adhikari D, Jiang W, Zhan J, Rawat DB, Bhattarai A (2024) Recent advances in anomaly detection in internet of things: Status, challenges, and perspectives. Comput. Sci. Rev. 54:100665.Google Scholar
Ali MM (2023) Real-time video anomaly detection for smart surveillance. IET Image Processing 17(5):1375–1388.Google Scholar
Alves JV, Leitão D, Jesus S, Sampaio MO, Liébana J, Saleiro P, Figueiredo MA, et al. (2025) A benchmarking framework and data set for learning to defer in human-ai decision-making. Sci. Data 12(1):506.Google Scholar
Bharadwaj R, Gani H, Naseer M, Khan FS, Khan S (2024) Vane-bench: Video anomaly evaluation benchmark for conversational LMMs. Preprint, submitted June 14, https://arxiv.org/abs/2406.10326.Google Scholar
Bhat A, Mondal A, Tripathy A (2025) LLM agents for internet of things (IoT) applications. Proc. CS598 LLM Agent 2025 Workshop (OpenReview. net).Google Scholar
Bui AL, Fonarow GC (2012) Home monitoring for heart failure management. J. Amer. College Cardiology 59(2):97–104.Google Scholar
Chen CM (2011) Web-based remote human pulse monitoring system with intelligent data analysis for home health care. Expert Systems Appl. 38(3):2011–2019.Google Scholar
Chen J, Mueller J (2023) Quantifying uncertainty in answers from any language model and enhancing their trustworthiness. Preprint, submitted August 30, https://arxiv.org/abs/2308.16175.Google Scholar
Chen T, Liu X, Da L, Chen J, Papalexakis V, Wei H (2025) Uncertainty quantification of large language models through multi-dimensional responses. Preprint, submitted February 24, https://arxiv.org/abs/2502.16820.Google Scholar
Chow C (2003) On optimum recognition error and reject tradeoff. IEEE Trans. Inform. Theory 16(1):41–46.Google Scholar
Da L, Chen T, Cheng L, Wei H (2024) LLM uncertainty quantification through directional entailment graph and claim level response augmentation. Preprint, submitted July 1, https://arxiv.org/abs/2407.00994.Google Scholar
Duan H, Zhang J, Zhang L, Wu Y, Lv T, Zeng Y, Cheng X (2025) A home broadband maintenance and installation solution leveraging LLM-agent technology. Proc. IEEE 8th Inform. Tech. Mechatronics Engrg. Conf., vol. 8 (IEEE, Piscataway, NJ), 1–6.Google Scholar
D’Incecco M, Squartini S, Zhong M (2019) Transfer learning for non-intrusive load monitoring. IEEE Trans. Smart Grid 11(2):1419–1429.Google Scholar
Franc V, Prusa D, Voracek V (2023) Optimal strategies for reject option classifiers. J. Machine Learn. Res. 24(11):1–49.Google Scholar
Gao S, Yang P, Liu Y, Chen Y, Zhu H, Zhang X, Huang L (2025) VAGU & GTS: LLM-based benchmark and framework for joint video anomaly grounding and understanding. Preprint, submitted July 29, https://arxiv.org/abs/2507.21507.Google Scholar
Geifman Y, El-Yaniv R (2017) Selective classification for deep neural networks. Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 30 (Curran Associates, Inc., Red Hook, NY), 4878–4887.Google Scholar
Geifman Y, El-Yaniv R (2019) Selectivenet: A deep neural network with an integrated reject option. Chaudhuri K, Sugiyama M, eds. Proc. 36th Internat. Conf. Machine Learn., vol 97 (PMLR, New York), 2151–2159.Google Scholar
Gong D, Liu L, Le V, Saha B, Mansour MR, Venkatesh S, Avd H (2019) Memorizing normality to detect anomaly: Memory-augmented deep autoencoder for unsupervised anomaly detection. Proc. IEEE/CVF Internat. Conf. Comput. Vision (IEEE, Piscataway, NJ), 1705–1714.Google Scholar
Hasan M, Choi J, Neumann J, Roy-Chowdhury AK, Davis LS (2016) Learning temporal regularity in video sequences. Proc. IEEE Conf. Comput. Vision Pattern Recognition (IEEE, Piscataway, NJ), 733–742.Google Scholar
He M, Jia T, Duan C, Cai H, Li Y, Huang G (2024) LLMelog: An approach for anomaly detection based on LLM-enriched log events. Proc. IEEE 35th Internat. Sympos. Software Reliability Engrg. (IEEE, Piscataway, NJ), 132–143.Google Scholar
Ho J, Salimans T, Gritsenko A, Chan W, Norouzi M, Fleet DJ (2022) Video diffusion models. Oh A, Agarwal A, Belgrave D, Cho K, eds. Adv. Neural Inform Processing Systems, vol. 35 (Curran Associates, Inc., Red Hook, NY), 8633–8646.Google Scholar
Hou B, Liu Y, Qian K, Andreas J, Chang S, Zhang Y (2023) Decomposing uncertainty for large language models through input clarification ensembling. Preprint, submitted November 15, https://arxiv.org/abs/2311.08718.Google Scholar
Inan H, Upasani K, Chi J, Rungta R, Iyer K, Mao Y, Tontchev M, et al. (2023) Llama guard: LLM-based input-output safeguard for human-ai conversations. Preprint, submitted December 7, https://arxiv.org/abs/2312.06674.Google Scholar
Ionescu RT, Khan FS, Georgescu MI, Shao L (2019) Object-centric auto-encoders and dummy anomalies for abnormal event detection in video. Proc. IEEE/CVF Conf. Comput. Vision Pattern Recognition (IEEE, Piscataway, NJ), 7842–7851.Google Scholar
Kiran BR, Thomas DM, Parakkal R (2018) An overview of deep learning based methods for unsupervised and semi-supervised anomaly detection in videos. J. Imaging 4(2):36.Google Scholar
Kirchhof M, Kasneci G, Kasneci E (2025) Position: Uncertainty quantification needs reassessment for large-language model agents. Preprint, submitted May 28, https://arxiv.org/abs/2505.22655.Google Scholar
Li S, Liu F, Jiao L (2022) Self-training multi-sequence learning with transformer for weakly supervised video anomaly detection. Proc. AAAI Conf. Artificial Intelligence, vol. 36 (AAAI Press, Palo Alto, CA), 1395–1403.Google Scholar
Lin Y, Liu S, Huang S (2018) Selective sensing of a heterogeneous population of units with dynamic health conditions. IISE Trans. 50(12):1076–1088.Google Scholar
Lin Z, Trivedi S, Sun J (2024) Generating with confidence: Uncertainty quantification for black-box large language models. Trans. Machine Learn. Res.Google Scholar
Ling C, Zhao X, Zhang X, Cheng W, Liu Y, Sun Y, Oishi M, et al. (2024) Uncertainty quantification for in-context learning of large language models. Preprint, submitted February 1, https://arxiv.org/abs/2402.10189.Google Scholar
Liu J, Xia Y, Tang Z (2021) Privacy-preserving video fall detection using visual shielding information. Visual Comput. 37(2):359–370.Google Scholar
Liu W, Luo W, Lian D, Gao S (2018) Future frame prediction for anomaly detection–A new baseline. Proc. IEEE Conf. Comput. Vision Pattern Recognition (IEEE, Piscataway, NJ), 6536–6545.Google Scholar
Liu X, Chen T, Da L, Chen C, Lin Z, Wei H (2025) Uncertainty quantification and confidence calibration in large language models: A survey. Proc. 31st ACM SIGKDD Conf. Knowledge Discovery Data Mining V.2 (ACM, New York), 6107–6117.Google Scholar
Liu G, Lin Z, Yan S, Sun J, Yu Y, Ma Y (2012) Robust recovery of subspace structures by low-rank representation. IEEE Trans. Pattern Anal. Machine Intelligence 35(1):171–184.Google Scholar
Liu P, Yuan W, Fu J, Jiang Z, Hayashi H, Neubig G (2023) Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Comput. Survey 55(9):1–35.Google Scholar
Lopes SI, Pinho P, Marques P, Abreu C, Carvalho NB, Ferreira J (2021) Contactless smart screening in nursing homes: An IoT-enabled solution for the COVID-19 era. Proc. 17th Internat. Conf. Wireless Mobile Comput. Networking Comm. (IEEE, Piscataway, NJ), 145–150.Google Scholar
Lopes SI, Silva F, Pinho P, Marques P, Abreu C, Milheiro J, Braga B, et al. (2024) CoViS: A contactless health monitoring system for the nursing home. IEEE Access 12:20802–20821.Google Scholar
Lv H, Sun Q (2024) Video anomaly detection and explanation via large language models. Preprint, submitted January 11, https://arxiv.org/abs/2401.05702.Google Scholar
Lv H, Chen C, Cui Z, Xu C, Li Y, Yang J (2021) Learning normal dynamics in videos with meta prototype network. Proc. IEEE/CVF Conf. Comput. Vision Pattern Recognition (IEEE, Piscataway, NJ), 15425–15434.Google Scholar
Madras D, Pitassi T, Zemel R (2018) Predict responsibly: Improving fairness and accuracy by learning to defer. Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 31 (Curran Associates, Inc., Red Hook, NY), 6150–6160.Google Scholar
Malone M, Schultz G (2022) Challenges in the diagnosis and management of wound infection. British J. Dermatology 187(2):159–166.Google Scholar
Markovitz A, Sharir G, Friedman I, Zelnik-Manor L, Avidan S (2020) Graph embedded pose clustering for anomaly detection. Proc. IEEE/CVF Conf. Comput. Vision Pattern Recognition (IEEE, Piscataway, NJ), 10539–10547.Google Scholar
Mehandru N, Golchini N, Bamman D, Zack T, Molina MF, Alaa A (2025) Er-reason: A benchmark data set for llm-based clinical reasoning in the emergency room. Preprint, submitted May 28, https://arxiv.org/abs/2505.22919.Google Scholar
Mnih A, Salakhutdinov RR (2007) Probabilistic matrix factorization. Platt JC, Koller D, Singer Y, Roweis S, eds. Advances in Neural Information Processing Systems, vol. 20. (Curran Associates, Inc., Red Hook, NY), 1257–1264.Google Scholar
Nayak R, Pati UC, Das SK (2021) A comprehensive review on deep learning-based methods for video anomaly detection. Image Vision Comput. 106:104078.Google Scholar
Nikitin A, Kossen J, Gal Y, Marttinen P (2024) Kernel language entropy: Fine-grained uncertainty quantification for LLMs from semantic similarities. Globerson A, Mackey L, Belgrave D, Fan A, Paquet U, Tomczak J, Zhang C, eds. Adv. Neural Inform. Processing Systems, vol. 37 (Curran Associates, Inc., Red Hook, NY), 8901–8929.Google Scholar
Ntelopoulos A, Nasrollahi K (2024) CALLM: Cascading autoencoder and large language model for video anomaly detection. Proc. Internat. Conf. Image Processing Theory Tools Appl. (IEEE, Piscataway, NJ).Google Scholar
Pan Q, Bao Y, Li H (2023) Transfer learning-based data anomaly detection for structural health monitoring. Structural Health Monitoring 22(5):3077–3091.Google Scholar
Pang G, Shen C, Cao L, Hengel AVD (2021) Deep learning for anomaly detection: A review. ACM Comput. Surveys 54(2):1–38.Google Scholar
Park T (2024) Enhancing anomaly detection in financial markets with an llm-based multi-agent framework. Preprint, submitted March 28, https://arxiv.org/abs/2403.19735.Google Scholar
Park KW, Mirian MS, McKeown MJ (2024) Artificial intelligence-based video monitoring of movement disorders in the elderly: A review on current and future landscapes. Singapore Medical J. 65(3):141–149.Google Scholar
Patel S, Lorincz K, Hughes R, Huggins N, Growdon J, Standaert D, Akay M, et al. (2009) Monitoring motor fluctuations in patients with parkinson’s disease using wearable sensors. IEEE Trans. Inform. Tech. Biomedicine 13(6):864–873.Google Scholar
Ren J, Xia F, Liu Y, Lee I (2021) Deep video anomaly detection: Opportunities and challenges. Proc. Internat. Conf. Data Mining Workshops (IEEE, Piscataway, NJ), 959–966.Google Scholar
Rivkin D, Hogan F, Feriani A, Konar A, Sigal A, Liu X, Dudek G (2024) A IoT smart home via autonomous LLM agents. IEEE Internet Things J.Google Scholar
Romano Y, Sesia M, Candes E (2020) Classification with valid and adaptive coverage. Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. Adv. Neural Inform. Processing Systems, vol. 33 (Curran Associates, Inc., Red Hook, NY), 3581–3591.Google Scholar
Sadinle M, Lei J, Wasserman L (2019) Least ambiguous set-valued classifiers with bounded error levels. J. Amer. Statist. Assoc. 114(525):223–234.Google Scholar
Song H, Ji R, Shi N, Lai F, Kontar RA (2025) Inv-entropy: A fully probabilistic framework for uncertainty quantification in language models. Preprint, submitted June 11, https://arxiv.org/abs/2506.09684.Google Scholar
Stojkoska BLR, Trivodaliev KV (2017) A review of internet of things for smart home: Challenges and solutions. J. Clean Production 140(3):1454–1464.Google Scholar
Sun Y, Ortiz J (2024) An ai-based system utilizing iot-enabled ambient sensors and llms for complex activity tracking. Preprint, submitted July 2, https://arxiv.org/abs/2407.02606.Google Scholar
Tian YJ, Felber NA, Pageau F, Schwab DR, Wangmo T (2024) Benefits and barriers associated with the use of smart home health technologies in the care of older persons: A systematic review. BMC Geriatrics 24(1):152.Google Scholar
Tian Y, Pang G, Chen Y, Singh R, Verjans JW, Carneiro G (2021) Weakly-supervised video anomaly detection with robust temporal feature magnitude learning. Proc. IEEE/CVF Internat. Conf. Comput. Vision (IEEE, Piscataway, NJ), 4975–4986.Google Scholar
Vyas J, Mercangoz M (2025) Autonomous control leveraging LLMs: An agentic framework for next-generation industrial automation. Preprint, submitted July 3, https://arxiv.org/abs/2507.07115.Google Scholar
Wang H, Qin J, Bastola A, Chen X, Suchanek J, Gong Z, Razi A (2024) VisionGPT: LLM-assisted real-time anomaly detection for safe visual navigation. Preprint, submitted March 19, https://arxiv.org/abs/2403.12415.Google Scholar
Wang X, Wei J, Schuurmans D, Le Q, Chi E, Narang S, Chowdhery A, et al. (2022) Self-consistency improves chain of thought reasoning in language models. Preprint, submitted March 21, https://arxiv.org/abs/2203.11171.Google Scholar
Wei J, Wang X, Schuurmans D, Bosma M, Xia F, Chi E, Le QV, et al. (2022) Chain-of-thought prompting elicits reasoning in large language models. Oh A, Agarwal A, Belgrave D, Cho K, eds. Adv. Neural Inform. Processing Systems, vol. 35 (Curran Associates, Inc., Red Hook, NY), 24824–24837.Google Scholar
Withanage KI, Lee I, Brinkworth R, Mackintosh S, Thewlis D (2016) Fall recovery subactivity recognition with RGB-D cameras. IEEE Trans. Industrial Inform. 12(6):2312–2320.Google Scholar
Wu P, Liu J (2021) Learning causal temporal relation and feature discrimination for anomaly detection. IEEE Trans. Image Processing 30:3513–3527.Google Scholar
Xu X, Cao Y, Chen Y, Shen W, Huang X (2024) Customizing visual-language foundation models for multi-modal anomaly detection and reasoning. Preprint, submitted March 17, https://arxiv.org/abs/2403.11083.Google Scholar
Yahaya SW, Lotfi A, Mahmud M (2021) Towards a data-driven adaptive anomaly detection system for human activity. Pattern Recognition Lett. 145:200–207.Google Scholar
Yang Y, Lee K, Dariush B, Cao Y, Lo SY (2024a) Follow the rules: Reasoning for video anomaly detection with large language models. Leonardis A, Ricci E, Roth S, Russakovsky O, Sattler T, Varol G, eds. Proc. Eur. Conf. Comput. Vision (Springer, Cham), 304–322.Google Scholar
Yang YY, Ho MY, Tai CH, Wu RM, Kuo MC, Tseng YJ (2024b) Fasteval parkinsonism: An instant deep learning–assisted video-based online system for parkinsonian motor symptom evaluation. NPJ Digital Medicine 7(1):31.Google Scholar
Ye F, Yang M, Pang J, Wang L, Wong D, Yilmaz E, Shi S, Tu Z (2024) Benchmarking LLMs via uncertainty quantification. Globerson A, Mackey L, Belgrave D, Fan A, Paquet U, Tomczak J, Zhang C, eds. Adv. Neural Inform. Processing Systems, vol. 37 (Curran Associates, Inc., Red Hook, NY), 15356–15385.Google Scholar
Yuan T, He Z, Dong L, Wang Y, Zhao R, Xia T, Xu L, et al. (2024) R-judge: Benchmarking safety risk awareness for LLM agents. Preprint, submitted January 18, https://arxiv.org/abs/2401.10019.Google Scholar
Yuan J, Li H, Ding X, Xie W, Li YJ, Zhao W, Wan K, et al. (2025) Understanding and mitigating numerical sources of nondeterminism in LLM inference. Proc. 39th Annual Conf. Neural Inform. Processing Systems (Curran Associates, Inc., Red Hook, NY).Google Scholar
Zaheer MZ, Mahmood A, Khan MH, Segu M, Yu F, Lee SI (2022) Generative cooperative learning for unsupervised video anomaly detection. Proc. IEEE/CVF Conf. Comput. Vision Pattern Recognition (IEEE, Piscataway, NJ), 14744–14754.Google Scholar
Zanella L, Menapace W, Mancini M, Wang Y, Ricci E (2024) Harnessing large language models for training-free video anomaly detection. Proc. IEEE/CVF Conf. Comput. Vision Pattern Recognition (IEEE, Piscataway, NJ), 18527–18536.Google Scholar
Zhang Y, Cao Y, Xu X, Shen W (2024a) Logicode: An LLM-driven framework for logical anomaly detection. IEEE Trans. Automation Sci. Engrg.Google Scholar
Zhang H, Xu X, Wang X, Zuo J, Han C, Huang X, Gao C, et al. (2024b) Holmes-VAD: Towards unbiased and explainable video anomaly detection via multi-modal LLM. Preprint, submitted June 18, https://arxiv.org/abs/2406.12235.Google Scholar
Zhao X, Zhang C, Guo P, Li W, Chen L, Zhao C, Huang S (2025) Smarthome-bench: A comprehensive benchmark for video anomaly detection in smart homes using multi-modal large language models. Proc. Comput. Vision Pattern Recognition Conf. Workshops (IEEE, Piscataway, NJ), 3975–3985.Google Scholar
Zhu S, Chen C, Sultani W (2021) Video anomaly detection for smart surveillance. Ionescu C, Vetterli M, eds. Computer Vision: A Reference Guide (Springer, Cham), 1315–1322.Google Scholar
Zhu J, Cai S, Deng F, Ooi BC, Wu J (2024) Do LLMs understand visual anomalies? Uncovering LLM’s capabilities in zero-shot anomaly detection. Proc. 32nd ACM Internat. Conf. Multimedia (ACM, New York), 48–57.Google Scholar

cover image INFORMS Journal on Data Science

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Received:August 21, 2025
Accepted:January 24, 2026
Published Online:March 05, 2026

Cite as

Congjing Zhang, Ryan Feng Lin, Xinyi Zhao, Pei Guo, Wei Li, Lin Chen, Chaoyue Zhao, Shuai Huang (2026) ALARM: Automated MLLM-Based Anomaly Detection in Complex-Environment Monitoring with Uncertainty Quantification. INFORMS Journal on Data Science 0(0).

https://doi.org/10.1287/ijds.2025.0107

Keywords

Acknowledgments

The authors thank the editor, associate editor, and anonymous reviewers for invaluable comments on this research. C. Zhang and R. F. Lin contributed equally.

PDF download

Available Issues