OptiChat: Bridging Optimization Models and Practitioners with Large Language Models

Published Online:https://doi.org/10.1287/ijds.2025.0074

Supplemental Material

Software and Data: ijds.2025.0074.cd.zip


Description of Software and Data

The software and data in the zip file referenced above are a snapshot of the software and data that were used in the research reported in the paper "OptiChat: Bridging Optimization Models and Practitioners with Large Language Models" by Hao Chen, Gonzalo Esteban Constante Flores, Krishna Sri Ipsit Mantri, Sai Madhukiran Kompalli, Akshdeep Singh Ahluwalia, Can Li. This repository is also available at https://github.com/li-group/OptiChat.

The goal of this repository is to replicate the numerical experiments in the paper.

Dependencies

The code in this repository requires the following dependencies. The dependency version number corresponds to the version of the package with which the code was tested.

Installation

  1. Install python 3.10.16
  2. Install python packages pip install -r requirements.txt
  3. Install Gurobi following the instructions at "How do I install Gurobi Optimizer?" For windows without admin access, follow the instructions at "How do I install Gurobi without administrator credentials?"
  4. Apply for an OpenAI API key at https://platform.openai.com/docs/overview. Add the key to your environment variables as OPENAI_API_KEY

Reproducibility Workflow

Which Results to Reproduce Data File Code File Output Run Time at the Above-Specified Computer Conditions
Table 1 in the paper Feas/
Infeas/
run_exp.py
(interpreter_experiment=True)
stats.json including time in Table 1 20 minutes (“gpt-4.1”)
Table 2, Diagnosing query Infeas/
test_set/tool_testset/
run_exp.py
(internal_experiment=True)
stats.csv including accuracy and time of diagnosing query in Table 2, stats_detail.pkl storing the LLM-generated answers 40 minutes (“gpt-4.1” + “gpt-4o-mini” + “gpt-4o” + “o3”)
Table 2, Retrieval, Sensitivity, What-if query Feas/
test_set/ tool_testset/
run_exp.py
(internal_experiment=True)
stats.csv including accuracy and time of retrieval, sensitivity, what-if query in Table 2 stats_detail.pkl storing the LLM-generated answers 70 minutes (“gpt-4.1” + “gpt-4o-mini” + “gpt-4o” + “o3”)
Table 2, Why-not query Feas/
test_set/code_testset
run_exp.py
(external_experiment=True)
stats.csv including accuracy and time of why-not query in Table 2 stats_detail.pkl storing the LLM-generated answers 70 minutes (“gpt-4.1” + “gpt-4o-mini” + “gpt-4o” + “o3”)
Table 4, Diagnosing query, w/o Predefined Functions Infeas/
test_set/tool_testset/
run_exp.py
(internal_experiment=True; ablation=True)
stats.csv including accuracy and time stats_detail.pkl storing the LLM-generated answers 70 minutes (“gpt-4.1” + “o3”)
Table 4, Diagnosing query, w/o Syntax Reminders Infeas/
test_set/tool_testset/
run_exp.py
(internal_experiment=True; skip_syntax=True)
stats.csv including accuracy and time stats_detail.pkl storing the LLM-generated answers 20 minutes (“gpt-4.1” + “o3”)
Table 4, Diagnosing query, w/o Illustrator Infeas/
test_set/tool_testset/
run_exp.py
(internal_experiment=True; skip_description =True)
stats.csv including accuracy and time stats_detail.pkl storing the LLM-generated answers 40 minutes (“gpt-4.1” + “o3”)
Table 4, 
Retrieval, Sensitivity, What-if query, w/o Predefined Functions
Feas/
test_set/ tool_testset/
run_exp.py
(internal_experiment=True, ablation=True)
stats.csv including accuracy and time stats_detail.pkl storing the LLM-generated answers 90 minutes (“gpt-4.1” + “o3”)
Table 4, Retrieval, Sensitivity, What-if query, w/o Syntax Reminders Feas/
test_set/tool_testset/
run_exp.py
(internal_experiment=True; skip_syntax=True)
stats.csv including accuracy and time stats_detail.pkl storing the LLM-generated answers 40 minutes (“gpt-4.1” + “o3”)
Table 4,
Retrieval, Sensitivity, What-if query, w/o Illustrator
Feas/
test_set/tool_testset/
run_exp.py
(internal_experiment=True; skip_description =True)
stats.csv including accuracy and time stats_detail.pkl storing the LLM-generated answers 80 minutes (“gpt-4.1” + “o3”)
Table 4, Why-not query, w/o Illustrator Feas/
test_set/code_testset
run_exp.py
(external_experiment=True; skip_description =True)
stats.csv including accuracy and time stats_detail.pkl storing the LLM-generated answers 80 minutes (“gpt-4.1” + “o3”)

 

Note

To reproduce the results reported in the main tables, the following parameters can be configured in run_exp.py to align with the specific experiment type and setting.

  • folder_name: set to “Infeas” for Diagnosing query; set “Feas” for Retrieval, Sensitivity, What-if, Why-not query
  • interpreter_experiment: set to True for Table 1
  • internal_experiment: set to True for Diagnosing, Retrieval, Sensitivity, What-if query in Table 2 and 4
  • external_ experiment: set to True for Why-not query in Table 2 and 4
  • ablation: set to True for the row, w/o Predefined Functions, in Table 4
  • skip_syntax: set to True for the row, w/o Syntax Reminders, in Table 4
  • skip_description: set to True for the row, w/o Illustrator, in Table 4
  • gpt_model: set to “gpt-4.1”, “gpt-4o-mini”, “gpt-4o”, or “o3” to replicate the results specific to an LLM in Table 2 and 4

The stats_detail.pkl files that store LLM-generated answers are used to analyze the error distribution in Table 3.

Ongoing Development

This code is being developed on an ongoing basis at the author-maintained OptiChat package. In particular, the source code in this repository corresponds to v0.1.

Cite

Article: https://doi.org/10.1287/ijds.2025.0074
Software and Data Repository: https://doi.org/10.1287/ijds.2025.0074.cd

License

Copyright (c) (2025 Chen, Constante Flores, Mantri, Kompalli, Ahluwalia, Li)

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.