ASU-Mayo Center for Innovative Imaging, Arizona State University, Phoenix, Arizona 85281; and Department of Neurology, Mayo Clinic, Phoenix, Arizona 85259

Search for more papers by this author

Todd J. Schwedt

[email protected]

http://orcid.org/0000-0002-7780-7086

ASU-Mayo Center for Innovative Imaging, Arizona State University, Phoenix, Arizona 85281; and Department of Neurology, Mayo Clinic, Phoenix, Arizona 85259

Search for more papers by this author

Jing Li

Corresponding Author

Jing Li

[email protected]

https://orcid.org/0000-0001-7028-3681

H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332

Search for more papers by this author

Published Online:7 Jan 2026https://doi.org/10.1287/ijds.2024.0059

Supplemental Material

Software and Data: ijds.2024.0059.sm1.zip

Description of Software and Data

The code and data in the zip file referenced above are a snapshot of the software and data that were used in the research reported in the paper "Supervised Multimodal Fission Learning" by Lingchao Mao, Qi Wang, Yi Su, Fleming Lure, Catherine D Chong, Todd J Schwedt, and Jing Li. This repository is also available via Zenodo, GitHub, or others.

The goal of this repository is to replicate the numerical experiments in the paper.

Computer and Software Environment

The following describes the computer hardware conditions and software environment on which the authors produce the results reported in the paper.

Python 3.8+
R 3.6+ (optional, for JIVE and SLIDE R package integration)

Dependencies

NumPy 1.21.0
pandas 1.3.0
SciPy 1.7.0
scikit-learn 1.0.0
imbalanced-learn 0.8.0
Matplotlib 3.4.0
NetworkX 2.6.0
tdqm 4.62.0
rpy2 3.4.0 (optional - requires R installation)
jupyter 1.0.0
ipykernel 6.0.0
json5 0.9.0

Installation

Step 1. Clone the repository:
git clone https://github.com/yourusername/MMFL.git cd MMFL
Step 2. Install Python dependencies:
pip install -r requirements.txt
Step 3. (Optional) Install R packages for baseline comparisons with JIVE/SJIVE/SLIDE:
# In R console or R command line
install.packages("devtools")
devtools::install_github("irinagain/SLIDE")
devtools::install_github("lockEF/r.jive")

File Structure


MMFL/
├── models/                    # Core model implementations
│   ├── MMFL.py               # Main MMFL algorithm with rank selection
│   ├── MADDi.py              # Multi-modal Attention-based Deep Learning
│   ├── IMLS.py               # Incomplete Multi-modality Latent Space 
│   └── stagewise.py          # Stagewise deep learning models
├── utils/                     # Utility functions
│   ├── train.py              # Model training and evaluation functions
│   ├── metrics.py            # Evaluation metrics (AUC, accuracy, etc.)
│   ├── prepare_dataset.py    # Data preprocessing and splitting
│   ├── generate_simulation.py # Synthetic data generation
│   ├── visualization.py      # Plotting utilities
│   ├── oversampling.py       # SMOTE oversampling for imbalanced data
│   ├── rank_selection.py     # Rank selection utilities
│   └── compare_auc_delong_xu.py # Statistical comparison methods
├── experiments/               # Experimental notebooks
│   ├── case_study_adni.ipynb # ADNI dataset experiments
│   ├── case_study_headache.ipynb # Headache dataset experiments
│   └── simulation_study.ipynb # Simulation studies
├── preprocessing/             # Data preprocessing scripts
├── data/                      # Data files
│   ├── ADNI_dataset.csv      # ADNI dataset
│   ├── ADNI_SNP_fisher_nature_p0.0005.csv # SNP data
│   └── headache_*.csv        # Headache study data
└── results/                   # Experimental results

Reproducibility Workflow

To reproduce the results in Tables 2, 3, and 4 in Section 4

Data File: Simulation data is generated by the code script
Code File: experiments/simulation_study.ipynb
Output: The values for Tables 2, 3, 4 printed in the notebook and exported to results/ simA_*.json files
Run Time at the Above-Specified Computer Conditions: 15 minutes

To reproduce the results in Table 5 and Figure 1

Data File: Generates the data/ADNI_dataset.csv
Code File: experiments/case_study_adni.ipynb
Output: The values for Table 5 are printed in the notebook and exported to results/case_*.json files
Run Time at the Above-Specified Computer Conditions: 25 minutes

To reproduce the results in Table 6

Data Files:
data/headache_metadata.csv
data/headache_questionnaire.csv
data/headache_mri.csv
data/headache_t2star.csv
Code File: experiments/case_study_headache.ipynb
Output: The values for Table 5 are printed in the notebook and exported to results/case2_mri-t2star-questionnaire_wcovariates_*.json files
Run Time at the Above-Specified Computer Conditions: 25 seconds

Note

All the Data Files are in the data folder. Running simulation codes will overwrite the simulated results. The codes have been designed in a way that they save the figures in the “results” folder. We have uploaded the data used to produce our results in the data_backup folder to ensure it is preserved in case the files in the data folder are overwritten when running the simulation codes.

Ongoing Development

A python package named distclust has been developed that can be used to perform the agglomerative clustering on empirical multivariate distributions and perform further analysis. More information regarding this package can be found on GitHub.

Cite

To cite the contents of this repository, please cite both the paper and this repository using their respective DOIs.

Article: https://doi.org/10.1287/ijds.2024.0056
Software and Data Repository: https://doi.org/10.1287/ijds.2024.0056.cd

License

Copyright (c) (2025 Lingchao Mao, Qi Wang, Yi Su, Fleming Lure, Catherine D. Chong, Todd J. Schwedt, Jing Li Ghasemloo and Eckman)

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

cover image INFORMS Journal on Data Science

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Received:November 25, 2024
Accepted:November 23, 2025
Published Online:January 07, 2026

Cite as

Lingchao Mao, Qi Wang, Yi Su, Fleming Lure, Catherine D. Chong, Todd J. Schwedt, Jing Li (2026) Supervised Multimodal Fission Learning. INFORMS Journal on Data Science 0(0).

https://doi.org/10.1287/ijds.2024.0059

Keywords

PDF download

Available Issues

Available Issues

Supervised Multimodal Fission Learning

Supplemental Material

Description of Software and Data

Computer and Software Environment

Dependencies

Installation

File Structure

Reproducibility Workflow

To reproduce the results in Tables 2, 3, and 4 in Section 4

To reproduce the results in Table 5 and Figure 1

To reproduce the results in Table 6

Note

Ongoing Development

Cite

License

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News