Data and Code Disclosure Policy

Updated January 1, 2025

Download Data and Code Disclosure Form
Download IJDS Reproducibility Report Template

A fundamental principle of the scientific method is replication: the validity of a research finding requires that it can be reproduced by other researchers. The intent of the Data and Code Disclosure policy is to assure the availability of the material necessary to replicate the research published in the journal.

A secondary benefit of this policy is to advance the research in the fields covered by the journal. Inevitably, the sharing of data and codes will be of value to the relevant research community, allowing them to leverage this prior work in their own pursuits. This sharing should increase the rate of scientific progress and impact.

General Policy

Authors of papers that contain numerical or computational work that relies on using software and/or data are required, at the time of a paper’s acceptance, to upload the data and code used in the paper to be archived on a Supplemental Page on the IJDS publication website and to complete the reproducibility workflow on that Supplemental Page. At the time of submission, what the authors need to do is complete the Data and Code Disclosure Form to acknowledge and agree to the journal's reproducibility requirement for accepted papers.

When datasets used cannot be fully disclosed, the following set of guidelines are intended to communicate the expectations for the policy and to help authors in developing their reproducibility plan.

Guidelines

  1. When the research relies upon licensed data from sources such as the Census Bureau, Compustat, CRSP, Factset, and WRDS, the authors should provide detailed instructions along with their own code for accessing and linking to the licensed data, sufficient for replication by others. The authors must provide a description of how previous intermediate data sets and programs were employed to create the final data set(s), if relevant.
  2. When the research relies on proprietary data covered by a Non-Disclosure Agreement, or sensitive human-subject data, or unique data sets that required an extensive time or monetary investment to compile, the authors should propose an alternative disclosure plan that is in keeping with the spirit of replicability while respecting the specific situation faced by the authors. For instance, the authors might propose to:
    1. Disguise the data in such a way that protects sensitive information yet allows for replication of the main results. For instance, add noise or apply multipliers to the variables. See Acimovic et al. (2019) for an example where SKU weekly demand is normalized such that total demand during the life cycle of a product is equal to 1; quintile bucket info is provided for each SKU to indicate fast and slow selling products. When normalizing the data, the authors provide limited precision (limited decimal places) so one cannot reconstruct the original demand values.
    2. Provide all necessary statistics to populate your model so that others can replicate the study. See Shi et al. (2016) for an example where the authors could not make the original dataset public due to a non-disclosure agreement with the collaborating hospital. Instead, they provided in the paper all necessary statistics to populate their model (including both summary statistics and distributional statistics). For instance, see Figures 3 and 7, Tables 1 and 3 for the daily/hourly patient arrival rates, the number of beds in each ward, as well as the distribution of patient length-of-stay.
    3. Post a randomly drawn subset of the paper’s data set that could be used to replicate the paper’s results, albeit with the expectation of larger standard errors.
    4. Post a synthetic data set that the authors generate so as to be representative of the actual data, at least for the purposes of replication. In this case the authors need to provide some evidence that the synthetic data is a valid surrogate for the actual data. If the authors propose to share a transformed data set, the authors should disclose to the editor the details of the process or method for creating this transformed data set.
    5. The authors might propose a delay in sharing of data or codes, so as to have more time to harvest their investment from building the database or algorithm. As a general guideline, a delay from publication of one year for code and data would seem an acceptable balance of the competing interests of the authors and the research community. But please note that a delay in releasing data/code will cause a delay in publishing the associated paper. That is to say, the associated paper will not be published until the data/code is released.
    6. Nevertheless, in some cases, none of these options may be workable. For instance, in healthcare-related research, the sharing of patient-level data in any form may be a non-starter. And creating a synthetic database may not be meaningful and/or may be an extraordinary burden. In these cases, the authors should provide sufficient details on the data set so that other researchers could readily generate their own data set comparable to that used in the research. This would necessarily include a data dictionary that contains a description of all variables used in the paper, so that other researchers can reconstruct these variables from their own data. See Gallino and Moreno (2014) where the authors provide guidelines to help others replicate the analysis in their paper.

Whether the authors’ proposed reproducibility plan is acceptable remains at the discretion of the Editor-in-Chief. Exceptions are possible but only when there are strong reasons preventing data/code from being disclosed (e.g., a legally binding NDA in force). When considering an authors’ plan, the EIC needs to weigh carefully the pro’s and con’s of processing a paper with potentially important or impactful research contributions that might not be readily reproducible. This consideration may well entail a tradeoff between the benefits from enforcing the data disclosure policy versus the blocking of the publication of an important paper.

If anyone is interested in using the shared data or code for their own research, they need to verify the permission license associated with the shared data or code. If permission for free reuse of the data/code is not explicitly given in the license, users are required to secure permission directly from the authors. Any person downloading any of the file(s) and/or the code for the purpose of verifying replicability/ reproducibility of the paper’s main results does not need any extra permission. Any reuse of the data/code, other than for the purpose of verifying replicability/ reproducibility of the paper’s main results, must cite the paper and acknowledge the source of the data/code.

Acknowledgement

This policy is based on the Data and Code Disclosure policy for Management Science (/page/mnsc/datapolicy), which in turn relied extensively on existing policies for data and/or code sharing, in particular the Data Availability Policy of the American Economic Association (https://www.aeaweb.org/journals/policies/data-availability-policy); the Journal of Finance Code Sharing Policy (https://www.afajof.org/resource/resmgr/files/Submission_docs/CodePolicy.pdf); and the Marketing Science Replication and Disclosure Policy (/doi/pdf/10.1287/mksc.1120.0761).

References

Acimovic J., F. Erize, K. Hu , D. J. Thomas , and J. A. Van Mieghem (2019) Product Life Cycle Data Set: Raw and Cleaned Data of Weekly Orders for Personal Computers. M&SOM. 21(1):171-176, https://doi.org/10.1287/msom.2017.0692

Gallino S. and Moreno A. (2014) Integration of Online and Offline Channels in Retail: The Impact of Sharing Reliable Inventory Availability Information. Management Science 60(6):1434-1451, https://doi.org/10.1287/mnsc.2014.1951

Shi P., Chou M.C., Dai J.G., Ding D., and Sim J. (2016) Models and Insights for Hospital Inpatient Operations: Time-Dependent ED Boarding Time. Management Science 62(1):1-28, https://doi.org/10.1287/mnsc.2014.2112


INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.