Generating Experimental Data for Computational Testing with Machine Scheduling Applications
The operations research literature provides little guidance about how data should be generated for the computational testing of algorithms or heuristic procedures. We discuss several widely used data generation schemes, and demonstrate that they may introduce biases into computational results. Moreover, such schemes are often not representative of the way data arises in practical situations. We address these deficiencies by describing several principles for data generation and several properties that are desirable in a generation scheme. This enables us to provide specific proposals for the generation of a variety of machine scheduling problems. We present a generation scheme for precedence constraints that achieves a target density which is uniform in the precedence constraint graph. We also present a generation scheme that explicitly considers the correlation of routings in a job shop. We identify several related issues that may influence the design of a data generation scheme. Finally, two case studies illustrate, for specific scheduling problems, how our proposals can be implemented to design a data generation scheme.