Multivariate Stratified Sampling by Optimization

Published Online:https://doi.org/10.1287/mnsc.29.6.715

An important, recurring problem in statistics involves the determination of strata boundaries for use in stratified sampling. This paper describes a practical method for stratifying a population of observations based on optimal cluster analysis. The goal of stratification is constructing a partition such that observations within a stratum are homogeneous as defined by within-cluster variances for attributes that are deemed important, while observations between strata are heterogeneous. The problem is defined as a deterministic optimization model with integer variables and is solved by means of a subgradient method. Computational tests with several examples show that the within-strata variances and thus the accompanying standard errors can be substantially reduced. Since the proposed model strives to minimize standard error, it is applicable to situations where a precise sample is essential, for example, microeconomic simulation studies.

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.