Privacy Protection in Data Mining: A Perturbation Approach for Categorical Data

Published Online:https://doi.org/10.1287/isre.1060.0095

References

  • Adam N. R., Wortmann J. C. Security-control methods for statistical databases: A comparative study. ACM Comput. Surveys (1989) 21(4):515–556CrossrefGoogle Scholar
  • Agrawal R., Srikant R. Privacy-preserving data mining. Proc. 2000 ACM SIGMOD Internat. Conf. Management of Data (2000) (ACM Press, New York) 439–450CrossrefGoogle Scholar
  • Atallah M., Bertino E., Elmagarmid A., Ibrahim M., Verykios V. Disclosure limitation of sensitive rules. Proc. IEEE Knowledge and Data Engineering Exchange Workshop (KEDX'99) (1999) (IEEE Computer Science Society, Washington, D.C.) 45–52Google Scholar
  • Blake C., Keogh E., Merz C. J. UCI repository of machine learning databases. (1998) . Department of Information and Computer Science, University of California, Irvine, CA. http://www.ics.uci.edu/∼mlearn/MLRepository.htmlGoogle Scholar
  • Brand R., Giessing S. Report on preparation of the data set and improvements on Sullivan's algorithm. (2002) . http://neon.vb.cbs.nl/casc/Google Scholar
  • Chowdhury D. S., Duncan G. T., Krishnan R., Roehrig S. F., Mukherjee S. Disclosure detection in multivariate categorical databases: Auditing confidentiality protection through two new matrix operators. Management Sci. (1999) 45(12):1710–1723LinkGoogle Scholar
  • Cox L. H. Network models for complementary cell suppression. J. Amer. Statist. Assoc. (1995) 90(432):1453–1462CrossrefGoogle Scholar
  • Culnan M. “How did they get my name?”: An exploratory investigation of consumer attitudes toward secondary information use. MIS Quart. (1993) 17(3):341–363CrossrefGoogle Scholar
  • Dalenius T., Reiss S. P. Data swapping: A technique for disclosure control. J. Statist. Planning Inference (1982) 6(1):73–85CrossrefGoogle Scholar
  • Denning D. E., Schlörer J. Inference control for statistical databases. Computer (1983) 16(7):69–82CrossrefGoogle Scholar
  • Dinur I., Nissim K. Revealing information while preserving privacy. Proc. 22nd ACM SIGMOD-SIGACT-SIGART Sympos. Principles Database Systems (2003) (ACM Press, New York) 202–210CrossrefGoogle Scholar
  • Duda R. O., Hart P. E., Stork D. G.Pattern Classification (2001) (John Wiley & Sons, New York) Google Scholar
  • Duncan G. T., Lambert D. The risk of disclosure for microdata. J. Bus. Econom. Statist. (1989) 7(2):201–217Google Scholar
  • Duncan G. T., Mukherjee S. Optimal disclosure limitation strategy in statistical databases: Deterring tracker attacks through additive noise. J. Amer. Statist. Assoc. (2000) 95(451):720–729CrossrefGoogle Scholar
  • Estivill-Castro V., Brankovic L., Mukesh M., Tjoa A. M. Data swapping: Balancing privacy against precision in mining for logic rules. Data Warehousing and Knowledge Discovery (DaWak'99) (1999) (Springer-Verlag, Berlin, Germany) 389–398CrossrefGoogle Scholar
  • Evfimievski A., Srikant R., Agrawal R., Gehrke J. Privacy preserving mining of association rules. Proc. 8th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (2002) (ACM Press, New York) 217–228CrossrefGoogle Scholar
  • Fienberg S. E., McIntyre J., Domingo-Ferrer J., Torra V. Data swapping: Variations on a theme by Dalenius and Reiss. Privacy in Statistical Databases (2004) (Springer-Verlag, Berlin, Germany) 14–29CrossrefGoogle Scholar
  • Fienberg S. E., Makov U. E., Steele R. J. Disclosure limitation using perturbation and related methods for categorical data. J. Official Statist. (1998) 14(4):485–502Google Scholar
  • Galletta D. MIS faculty salary survey results. (2004) . http://www.pitt.edu/∼galletta/salsurv.htmlGoogle Scholar
  • Garfinkel R., Gopal R., Goes P. Privacy protection of binary confidential data against deterministic, stochastic, and insider threat. Management Sci. (2002) 48(6):749–764LinkGoogle Scholar
  • Greengard S. Privacy: Entitlement or illusion? Personnel J. (1996) 75(5):74–88Google Scholar
  • Gouweleeuw J. M., Kooiman P., Willenborg L. C. R. J., de Wolf P.-P. Post randomization for statistical disclosure control: Theory and implementation. J. Official Statist. (1998) 14(4):463–478Google Scholar
  • Iyengar V. S. Transforming data to satisfy privacy constraints. Proc. 8th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (2002) (ACM Press, New York) 279–288CrossrefGoogle Scholar
  • Kullback S.Information Theory and Statistics (1959) (John Wiley & Sons, New York) Google Scholar
  • Lerman C., Molyneaux J. W., Iswarati S. Pangemanan. The determinants of contraceptive method and service point choice. Secondary Analysis of the 1987 National Indonesia Contraceptive Prevalence Survey (1991) I(East-West Population Institute, Jakarta, Indonesia) . Fertility and Family PlanningGoogle Scholar
  • Li X.-B., Jacob V. S. Adaptive data reduction for large-scale transaction data. (2005) . Working paper, School of Management, University of Texas at Dallas, Richardson, TXGoogle Scholar
  • Li X.-B., Sarkar S. A data perturbation approach to privacy protection in data mining. (2005) . Working paper, School of Management, University of Texas at Dallas, Richardson, TXGoogle Scholar
  • Liew C. K., Choi U. J., Liew C. J. A data distortion by probability distribution. ACM Trans. Database Systems (1985) 10(3):395–411CrossrefGoogle Scholar
  • Muralidhar K., Parsa R., Sarathy R. A general additive data perturbation method for database security. Management Sci. (1999) 45(10):1399–1415LinkGoogle Scholar
  • Quinlan J. R.C4.5: Programs for Machine Learning (1993) (Morgan Kaufmann, San Mateo, CA) Google Scholar
  • Reiss S. P. Practical data-swapping: The first steps. ACM Trans. Database Systems (1984) 9(1):20–37CrossrefGoogle Scholar
  • Rotenberg M. Protecting privacy. Comm. ACM (1992) 35(4):164CrossrefGoogle Scholar
  • Sarathy R., Muralidhar K. The security of confidential numerical data in databases. Inform. Systems Res. (2002) 13(4):389–403LinkGoogle Scholar
  • Schlörer J. Security of statistical databases: Multidimensional transformation. ACM Trans. Database Systems (1981) 6(1):95–112CrossrefGoogle Scholar
  • Stanford Student Computer and Network Privacy Project A study of student privacy issues at Stanford University. Comm. ACM (2002) 45(3):23–25CrossrefGoogle Scholar
  • Sullivan G., Fuller W. A. Construction of masking error for categorical variables. Proc. Section Survey Res. Methods (1990) (American Statistical Association, Alexandria, VA) 435–439Google Scholar
  • Verykios V. S., Elmagarmid A. K., Bertino E., Saygin Y., Dasseni E. Association rule hiding. IEEE Trans. Knowledge Data Engrg. (2004) 16(4):434–447CrossrefGoogle Scholar
  • Wang H., Lee M. K. O., Wang C. Consumer privacy concerns about Internet marketing. Comm. ACM (1998) 41(3):63–70CrossrefGoogle Scholar
  • Witten I. H., Frank E.Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations (2000) (Morgan Kaufmann, San Francisco, CA) Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.