Positive-versus-Negative Classification for Model Aggregation in Predictive Data Mining

Patricia E. N. Lutu
Patricia E. N. Lutu
[email protected]
Department of Computer Science, University of Pretoria, Pretoria 0002, South Africa
Search for more papers by this author
,
Andries P. Engelbrecht
Andries P. Engelbrecht
[email protected]
Department of Computer Science, University of Pretoria, Pretoria 0002, South Africa
Search for more papers by this author

Patricia E. N. Lutu

[email protected]

Department of Computer Science, University of Pretoria, Pretoria 0002, South Africa

Search for more papers by this author

Andries P. Engelbrecht

[email protected]

Department of Computer Science, University of Pretoria, Pretoria 0002, South Africa

Search for more papers by this author

Published Online:17 Jan 2013https://doi.org/10.1287/ijoc.1120.0540

References

Ali KM, Pazzani J (1996) Error reduction through learning multiple descriptions. Machine Learn. 24:173–202.Crossref, Google Scholar
Bay SD, Kibler D, Pazzani MJ, Smyth P (2000) The UCI KDD archive of large data sets for data mining research and experimentation. Assoc. Comput. Machinery Special Interest Group on Knowledge Discovery in Databases 2(2):81–85.Google Scholar
Bishop CM (1995) Neural Network for Pattern Recognition (Clarendon Press, Oxford, UK).Crossref, Google Scholar
Breiman L (1996) Bagging predictors. Machine Learn. 24:123–140.Crossref, Google Scholar
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and Regression Trees (Wadsworth and Brooks, Pacific Grove, CA).Google Scholar
Cover TM, Hart PE (1967) Nearest neighbor pattern classification. IEEE Trans. Inform. Theory 13(1):21–27.Crossref, Google Scholar
Dietterich TG, Kong EB (1995) Machine learning bias, statistical bias, and statistical variance of decision tree algorithms. Technical report, Department of Computer Science, Oregon State University, Corvallis, OR. http://web.engr.oregonstate.edu/∼tgd/publications/index.html.Google Scholar
Fan C, Muller M, Rezucha I (1962) Development of sampling plans by using sequential (item by item) selection techniques and digital computers. J. Amer. Statist. Assoc. 57(298):387–402.Crossref, Google Scholar
Fawcett T (2001) Using rulesets to maximise ROC performance. Proc. IEEE Internat. Conf. Data Mining (ICDM-2001) (IEEE Computer Society, San Jose, CA), 131–138.Crossref, Google Scholar
Fawcett T (2004) ROC graphs: Notes and practical considerations for researchers. Technical report, HP Laboratories, Palo Alto, CA. Accessed November 2012, http://home.comcast.net/∼tom.fawcett/public_html/papers/ROC101.pdf.Google Scholar
Fawcett T (2006) An introduction to ROC analysis. Pattern Recognition Lett. 27(8):861–874.Crossref, Google Scholar
Freund Y, Schapire R (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. System Sci. 55(1):119–139.Crossref, Google Scholar
Friedman JH (1997) On bias, variance, 0/1—loss, and the curse-of-dimensionality. Data Mining and Knowledge Discovery 1(1):55–77.Crossref, Google Scholar
Geman S, Bienenstock E, Doursat R (1992) Neural networks and the bias/variance dilemma. Neural Comput. 4:1–58.Crossref, Google Scholar
Giudici P (2003) Applied Data Mining: Statistical Methods for Business and Industry (John Wiley & Sons, Chichester, UK).Google Scholar
Giudici P, Figini S (2009) Applied Data Mining for Business and Industry, 2nd ed. (John Wiley & Sons, Chichester, UK).Crossref, Google Scholar
Goodman A, Kamath C, Kumar V (2008) Data analysis in the 21st century. Statist. Anal. Data Mining 1(1):1–3.Crossref, Google Scholar
Hand DJ (1997) Construction and Assessment of Classification Rules (John Wiley & Sons, Chichester, UK).Google Scholar
Hand DJ (1998) Data mining: statistics and more? Amer. Statistician 52(2):112–118.Crossref, Google Scholar
Hand DJ, Till RJ (2001) A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learn. 45:171–186.Crossref, Google Scholar
Hand DJ, Manila H, Smyth P (2001) Principles of Data Mining (MIT Press, Cambridge, MA).Google Scholar
Hansen LK, Salamon P (1990) Neural network ensembles. IEEE Trans. Pattern Anal. Machine Intelligence 12(10):993–1001.Crossref, Google Scholar
Hettich S, Bay SD (1999) The UCI KDD archive ( http://kdd.ics.uci.edu). Department of Information and Computer Science, University of California, Irvine, CA.Google Scholar
Ho T (1998) The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Machine Intelligence 20(8):832–844.Crossref, Google Scholar
Jones T (1962) A note on sampling from tape files. Comm. ACM 5(6):343.Crossref, Google Scholar
Kittler J (1998) Combining classifiers: A theoretical framework. Pattern Anal. Appl. 1:18–27.Crossref, Google Scholar
Kohavi R, Wolpert DH (1996) Bias plus variance decomposition for zero-one loss functions. Saitta L, ed. Machine Learn.: Proc. Thirteenth Internat. Conf. (Morgan Kaufmann, San Francisco), 275–283.Google Scholar
Kohavi R, Mason RJ, Zheng Z (2004) Lessons and challenges from mining retail e-commerce data. Machine Learn. 57:83–113.Crossref, Google Scholar
Kong E, Dietterich T (1995) Error-correcting output coding corrects bias and variance. Proc. Twelfth Internat. Conf. Machine Learn. (Morgan Kaufmann, San Francisco), 313–321.Crossref, Google Scholar
Krogh A, Vedelsby J (1995) Neural network ensembles, cross validation and active learning. Tesauro G, Touretzky DS, Leen TK, eds. Advances in Neural Information Processing Systems (MIT Press, Cambridge, MA), 231–238.Google Scholar
Kuncheva L (2004) Combining Pattern Classifiers: Methods and Algorithms (John Wiley & Sons, Hoboken, NJ).Crossref, Google Scholar
Kwok SW, Carter C (1990) Multiple decision trees. Shachter RD, Levitt TS, Kanal LN, Lemmer JF, eds. Uncertainty in Artificial Intelligence, Vol. 4 (Elsevier Science Publishers, North-Holland, Amsterdam), 327–335.Crossref, Google Scholar
Laskov P, Düssel P, Schäfer C, Rieck K (2005) Learning intrusion detection: Supervised or unsupervised? ICAP: Internat. Conf. Image Anal. Processing, Cagliari, Italy.Crossref, Google Scholar
Lee W, Stolfo J (2000) A framework for constructing features and models for intrusion detection systems. ACM Trans. Inform. System Security 3(4):227–261.Crossref, Google Scholar
Lutu PEN (2010) Data set selection for aggregate model implementation in predictive data mining. Ph.D. thesis, Department of Computer Science, University of Pretoria, South Africa.Google Scholar
Lutu PEN (2011a) Using confusion graphs and confusion matrices to design ensemble base models for classification. Cuzzocrea A, Dayal U, eds. Proc. 13th Internat. Conf. Data Warehousing and Knowledge Discovery, DaWak 2011, Toulouse, France (Lecture Notes in Computer Science, Vol. 6862, LNCS, Berlin), 301–315.Crossref, Google Scholar
Lutu PEN (2011b) Empirical comparison of four classifier fusion strategies for positive-versus-negative ensembles. Sewcheran K, Osman H, eds. Proc. SAICSIT 2011, Cape Town, South Africa (ACM International Conference Series, New York), Vol. 978, 302–305.Crossref, Google Scholar
Lutu PEN, Engelbrecht AP (2010) A decision rule-based method for feature selection in predictive data mining. Expert Systems Appl. 37(1):602–609.Crossref, Google Scholar
Lutu PEN, Engelbrecht AP (2012) Using OVA modeling to improve classification performance for large data sets. Expert Systems Appl. 39(4):4358–4376.Crossref, Google Scholar
Lutu PEN, Engelbrecht AP (2013) Base model combination algorithm for resolving tied predictions for k-nearest neighbor OVA ensemble models. INFORMS J. Comput. 25(3):517–526.Link, Google Scholar
Olken F (1993) Random sampling from databases. Ph.D. thesis, Department of Computer Science, University of California at Berkeley, Berkeley, CA.Google Scholar
Olken F, Rotem D (1995) Random sampling from databases—A survey. Statist. Comput. 5(1):25–42.Crossref, Google Scholar
Ooi CH, Chetty M, Teng SW (2007) Differential prioritization in feature selection and classifier aggregation for multiclass microarray data sets. Data Mining and Knowledge Discovery 14:329–366.Crossref, Google Scholar
Osei-Bryson K-M, Kah MO, Kah JML (2008) Selecting predictive models for inclusion in an ensemble. The 18th Triennial Conf. Internat. Federation of Oper. Res. Soc. (IFORS 2008), Sandton, Johannesburg.Google Scholar
Provost F, Domingos P (2001) Well trained PETS: Improving probability estimation trees. Working Paper IS-00-04, Stern School of Business, New York University, New York.Google Scholar
Provost F, Fawcett T (2001) Robust classification for imprecise environments. Machine Learn. 42:203–231.Crossref, Google Scholar
Quinlan JR (1993) C4.5: Programs for Machine Learning (Morgan Kauffman, San Francisco).Google Scholar
Quinlan JR (2004) An informal tutorial, Rulequest research. Accessed October 28, 2005, http://www.rulequest.com.Google Scholar
Rao PSRS (2000) Sampling Methodologies with Applications (CRC/Chapman and Hall, Boca Raton, FL).Crossref, Google Scholar
Rifkin R, Klautau A (2004) In defense of one-vs-all classification. J. Machine Learn. Res. 5:101–141.Google Scholar
Shin SW, Lee CH (2006) Using attack-specific feature subsets for network intrusion detection. Proc. 19th Australian Conf. Artificial Intelligence, Hobart, Australia.Crossref, Google Scholar
Smyth P (2001) Data mining at the interface of computer science and statistics. Grossman RL, Kamath C, Kegelmeyer P, Kumar V, Namburu RR, eds. Data Mining for Scientific and Engineering Applications (Kluwer Academic Publishers, Dordrecht, Netherlands).Crossref, Google Scholar
Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q, Motoda H, Mclachlan GJ, et al. (2007) Top 10 algorithms in data mining. Knowledge Inform. Systems 14(1):1–37.Crossref, Google Scholar

cover image INFORMS Journal on Computing

Volume 25, Issue 4

Fall 2013

Pages 599-822

Article Information

Supplemental Material

Metrics

Information

Received:December 01, 2010
Accepted:June 01, 2012
Published Online:January 17, 2013

Cite as

Patricia E. N. Lutu, Andries P. Engelbrecht (2013) Positive-versus-Negative Classification for Model Aggregation in Predictive Data Mining. INFORMS Journal on Computing 25(4):792-807.

https://doi.org/10.1287/ijoc.1120.0540

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Positive-versus-Negative Classification for Model Aggregation in Predictive Data Mining

References

Volume 25, Issue 4

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News