Effective Active Learning Strategies for the Use of Large-Margin Classifiers in Semantic Annotation: An Optimal Parameter Discovery Perspective

Published Online:https://doi.org/10.1287/ijoc.2013.0578

References

  • Attenberg J, Melville P, Provost F, Saar-Tsechansky M (2011) Selective data acquisition. Krishnapuram B, Yu S, Rao RB, eds. Cost-Sensitive Machine Learning (CRC Press, Boca Raton, FL),101–155.Google Scholar
  • Berry MW, Castellanos M (2007) Survey of Text Mining: Clustering, Classification, and Retrieval (Springer-Verlag, New York).Google Scholar
  • Bhargava HK, Krishnan R (1998) The World Wide Web: Opportunities for operations research and management science. INFORMS J. Comput. 10(4):359–383.LinkGoogle Scholar
  • Birlutiu A, Groot P, Heskes T (2012) Efficiently learning the preferences of people. Machine Learn. 10(4):1–28.Google Scholar
  • Bordes A, Ertekin S, Weston J, Bottou O (2005) Fast kernel classifiers with online and active learning. J. Machine Learn. Res. 6:1579–1619.Google Scholar
  • Carrizosa E, Mart-Barrag B, Moralesc D (2011) Detecting relevant variables and interactions in supervised classification. Eur. J. Oper. Res. 213(1):260–269.CrossrefGoogle Scholar
  • Cohn D, Atlas L, Ladner R (1994) Improving generalization with active learning. Machine Learn. 15(2):201–221.CrossrefGoogle Scholar
  • Culotta A, McCallum A (2005) Reducing labeling effort for structured prediction tasks. Proc. 20th National Conf. Artificial Intelligence (AAAI Press, Menlo Park, CA), 746–751.CrossrefGoogle Scholar
  • Cunningham H, Maynard D, Bontcheva K, Tablan V (2002) GATE: A framework and geographical development environment for robust NLP tools and applications. Proc. 40th Anniversary Meeting Assoc. Comput. Linguistics (Association for Computational Linguistics, Stroudsburg, PA).Google Scholar
  • Das SR, Chen MY (2007) Yahoo! for Amazon: Sentiment extraction from small talk on the Web. Management Sci. 53(9):1375–1388.LinkGoogle Scholar
  • Druck G, Mann G, McCallum A (2008) Learning from labeled features using generalized expectation criteria. Proc. 31st Annual Internat. ACM SIGIR Conf. Res. Development Inform. Retrieval (ACM Press, New York), 595–602.CrossrefGoogle Scholar
  • Erik F, Tjong KS (2002) Introduction to the CONLL-2002 shared task: Language-independent named entity recognition. Proc. Sixth Conf. Natural Language Learn. (Association for Computational Linguistics, Stroudsburg, PA), 155–158.Google Scholar
  • Escudeiro N, Jorge A (2010) D-Confidence: An active learning strategy which efficiently identifies small classes. Proc. NAACL HLT 2010 Workshop on Active Learn. Natural Language Processing, (Association for Computational Linguistics, Stroudsburg, PA), 18–26.Google Scholar
  • Evgeniou T, Pontil M, Poggio T (2000) Statistical learning theory: A primer. Internat. J. Comput. Vision 38(1):9–13.CrossrefGoogle Scholar
  • Fan W, Michael GD, Pathak P (2005) Genetic programming-based discovery of ranking functions for effective Web search. J. Management Inform. Systems 21(4):37–56.CrossrefGoogle Scholar
  • Farhoomand AF, Drury DH (2002) Managerial information overload. Comm. ACM 45(10):127–131.CrossrefGoogle Scholar
  • Greiner R, Grove AJ, Roth D (2002) Learning cost-sensitive active classifiers. Artificial Intelligence 139(2):137–174.CrossrefGoogle Scholar
  • Grilheres B, Beauce C, Canu S, Brunessaux S (2005) Benchmarking of semantic annotation with conditional random fields. Proc. 2th Eur. Workshop on the Integration of Knowledge, Semantic and Digital Media Technologies (Institution of Electrical Engineers, London), 233–236.CrossrefGoogle Scholar
  • Guillory A, Chastain E, Bilmes J (2009) Active learning as non-convex optimization. J. Machine Learn. Res. 5:201–208.Google Scholar
  • Haldun A, Serpil S (2009) Using support vector machines to learn the efficient set in multiple objective discrete optimization. Eur. J. Oper. Res. 193(2):510–519.CrossrefGoogle Scholar
  • Hotho A, Nürnberger A, Paaß G (2005) A brief survey of text mining. J. Comput. Linguistics Language Tech. 20(1):19–62.CrossrefGoogle Scholar
  • Ikeda K, Yamasaki T (2007) Incremental support vector machines and their geometrical analyses. Neurocomputing 70(13–15):2528–2533.CrossrefGoogle Scholar
  • Ingo S, Andreas C (2008) Support Vector Machines (Springer-Verlag, New York).Google Scholar
  • Jang H, Song SK, Myaeng SH (2006) Text mining for medical documents using a hidden Markov model. Proc. 3rd Asia Conf. Inform. Retrieval Tech. (Springer-Verlag, Berlin), 553–559.CrossrefGoogle Scholar
  • Ji S, Carin L (2007) Cost-sensitive feature acquisition and classification. Pattern Recognition 40(5):1474–1485.CrossrefGoogle Scholar
  • Joachims T (1999) Making large-scale support vector machine learning practical. Schölkopf B, Burges JC, Smola AJ, eds. Advances in Kernel Methods: Support Vector Learning (MIT Press, Cambridge, MA), 169–184.Google Scholar
  • Joachims T (2008) SVM-struct: Support vector machine for complex outputs. Accessed January 15, 2011, http://svmlight.joachims.org/svm_struct.html.Google Scholar
  • Khemchandani R, Jayadeva, Chandra S (2009) Knowledge based proximal support vector machines. Eur. J. Oper. Res. 195(3):914–923.CrossrefGoogle Scholar
  • Kim S, Song Y, Kim K, Cha J, Lee G (2006) MMR-based active machine learning for bio named entity recognition. Proc. Human Language Tech. Conf. North Amer. Chapter Assoc. Comput. Linguistics (Association for Computational Linguistics, Stroudsburg, PA), 69–72.CrossrefGoogle Scholar
  • Korde V, Mahender CN (2012) Text classification and classifiers: A survey. Internat. J. Artificial Intelligence Appl. 3(2):85–99.CrossrefGoogle Scholar
  • Lafferty J, McCallum A, Pereira F (2001) Conditional random fields: Probabilistic models for segmenting and labeling sequence data. Proc. 18th Internat. Conf. Machine Learn. (Morgan Kaufmann, San Francisco), 282–289.Google Scholar
  • Lampert CH, Peters J (2009) Active structured learning for high-speed object detection. Proc. 31st DAGM Sympos. Pattern Recognition (Springer-Verlag, Berlin), 221–231.CrossrefGoogle Scholar
  • Lau KW, Wu QH (2003) Online training of support vector classifier. Pattern Recognition 36(8):1913–1920.CrossrefGoogle Scholar
  • Lau RYK, Bruza PD, Song D (2008) Towards a belief revision based adaptive and context sensitive information retrieval system. ACM Trans. Inform. Systems 26(2):8.1–8.38.CrossrefGoogle Scholar
  • Li Y, Bontcheva K, Cunningham H (2004) SVM based learning system for information extraction. Proc. First Internat. Conf. Deterministic Statist. Methods in Machine Learn. (Springer-Verlag, Berlin), 319–339.Google Scholar
  • Lopresti T (2009) An economic lifeline: Text mining customer experience. Accessed March 11, 2010, http://www.mycustomer.com/item/134245.Google Scholar
  • Lou X, Hamprecht FA (2012) Structured learning from partial annotations. Proc. 29th Internat. Conf. Machine Learn. (Omnipress, New York), 1519–1526.Google Scholar
  • Manning C, Schutze H (1999) Foundations of Statistical Natural Language Processing (MIT Press, Cambridge, MA).Google Scholar
  • Martens D, Baesens B, Gestel V, Vanthienen J (2007) Comprehensible credit scoring models using rule extraction from support vector machines. Eur. J. Oper. Res. 183(3):1466–1476.CrossrefGoogle Scholar
  • Maynard D, Peters W, Li Y (2006) Metrics for evaluation of ontology-based information extraction. Proc. Fifteenth International Conf. World Wide Web (ACM, New York), 233–240.Google Scholar
  • McCallum A (2002) MALLET: A machine learning for language toolkit. Accessed January 3, 2011, http://mallet.cs.umass.edu.Google Scholar
  • Mitchell TM (1982) Generalization as search. Artificial Intelligence 18(2):203–226.CrossrefGoogle Scholar
  • Nguyen HT, Smeulders A (2004) Active learning using pre-clustering. Proc. Twenty-First Internat. Conf. Machine Learn. (ACM, New York), 79–86.CrossrefGoogle Scholar
  • Nguyen N, Guo Y (2007) Comparisons of sequence labeling algorithms and extensions. Proc. Twenty-Fourth Internat. Conf. Machine Learn. (ACM, New York), 681–688.CrossrefGoogle Scholar
  • Ohta T, Tateisi Y, Kim J, Mima H, Tsujii J (2002) The GENIA corpus: An annotated research abstract corpus in molecular biology domain. Proc. Second Internat. Conf. Human Language Tech. Res. (Morgan Kaufmann, San Francisco), 82–86.CrossrefGoogle Scholar
  • Olafsson S, Li X, Wu S (2008) Operations research and data mining. Eur. J. Oper. Res. 187(3):1429–1448.CrossrefGoogle Scholar
  • O'Riain S, Spyns P (2006) Enhancing the business analysis function with semantics. Proc. 2006 Confederated Internat. Conf. Move to Meaningful Internet Systems: CoopIS, DOA, GADA, ODBASE (Springer-Verlag, Berlin), 818–835.CrossrefGoogle Scholar
  • Padmanabhan B, Tuzhilin A (2003) On the use of optimization for data mining: Theoretical interactions and eCRM opportunities. Management Sci. 49(10):1327–1343.LinkGoogle Scholar
  • Pradhan S, Hacioglu K, Krugler V, Ward W, Martin JH, Jurafsky D (2005) Support vector learning for semantic argument classification. Machine Learn. 60:11–39.CrossrefGoogle Scholar
  • Rabiner LR (1989) A tutorial on hidden Markov models and selected applications in speech recognition. IEEE 77(2):257–286.CrossrefGoogle Scholar
  • Reeve L (2004) Integrating hidden Markov models into semantic Web annotation platforms. Technical report, College of Information Science and Technology, Drexel University, Philadelphia, PA.Google Scholar
  • Reeve L, Han H (2005) Survey of semantic annotation platforms. Proc. 20th Annual ACM Sympos. Appl. Comput. (ACM, New York), 1634–1638.CrossrefGoogle Scholar
  • Ring S (2001) Incremental learning with support vector machines. Proc. First IEEE Internat. Conf. Data Mining (IEEE Computer Society, Washington, DC), 641–642.Google Scholar
  • Rita K, Eid T, White A (2005) Management update: Companies should align their structured and unstructured data. Accessed March 21, 2010, http://www.gartner.com/DisplayDocument?doc_cd=126099&ref=g_fromdoc.Google Scholar
  • Roy N, McCallum A (2001) Toward optimal active learning through sampling estimation of error reduction. Brodley CE, Danyluk A, eds. Proc. Eighteenth Internat. Conf. Machine Learn. (Morgan Kaufmann, San Francisco), 441–448.Google Scholar
  • Saar-Tsechansky M, Melville P, Provost F (2009) Active feature-value acquisition. Management Sci. 55(4):664–684.LinkGoogle Scholar
  • Scheffer T, Decomain C, Wrobel S (2001) Active hidden Markov models for information extraction. Proc. 4th Internat. Conf. Adv. Intelligent Data Anal. (Springer-Verlag, London), 309–318.CrossrefGoogle Scholar
  • Sen S, Padmanabhan B, Tuzhilin A, White NH, Stein R (1998) The identification and satisfaction of consumer analysis-driven information needs of marketers on the WWW. Eur. J. Marketing 32(7–8):688–702.CrossrefGoogle Scholar
  • Settles B (2009) Active learning literature survey. Technical Report 1648, Computer Sciences Department, University of Wisconsin–Madison, Madison.Google Scholar
  • Settles B (2011) From theories to queries: Active learning in practice. JMLR Workshop Conf. Proc. (Microtome Publishing, Brookline, MA), 1–18.Google Scholar
  • Settles B, Craven M (2008) An analysis of active learning strategies for sequence labeling tasks. Proc. 2008 Conf. Empirical Methods in Natural Language Processing (Association for Computational Linguistics, Stroudsburg, PA), 1070–1079.CrossrefGoogle Scholar
  • Shawe-Taylor J, Cristianini N (1999) Further results on the margin distribution. Proc. Twelfth Annual Conf. Comput. Learn. Theory (ACM, New York), 278–285.CrossrefGoogle Scholar
  • Shen D, Zhang J, Su J, Zhou G, Tan C (2004) Multi-criteria-based active learning for named entity recognition. Proc. 42nd Annual Meeting Assoc. Comput. Linguistics (Association for Computational Linguistics, Stroudsburg, PA), 589.CrossrefGoogle Scholar
  • Spangler S, Kreulen JT, Lessler J (2003) Generating and browsing multiple taxonomies over a document collection. J. Management Inform. Systems 19(4):191–212.CrossrefGoogle Scholar
  • Srivastava J, Cooley R (2003) Web business intelligence: Mining the Web for actionable knowledge. INFORMS J. Comput. 15(2):191–207.LinkGoogle Scholar
  • Strehl A, Ghosh J (2003) Relationship-based clustering and visualization for high-dimensional data mining. INFORMS J. Comput. 15(2):208–230.LinkGoogle Scholar
  • Surdeanu M, Ciaramita M (2007) Robust information extraction with perceptrons. Proc. NIST 2007 Automatic Content Extraction Workshop (NIST Multimodal Information Group, Washington, DC), 1–4.Google Scholar
  • Sutton C, McCallum A (2007) Piecewise pseudolikelihood for efficient training of conditional random fields. Proc. 24th Internat. Conf. Machine Learn. (ACM, New York), 863–870.CrossrefGoogle Scholar
  • Symons CT, Samatova NF, Ramya K, Park BH (2006) Multi-criterion active learning in conditional random fields. Proc. 18th IEEE Internat. Conf. Tools with Artificial Intelligence (IEEE Computer Society, Washington, DC), 323–331.CrossrefGoogle Scholar
  • Tang J, Hong M, Li J, Liang B (2006) Tree-structured conditional random fields for semantic annotation. Proc. Fifth Internat. Semantic Web Conf. (Springer-Verlag, Berlin), 640–653.CrossrefGoogle Scholar
  • Thompson CA, Califf ME, Mooney RJ (1999) Active learning for natural language parsing and information extraction. Proc. Sixteenth Internat. Conf. Machine Learn. (Morgan Kaufmann, San Francisco), 406–414.Google Scholar
  • Tong S, Koller D (2002) Support vector machine active learning with applications to text classification. J. Machine Learn. Res. 2:45–66.Google Scholar
  • Tsochantaridis I, Hofmann T, Joachims T, Altun Y (2004) Support vector machine learning for interdependent and structured output spaces. Proc. Twenty-First Internat. Conf. Machine Learn. (ACM, New York), 104–111.CrossrefGoogle Scholar
  • Tsochantaridis I, Joachims T, Hofmann T, Altun Y (2005) Large margin methods for structured and interdependent output variables. J. Machine Learn. Res. 6:1453–1484.Google Scholar
  • Victoria U, Cimiano P, Iria J, Handschuh S, Vargas-Vera M, Motta E, Ciravegna F (2006) Semantic annotation for knowledge management: Requirements and a survey of the state of the art. J. Web Semantics 4(1):14–28.CrossrefGoogle Scholar
  • Vladimir V (1995) The Nature of Statistical Learning Theory (Springer-Verlag, New York).Google Scholar
  • Webb AR (2002) Statistical Pattern Recognition (John Wiley & Sons, Hoboken, NJ).CrossrefGoogle Scholar
  • Wei CP, Hu PJ, Tai CH, Huang CN, Yang CS (2008) Managing word mismatch problems in information retrieval: A topic-based query expansion approach. J. Management Inform. Systems 24(3):269–295.CrossrefGoogle Scholar
  • Yan R, Yang J, Hauptmann A (2003) Automatically labeling video data using multi-class active learning. Proc. Ninth IEEE Internat. Conf. Comput. Vision (IEEE Computer Society, Washington, DC), 516–523.CrossrefGoogle Scholar
  • Yasemin A, Ioannis T, Thomas H (2003) Hidden Markov support vector machines. Proc. Twentieth Internat. Conf. Machine Learn. (AAAI Press, Palo Alto, CA), 3–10.Google Scholar
  • Yu CN (2010) Improved learning of structural support vector machines: Training with latent variables and nonlinear kernels. Unpublished doctoral dissertation, Department of Computer Science, Cornell University, Ithaca, NY.Google Scholar
  • Zhang X, Zou J, Le DX, Thoma GR (2010) A structural SVM approach for reference parsing. Proc. Ninth Internat. Conf. Machine Learn. Appl. (IEEE Computer Society, Washington, DC), 479–484.CrossrefGoogle Scholar
  • Zhao B, Yin X, Xing EP (2011) Max margin learning on domain-independent Web information extraction. Proc. 20th ACM Internat. Conf. Inform. Knowledge Management (ACM, New York), 1305–1310.CrossrefGoogle Scholar
  • Zheng Z, Padmanabhan B (2006) Selectively acquiring customer information: A new data acquisition problem and an active learning-based solution. Management Sci. 52(5):697–712.LinkGoogle Scholar
  • Zhu J, Nie Z, Zhang B, Wen J-R (2008) Dynamic hierarchical markov random fields for integrated Web data extraction. J. Machine Learn. Res. 9:1583–1614.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.