DOI QR코드

DOI QR Code

Consolidation of Subtasks for Target Task in Pipelined NLP Model

  • Son, Jeong-Woo (Broadcasting & Telecommunications Media Research Laboratory, ETRI, School of Computer Science, Kyungpook National University) ;
  • Yoon, Heegeun (College of IT Engineering, Kyungpook National University) ;
  • Park, Seong-Bae (College of IT Engineering, Kyungpook National University) ;
  • Cho, Keeseong (Broadcasting & Telecommunications Media Research Laboratory, ETRI) ;
  • Ryu, Won (Broadcasting & Telecommunications Media Research Laboratory, ETRI)
  • Received : 2014.01.29
  • Accepted : 2014.09.15
  • Published : 2014.10.01

Abstract

Most natural language processing tasks depend on the outputs of some other tasks. Thus, they involve other tasks as subtasks. The main problem of this type of pipelined model is that the optimality of the subtasks that are trained with their own data is not guaranteed in the final target task, since the subtasks are not optimized with respect to the target task. As a solution to this problem, this paper proposes a consolidation of subtasks for a target task ($CST^2$). In $CST^2$, all parameters of a target task and its subtasks are optimized to fulfill the objective of the target task. $CST^2$ finds such optimized parameters through a backpropagation algorithm. In experiments in which text chunking is a target task and part-of-speech tagging is its subtask, $CST^2$ outperforms a traditional pipelined text chunker. The experimental results prove the effectiveness of optimizing subtasks with respect to the target task.

Keywords

References

  1. G. Tur, "Multitask Learning for Spoken Language Understanding," Proc. Int. Conf. Acoust., Speech, Signal Process., Toulouse, France, May 14-19, 2006, pp. 585-588.
  2. W. Sun, "A Stacked Sub-Word Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging," Proc. Annual Meeting Association Comput. Linguistics, Portland, OR, USA, June 19-24, 2011, pp. 1385-1394.
  3. J. Finkel and C. Manning, "Hierarchical Joint Learning: Improving Joint Parsing and Named Entity Recognition with Non-jointly Labeled Data," Proc. Annual Meeting Association Comput. Linguistics, Uppsala, Sweden, July 11-16, 2010, pp. 720-728.
  4. R. Florian and G. Ngai, "Multidimensional Transformation-Based Learning," Proc. Workshop Comput. Natural Language Learning, Boulder, CO, USA, vol. 7, June 4-5, 2009, pp. 1-8.
  5. R. Collobert and J. Weston, "A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning," Proc. Int. Conf. Mach. Learning, Helsinki, Finland, July 5-9, 2008, pp. 160-167.
  6. R. Caruana, L. Pratt, and S. Thrun, "Multitask Learning," Mach. Learning, vol. 28, no. 1, July 1997, pp. 41-75. https://doi.org/10.1023/A:1007379606734
  7. G. Neubig et al., "Unsupervised Model for Joint Phrase Alignment and Extraction," Proc. Annual Meeting Association Comput. Linguistics, Portland, OR, USA, vol. 1, June 19-24, 2011, pp. 632-641.
  8. X. Liu et al., "Joint Inference of Named Entity Recognition and Normalization for Tweets," Proc. Annual Meeting Association Comput. Linguistics, Jeju, Rep. of Korea, vol. 1, July 8-14, 2012, pp. 526-535.
  9. J. Hatori et al., "Incremental Joint Approach to Word Segmentation, POS Tagging, and Dependency Parsing in Chinese," Proc. Annual Meeting Association Comput. Linguistics, Jeju, Rep. of Korea, July 8-14, 2012, pp. 1045-1053.
  10. Y. Goldberg and R. Tsarfaty, "A Single Generative Model for Joint Morphological Segmentation and Syntactic Parsing," Proc. Annual Meeting Association Comput. Linguistics, Columbus, OH, USA, June 15-20, 2008, pp. 371-379.
  11. Y. Watanabe, M. Asahara, and Y. Matsumoto, "A Structured Model for Joint Learning of Argument Roles and Predicate Senses," Proc. Annual Meeting Association Comput. Linguistics, Uppsala, Sweden, July 11-16, 2010, pp. 98-102.
  12. M. Khalid, V. Jijkoun, and M. Rijke, "The Impact of Named Entity Normalization on Information Retrieval for Question Answering," Proc. European Conf. IR. Res., Glasgow, UK, Mar. 30-Apr. 3, 2008, pp. 705-710.
  13. J. Finkel, C. Manning, and A. Ng, "Solving the Problem of Cascading Errors: Approximate Bayesian Inference for Linguistic Annotation Pipelines," Proc. Conf. Empirical Methods Natural Language Process., Sydney, Australia, July 22-23, 2006, pp. 618-626.
  14. H. Song et al., "A Cost Sensitive Part-of-Speech Tagging: Differentiating Serious Errors from Minor Errors," Proc. Annual Meeting Association Comput. Linguistics, Jeju, Rep. of Korea, vol. 1, July 8-14, 2012, pp. 1025-1034.
  15. S. Haykin, Neural Networks: A Comprehensive Foundation, New Jersey, USA: Prentice Hall, 1999.
  16. M. Chang et al., "Learning and Inference with Constraints," Proc. AAAI Conf. Artif. Intell., Chicago, IL, USA, July 13-17, 2008, pp. 1513-1518.
  17. D. Roth and W. Yih, "A Linear Programming Formulation for Global Inference in Natural Language Tasks," Proc. Annual Conf. Comput. Natural Language Learning, Boston, MA, USA, May 6-7, 2004, pp. 1-8.
  18. A. Azuma and Y. Matsumoto, "Multilayer Sequence Labelling," Proc. Conf. Empirical Methods Natural Language Process., Edinburgh, UK, July 30-31, 2011, pp. 628-637.
  19. L. Ramshaw and M. Marcus, "Text Chunking Using Transformation-Based Learning," Proc. Workshop Very Large Corpora, Boston, MA, USA, June 30, 1995, pp. 82-94.
  20. T. Kudo and Y. Matsumoto, "Chunking with Support Vector Machines," Proc. Meeting North American Chapter Association Comput. Linguistics, Pittsburgh, PA, USA, June 2-7, 2001, pp. 1-8.
  21. T. Kudo, CRF++: Yet Another CRF Toolkit, NIST, 2005. Accessed Sept. 20, 2013. http://crfpp.sourceforge.net
  22. K. Toutanova et al., "Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network," Proc. Annual Meeting Association Comput. Linguistics, Sapporo, Japan, vol. 1, July 7-12, 2003, pp. 173-180.
  23. L. Shen, G. Satta, and A. Joshi. "Guided Learning for Bidirectional Sequence Classification," Proc. Annual Meeting Association Comput. Linguistics, Prague, Czech Republic, June 23-30, 2007, pp. 760-767.
  24. Y. Tsuruoka and J. Tsujii, "Bidirectional Inference with the Easiest-First Strategy for Tagging Sequence Data," Proc. Conf. Human Language Technol. Natural Language Process., Vancouver, Canada, Oct. 6-8, 2005, pp. 467-474.
  25. T. Zhang, F. Damerau, and D. Johnson, "Text Chunking Using Regularized Winnow," Proc. Annual Meeting Association Comput. Linguistics, Toulouse, France, July 9-11, 2001, pp. 539-546.
  26. X. Carreras and L. Marquez, "Phrase Recognition by Filtering and Ranking with Perceptrons," Proc. Int. Conf. Recent. Adv. Natural Language Process., Borovets, Bulgaria, Sept. 10-12, 2003, pp. 205-216.