DOI QR코드

DOI QR Code

Fine-tuning BERT Models for Keyphrase Extraction in Scientific Articles

  • Lim, Yeonsoo (Cognitive Intelligence Lab., Department of Computer Engineering, Kumoh National Institute of Technology) ;
  • Seo, Deokjin (Cognitive Intelligence Lab., Department of Computer Engineering, Kumoh National Institute of Technology) ;
  • Jung, Yuchul (Cognitive Intelligence Lab., Department of Computer Engineering, Kumoh National Institute of Technology)
  • Received : 2020.07.09
  • Accepted : 2020.07.26
  • Published : 2020.07.31

Abstract

Despite extensive research, performance enhancement of keyphrase (KP) extraction remains a challenging problem in modern informatics. Recently, deep learning-based supervised approaches have exhibited state-of-the-art accuracies with respect to this problem, and several of the previously proposed methods utilize Bidirectional Encoder Representations from Transformers (BERT)-based language models. However, few studies have investigated the effective application of BERT-based fine-tuning techniques to the problem of KP extraction. In this paper, we consider the aforementioned problem in the context of scientific articles by investigating the fine-tuning characteristics of two distinct BERT models - BERT (i.e., base BERT model by Google) and SciBERT (i.e., a BERT model trained on scientific text). Three different datasets (WWW, KDD, and Inspec) comprising data obtained from the computer science domain are used to compare the results obtained by fine-tuning BERT and SciBERT in terms of KP extraction.

Keywords

Acknowledgement

This work was supported by Kumoh National Institute of Technology.

References

  1. R. Wang, W. Liu, and C. McDonald, "Corpus-independent generic keyphrase extraction using word embedding vectors", in Proc. of Software Engineering Research Conference, 1999, Vol. 39, pp. 1-8.
  2. X. Jiang, Y. Hu, and H. Li, "A ranking approach to keyphrase extraction", in Proc. of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Boston MA USA, July 2009, pp. 756-757.
  3. J. Mothe, F. Ramiandrisoa, and M. Rasolomanana, "Automatic keyphrase extraction using graph-based methods", in Proc. of the 33rd Annual ACM Symposium on Applied Computing, Pau France, April 2018, pp. 728-730.
  4. J. Howard, "Universal Language Model Fine-tuning for Text Classification", Association for Computational Linguistics, January 2018. [Online]. Available: http://arxiv.org/abs/1801.06146.
  5. M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer, "Deep contextualized word representations", in Proc. of the North American Chapter of the Association for Computational Linguistics, March 2018, [Online]. Available: http://arxiv.org/abs/1802.05365.
  6. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding", in Proc. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018, Vol. 1, pp. 4171-4186.
  7. Z. Zhang et al., "Semantics-aware BERT for Language Understanding", in Proc. of the Association for the Advancement of Artificial Intelligence, 2019, [Online]. Available: http://arxiv.org/abs/1909.02209.
  8. E. Papagiannopoulou and G. Tsoumakas, "A review of keyphrase extraction", Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, Vol. 10, No. 2, Sep. 2019.
  9. K. S. Jones, "A statistical interpretation of term specificity and its application in retrieval", Journal of Documentation, 1972. [Online]. Available: http://danigayo.info/teaching/SIW/PDF/sparckjones1972-bis.pdf.
  10. L. Page, S. Brin, R. Motwani, and T. Winograd, "The PageRank Citation Ranking: Bringing Order to the Web", Stanford InfoLab, 1999. [Online]. Available: http://ilpubs.stanford.edu:8090/422.
  11. R. Mihalcea and P. Tarau, "TextRank: Bringing Order Into Texts", in Proc. of the Empirical Methods in Natural Language Processing, 2004, pp. 404-411.
  12. G. Erkan and D. R. Radev, "LexRank: Graph-based lexical centrality as salience in text summarization", Journal of Artificial Intelligence Research, Vol. 22, pp. 457-479, 2004. https://doi.org/10.1613/jair.1523
  13. A. Bougouin and F. Boudin, "TopicRank : Graph-Based Topic Ranking for Keyphrase Extraction", in Proc. of the International Joint Conference on Natural Language Processing, 2013, pp. 543-551.
  14. S. Danesh, T. Sumner, and J. H. Martin, "SGRank: Combining statistical and graphical methods to improve the state of the art in unsupervised keyphrase extraction", in Proc. of the fourth joint conference on lexical and computational semantics, 2015, pp. 117-126.
  15. X. Wan and J. Xiao, "Single document keyphrase extraction using neighborhood knowledge", in Proc. of the Association for the Advancement of Artificial Intelligence, Vol. 2, July 2008, pp. 855-860.
  16. W. Shi, W. Zheng, J. X. Yu, H. Cheng, and L. Zou, "Keyphrase Extraction Using Knowledge Graphs", Data Science and Engineering, Vol. 2, No. 4, pp. 275-288, November 2017. https://doi.org/10.1007/s41019-017-0055-z
  17. A. Hulth, "Improved automatic keyword extraction given more linguistic knowledge", in Proc. of the Empirical Methods in Natural Language Processing, Philadelphia, Pa. USA, July 2003, pp. 216-223.
  18. S. N. Kim and M. Y. Kan, "Re-examining automatic keyphrase extraction approaches in scientific articles", in Proc. of the Workshop on Multiword Expressions: Identification, Interpretation, Disambiguation and Applications, Singapore, August 2009, pp. 9-16.
  19. T. D. Nguyen and M. Y. Kan, "Extraction in scientific publications", in Proc. of the International Conference on Asian Digital Libraries, 2007, pp. 317-326.
  20. S. D. Gollapalli and X. Li., "Keyphrase Extraction using Sequential Labeling", Journal of Arxiv, 2016. [Online]. Available: http://arxiv.org/abs/1608.00329.
  21. A. Graves, N. Jaitly, and A. R. Mohamed, "Hybrid speech recognition with Deep Bidirectional LSTM", in Proc. of the Institute of Electrical and Electronics Engineers, Olomouc, Czech Republic, December 2013, pp. 273-278.
  22. A. Graves and J. Schmidhuber, "Framewise phoneme classification with bidirectional LSTM networks", in Proc. of the Institute of Electrical and Electronics Engineers, Montreal, Que., Canada , August 2005, pp. 2047-2052.
  23. Q. Zhang, Y. Wang, Y. Gong, and X. Huang, "Keyphrase extraction using deep recurrent neural networks on twitter", in Proc. of the Empirical Methods in Natural Language Processing, Austin, Texas, November 2016, pp. 836-845.
  24. R. Meng, S. Zhao, S. Han, D. He, P. Brusilovsky, and Y. Chi. (2017, Apri). "Deep keyphrase generation", Annual Meeting of the Association for Computational Linguistics [Online]. Available: https://arxiv.org/abs/1704.06879.
  25. R. Alzaidy, C. Caragea, and C. L. Giles, "Bi-LSTM-CRF sequence labeling for keyphrase extraction from scholarly documents", in Proc. of the World Wide Web Conference, San Francisco CA USA, May 2019, pp. 2551-2557.
  26. J. Pennington, R. Socher, "GloVe: Global Vectors for Word Representation", in Proc. of the Empirical Methods in Natural Language Processing, Doha, Qatar, October 2014, pp. 1532-1543.
  27. D. Sahrawat, et al, "Keyphrase Extraction from Scholarly Articles as Sequence Labeling using Contextualized Embeddings", in Proc. of the European Conference on Information Retrieval, October 2019, pp. 328-335.
  28. A. Radford and T. Salimans, "Improving Language Understanding by Generative Pre-Training", June 2018, [Online]. Available: https://s3-us-west-2.amazonaws.com/openaiassets/ research-covers/language-unsupervised/language_understanding_paper.pdf.
  29. I. Beltagy, K. Lo, and A. Cohan, "SciBERT: A Pretrained Language Model for Scientific Text", in Proc. of the Empirical Methods in Natural Language Processing, Hong Kong, China November 2019, pp. 3615-3620.
  30. A. Vaswani, et al., "Attention is all you need", in Proc. of the Advances in Neural Information Processing Systems, 2017, pp. 5998-6009.
  31. M. Basaldella, et al, "Bidirectional lstm recurrent neural network for keyphrase extraction", in Proc. Italian Research Conference on Digital Libraries, Udine, Italy, January 2018, pp. 180-187.
  32. Y. Wu, et al., "Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation", Arxiv, October 2016. [Online]. Available: http://arxiv.org/abs/1609.08144.