• Title/Summary/Keyword: AET library and information science

Search Result 1, Processing Time 0.01 seconds

Topic Model Augmentation and Extension Method using LDA and BERTopic (LDA와 BERTopic을 이용한 토픽모델링의 증강과 확장 기법 연구)

  • Kim, SeonWook;Yang, Kiduk
    • Journal of the Korean Society for information Management
    • /
    • v.39 no.3
    • /
    • pp.99-132
    • /
    • 2022
  • The purpose of this study is to propose AET (Augmented and Extended Topics), a novel method of synthesizing both LDA and BERTopic results, and to analyze the recently published LIS articles as an experimental approach. To achieve the purpose of this study, 55,442 abstracts from 85 LIS journals within the WoS database, which spans from January 2001 to October 2021, were analyzed. AET first constructs a WORD2VEC-based cosine similarity matrix between LDA and BERTopic results, extracts AT (Augmented Topics) by repeating the matrix reordering and segmentation procedures as long as their semantic relations are still valid, and finally determines ET (Extended Topics) by removing any LDA related residual subtopics from the matrix and ordering the rest of them by F1 (BERTopic topic size rank, Inverse cosine similarity rank). AET, by comparing with the baseline LDA result, shows that AT has effectively concretized the original LDA topic model and ET has discovered new meaningful topics that LDA didn't. When it comes to the qualitative performance evaluation, AT performs better than LDA while ET shows similar performances except in a few cases.