• Title/Summary/Keyword: Latent Dirichlet allocation

Search Result 214, Processing Time 0.025 seconds

Convergence Study on Research Topics for Thyroid Cancer in Korea (국내 갑상선암 논문 토픽에 대한 융합연구)

  • Yang, Ji-Yeon
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.2
    • /
    • pp.75-81
    • /
    • 2019
  • The purpose of this study was to perform a convergence study for the investigation of the trend of research topics related to thyroid cancer in Korea. We collected related research papers from DBpia and employed LDA-based topic model. In result, we identified four research topics, each of which concerns "Surgery", "Disease aggressiveness", "Survival analysis", and "Well-being of patients". With multinomial logistic regression, we found significant time trend, where "Surgery"-related topic was popular before 2000, topics regarding "Disease aggressiveness" and "Survival analysis" were frequently addressed in the 2000s, and "Survival analysis" and especially "Well-being of patients" have been pursued since 2010. The findings would serve as a reference guide for research directions. Future work may examine whether the recent change in research topics is observed in other diseases.

A Technology Landscape of Artificial Intelligence: Technological Structure and Firms' Competitive Advantages (인공지능 기술 랜드스케이프 : 기술 구조와 기업별 경쟁우위)

  • Lee, Wangjae;Lee, Hakyeon
    • Journal of Korea Technology Innovation Society
    • /
    • v.22 no.3
    • /
    • pp.340-361
    • /
    • 2019
  • This study analyzes the technological structure of artificial intelligence (AI) and technological capabilities of AI companies based on patent information. 2589 AI patents registered in USPTO from 2007 to 2017 were collected and analyzed by the Latent Dirichlet Allocation (LDA) to derive 20 AI technology topics. Analysis of technology development trends by AI technology reveals that visual understanding, data analysis, motion control, and machine learning are growing, while language understanding and speech technology are sluggish. In addition, we also investigated leading companies in each sub-field of AI as well as core competencies of global IT companies. The findings of this study are expected to be fruitfully used for formulation and implementation of technology strategy of AI companies.

An Analysis of the Social Phenomena and Perceptions of the Special Case of Military Service System in Korean Sports Field Using Big Data (빅데이터분석을 통한 체육계 병역특례제도의 사회적 현상 및 인식분석)

  • Lee, Hyun-Jeong;Han, Hae-Won
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.4
    • /
    • pp.229-236
    • /
    • 2019
  • The purpose of this paper is to analyze social phenomena and perceptions by collecting and analyzing data on public opinion, views and trends related to special case of military service in the sports community through Big KINDS operated by the Korea Press Promotion Foundation. To this end, the related keywords were derived and visualized by implementing a LDA(latent dirichlet allocation) technique to derive problems found in social phenomena based on big data analysis. The topics derived include "re-lighting special case on military service," " military service corruption controversy," "special case of military service for athletes," "alternative military service system for artists " and "parliamentary inspection of the administration" This could be used as a basic data for identifying accurate information on social controversies related to special case of military service in the sports community and drawing up practical measures that are considered in line with the principle of just and equal burden.

Data Analysis of Dropouts of University Students Using Topic Modeling (토픽모델링을 활용한 대학생의 중도탈락 데이터 분석)

  • Jeong, Do-Heon;Park, Ju-Yeon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.1
    • /
    • pp.88-95
    • /
    • 2021
  • This study aims to provide implications for establishing support policies for students by empirically analyzing data on university students dropouts. To this end, data of students enrolled in D University after 2017 were sampled and collected. The collected data was analyzed using topic modeling(LDA: Latent Dirichlet Allocation) technique, which is a probabilistic model based on text mining. As a result of the study, it was found that topics that were characteristic of dropout students were found, and the classification performance between groups through topics was also excellent. Based on these results, a specific educational support system was proposed to prevent dropout of university students. This study is meaningful in that it shows the use of text mining techniques in the education field and suggests an education policy based on data analysis.

Technology Development Strategy of Piggyback Transportation System Using Topic Modeling Based on LDA Algorithm

  • Jun, Sung-Chan;Han, Seong-Ho;Kim, Sang-Baek
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.12
    • /
    • pp.261-270
    • /
    • 2020
  • In this study, we identify promising technologies for Piggyback transportation system by analyzing the relevant patent information. In order for this, we first develop the patent database by extracting relevant technology keywords from the pioneering research papers for the Piggyback flactcar system. We then employed textmining to identify the frequently referred words from the patent database, and using these words, we applied the LDA (Latent Dirichlet Allocation) algorithm in order to identify "topics" that are corresponding to "key" technologies for the Piggyback system. Finally, we employ the ARIMA model to forecast the trends of these "key" technologies for technology forecasting, and identify the promising technologies for the Piggyback system. with keyword search method the patent analysis. The results show that data-driven integrated management system, operation planning system and special cargo (especially fluid and gas) handling/storage technologies are identified to be the "key" promising technolgies for the future of the Piggyback system, and data reception/analysis techniques must be developed in order to improve the system performance. The proposed procedure and analysis method provides useful insights to develop the R&D strategy and the technology roadmap for the Piggyback system.

Application of a Topic Model on the Korea Expressway Corporation's VOC Data (한국도로공사 VOC 데이터를 이용한 토픽 모형 적용 방안)

  • Kim, Ji Won;Park, Sang Min;Park, Sungho;Jeong, Harim;Yun, Ilsoo
    • Journal of Information Technology Services
    • /
    • v.19 no.6
    • /
    • pp.1-13
    • /
    • 2020
  • Recently, 80% of big data consists of unstructured text data. In particular, various types of documents are stored in the form of large-scale unstructured documents through social network services (SNS), blogs, news, etc., and the importance of unstructured data is highlighted. As the possibility of using unstructured data increases, various analysis techniques such as text mining have recently appeared. Therefore, in this study, topic modeling technique was applied to the Korea Highway Corporation's voice of customer (VOC) data that includes customer opinions and complaints. Currently, VOC data is divided into the business areas of Korea Expressway Corporation. However, the classified categories are often not accurate, and the ambiguous ones are classified as "other". Therefore, in order to use VOC data for efficient service improvement and the like, a more systematic and efficient classification method of VOC data is required. To this end, this study proposed two approaches, including method using only the latent dirichlet allocation (LDA), the most representative topic modeling technique, and a new method combining the LDA and the word embedding technique, Word2vec. As a result, it was confirmed that the categories of VOC data are relatively well classified when using the new method. Through these results, it is judged that it will be possible to derive the implications of the Korea Expressway Corporation and utilize it for service improvement.

Analysis on the Trend of The Journal of Information Systems Using TLS Mining (TLS 마이닝을 이용한 '정보시스템연구' 동향 분석)

  • Yun, Ji Hye;Oh, Chang Gyu;Lee, Jong Hwa
    • The Journal of Information Systems
    • /
    • v.31 no.1
    • /
    • pp.289-304
    • /
    • 2022
  • Purpose The development of the network and mobile industries has induced companies to invest in information systems, leading a new industrial revolution. The Journal of Information Systems, which developed the information system field into a theoretical and practical study in the 1990s, retains a 30-year history of information systems. This study aims to identify academic values and research trends of JIS by analyzing the trends. Design/methodology/approach This study aims to analyze the trend of JIS by compounding various methods, named as TLS mining analysis. TLS mining analysis consists of a series of analysis including Term Frequency-Inverse Document Frequency (TF-IDF) weight model, Latent Dirichlet Allocation (LDA) topic modeling, and a text mining with Semantic Network Analysis. Firstly, keywords are extracted from the research data using the TF-IDF weight model, and after that, topic modeling is performed using the Latent Dirichlet Allocation (LDA) algorithm to identify issue keywords. Findings The current study used the summery service of the published research paper provided by Korea Citation Index to analyze JIS. 714 papers that were published from 2002 to 2012 were divided into two periods: 2002-2011 and 2012-2021. In the first period (2002-2011), the research trend in the information system field had focused on E-business strategies as most of the companies adopted online business models. In the second period (2012-2021), data-based information technology and new industrial revolution technologies such as artificial intelligence, SNS, and mobile had been the main research issues in the information system field. In addition, keywords for improving the JIS citation index were presented.

An Exploratory Analysis of Online Discussion of Library and Information Science Professionals in India using Text Mining

  • Garg, Mohit;Kanjilal, Uma
    • Journal of Information Science Theory and Practice
    • /
    • v.10 no.3
    • /
    • pp.40-56
    • /
    • 2022
  • This paper aims to implement a topic modeling technique for extracting the topics of online discussions among library professionals in India. Topic modeling is the established text mining technique popularly used for modeling text data from Twitter, Facebook, Yelp, and other social media platforms. The present study modeled the online discussions of Library and Information Science (LIS) professionals posted on Lis Links. The text data of these posts was extracted using a program written in R using the package "rvest." The data was pre-processed to remove blank posts, posts having text in non-English fonts, punctuation, URLs, emails, etc. Topic modeling with the Latent Dirichlet Allocation algorithm was applied to the pre-processed corpus to identify each topic associated with the posts. The frequency analysis of the occurrence of words in the text corpus was calculated. The results found that the most frequent words included: library, information, university, librarian, book, professional, science, research, paper, question, answer, and management. This shows that the LIS professionals actively discussed exams, research, and library operations on the forum of Lis Links. The study categorized the online discussions on Lis Links into ten topics, i.e. "LIS Recruitment," "LIS Issues," "Other Discussion," "LIS Education," "LIS Research," "LIS Exams," "General Information related to Library," "LIS Admission," "Library and Professional Activities," and "Information Communication Technology (ICT)." It was found that the majority of the posts belonged to "LIS Exam," followed by "Other Discussions" and "General Information related to the Library."

Trend Analysis of Pet Plants Before and After COVID-19 Outbreak Using Topic Modeling: Focusing on Big Data of News Articles from 2018 to 2021

  • Park, Yumin;Shin, Yong-Wook
    • Journal of People, Plants, and Environment
    • /
    • v.24 no.6
    • /
    • pp.563-572
    • /
    • 2021
  • Background and objective: The ongoing COVID-19 pandemic restricted daily life, forcing people to spend time indoors. With the growing interest in mental health issues and residential environments, 'pet plants' have been receiving attention during the unprecedented social distancing measures. This study aims to analyze the change in trends of pet plants before and during the COVID-19 pandemic and provide basic data for studies related to pet plants and directions of future development. Methods: A total of 2,016 news articles using the keyword 'pet plants' were collected on Naver News from January 1, 2018 to August 15, 2019 (609 articles) and January 1, 2020 to August 15, 2021 (1,407 articles). The texts were tokenized into words using KoNLPy package, ultimately coming up with 63,597 words. The analyses included frequency of keywords and topic modeling based on Latent Dirichlet Allocation (LDA) to identify the inherent meanings of related words and each topic. Results: Topic modeling generated three topics in each period (before and during the COVID-19), and the results showed that pet plants in daily life have become the object of 'emotional support' and 'healing' during social distancing. In particular, pet plants, which had been distributed as a solution to prevent solitary deaths and depression among seniors living alone, are now expanded to help resolve the social isolation of the general public suffering from COVID-19. The new term 'plant butler' became a new trend, and there was a change in the trend in which people shared their hobbies and information about pet plants and communicated with others in online. Conclusion: Based on these findings, the trend data of pet plants before and after the outbreak of COVID-19 can provide the basis for activating research on pet plants and setting the direction for development of related industries considering the continuous popularity and trend of indoor gardening and green hobby.

Reviews Analysis of Korean Clinics Using LDA Topic Modeling (토픽 모델링을 활용한 한의원 리뷰 분석과 마케팅 제언)

  • Kim, Cho-Myong;Jo, A-Ram;Kim, Yang-Kyun
    • The Journal of Korean Medicine
    • /
    • v.43 no.1
    • /
    • pp.73-86
    • /
    • 2022
  • Objectives: In the health care industry, the influence of online reviews is growing. As medical services are provided mainly by providers, those services have been managed by hospitals and clinics. However, direct promotions of medical services by providers are legally forbidden. Due to this reason, consumers, like patients and clients, search a lot of reviews on the Internet to get any information about hospitals, treatments, prices, etc. It can be determined that online reviews indicate the quality of hospitals, and that analysis should be done for sustainable hospital marketing. Method: Using a Python-based crawler, we collected reviews, written by real patients, who had experienced Korean medicine, about more than 14,000 reviews. To extract the most representative words, reviews were divided by positive and negative; after that reviews were pre-processed to get only nouns and adjectives to get TF(Term Frequency), DF(Document Frequency), and TF-IDF(Term Frequency - Inverse Document Frequency). Finally, to get some topics about reviews, aggregations of extracted words were analyzed by using LDA(Latent Dirichlet Allocation) methods. To avoid overlap, the number of topics is set by Davis visualization. Results and Conclusions: 6 and 3 topics extracted in each positive/negative review, analyzed by LDA Topic Model. The main factors, consisting of topics were 1) Response to patients and customers. 2) Customized treatment (consultation) and management. 3) Hospital/Clinic's environments.