• Title/Summary/Keyword: Topic Classification

Search Result 254, Processing Time 0.024 seconds

A Deep Learning-based Depression Trend Analysis of Korean on Social Media (딥러닝 기반 소셜미디어 한글 텍스트 우울 경향 분석)

  • Park, Seojeong;Lee, Soobin;Kim, Woo Jung;Song, Min
    • Journal of the Korean Society for information Management
    • /
    • v.39 no.1
    • /
    • pp.91-117
    • /
    • 2022
  • The number of depressed patients in Korea and around the world is rapidly increasing every year. However, most of the mentally ill patients are not aware that they are suffering from the disease, so adequate treatment is not being performed. If depressive symptoms are neglected, it can lead to suicide, anxiety, and other psychological problems. Therefore, early detection and treatment of depression are very important in improving mental health. To improve this problem, this study presented a deep learning-based depression tendency model using Korean social media text. After collecting data from Naver KonwledgeiN, Naver Blog, Hidoc, and Twitter, DSM-5 major depressive disorder diagnosis criteria were used to classify and annotate classes according to the number of depressive symptoms. Afterwards, TF-IDF analysis and simultaneous word analysis were performed to examine the characteristics of each class of the corpus constructed. In addition, word embedding, dictionary-based sentiment analysis, and LDA topic modeling were performed to generate a depression tendency classification model using various text features. Through this, the embedded text, sentiment score, and topic number for each document were calculated and used as text features. As a result, it was confirmed that the highest accuracy rate of 83.28% was achieved when the depression tendency was classified based on the KorBERT algorithm by combining both the emotional score and the topic of the document with the embedded text. This study establishes a classification model for Korean depression trends with improved performance using various text features, and detects potential depressive patients early among Korean online community users, enabling rapid treatment and prevention, thereby enabling the mental health of Korean society. It is significant in that it can help in promotion.

A Study on Image Retrieval Using Sound Classifier (사운드 분류기를 이용한 영상검색에 관한 연구)

  • Kim, Seung-Han;Lee, Myeong-Sun;Roh, Seung-Yong
    • Proceedings of the KIEE Conference
    • /
    • 2006.10c
    • /
    • pp.419-421
    • /
    • 2006
  • The importance of automatic discrimination image data has evolved as a research topic over recent years. We have used forward neural network as a classifier using sound data features within image data, our initial tests have shown encouraging results that indicate the viability of our approach.

  • PDF

Comparison and Analysis of Subject Classification for Domestic Research Data (국내 학술논문 주제 분류 알고리즘 비교 및 분석)

  • Choi, Wonjun;Sul, Jaewook;Jeong, Heeseok;Yoon, Hwamook
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.8
    • /
    • pp.178-186
    • /
    • 2018
  • Subject classification of thesis units is essential to serve scholarly information deliverables. However, to date, there is a journal-based topic classification, and there are not many article-level subject classification services. In the case of academic papers among domestic works, subject classification can be a more important information because it can cover a larger area of service and can provide service by setting a range. However, the problem of classifying themes by field requires the hands of experts in various fields, and various methods of verification are needed to increase accuracy. In this paper, we try to classify topics using the unsupervised learning algorithm to find the correct answer in the unknown state and compare the results of the subject classification algorithms using the coherence and perplexity. The unsupervised learning algorithms are a well-known Hierarchical Dirichlet Process (HDP), Latent Dirichlet Allocation (LDA) and Latent Semantic Indexing (LSI) algorithm.

Recovery of the connection relationship among planar objects

  • Yao, Fenghui;Shao, Guifeng;T amaki, Akikazu;Kato, Kiyoshi
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1996.10a
    • /
    • pp.430-433
    • /
    • 1996
  • The shape of an object plays a very important role in pattern analysis and classification. Roughly, the researches on this topic can be classified into three fields, i.e. (i) edge detection, (ii) dominant points extraction, and (iii) shape recognition and classification. Many works have been done in these three fields. However, it is very seldom to see the research that discusses the connection relationship of objects. This problem is very important in robot assembly systems. Therefore, here we focus on this problem and discuss how to recover the connection relationship of planar objects. Our method is based on the partial curve identification algorithm. The experiment results show the efficiency and validity of this method.

  • PDF

An Empirical Study on Research Diversity in "Journal of MIS Research" ("경영정보학연구"의 연구 다양성 평가)

  • Kim, Gi-Moon;Park, Choong-Shin;Kim, Joon-Seok;Lee, Ho-Geun;Im, Kun-Shin
    • Asia pacific journal of information systems
    • /
    • v.15 no.2
    • /
    • pp.149-170
    • /
    • 2005
  • This study assessed diversity in MIS research by examining 357 articles published in Journal of MIS Research from 1991 to 2003. The classification system developed by Vessey et al.(2002) was employed to examine research diversity of the articles. The classification system comprises four key characteristics of diversity such as research topic, research method, level of analysis, and reference discipline. The results of our study were also compared with Vessey et al,(2002)'s results to address problems in current MIS research and to suggest some recommendations for the future MIS research in Korea.

Analysis of Automatic Topic Classification using Youtube Meta Information (유튜브 메타정보를 이용한 자동 주제 분류 고찰)

  • Kim, Yong-Woo;Jeon, Seong-Bae;Jung, Yuchul
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2021.01a
    • /
    • pp.349-351
    • /
    • 2021
  • Youtube 동영상 업로드 시, 사용자가 직접 주제를 설정해야 하는 어려움이 있다. 본 연구에서는 사용자가 입력하는 제목과 설명정보를 이용하여 자동으로 주제를 분류하는 연구를 진행하였다. 이를 위해 한국어기반의 컨텐츠 중 고빈도의 8개 주제 카테고리를 선정하고, 이를 1.3만건의 학습데이터를 크롤링을 통해 구축하였다. 또한, 다양한 알고리즘들에 대한 최대성능을 확인하기 위해 대표적인 텍스트 분류 방법인 SVM과 LSTM기법 및 BERT 모델기반 미세적용(fine-tuning)을 시도하였다. 결과적으로 Bert-multiligual (base)를 fine-tuning한 실험에서 최대 94%의 정확도를 확인하였다. 하지만, Youtube 동영상 특성상 여러 주제를 가진 것들이 상당수 존재하기에, 실제 체감정확도는 더 높을 것으로 기대된다.

  • PDF

A Study on Analysis of national R&D research trends for Artificial Intelligence using LDA topic modeling (LDA 토픽모델링을 활용한 인공지능 관련 국가R&D 연구동향 분석)

  • Yang, MyungSeok;Lee, SungHee;Park, KeunHee;Choi, KwangNam;Kim, TaeHyun
    • Journal of Internet Computing and Services
    • /
    • v.22 no.5
    • /
    • pp.47-55
    • /
    • 2021
  • Analysis of research trends in specific subject areas is performed by examining related topics and subject changes by using topic modeling techniques through keyword extraction for most of the literature information (paper, patents, etc.). Unlike existing research methods, this paper extracts topics related to the research topic using the LDA topic modeling technique for the project information of national R&D projects provided by the National Science and Technology Knowledge Information Service (NTIS) in the field of artificial intelligence. By analyzing these topics, this study aims to analyze research topics and investment directions for national R&D projects. NTIS provides a vast amount of national R&D information, from information on tasks carried out through national R&D projects to research results (thesis, patents, etc.) generated through research. In this paper, the search results were confirmed by performing artificial intelligence keywords and related classification searches in NTIS integrated search, and basic data was constructed by downloading the latest three-year project information. Using the LDA topic modeling library provided by Python, related topics and keywords were extracted and analyzed for basic data (research goals, research content, expected effects, keywords, etc.) to derive insights on the direction of research investment.

A Topic Analysis of Requested Books by User Types at a University Library for Patron-Driven Acquisition (이용자 요구 기반 장서개발을 위한 대학도서관 희망도서 주제 분석)

  • Sanghee Choi
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.58 no.1
    • /
    • pp.395-415
    • /
    • 2024
  • In the development of a university library's collection, the concept of patron-driven acquisition refers to a collection strategy that addresses users' direct information needs. In this study, an analysis of ten years' worth of book requests by user types was conducted to understand the topic preferences for efficient collection devleopment in the university library. In collection development, identifying subject areas of users' requested books is necessary for librarians to identify key areas of collection development and establish balanced collection development policies. To identify the major subject areas for each user group, KDC (Korean Decimal Classification) subject classifications were used, and network analysis techniques were applied to investigate the relationships between book topics in detail. The analysis revealed that "social sciences" emerged as the major topic across all user groups. However, in the analysis of sub-topics, "medicine" and "psychology" were distinctively identified as the major subject areas for graduate students, setting them apart from other user groups. The result of the network analysis further indicated that undergraduate students showed unique topics such as civil service, job placement, and career, which were not observed as major topic clusters in other user groups. On the other hand, graduate students tended to concentrate on a few specialized subjects, forming distinct topic clusters in the analysis.

The Automatic Management of Classification Scheme with Interoperability on Heterogeneous Data (이기종 데이터 간 상호운용적 분류체계 관리를 위한 분류체계 자동화 방안)

  • Lee, Won-Goo;Hwang, Myung-Gwon;Lee, Min-Ho;Shin, Sung-Ho;Kim, Kwang-Young;Yoon, Hwa-Mook;Sung, Won-Kyung;Jeon, Do-Heon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.12
    • /
    • pp.2609-2618
    • /
    • 2011
  • Under the knowledge-based economy in 21C, the convergence and complexity in science and technology are being more active. Interoperability between heterogeneous domains is a very important point considered in the field of scholarly information service as well information standardization. Thus we suggest the systematic solution method to flexibly extend classification scheme in order for content management and service organizations. Especially, This paper shows that automatic method for interoperability between heterogeneous scholarly classification code structures will be effective in enhancing the information service system.

Effective Korean sentiment classification method using word2vec and ensemble classifier (Word2vec과 앙상블 분류기를 사용한 효율적 한국어 감성 분류 방안)

  • Park, Sung Soo;Lee, Kun Chang
    • Journal of Digital Contents Society
    • /
    • v.19 no.1
    • /
    • pp.133-140
    • /
    • 2018
  • Accurate sentiment classification is an important research topic in sentiment analysis. This study suggests an efficient classification method of Korean sentiment using word2vec and ensemble methods which have been recently studied variously. For the 200,000 Korean movie review texts, we generate a POS-based BOW feature and a feature using word2vec, and integrated features of two feature representation. We used a single classifier of Logistic Regression, Decision Tree, Naive Bayes, and Support Vector Machine and an ensemble classifier of Adaptive Boost, Bagging, Gradient Boosting, and Random Forest for sentiment classification. As a result of this study, the integrated feature representation composed of BOW feature including adjective and adverb and word2vec feature showed the highest sentiment classification accuracy. Empirical results show that SVM, a single classifier, has the highest performance but ensemble classifiers show similar or slightly lower performance than the single classifier.