• Title/Summary/Keyword: 소셜태깅

Search Result 39, Processing Time 0.019 seconds

Topic Modeling Insomnia Social Media Corpus using BERTopic and Building Automatic Deep Learning Classification Model (BERTopic을 활용한 불면증 소셜 데이터 토픽 모델링 및 불면증 경향 문헌 딥러닝 자동분류 모델 구축)

  • Ko, Young Soo;Lee, Soobin;Cha, Minjung;Kim, Seongdeok;Lee, Juhee;Han, Ji Yeong;Song, Min
    • Journal of the Korean Society for information Management
    • /
    • v.39 no.2
    • /
    • pp.111-129
    • /
    • 2022
  • Insomnia is a chronic disease in modern society, with the number of new patients increasing by more than 20% in the last 5 years. Insomnia is a serious disease that requires diagnosis and treatment because the individual and social problems that occur when there is a lack of sleep are serious and the triggers of insomnia are complex. This study collected 5,699 data from 'insomnia', a community on 'Reddit', a social media that freely expresses opinions. Based on the International Classification of Sleep Disorders ICSD-3 standard and the guidelines with the help of experts, the insomnia corpus was constructed by tagging them as insomnia tendency documents and non-insomnia tendency documents. Five deep learning language models (BERT, RoBERTa, ALBERT, ELECTRA, XLNet) were trained using the constructed insomnia corpus as training data. As a result of performance evaluation, RoBERTa showed the highest performance with an accuracy of 81.33%. In order to in-depth analysis of insomnia social data, topic modeling was performed using the newly emerged BERTopic method by supplementing the weaknesses of LDA, which is widely used in the past. As a result of the analysis, 8 subject groups ('Negative emotions', 'Advice and help and gratitude', 'Insomnia-related diseases', 'Sleeping pills', 'Exercise and eating habits', 'Physical characteristics', 'Activity characteristics', 'Environmental characteristics') could be confirmed. Users expressed negative emotions and sought help and advice from the Reddit insomnia community. In addition, they mentioned diseases related to insomnia, shared discourse on the use of sleeping pills, and expressed interest in exercise and eating habits. As insomnia-related characteristics, we found physical characteristics such as breathing, pregnancy, and heart, active characteristics such as zombies, hypnic jerk, and groggy, and environmental characteristics such as sunlight, blankets, temperature, and naps.

Constructing an Evaluation Set for Korean Sentiment Analysis Systems Incorporating the Category and the Strength of Sentiment (감성 강도를 고려한 감성 분석 평가집합 구축)

  • Kim, Do-Yeon;Wu, Yong;Park, Hyuk-Ro
    • The Journal of the Korea Contents Association
    • /
    • v.12 no.11
    • /
    • pp.30-38
    • /
    • 2012
  • Sentiment analysis is concerned with extracting and analyzing different kinds of user sentiment expressed in a variety of social media such as blog and twitter. Although sentiment analysis techniques are actively studied for these days, evaluation sets are not developed yet for Korean sentiment analysis. In this paper, we constructed an evaluation set for Korean sentiment analysis. To evaluate sentiment analysis systems more throughly, each sentence in our evaluation set is tagged with the polarity of the sentiment as well as the category and the strength of the sentiment. We divide kinds of sentiment into 7 positive categories and 15 negative categories. Each category is given the strength of the sentiment from 1 to 3. Our evaluation set consists of 3,270 sentences extracted from various social media. For each sentence, 5 human taggers assigned the category and the strength of the sentiment expressed in the sentence. The ratio of inter-taggers agreement was 93% in the polarity, 70% in the category, 58% in the strength of sentiment. The ratio of inter-taggers agreement our evaluation set is a bit higher than other evaluation sets developed for German and Spanish. This result shows our evaluation set can be used as a reliable resource for the evaluation of sentiment analysis systems.

Developing Facets for Fiction Retrieval Based on User-generated Book Tags (이용자 생성 도서정보 태그에 기반한 소설 검색의 패싯 유형 개발)

  • Shim, Jiyoung
    • Journal of the Korean Society for information Management
    • /
    • v.37 no.2
    • /
    • pp.225-249
    • /
    • 2020
  • The purpose of this study is to identify and systematize various facet elements required by users in fiction search situations from book tags to improve the fiction search environment. Based on the Ranganathan's PMEST formula, the basic facet system of the fiction was defined as 1) the personality that forms the fiction material, 2) the content and external characteristics that compose the fiction, 3) the reader interaction with books, 4) spatial information related to fiction and reading activities, and 5) time information related to fiction and reading activities. Out of approximately 310,000 tags assigned to 7,174 fiction, 3,730 core tags were selected and content-analyzed. As a result, various attributes were systematized around the top 25 categories of the fiction facets. The results of this study can be applied to facet navigation of OPAC and fiction DB in the future.

A Tag-based Music Recommendation Using UniTag Ontology (UniTag 온톨로지를 이용한 태그 기반 음악 추천 기법)

  • Kim, Hyon Hee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.11
    • /
    • pp.133-140
    • /
    • 2012
  • In this paper, we propose a music recommendation method considering users' tags by collaborative tagging in a social music site. Since collaborative tagging allows a user to add keywords chosen by himself to web resources, it provides users' preference about the web resources concretely. In particular, emotional tags which represent human's emotion contain users' musical preference more directly than factual tags which represent facts such as musical genre and artists. Therefore, to classify the tags into the emotional tags and the factual tags and to assign weighted values to the emotional tags, a tag ontology called UniTag is developed. After preprocessing the tags, the weighted tags are used to create user profiles, and the music recommendation algorithm is executed based on the profiles. To evaluate the proposed method, a conventional playcount-based recommendation, an unweighted tag-based recommendation, and an weighted tag-based recommendation are executed. Our experimental results show that the weighted tag-based recommendation outperforms other two approaches in terms of precision.

A Study on Varieties of Subject Access and Usabilities of the National Library of Korea Subject Headings (주제 접근의 다양성과 국립중앙도서관 주제명 표목의 활용가능성에 관한 연구)

  • Chung, Yeon Kyoung
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.25 no.4
    • /
    • pp.171-185
    • /
    • 2014
  • The purposes of this study are to examine the various methods of subject access in the rapidly changing environment and to suggest the future of subject access in National Library of Korea (NLK). First of all, current status and problems of Library of Congress Subject Headings List as an representative subject headings in the world and the ways of improving effectiveness of subject retrieval were dealt with. As the ways of improving subject access, social bookmarking, folksonomy, tagging, facet applications, automatic assignment of keyword, thesauri, classification system, and auto-assigned search box were suggested. Finally, current status of NLK subject headings and the ways of improving for utilization of the subject headings as subject access were provided.

A Hybrid Music Recommendation System Combining Listening Habits and Tag Information (사용자 청취 습관과 태그 정보를 이용한 하이브리드 음악 추천 시스템)

  • Kim, Hyon Hee;Kim, Donggeon;Jo, Jinnam
    • Journal of the Korea Society of Computer and Information
    • /
    • v.18 no.2
    • /
    • pp.107-116
    • /
    • 2013
  • In this paper, we propose a hybrid music recommendation system combining users' listening habits and tag information in a social music site. Most of commercial music recommendation systems recommend music items based on the number of plays and explicit ratings of a song. However, the approach has some difficulties in recommending new items with only a few ratings or recommending items to new users with little information. To resolve the problem, we use tag information which is generated by collaborative tagging. According to the meaning of tags, a weighted value is assigned as the score of a tag of an music item. By combining the score of tags and the number of plays, user profiles are created and collaborative filtering algorithm is executed. For performance evaluation, precision, recall, and F-measure are calculated using the listening habit-based recommendation, the tag score-based recommendation, and the hybrid recommendation, respectively. Our experiments show that the hybrid recommendation system outperforms the other two approaches.

An Efficient Technique for Image Tag Ranking using Semantic Relationship between Tags (태그간 의미관계를 이용한 효율적인 이미지 태그 랭킹 기법)

  • Hong, Hyun-Ki;Heu, Jee-Uk;Jeong, Jin-Woo;Lee, Dong-Ho
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2010.06c
    • /
    • pp.31-36
    • /
    • 2010
  • 최근 대두되고 있는 웹2.0의 특징은 일반 사용자들이 능동적으로 정보를 생산해내고 공유하는데 있다. 웹 2.0의 참여형 아키텍쳐를 구성하는 핵심요소로 인식되고 있는 폭소노미(Folksonomy)는 과거 택소노미(Taxonomy)와 같이 전문가에 의하여 구축되는 분류 체계가 아닌 사용자들이 협동적으로 태그(Tag)들을 만들고 관리하는 소셜 태깅(Social Tagging)에 의한 분류 시스템이다. 최근 이러한 폭소노미를 활용하여 이미지를 공유하고 검색하고자 하는 다양한 시도들이 진행되고 있다. 그러나 Flickr와 같은 태그 기반 이미지 공유 시스템에서는 태그의 문법적, 의미적 모호성과 이미지에 대한 태그들의 중요성 또는 상관관계를 고려하지 않아 태그 기반 검색 시 정확성 및 신뢰성을 보장할 수 없다. 이러한 문제를 해결하기 위해 폭소노미에 기반한 이미지 공유 데이터베이스에서 적합한 태그들을 태그 전달(Tag Propagation)하거나 확률 및 출현빈도에 기반하여 태그 랭킹을 수행하기 위한 연구들이 활발히 진행되고 있지만 여전히 만족할만한 성능을 보이지 못하고 있다. 본 논문에서는 이미지 공유 데이터베이스에서 유사한 이미지들로부터 이미지에 보다 적합한 태그들을 부여하기 위해서, WordNet을 활용하여 태그들 간의 의미관계에 기반한 효율적인 태그 랭킹 기법을 제안한다. 또한, 신뢰성 있는 태그 기반 검색을 위하여 제안한 태그 랭킹 기법이 현재 이미지 공유 시스템의 랭킹 결과보다 정확성을 높일 수 있음을 실험 예제를 통하여 확인하였다.

  • PDF

A Study on Scale Effects of the MAUP According to the Degree of Spatial Autocorrelation - Focused on LBSNS Data - (공간적 자기상관성의 정도에 따른 MAUP에서의 스케일 효과 연구 - LBSNS 데이터를 중심으로 -)

  • Lee, Young Min;Kwon, Pil;Yu, Ki Yun;Huh, Yong
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.24 no.1
    • /
    • pp.25-33
    • /
    • 2016
  • In order to visualize point based Location-Based Social Network Services(LBSNS) data on multi-scaled tile map effectively, it is necessary to apply tile-based clustering method. Then determinating reasonable numbers and size of tiles is required. However, there is no such criteria and the numbers and size of tiles are modified based on data type and the purpose of analysis. In other words, researchers' subjectivity is always involved in this type of study. This is when Modifiable Areal Unit Problem(MAUP) occurs, that affects the results of analysis. Among LBSNS, geotagged Twitter data were chosen to find the influence of MAUP in scale effects perspective. For this purpose, the degree of spatial autocorrelation using spatial error model was altered, and change of distributions was analyzed using Morna's I. As a result, positive spatial autocorrelation showed in the original data and the spatial autocorrelation was decreased as the value of spatial autoregressive coefficient was increasing. Therefore, the intensity of the spatial autocorrelation of Twitter data was adjusted to five levels, and for each level, nine different size of grid was created. For each level and different grid sizes, Moran's I was calculated. It was found that the spatial autocorrelation was increased when the aggregation level was being increased and decreased in a certainpoint. Another tendency was found that the scale effect of MAUP was decreased when the spatial autocorrelation was high.

Component Grid: A Developer-centric Environment for Defense Software Reuse (컴포넌트 그리드: 개발자 친화적인 국방 소프트웨어 재사용 지원 환경)

  • Ko, In-Young;Koo, Hyung-Min
    • Journal of Software Engineering Society
    • /
    • v.23 no.4
    • /
    • pp.151-163
    • /
    • 2010
  • In the defense software domain where large-scale software products in various application areas need to be built, reusing software is regarded as one of the important practices to build software products efficiently and economically. There have been many efforts to apply various methods to support software reuse in the defense software domain. However, developers in the defense software domain still experience many difficulties and face obstacles in reusing software assets. In this paper, we analyze practical problems of software reuse in the defense software domain, and define core requirements to solve those problems. To meet these requirements, we are currently developing the Component Grid system, a reuse-support system that provides a developer-centric software reuse environment. We have designed an architecture of Component Grid, and defined essential elements of the architecture. We have also developed the core approaches for developing the Component Grid system: a semantic-tagging-based requirement tracing method, a reuse-knowledge representation model, a social-network-based asset search method, a web-based asset management environment, and a wiki-based collaborative and participative knowledge construction and refinement method. We expect that the Component Grid system will contribute to increase the reusability of software assets in the defense software domain by providing the environment that supports transparent and efficient sharing and reuse of software assets.

  • PDF