• Title/Summary/Keyword: 텍스트마이닝분석

Search Result 992, Processing Time 0.029 seconds

Exploring Feature Selection Methods for Effective Emotion Mining (효과적 이모션마이닝을 위한 속성선택 방법에 관한 연구)

  • Eo, Kyun Sun;Lee, Kun Chang
    • Journal of Digital Convergence
    • /
    • v.17 no.3
    • /
    • pp.107-117
    • /
    • 2019
  • In the era of SNS, many people relies on it to express their emotions about various kinds of products and services. Therefore, for the companies eagerly seeking to investigate how their products and services are perceived in the market, emotion mining tasks using dataset from SNSs become important much more than ever. Basically, emotion mining is a branch of sentiment analysis which is based on BOW (bag-of-words) and TF-IDF. However, there are few studies on the emotion mining which adopt feature selection (FS) methods to look for optimal set of features ensuring better results. In this sense, this study aims to propose FS methods to conduct emotion mining tasks more effectively with better outcomes. This study uses Twitter and SemEval2007 dataset for the sake of emotion mining experiments. We applied three FS methods such as CFS (Correlation based FS), IG (Information Gain), and ReliefF. Emotion mining results were obtained from applying the selected features to nine classifiers. When applying DT (decision tree) to Tweet dataset, accuracy increases with CFS, IG, and ReliefF methods. When applying LR (logistic regression) to SemEval2007 dataset, accuracy increases with ReliefF method.

Analysis of Factors Affecting Surge in Container Shipping Rates in the Era of Covid19 Using Text Analysis (코로나19 판데믹 이후 컨테이너선 운임 상승 요인분석: 텍스트 분석을 중심으로)

  • Rha, Jin Sung
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.27 no.1
    • /
    • pp.111-123
    • /
    • 2022
  • In the era of the Covid19, container shipping rates are surging up. Many studies have attempted to investigate the factors affecting a surge in container shipping rates. However, there is limited literature using text mining techniques for analyzing the underlying causes of the surge. This study aims to identify the factors behind the unprecedented surge in shipping rates using network text analysis and LDA topic modeling. For the analysis, we collected the data and keywords from articles in Lloyd's List during past two years(2020-2021). The results of the text analysis showed that the current surge is mainly due to "US-China trade war", "rising blanking sailings", "port congestion", "container shortage", and "unexpected events such as the Suez canal blockage".

Study on the Research Trend of Overseas Elderly Occupational Therapy Using Text Mining (텍스트마이닝을 활용한 국외 노인작업치료의 연구동향 분석)

  • Kim, Ah-Ram;Lee, Tae kwon;Jeong, In Jae;Park, Hae Yean
    • Therapeutic Science for Rehabilitation
    • /
    • v.10 no.1
    • /
    • pp.7-17
    • /
    • 2021
  • Objective : The purpose of this study was to quantitatively analyze the quantitative changes in, and the status of, overseas occupational therapy using text mining. Methods : Using PubMed, research papers on Elderly, Health and Occupational therapy published between 2009 and 2019 were selected for analysis, Abstracts of the selected papers were analyzed. The number of annual papers, the key words, the key words by year, and the relationship between the words were analyzed. Results : The number of papers published from 2009 to 2019 was 9,941, there was a gradual increase from 2009 to the highest in 2017 or 2018, followed by a decreasing trend in 2019. Within the last five years, the most frequent words were Care, Group, Intervention, Pain, Treatment, and Work. There was a strong relationship between the words based on the average frequency over the last 11 years, function, health, event, and partition. Conclusion : This study is meaningful because it applied a new research method called text mining to the empirical and systematic analysis of trends in occupational therapy and presented macroscopic and comprehensive results. The findings are expected to help establish new research directions at clinical and research sites for occupational therapy related to older adults.

Analysis of User Reviews of Running Applications Using Text Mining: Focusing on Nike Run Club and Runkeeper (텍스트마이닝을 활용한 러닝 어플리케이션 사용자 리뷰 분석: Nike Run Club과 Runkeeper를 중심으로)

  • Gimun Ryu;Ilgwang Kim
    • Journal of Industrial Convergence
    • /
    • v.22 no.4
    • /
    • pp.11-19
    • /
    • 2024
  • The purpose of this study was to analyze user reviews of running applications using text mining. This study used user reviews of Nike Run Club and Runkeeper in the Google Play Store using the selenium package of python3 as the analysis data, and separated the morphemes by leaving only Korean nouns through the OKT analyzer. After morpheme separation, we created a rankNL dictionary to remove stopwords. To analyze the data, we used TF, TF-IDF and LDA topic modeling in text mining. The results of this study are as follows. First, the keywords 'record', 'app', and 'workout' were identified as the top keywords in the user reviews of Nike Run Club and Runkeeper applications, and there were differences in the rankings of TF and TF-IDF. Second, the LDA topic modeling of Nike Run Club identified the topics of 'basic items', 'additional features', 'errors', and 'location-based data', and the topics of Runkeeper identified the topics of 'errors', 'voice function', 'running data', 'benefits', and 'motivation'. Based on the results, it is recommended that errors and improvements should be made to contribute to the competitiveness of the application.

Inferring Undiscovered Public Knowledge by Using Text Mining-driven Graph Model (텍스트 마이닝 기반의 그래프 모델을 이용한 미발견 공공 지식 추론)

  • Heo, Go Eun;Song, Min
    • Journal of the Korean Society for information Management
    • /
    • v.31 no.1
    • /
    • pp.231-250
    • /
    • 2014
  • Due to the recent development of Information and Communication Technologies (ICT), the amount of research publications has increased exponentially. In response to this rapid growth, the demand of automated text processing methods has risen to deal with massive amount of text data. Biomedical text mining discovering hidden biological meanings and treatments from biomedical literatures becomes a pivotal methodology and it helps medical disciplines reduce the time and cost. Many researchers have conducted literature-based discovery studies to generate new hypotheses. However, existing approaches either require intensive manual process of during the procedures or a semi-automatic procedure to find and select biomedical entities. In addition, they had limitations of showing one dimension that is, the cause-and-effect relationship between two concepts. Thus;this study proposed a novel approach to discover various relationships among source and target concepts and their intermediate concepts by expanding intermediate concepts to multi-levels. This study provided distinct perspectives for literature-based discovery by not only discovering the meaningful relationship among concepts in biomedical literature through graph-based path interference but also being able to generate feasible new hypotheses.

A Study on the Application of Text Mining for Corporate Application form (기업 자기소개서 대상 텍스트 마이닝 적용 연구)

  • Kim, Kyoung-Sik;Kim, Seong-Bo;Kim, Ung-mo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.11a
    • /
    • pp.668-670
    • /
    • 2017
  • 최근 우리 나라 청년실업률의 증가와 함께 기업에서는 좋은 인재를 뽑기 위하여 정형화된 자격증보다는 지원자의 경험과 역량을 보기 위한 자기소개서에 대한 중요성이 커지고 있다. 그리하여 비정형 데이터를 분석하는 대표적인 방법인 텍스트 마이닝 기법을 이용하여 취업 커뮤니티에 올라 온 합격한 삼성, 현대자동차, LG 자기소개서 데이터를 얻어내고 그 후 KONLPY 패키지를 통하여 형태소 분석을 실시한다. 합격자소서에 자주 나온 단어의 순위를 매기고 공통적으로 많이 들어간 단어와 각 대기업 별 차이가 나는 단어를 회사의 인재상과 비교해본다. 그리고 취업 준비생들에게 효율적인 방법을 통해 자기소개서를 작성하여 합격률을 높이는 방향으로 사용한다.

R&D Redundancy and Similarity Check System (클라우드 기반 R&D 연구 보고서 문서표절 및 유사도 검출 시스템)

  • Shin, Hyojoung;Park, Kiheung;Haing, Huhduck
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2016.01a
    • /
    • pp.31-32
    • /
    • 2016
  • 최근 정부의 R&D 연구에 대한 지원 규모 증가로 인해 전국가적으로 활발하게 기술 연구가 진행되고 있지만 예산을 집행하는 과정에서 기술 연구개발 과제의 중복연구로 시간과 예산을 낭비하는 사례를 노출하고 있다. 이와 같은 문제점을 해결하기 위해서는 정부 R&D 과제 선정과정에서 연구주제의 중복성 방지 등 근원적 혁신이 필요하다. 본 논문에서는 텍스트 마이닝 기술 및 빅데이터 분석 기술(하둡, 아마존 웹 서비스)과 같은 데이터 분석 기술이 도입된 클라우드 기반 R&D 연구 보고서 문서표절 및 유사도를 검출하는 시스템을 제안한다. 본 시스템은 SaaS 형태의 "on-demand software"로 웹 접속만으로 사용이 가능하다.

  • PDF

Analysis of trend in construction using textmining method (텍스트마이닝을 활용한 건설분야 트랜드 분석)

  • Jeong, Cheol-Woo;Kim, Jae-Jun
    • Journal of The Korean Digital Architecture Interior Association
    • /
    • v.12 no.2
    • /
    • pp.53-60
    • /
    • 2012
  • In this paper, we present new methods for identifying keywords for foresight topics that utilize the internet and textmining techniques to draw objective and quantified information that support experts' qualitative opinions and evaluations in foresight. Furthermore, by applying this fabricated procedure, we have derived keywords to analyze priorities in architectural engineering. Not much difference between qualitative methods of experts and quantitative methods such as text mining has been observed from comparison between technologies derived via qualitative method from "The Science Technology Vision" (control group). Therefore, as a quantitative tool useful for drawing keywords for foresight, textmining can supplement quantitative analysis by experts. In addition, depending on the level and type of raw data, text mining can bring better results in deriving foresight keywords. For this reason, research activities accommodating Internet search results and the development of textmining methods for analyzing current trends are in demand.

Time Series Analysis of Patent Keywords for Forecasting Emerging Technology (특허 키워드 시계열분석을 통한 부상기술 예측)

  • Kim, Jong-Chan;Lee, Joon-Hyuck;Kim, Gab-Jo;Park, Sang-Sung;Jang, Dong-Sick
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2014.04a
    • /
    • pp.650-652
    • /
    • 2014
  • 국가와 기업의 연구개발투자 및 경영정책 전략 수립에서 미래 부상기술 예측은 매우 중요한 역할을 한다. 기술예측을 위한 다양한 방법들이 사용되고 있으며 특허를 이용한 기술예측 또한 활발히 진행되고 있다. 최근에는 텍스트마이닝을 이용해 특허데이터의 정량적인 분석이 이루어지고 있다. 본 논문에서는 텍스트마이닝과 지수평활법을 이용한 기술예측 방법을 제안한다.

Exploring Change of The Main Key Words in Nanum Broadcast News of the Koryeinmaeul Enclave in Gwangju (광주고려인마을 나눔방송의 주요 핵심어 변화 탐색)

  • Lee, Hyoung-Ha;Kwon, Choong Hoon
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2019.07a
    • /
    • pp.185-186
    • /
    • 2019
  • 본 연구는 광주고려인마을 나눔방송의 나눔뉴스(2010년부터 2018년까지) 기사제목에 나타난 주요 핵심어와 그들간의 관계성을 탐색하기 위해 텍스트 마이닝(text mining) 방법론을 활용하여 분석하였다. 고려인마을 지원조례 제정(2013년) 후 고려인마을지원종합지원센터 개소(2015년)하면서 이주초기와는 다르게 다양한 나눔방송 뉴스가 생산되고 있다. 본 연구에서는 나눔방송의 나눔뉴스 분석을 1기(2010년부터 2014년), 2기(2015년부터 2018년까지)로 구분하여 그 주요 핵심어의 변화를 탐색하고자 한다.

  • PDF