• 제목/요약/키워드: Text mining analysis

검색결과 1,187건 처리시간 0.021초

텍스트 마이닝 방법을 활용한 국내 학습상담 연구 동향 분석 (Analysis of Trends in Domestic Learning Counseling Research Using Text Mining Methods)

  • 현용찬;양지혜;박정환
    • 융합정보논문지
    • /
    • 제12권3호
    • /
    • pp.302-310
    • /
    • 2022
  • 본 연구는 청소년의 학습상담 관련 연구 동향을 텍스트 마이닝 방법을 활용하여 얻어진 결과를 살펴보고 후속 연구 방향을 제시하였다. 한국 청소년 고민의 상위 1, 2위는 학습과 진로이다. '학습상담', '학업상담'키워드로 RISS를 통하여 KCI 등재 후보 이상의 학술논문 201편을 대상으로 연구자의 주관과 편견을 최소화할 수 있는 텍스트 마이닝 기법으로 모델링 분석하였다. 학습상담 토픽 결과 상담 경험[토픽1], 집단상담 연구[토픽2], 부모상담[토픽3], 학습기술 프로그램 개발[토픽4]로 나타났다. 학습상담 관련 연구는 정서적인 안정을 위한 상담, 집단상담, 부모상담과 학습기술 프로그램이 개발되고 있다. 청소년의 고민을 해결하기 위한 학습상담은 심리 정서, 부모상담, 학습기술 전문가의 협업을 통한 통합적인 지원을 위한 연구가 지속되기를 기대한다.

A Study on the Integration Between Smart Mobility Technology and Information Communication Technology (ICT) Using Patent Analysis

  • Alkaabi, Khaled Sulaiman Khalfan Sulaiman;Yu, Jiwon
    • 한국컴퓨터정보학회논문지
    • /
    • 제24권6호
    • /
    • pp.89-97
    • /
    • 2019
  • This study proposes a method for investigating current patents related to information communication technology and smart mobility to provide insights into future technology trends. The method is based on text mining clustering analysis. The method consists of two stages, which are data preparation and clustering analysis, respectively. In the first stage, tokenizing, filtering, stemming, and feature selection are implemented to transform the data into a usable format (structured data) and to extract useful information for the next stage. In the second stage, the structured data is partitioned into groups. The K-medoids algorithm is selected over the K-means algorithm for this analysis owing to its advantages in dealing with noise and outliers. The results of the analysis indicate that most current patents focus mainly on smart connectivity and smart guide systems, which play a major role in the development of smart mobility.

텍스트 마이닝을 활용한 영화흥행 예측 연구 (Study on prediction for a film success using text mining)

  • 이상훈;조장식;강창완;최승배
    • Journal of the Korean Data and Information Science Society
    • /
    • 제26권6호
    • /
    • pp.1259-1269
    • /
    • 2015
  • 최근 빅 데이터는 학계에서 키워드로 자리매김을 하고 있다. 빅 데이터의 유용성은 학계뿐만 아니라 정부, 지자체 그리고 기업체까지 파급되고 있고, 빅 데이터 속에서 유용한 정보를 도출해 내기 위해 노력하고 있다. 본 연구에서는 영화에 대한 리뷰를 가지고 텍스트 마이닝 (text mining)을 이용한 빅 데이터 분석을 수행한다. 본 연구의 목적은 포털 사이트 'D'사와 영화진흥위원회의 영화에 대한 리뷰 데이터, 그리고 고객들의 평점평균 (score)과 스크린 수 (screen number)를 설명변수로 사용하고, 영화 흥행 여부를 종속변수로 하여 로지스틱 회귀분석을 통한 영화 흥행 예측 모형을 제안하는 것이다. 분석결과, 본 연구에서 제안한 예측모형의 정분류율은 95.74%로 얻어졌다.

한글 음소 단위 딥러닝 모형을 이용한 감성분석 (Sentiment Analysis Using Deep Learning Model based on Phoneme-level Korean)

  • 이재준;권순범;안성만
    • 한국IT서비스학회지
    • /
    • 제17권1호
    • /
    • pp.79-89
    • /
    • 2018
  • Sentiment analysis is a technique of text mining that extracts feelings of the person who wrote the sentence like movie review. The preliminary researches of sentiment analysis identify sentiments by using the dictionary which contains negative and positive words collected in advance. As researches on deep learning are actively carried out, sentiment analysis using deep learning model with morpheme or word unit has been done. However, this model has disadvantages in that the word dictionary varies according to the domain and the number of morphemes or words gets relatively larger than that of phonemes. Therefore, the size of the dictionary becomes large and the complexity of the model increases accordingly. We construct a sentiment analysis model using recurrent neural network by dividing input data into phoneme-level which is smaller than morpheme-level. To verify the performance, we use 30,000 movie reviews from the Korean biggest portal, Naver. Morpheme-level sentiment analysis model is also implemented and compared. As a result, the phoneme-level sentiment analysis model is superior to that of the morpheme-level, and in particular, the phoneme-level model using LSTM performs better than that of using GRU model. It is expected that Korean text processing based on a phoneme-level model can be applied to various text mining and language models.

사회네트워크분석과 텍스트마이닝을 이용한 배구 경기력 분석 (Performance analysis of volleyball games using the social network and text mining techniques)

  • 강병욱;허만규;최승배
    • Journal of the Korean Data and Information Science Society
    • /
    • 제26권3호
    • /
    • pp.619-630
    • /
    • 2015
  • 본 연구의 목적은 '사회네트워크분석'과 '텍스트마이닝'을 이용하여 국내 남자프로배구 구단의 공격, 패스 패턴을 찾아내고, 배구경기력과 관련된 핵심 키워드 추출하여 경기력을 평가하여 향후 구단의 경기 전력을 수립하는데 기초자료로 활용하는데 있다. 본 연구에서는 '사회네트워크분석'을 통해 도출된 그룹변수들을 '텍스트마이닝' 기법의 결과인 경기의 '승패'에 차이를 검정하기 위해 '0' 그룹 (6명)과 '1' 그룹 (11명)으로 재구성하였다. 연구의 결과로서 '사회네트워크분석'의 연결중심성과 중개중심성의 순위로 판단하면, '0' 그룹 보다 '1' 그룹이 우수한 경기력을 보였다. '사회네트워크분석'에 의해서 재구성된 '0' 그룹과 '1' 그룹에 따라서 '텍스트마이닝'에 의해서 생성된 '승패' 그룹에 대한 유의성 검정 결과 유의한 차이가 있는 것으로 나타났다 (p값: 0.001). '그룹별' 클러스터링 결과, '0' 그룹의 경우 'D' 선수와 'E' 선수가 '세트' 플레이를 통하여 정확하게 득점한다고 할 수 있다. '1' 그룹의 경우 'K' 선수가 '디그'에 의해서 '공격'을 하는 경우 실패하는 경우가 많고, 'C' 선수와 'P' 선수는 '세트' 정확한 플레이를 한 것으로 나타났다.

Big Data Analysis on the Perception of Home Training According to the Implementation of COVID-19 Social Distancing

  • Hyun-Chang Keum;Kyung-Won Byun
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제15권3호
    • /
    • pp.211-218
    • /
    • 2023
  • Due to the implementation of COVID-19 distancing, interest and users in 'home training' are rapidly increasing. Therefore, the purpose of this study is to identify the perception of 'home training' through big data analysis on social media channels and provide basic data to related business sector. Social media channels collected big data from various news and social content provided on Naver and Google sites. Data for three years from March 22, 2020 were collected based on the time when COVID-19 distancing was implemented in Korea. The collected data included 4,000 Naver blogs, 2,673 news, 4,000 cafes, 3,989 knowledge IN, and 953 Google channel news. These data analyzed TF and TF-IDF through text mining, and through this, semantic network analysis was conducted on 70 keywords, big data analysis programs such as Textom and Ucinet were used for social big data analysis, and NetDraw was used for visualization. As a result of text mining analysis, 'home training' was found the most frequently in relation to TF with 4,045 times. The next order is 'exercise', 'Homt', 'house', 'apparatus', 'recommendation', and 'diet'. Regarding TF-IDF, the main keywords are 'exercise', 'apparatus', 'home', 'house', 'diet', 'recommendation', and 'mat'. Based on these results, 70 keywords with high frequency were extracted, and then semantic indicators and centrality analysis were conducted. Finally, through CONCOR analysis, it was clustered into 'purchase cluster', 'equipment cluster', 'diet cluster', and 'execute method cluster'. For the results of these four clusters, basic data on the 'home training' business sector were presented based on consumers' main perception of 'home training' and analysis of the meaning network.

빅데이터 분석을 이용한 디지털 패션 테크에 대한 인식 연구 (Perceptions and Trends of Digital Fashion Technology - A Big Data Analysis -)

  • 송은영;임호선
    • 한국의류산업학회지
    • /
    • 제23권3호
    • /
    • pp.380-389
    • /
    • 2021
  • This study aimed to reveal the perceptions and trends of digital fashion technology through an informational approach. A big data analysis was conducted after collecting the text shown in a web environment from April 2019 to April 2021. Key words were derived through text mining analysis and network analysis, and the structure of perception of digital fashion technology was identified. Using textoms, we collected 8144 texts after data refinement, conducted a frequency of emergence and central component analysis, and visualized the results with word cloud and N-gram. The frequency of appearance also generated matrices with the top 70 words, and a structural equivalent analysis was performed. The results were presented with network visualizations and dendrograms. Fashion, digital, and technology were the most frequently mentioned topics, and the frequencies of platform, digital transformation, and start-ups were also high. Through clustering, four clusters of marketing were formed using fashion, digital technology, startups, and augmented reality/virtual reality technology. Future research on startups and smart factories with technologies based on stable platforms is needed. The results of this study contribute to increasing the fashion industry's knowledge on digital fashion technology and can be used as a foundational study for the development of research on related topics.

논문 서지정보를 이용한 빈산소수괴 연구 분야의 연구용어 빈도분석 (Frequency Analysis of Scientific Texts on the Hypoxia Using Bibliographic Data)

  • 이기섭;이지영;조홍연
    • Ocean and Polar Research
    • /
    • 제41권2호
    • /
    • pp.107-120
    • /
    • 2019
  • The frequency analysis of scientific terms using bibliographic information is a simple concept, but as relevant data become more widespread, manual analysis of all data is practically impossible or only possible to a very limited extent. In addition, as the scale of oceanographic research has expanded to become much more comprehensive and widespread, the allocation of research resources on various topics has become an important issue. In this study, the frequency analysis of scientific terms was performed using text mining. The data used in the analysis is a general-purpose scholarship database, totaling 2,878 articles. Hypoxia, which is an important issue in the marine environment, was selected as a research field and the frequencies of related words were analyzed. The most frequently used words were 'Organic matter', 'Bottom water', and 'Dead zone' and specific areas showed high frequency. The results of this research can be used as a basis for the allocation of research resources to the frequency of use of related terms in specific fields when planning a large research project represented by single word.

A Big Data Study on Viewers' Response and Success Factors in the D2C Era Focused on tvN's Web-real Variety 'SinSeoYuGi' and Naver TV Cast Programming

  • Oh, Sejong;Ahn, Sunghun;Byun, Jungmin
    • International Journal of Advanced Culture Technology
    • /
    • 제4권2호
    • /
    • pp.7-18
    • /
    • 2016
  • The first D2C-era web-real variety show in Korea was broadcast via tvN of CJ E&M. The web-real variety program 'SinSeoYuGi' accumulated 54 million views, along with 50 million views at the Chinese portal site QQ. This study carries out an analysis using text mining that extracts portal site blogs, twitter page views and associative terms. In addition, this study derives viewers' response by extracting key words with opinion mining techniques that divide positive words, neutral words and negative words through customer sentiment analysis. It is found that the success factors of the web-real variety were reduced in appearance fees and production cost, harmony between actual cast members and scenario characters, mobile TV programing, and pre-roll advertising. It is expected that web-real variety broadcasting will increase in value as web contents in the future, and be established as a new genre with the job of 'technical marketer' growing as well.

Data mining and Copyright

  • Kim, Kyungsuk
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제14권4호
    • /
    • pp.11-19
    • /
    • 2022
  • Data mining has broad applications that reach beyond scholarly and scientific research and provide internet search engine services that are commonly used forms of Text and Data Mining('TDM') of websites. The exceptions and limitations for data mining provide a competitive advantage in the global race for policy innovation because it permits researchers to conduct computational analysis - TDM on any materials to which they have access. For this purpose, Japan and the EU added limitations on copyright to legalize some TDM research through amendments to copyright law, and the U.S. copyright law has allowed data mining by the fair use provision. On the other hand, there are no explicit exceptions and limitations for data mining under the Korean Copyright Act, and there are no cases considering data mining fair use. We review comparatively exceptions and limitations on copyright which will help to encourage AI-related business by using more data smoothly through the mining process and extracting more valuable information.