• Title/Summary/Keyword: 텍스트 연구

Search Result 3,471, Processing Time 0.031 seconds

Keyword Data Analysis Using Bayesian Conjugate Prior Distribution (베이지안 공액 사전분포를 이용한 키워드 데이터 분석)

  • Jun, Sunghae
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.6
    • /
    • pp.1-8
    • /
    • 2020
  • The use of text data in big data analytics has been increased. So, much research on methods for text data analysis has been performed. In this paper, we study Bayesian learning based on conjugate prior for analyzing keyword data extracted from text big data. Bayesian statistics provides learning process for updating parameters when new data is added to existing data. This is an efficient process in big data environment, because a large amount of data is created and added over time in big data platform. In order to show the performance and applicability of proposed method, we carry out a case study by analyzing the keyword data from real patent document data.

Comparison and Analysis of Domestic and Foreign Sports Brands Using Text Mining and Opinion Mining Analysis (텍스트 마이닝과 오피니언 마이닝 분석을 활용한 국내외 스포츠용품 브랜드 비교·분석 연구)

  • Kim, Jae-Hwan;Lee, Jae-Moon
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.6
    • /
    • pp.217-234
    • /
    • 2018
  • In this study, big data analysis was conducted for domestic and international sports goods brands. Text Mining, TF-IDF, Opinion Mining, interestity graph were conducted through the social matrix program Textom and the fashion data analysis platform MISP. In order to examine the recent recognition of sports brands, the period of study is limited to 1 year from January 1, 2017 to December 31, 2017. As a result of analysis, first, we could confirm the products representing each brand. Second, I could confirm the marketing that represents each brand. Third, the common words extracted from each brand were identified. Fourth, the emotions of positive and negative of each brand were confirmed.

Analysis of Real Estate Market Trend Using Text Mining and Big Data (빅데이터와 텍스트마이닝을 이용한 부동산시장 동향분석)

  • Chun, Hae-Jung
    • Journal of Digital Convergence
    • /
    • v.17 no.4
    • /
    • pp.49-55
    • /
    • 2019
  • This study is on the trend of real estate market using text mining and big data. The data were collected through internet news posted on Naver from August 2016 to August 2017. As a result of TF-IDF analysis, the frequency was high in the order of housing, sale, household, real estate market, and region. Many words related to policies such as loan, government, countermeasures, and regulations were extracted, and the region - related words appeared the most frequently in Seoul. The combination of the words related to the region showed that the frequencies of 'Seoul - Gangnam', 'Seoul - Metropolitan area', 'Gangnam - reconstruction' and 'Seoul - reconstruction' appeared frequently. It can be seen that the people's interest and expectation about the reconstruction of Gangnam area is high.

Analysis of VR Game Trends using Text Mining and Word Cloud -Focusing on STEAM review data- (텍스트마이닝과 워드 클라우드를 활용한 VR 게임 트렌드 분석 -스팀(steam) 리뷰 데이터를 중심으로-)

  • Na, Ji Young
    • Journal of Korea Game Society
    • /
    • v.22 no.1
    • /
    • pp.87-98
    • /
    • 2022
  • With the development of fourth industrial revolution-related technology and increased demands for non-face-to-face services, VR games attract attention. This study collected VR game review data from an online game platform STEAM and analyzed chronical trends using text mining and word cloud analysis. According to the results, experience and perceived cost were major trends from 2016 to 2017, increased demands for FPS and rhythm games were from 2018 to 2019, and story and immersion were from 2020 to 2021. It aims to contribute to expanding the base of VR games by identifying the keywords VR users take interest in by period.

A Study on the Analysis of Accident Types in Public and Private Construction Using Web Scraping and Text Mining (웹 스크래핑과 텍스트마이닝을 이용한 공공 및 민간공사의 사고유형 분석)

  • Yoon, Younggeun;Oh, Taekeun
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.5
    • /
    • pp.729-734
    • /
    • 2022
  • Various studies using accident cases are being conducted to identify the causes of accidents in the construction industry, but studies on the differences between public and private construction are insignificant. In this study, web scraping and text mining technologies were applied to analyze the causes of accidents by order type. Through statistical analysis and word cloud analysis of more than 10,000 structured and unstructured data collected, it was confirmed that there was a difference in the types and causes of accidents in public and private construction. In addition, it can contribute to the establishment of safety management measures in the future by identifying the correlation between major accident causes.

Box Office Hit Prediction Using Data mining and Text mining (데이터마이닝과 텍스트마이닝을 활용한 영화 흥행 예측)

  • Jo, Hyo-jung
    • Annual Conference of KIPS
    • /
    • 2021.05a
    • /
    • pp.316-318
    • /
    • 2021
  • 영화 수익에 있어 영화의 흥행 여부는 중요한 영향을 끼친다. 영화 흥행 요인은 영화 산업의 규모가 커지면서 많은 제작사들 및 투자자들이 고려해야 하는 사항이 되었다. 따라서 영화의 흥행을 예측하기 위한 많은 모델이 연구되었다. 본 연구의 목적은 선행연구에서 흥행에 유의미한 영향을 끼친다고 밝혀진 스크린 수, 감독명, 제작사명 등의 내재적인 속성과 더불어 온라인 구전 변수를 사용하여 영화 흥행 예측 모델을 만드는 것이다. 이때 기사 수, 블로그 수와 같이 온라인 구전의 크기를 나타내는 변수들을 사용하는 대신 개봉 후 첫 주간의 관람객 리뷰를 텍스트마이닝을 이용하여 전체 리뷰 중 긍정 리뷰의 비율에 따라 점수를 매긴 후 독립변수로 사용한다. 그 후, 데이터 마이닝 기법을 활용하여 만든 모델에 앞서 언급한 독립변수를 입력 값으로 사용하여 영화의 흥행을 예측한다. 최종적으로 의사결정트리와 로지스틱회귀를 수행한 결과 영화 흥행에 영향을 주는 독립변수를 찾고 모델의 성능을 평가하였다. 로지스틱회귀의 결과 관객 수, 평점이 영화의 흥행에 특히 유의한 영향을 끼치는 변수로 선정되었고 리뷰 역시 유의한 변수로 선정되었다. 이때 만들어진 모델은 약 90%의 높은 수준의 정확도를 보여주었다. 의사결정트리의 결과 관객 수가 가장 중요한 변수로 선정되었다.

Text-based Password Guessing Research Trend using Recurrent Neural Networks (순환 신경망을 사용한 텍스트 기반 패스워드 예측 연구 동향)

  • Lim, Se-Jin;Kim, Hyun-Ji;Kang, Yea-Jun;Kim, Won-Woong;Oh, Yu-Jin;Seo, Hwa-Jeong
    • Annual Conference of KIPS
    • /
    • 2022.11a
    • /
    • pp.473-474
    • /
    • 2022
  • 텍스트를 기반으로 하는 패스워드는 다방면에서 가장 많이 사용되고 있는 인증 수단이다. 하지만 이러한 패스워드는 사용자의 기억에 의존하기 때문에 사람들은 일반적으로 기억하기 쉽게 '!iloveY0u'와 같은 암호를 사용한다. 이로 인해 사용자들의 패스워드 간에 규칙성이 생기게 되어 HashCat과 같은 크래킹 도구에 의해 해킹될 수 있다. 딥러닝을 통한 패스워드 예측의 경우, 일반적인 패스워드 크래킹 도구와 달리 패스워드 구조 및 속성에 대한 사전 지식 및 전문적 지식 없이도 패턴을 추출하고 학습할 수 있어 활발히 연구되고 있다. 본 논문에서는 딥러닝 모델 중에서도 순환 신경망을 사용하여 텍스트 기반의 패스워드를 예측하는 연구의 동향에 대해 알아본다.

Analyzing OTT Interactive Content Using Text Mining Method (텍스트 마이닝으로 OTT 인터랙티브 콘텐츠 다시보기)

  • Sukchang Lee
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.5
    • /
    • pp.859-865
    • /
    • 2023
  • In a situation where service providers are increasingly focusing on content development due to the intense competition in the OTT market, interactive content that encourages active participation from viewers is garnering significant attention. In response to this trend, research on interactive content is being conducted more actively. This study aims to analyze interactive content through text mining techniques, with a specific focus on online unstructured data. The analysis includes deriving the characteristics of keywords according to their weight, examining the relationship between OTT platforms and interactive content, and tracking changes in the trends of interactive content based on objective data. To conduct this analysis, detailed techniques such as 'Word Cloud', 'Relationship Analysis', and 'Keyword Trend' are used, and the study also aims to derive meaningful implications from these analyses.

Sensibility by Weather and e-Commerce Purchase Behavior

  • Hyun-Jin Yeo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.4
    • /
    • pp.177-182
    • /
    • 2024
  • A consumer's decisions are made by affection of product. Affection has types: evaluation, mood, emotion and sensibility that means unconscious changes. Previous researches have clarified weather factors affect to sensibility that means weather factors may have causal effects to consumer's decision making. This research utilize weather information from KMA(Korea Meteorological Administration) and SNS geographical information and text to make weather sensibility model, and clarify the model shows significant change to online shop customer's purchase behavior(purchase frequency) by merging customer's address information and geometric information of the model for apply weather model. As a result, a model utilize daily precipitation, sunshine hours, average ground temperature, and average relative humidity makes significant result to e-commerce purchase behavior frequency.

Enhancing Multimodal Emotion Recognition in Speech and Text with Integrated CNN, LSTM, and BERT Models (통합 CNN, LSTM, 및 BERT 모델 기반의 음성 및 텍스트 다중 모달 감정 인식 연구)

  • Edward Dwijayanto Cahyadi;Hans Nathaniel Hadi Soesilo;Mi-Hwa Song
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.1
    • /
    • pp.617-623
    • /
    • 2024
  • Identifying emotions through speech poses a significant challenge due to the complex relationship between language and emotions. Our paper aims to take on this challenge by employing feature engineering to identify emotions in speech through a multimodal classification task involving both speech and text data. We evaluated two classifiers-Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM)-both integrated with a BERT-based pre-trained model. Our assessment covers various performance metrics (accuracy, F-score, precision, and recall) across different experimental setups). The findings highlight the impressive proficiency of two models in accurately discerning emotions from both text and speech data.