• Title/Summary/Keyword: 온라인 마이닝

Search Result 242, Processing Time 0.024 seconds

Unstructured Data Processing Using Keyword-Based Topic-Oriented Analysis (키워드 기반 주제중심 분석을 이용한 비정형데이터 처리)

  • Ko, Myung-Sook
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.11
    • /
    • pp.521-526
    • /
    • 2017
  • Data format of Big data is diverse and vast, and its generation speed is very fast, requiring new management and analysis methods, not traditional data processing methods. Textual mining techniques can be used to extract useful information from unstructured text written in human language in online documents on social networks. Identifying trends in the message of politics, economy, and culture left behind in social media is a factor in understanding what topics they are interested in. In this study, text mining was performed on online news related to a given keyword using topic - oriented analysis technique. We use Latent Dirichiet Allocation (LDA) to extract information from web documents and analyze which subjects are interested in a given keyword, and which topics are related to which core values are related.

A Study on Political Attitude Estimation of Korean OSN Users (온라인 소셜네트워크를 통한 한국인의 정치성향 예측 기법의 연구)

  • Wijaya, Muhammad Eka;Ahn, Heejune
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.21 no.4
    • /
    • pp.1-11
    • /
    • 2016
  • Recently numerous studies are conducted to estimate the human personality from the online social activities. This paper develops a comprehensive model for political attitude estimation leveraging the Facebook Like information of the users. We designed a Facebook Crawler that efficiently collects data overcoming the difficulties in crawling Ajax enabled Facebook pages. We show that the category level selection can reduce the data analysis complexity utilizing the sparsity of the huge like-attitude matrix. In the Korean Facebook users' context, only 28 criteria (3% of the total) can estimate the political polarity of the user with high accuracy (AUC of 0.82).

A Study of an Efficient Retrieval System Algorithm using a Text Mining (텍스트마이닝 기술을 이용한 효율적인 검색시스템 알고리즘에 대한 연구)

  • Kim, Je-Seok;Kim, Jang-Hyung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • v.9 no.2
    • /
    • pp.531-534
    • /
    • 2005
  • Currently some problems are presented by the enlargement of network range and hardware upgrade for the solutions for network traffic and treatment speed of server processing, as well as the resource of networks and increasing speed of on-line information that is exceeding in operation limit of existing information systems. The study proposes the Architecture, an organic unification system of optimized content for retrieval, which is adapted to variable points of view of users or content changes of document aggregation by the study of algorithm, which offers easy retrieval of the location of documents on a multitude of on-line data.

  • PDF

Online Reputation Analysis of Dietary Supplements based on Sentiment Analysis (감성 분석을 이용한 다이어트 보조 식품에 대한 온라인 평판분석)

  • Lee, So-Hee;Lee, Jin-Yeong;Kim, Hyon Hee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2018.05a
    • /
    • pp.306-308
    • /
    • 2018
  • 본 연구에서는 체중 감량을 위해 무분별한 다이어트 식품의 남용을 막고, 다이어트 보조 식품에 대한 정보를 제공하기 위해서 감성 분석을 활용하여 다이어트 보조 식품에 대한 온라인 후기를 분석하였다. 먼저, 다이어트 보조 식품을 그 특성에 따라 네 가지 종류로 분류하고 각 카테고리 별로 긍정 및 부정 점수를 계산하였다. 이를 위해 체중 감량에 대한 감성 사전을 다이어트 식품에 대한 후기를 텍스트 마이닝하여 구축하였다. 특히 부작용이 있는 식품에 대한 부정 점수에 가중치를 두기 위해서 WHO-ART 에서 정의한 부작용 용어에는 가중치를 두어 처리하였다. 분석 결과 단백질 보충 식품군이 긍정 점수가 가장 높게 나타났고, 이는 다이어트를 위한 목적 이외에도 운동을 전문적으로 하는 사람들에게 오랜기간 사용되어 왔기 때문인 것으로 해석된다. 또한 식욕 억제제 식품군이 긍정점수는 가장 낮고 부정 점수는 가장 높게 나타났는데, 이는 식욕억제제의 주성분인 펜타민에 의한 가능성이 클 것이라고 예측된다.

A Study on the Failure Experiences of Online Fashion Shopping Mall Startups -Applying Text Mining and Grounded Theory- (온라인 패션 쇼핑몰 창업의 실패 경험에 관한 연구 -텍스트 마이닝과 근거이론을 적용하여-)

  • Min Jeong Seo
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.47 no.6
    • /
    • pp.1096-1112
    • /
    • 2023
  • Many entrepreneurs who launched online fashion shopping malls faced failure compared to those who achieved success. Recognizing the importance of research that reflects reality, this study explores entrepreneurs' experiences during the failure process of online fashion shopping malls. Two studies utilized YouTube videos documenting such online fashion shopping malls' failure. Study 1 employed text mining techniques, including high-frequency analysis and topic modeling, while Study 2 used a qualitative research method, specifically grounded theory. Study 1 identified the prominent experiences of operating online fashion shopping malls, while Study 2 provided a holistic perspective on the failure processes. The integrated findings from both studies highlight that entrepreneurs' passion for fashion motivates them to establish online fashion shopping malls, yet they encounter numerous challenges during the operational process. Insufficient business preparation and operational capabilities contribute to their failure to achieve financial goals. Despite efforts to boost sales and profit, entrepreneurs often close their businesses due to inadequate funds and waning motivation. The outcomes of this study can inform us about the operational challenges faced by online fashion shopping malls and offer valuable insights for developing new strategies to sustain and improve them.

Design of Contents Curation System Based on Incremental Learning Technology for Big Data Mining (빅데이터 마이닝을 위한 점진적 학습 기반 콘텐츠 큐레이션 시스템 설계)

  • Min, Byung-Won
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2017.05a
    • /
    • pp.421-422
    • /
    • 2017
  • 콘텐츠 큐레이션 서비스를 위해서 대용량 데이터를 학습하는 과정에서 발생하는 메모리부족 문제, 학습소요시간 문제 등을 해결하기 위한 "대용량 문서학습을 위한 동적학습 파이프라인 생성기술 중 빅데이터 마이닝을 위한 점진적 학습 모델" 기술이 필요하며, 본 논문에서 제안한 콘텐츠 큐레이션 서비스는 온라인상의 수많은 콘텐츠들 중 개인의 주관이나 관점에 따라 관련 콘텐츠들을 수집, 정리하고 편집하여 이용자와 관련이 있거나 좋아할 만한 콘텐츠를 제공하는 서비스이다. 큐레이션 서비스에서는 개인비서, 금융 분야의 투자, 자율주행, 저널리즘, 효율적인 업무 지시/감독, 제조업의 자동화 공정, 교육, 콘텐츠 유통, 학술정보 등에서 컴퓨터가 방대한 양의 데이터로 부터 학습하여 사람의 일을 대신 처리하거나 의사결정에 도움을 줌으로써 업무의 효율을 높여주는 서비스 산업에 활용이 가능하다.

  • PDF

The way to measure trust ratio of text in Opinion Mining (오피니언 마이닝에서의 텍스트 신뢰도 측정 방법)

  • Kim, Iee-Joon;Lim, Ji-Yeon
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2011.06c
    • /
    • pp.135-138
    • /
    • 2011
  • 정보화 시대에 정보력 경쟁과 확보는 오늘날 생존과 직결될 수 있는 가장 중요한 요소가 되었다. 이런 급속히 발전하는 온라인상에서, 정보를 공유하는 사람들은 양질의 정보를 공유할 의무가 있다. 또한, 많은 사람들이 자신의 생각에 확신을 가지기 위해 웹상의 다른 사람들의 정보를 참조하고 결정하는데 있어서 도움을 구하는 것이 현실이다. 이렇듯 웹상에서 넘쳐나는 수많은 정보와 의견들을 전부 신뢰할 수 없기에, 작성자의 신뢰도를 어느 정도 수치화 한다면 특정 작성자들의 의도적인 의견 조작에 의한 피해들을 사전에 방지할 수 있을 것이다. 본 논문에서는 특정 작성자의 글을 오피니언 마이닝하여, 특정 카테고리 별로 분석하여 신뢰도 점수를 부여하는 방법을 제안 하고자 한다.

An algorithm for mining the reputation of a product based on big data analytics (빅데이터 분석 기반의 제품 평판 마이닝 알고리즘)

  • Park, Sang-Min;Park, Sae-Bit;On, Byung-Won
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2016.04a
    • /
    • pp.420-423
    • /
    • 2016
  • 최근 여론조사 분야에서 빅데이터 분석 기법이 널리 활용되고 있다. 기업에서는 최근 출시된 제품에 대한 선호도를 조사하기 위해 기존의 설문조사나 전문가의 의견을 단순 취합하는 것이 아니라, 온라인상에 존재하는 다양한 종류의 데이터를 수집하고 분석하여 제품에 대한 대중의 기호를 정확히 파악할 수 있는 방안이 필요하다. 본 연구에서는 빅데이터로부터 제품의 평판을 자동으로 찾아내는 텍스트 마이닝 방안을 제안하고, 소나타 자동차를 중심으로 제안 방안의 효율성을 평가하고 실험 결과를 자세히 분석한다.

A Study on the Influence of Sentiment and Emotion on Review Helpfulness through Online Reviews of Restaurants (레스토랑의 온라인 리뷰를 통해 감성과 감정이 리뷰 유용성에 미치는 영향에 관한 연구)

  • Yao, Ziyan;Park, Jiyoung;Hong, Taeho
    • Knowledge Management Research
    • /
    • v.22 no.1
    • /
    • pp.243-267
    • /
    • 2021
  • Sentiment represents one's own state through the process of change to stimulus, and emotion represents a simple psychological state felt for a certain phenomenon. These two terms tend to be used interchangeably, but their meaning and usage are different. In this study, we try to find out how it affects the helpfulness of reviews by classifying sentiment and emotion through online reviews written by online consumers after purchasing and using various products and services. Recently, online reviews have become a very important factor for businesses and consumers. Helpful reviews play a key role in the decision-making process of potential customers and can be assessed through review helpfulness. The helpfulness of reviews is becoming increasingly important in practice as it is utilized in marketing strategies in business as well as in purchasing decision-making issues of consumers. And academically, the importance of research to find the factors influencing the helpfulness of reviews is growing. In this study, Yelp.com secured reviews on restaurants and conducted a study on how the sentiment and emotion of online reviews affect the helpfulness of reviews. Based on the prior research, a research model including sentiment and emotions for online reviews was built, and text mining analyzes how the sentiment and emotion of online reviews affect the helpfulness of online reviews, and the difference in the effects on emotions It was verified. The results showed that negative sentiment and emotion had a greater effect on review helpfulness, which was consistent with the negative bias theory.

The Detection of Online Manipulated Reviews Using Machine Learning and GPT-3 (기계학습과 GPT3를 시용한 조작된 리뷰의 탐지)

  • Chernyaeva, Olga;Hong, Taeho
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.4
    • /
    • pp.347-364
    • /
    • 2022
  • Fraudulent companies or sellers strategically manipulate reviews to influence customers' purchase decisions; therefore, the reliability of reviews has become crucial for customer decision-making. Since customers increasingly rely on online reviews to search for more detailed information about products or services before purchasing, many researchers focus on detecting manipulated reviews. However, the main problem in detecting manipulated reviews is the difficulties with obtaining data with manipulated reviews to utilize machine learning techniques with sufficient data. Also, the number of manipulated reviews is insufficient compared with the number of non-manipulated reviews, so the class imbalance problem occurs. The class with fewer examples is under-represented and can hamper a model's accuracy, so machine learning methods suffer from the class imbalance problem and solving the class imbalance problem is important to build an accurate model for detecting manipulated reviews. Thus, we propose an OpenAI-based reviews generation model to solve the manipulated reviews imbalance problem, thereby enhancing the accuracy of manipulated reviews detection. In this research, we applied the novel autoregressive language model - GPT-3 to generate reviews based on manipulated reviews. Moreover, we found that applying GPT-3 model for oversampling manipulated reviews can recover a satisfactory portion of performance losses and shows better performance in classification (logit, decision tree, neural networks) than traditional oversampling models such as random oversampling and SMOTE.