• Title/Summary/Keyword: Review data

Search Result 8,655, Processing Time 0.041 seconds

Changes in Review Length Based on the Popularity of Movies Using Big Data (빅데이터를 활용한 영화 흥행에 따른 리뷰길이 변화)

  • Cho, Yonghee;Park, Yiseul;Kim, Hea-Jin
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.5
    • /
    • pp.367-375
    • /
    • 2018
  • The study aims to determine which groups leave longer(more active) online reviews(comments) on the film by separating groups, one that satisfied with the movie while the other group dissatisfied with the movie. The data used were rating scores and reviews(comments) from Naver Movie API, and break-even point data provided by Korea Film Commission. We analyzed the relationship between movie rating and review length, before and after movie opening, the characteristics of review length according to the box office, and whether the movie rating affects the review length.

Fine-tuning BERT-based NLP Models for Sentiment Analysis of Korean Reviews: Optimizing the sequence length (BERT 기반 자연어처리 모델의 미세 조정을 통한 한국어 리뷰 감성 분석: 입력 시퀀스 길이 최적화)

  • Sunga Hwang;Seyeon Park;Beakcheol Jang
    • Journal of Internet Computing and Services
    • /
    • v.25 no.4
    • /
    • pp.47-56
    • /
    • 2024
  • This paper proposes a method for fine-tuning BERT-based natural language processing models to perform sentiment analysis on Korean review data. By varying the input sequence length during this process and comparing the performance, we aim to explore the optimal performance according to the input sequence length. For this purpose, text review data collected from the clothing shopping platform M was utilized. Through web scraping, review data was collected. During the data preprocessing stage, positive and negative satisfaction scores were recalibrated to improve the accuracy of the analysis. Specifically, the GPT-4 API was used to reset the labels to reflect the actual sentiment of the review texts, and data imbalance issues were addressed by adjusting the data to 6:4 ratio. The reviews on the clothing shopping platform averaged about 12 tokens in length, and to provide the optimal model suitable for this, five BERT-based pre-trained models were used in the modeling stage, focusing on input sequence length and memory usage for performance comparison. The experimental results indicated that an input sequence length of 64 generally exhibited the most appropriate performance and memory usage. In particular, the KcELECTRA model showed optimal performance and memory usage at an input sequence length of 64, achieving higher than 92% accuracy and reliability in sentiment analysis of Korean review data. Furthermore, by utilizing BERTopic, we provide a Korean review sentiment analysis process that classifies new incoming review data by category and extracts sentiment scores for each category using the final constructed model.

FEROM: Feature Extraction and Refinement for Opinion Mining

  • Jeong, Ha-Na;Shin, Dong-Wook;Choi, Joong-Min
    • ETRI Journal
    • /
    • v.33 no.5
    • /
    • pp.720-730
    • /
    • 2011
  • Opinion mining involves the analysis of customer opinions using product reviews and provides meaningful information including the polarity of the opinions. In opinion mining, feature extraction is important since the customers do not normally express their product opinions holistically but separately according to its individual features. However, previous research on feature-based opinion mining has not had good results due to drawbacks, such as selecting a feature considering only syntactical grammar information or treating features with similar meanings as different. To solve these problems, this paper proposes an enhanced feature extraction and refinement method called FEROM that effectively extracts correct features from review data by exploiting both grammatical properties and semantic characteristics of feature words and refines the features by recognizing and merging similar ones. A series of experiments performed on actual online review data demonstrated that FEROM is highly effective at extracting and refining features for analyzing customer review data and eventually contributes to accurate and functional opinion mining.

Editorial for Vol. 30, Issue 1 (편집자 주 - 30권 1호)

  • Kim, Young Hyo
    • Korean journal of aerospace and environmental medicine
    • /
    • v.30 no.1
    • /
    • pp.1-2
    • /
    • 2020
  • In commemoration of Vol. 30, Issue 1, our journal prepares four review articles and two original papers. The first review article provides guidelines for medical treatment for emergencies in an aircraft furing flight. This guideline addresses the resources and medical equipment available to physicians on board, common medical conditions, how to deal with them, including legal issues. The second review article covers historically meaningful animals that have contributed to aerospace research and the role of a veterinarian. The third one describes cardiovascular, musculoskeletal, and vestibular physiological effects of microgravity on the human body. As we are about to enter an aging society, the fourth review article introduces guidelines for safe overseas travel for senior passengers. The role of the aviation medical examiner is to maintain aircrew's health and to help them work long and healthy. In this regard, Choi et al. analyzed the physical examination data and sick leave data of an airline. Han et al. investigated the aerospace medical examination data of the Republic of Korea and suggested a solution to some common health problems of the crew.

Semantic analysis via application of deep learning using Naver movie review data (네이버 영화 리뷰 데이터를 이용한 의미 분석(semantic analysis))

  • Kim, Sojin;Song, Jongwoo
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.1
    • /
    • pp.19-33
    • /
    • 2022
  • With the explosive growth of social media, its abundant text-based data generated by web users has become an important source for data analysis. For example, we often witness online movie reviews from the 'Naver Movie' affecting the general public to decide whether they should watch the movie or not. This study has conducted analysis on the Naver Movie's text-based review data to predict the actual ratings. After examining the distribution of movie ratings, we performed semantics analysis using Korean Natural Language Processing. This research sought to find the best review rating prediction model by comparing machine learning and deep learning models. We also compared various regression and classification models in 2-class and multi-class cases. Lastly we explained the causes of review misclassification related to movie review data characteristics.

Research on the Influencing Factors of the Usefulness of the Online Review and Products Sales : Based on Chinese Online Shopping Platform Data (온라인 리뷰 유용성과 상품매출에 영향을 주는 요인 : 중국 온라인 쇼핑 플랫폼 데이터를 기반으로)

  • Hwang, Chim;Kwon, Young-Jin;Lee, Sang-Yong Tom
    • Journal of Information Technology Applications and Management
    • /
    • v.25 no.2
    • /
    • pp.53-72
    • /
    • 2018
  • This empirical study explored characteristics that affect the usefulness of online reviews, in the China e-commerce platform, and implemented multiple regressions to find factors that significantly influence on product sales, ultimately. Till now, prior studies have continuously revealed what factor affects usefulness of online review or product sales, only in respective terms. The point of our study is that we built two-level regression models, thereby being able to comprehensively analyze these two different targets. Before plunging into running regressions, we carefully collected 192,764 online review data for 200 products extracted from the Jingdong, the second biggest e-commerce platform in China. Also, we gathered "review sentimental scores" variable from each review and used that one as a core variable in our regression model, thus we were able to implement both quantitative and qualitative research. The evidences from the two-level regression models showed that the extent to which a product is experience good positively affects both usefulness of a review and product sales, again the usefulness of a review contributes to product sales in sequence. Also, the property of experience good has interaction effect on both for two-level regression models. Our main findings highlight the importance of role of online review to business performance of e-commerce firms.

Analysis of Influencing Factors on the Outpatient Prescription of Antipsychotic Drugs in the Elderly Patients (노인환자의 항정신병 약물 원외처방 내역에 미친 영향 요인 분석)

  • Dong, Jae Yong;Lee, Hyun Ji;Lee, Tae Hoon;Kim, Yujeong
    • Korean Journal of Clinical Pharmacy
    • /
    • v.31 no.4
    • /
    • pp.268-277
    • /
    • 2021
  • Background: Most antipsychotic drugs studies have been mainly conducted on side effects, randomized clinical trials, utilization rates, and trends. But there have been few studies on the influencing factors in elderly patients. The purpose of this study was to analyze the influencing factors on the outpatient prescription of antipsychotic drugs in the elderly patients. Methods: Active ingredients of antipsychotic drugs in Korea were selected according to the Korean Pharmaceutical Information Center (KPIC)'s classification. Data source was Korean Health Insurance Review and Assessment Service (HIRA) claims data in 2020 and target patient group was the elderly patient group. We extracted patients who have been prescribed one or more antipsychotic drugs and visited only one medical institution. Data were analyzed using descriptive statistics, chi-square, t-test, negative binomial regression. Results: A number of outpatients were 245,197 and prescriptions were 1,379,092. Most characteristics of patients were 75-85 year's old, female, health insurance type, no disease (dementia, schizophrenia), atypical drugs, cci score (>2) and characteristics of medical institution were neurology in specialty, rural region, general hospitals. Results of regression showed that patient's characteristics and medical center characteristics had significant effect on the outpatient prescription of antipsychotic drugs in the elderly patients. Conclusion: This study suggests that national policy of antipsychotic drugs in the elderly patients, with the consideration of the patients' and medical institutions' characteristics, is needed.

Multidimensional Analysis of Unstructured Data and Trends in Architectural Review Opinions of Small and Medium-Sized Apartment Projects (다차원 분석방법을 활용한 중소규모 공동주택 건축심의 의견의 경향과 비정형 데이터로서의 특성분석)

  • Kim, Jinhee;Hwang, Taeeon;Kim, Jae-Sik;Huh, Youngki
    • Korean Journal of Construction Engineering and Management
    • /
    • v.24 no.6
    • /
    • pp.74-80
    • /
    • 2023
  • This study examines the characteristics of architectural review opinions as unstructured data, focusing on the most challenging risk for developers of small and medium-sized apartment projects in response to the increasing number of single-person households in Korea. Using multidimensional analysis methods, the study analyzes the review opinions of 25 projects in B City. Correspondence analysis and MDS (Multidimensional Scale) analysis show that, consistent with prior research, the keywords related to 'structure' and 'planning' dominate architectural review opinions in B City. While the MDS model's stress is very poor at 34.4%, correspondence analysis reveals that this is due to the characteristics of unstructured data in architectural reviews. In addition, the non-structured data analyzed in this study, such as architectural review opinions, exhibited a probability distribution with low kurtosis and high skewness, as they involved various combinations and occurrences of data depending on the discretion of the review committee members and the specific formats of different local governments. This often led to the emergence of keywords that differed significantly from commonly mentioned terms. Although the study has some limitations, it provides a foundation for future detailed analysis by identifying the characteristics of architectural review opinions as unstructured data.

Product Recommendation System based on User Purchase Priority

  • Bang, Jinsuk;Hwang, Doyeun;Jung, Hoekyung
    • Journal of information and communication convergence engineering
    • /
    • v.18 no.1
    • /
    • pp.55-60
    • /
    • 2020
  • As personalized customer services create a society that emphasizes the personality of an individual, the number of product reviews and quantity of user data generated by users on the internet in mobile shopping apps and sites are increasing. Such product review data are classified as unstructured data. Unstructured data have the potential to be transformed into information that companies and users can employ, using appropriate processing and analyses. However, existing systems do not reflect the detailed information they collect, such as user characteristics, purchase preference, or purchase priority while analyzing review data. Thus, it is challenging to provide customized recommendations for various users. Therefore, in this study, we have developed a product recommendation system that takes into account the user's priority, which they select, when searching for and purchasing a product. The recommendation system then displays the results to the user by processing and analyzing their preferences. Since the user's preference is considered, the user can obtain results that are more relevant.

A Study on the Method for Extracting the Purpose-Specific Customized Information from Online Product Reviews based on Text Mining (텍스트 마이닝 기반의 온라인 상품 리뷰 추출을 통한 목적별 맞춤화 정보 도출 방법론 연구)

  • Kim, Joo Young;Kim, Dong soo
    • The Journal of Society for e-Business Studies
    • /
    • v.21 no.2
    • /
    • pp.151-161
    • /
    • 2016
  • In the era of the Web 2.0, characterized by the openness, sharing and participation, it is easy for internet users to produce and share the data. The amount of the unstructured data which occupies most of the digital world's data has increased exponentially. One of the kinds of the unstructured data called personal online product reviews is necessary for both the company that produces those products and the potential customers who are interested in those products. In order to extract useful information from lots of scattered review data, the process of collecting data, storing, preprocessing, analyzing, and drawing a conclusion is needed. Therefore we introduce the text-mining methodology for applying the natural language process technology to the text format data like product review in order to carry out extracting structured data by using R programming. Also, we introduce the data-mining to derive the purpose-specific customized information from the structured review information drawn by the text-mining.