Search | Korea Science

Rating and Comments Mining Using TF-IDF and SO-PMI for Improved Priority Ratings

Kim, Jinah;Moon, Nammee
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.13 no.11
- /
- pp.5321-5334
- /
- 2019
Data mining technology is frequently used in identifying the intention of users over a variety of information contexts. Since relevant terms are mainly hidden in text data, it is necessary to extract them. Quantification is required in order to interpret user preference in association with other structured data. This paper proposes rating and comments mining to identify user priority and obtain improved ratings. Structured data (location and rating) and unstructured data (comments) are collected and priority is derived by analyzing statistics and employing TF-IDF. In addition, the improved ratings are generated by applying priority categories based on materialized ratings through Sentiment-Oriented Point-wise Mutual Information (SO-PMI)-based emotion analysis. In this paper, an experiment was carried out by collecting ratings and comments on "place" and by applying them. We confirmed that the proposed mining method is 1.2 times better than the conventional methods that do not reflect priorities and that the performance is improved to almost 2 times when the number to be predicted is small.
https://doi.org/10.3837/tiis.2019.11.003 인용 PDF KSCI HTML

Impact of Word Embedding Methods on Performance of Sentiment Analysis with Machine Learning Techniques

Park, Hoyeon;Kim, Kyoung-jae
- Journal of the Korea Society of Computer and Information
- /
- v.25 no.8
- /
- pp.181-188
- /
- 2020
In this study, we propose a comparative study to confirm the impact of various word embedding techniques on the performance of sentiment analysis. Sentiment analysis is one of opinion mining techniques to identify and extract subjective information from text using natural language processing and can be used to classify the sentiment of product reviews or comments. Since sentiment can be classified as either positive or negative, it can be considered one of the general classification problems. For sentiment analysis, the text must be converted into a language that can be recognized by a computer. Therefore, text such as a word or document is transformed into a vector in natural language processing called word embedding. Various techniques, such as Bag of Words, TF-IDF, and Word2Vec are used as word embedding techniques. Until now, there have not been many studies on word embedding techniques suitable for emotional analysis. In this study, among various word embedding techniques, Bag of Words, TF-IDF, and Word2Vec are used to compare and analyze the performance of movie review sentiment analysis. The research data set for this study is the IMDB data set, which is widely used in text mining. As a result, it was found that the performance of TF-IDF and Bag of Words was superior to that of Word2Vec and TF-IDF performed better than Bag of Words, but the difference was not very significant.
https://doi.org/10.9708/jksci.2020.25.08.181 인용 PDF KSCI

Sentiment analysis on movie review through building modified sentiment dictionary by movie genre (영역별 맞춤형 감성사전 구축을 통한 영화리뷰 감성분석)

Lee, Sang Hoon;Cui, Jing;Kim, Jong Woo
- Journal of Intelligence and Information Systems
- /
- v.22 no.2
- /
- pp.97-113
- /
- 2016
Due to the growth of internet data and the rapid development of internet technology, "big data" analysis is actively conducted to analyze enormous data for various purposes. Especially in recent years, a number of studies have been performed on the applications of text mining techniques in order to overcome the limitations of existing structured data analysis. Various studies on sentiment analysis, the part of text mining techniques, are actively studied to score opinions based on the distribution of polarity of words in documents. Usually, the sentiment analysis uses sentiment dictionary contains positivity and negativity of vocabularies. As a part of such studies, this study tries to construct sentiment dictionary which is customized to specific data domain. Using a common sentiment dictionary for sentiment analysis without considering data domain characteristic cannot reflect contextual expression only used in the specific data domain. So, we can expect using a modified sentiment dictionary customized to data domain can lead the improvement of sentiment analysis efficiency. Therefore, this study aims to suggest a way to construct customized dictionary to reflect characteristics of data domain. Especially, in this study, movie review data are divided by genre and construct genre-customized dictionaries. The performance of customized dictionary in sentiment analysis is compared with a common sentiment dictionary. In this study, IMDb data are chosen as the subject of analysis, and movie reviews are categorized by genre. Six genres in IMDb, 'action', 'animation', 'comedy', 'drama', 'horror', and 'sci-fi' are selected. Five highest ranking movies and five lowest ranking movies per genre are selected as training data set and two years' movie data from 2012 September 2012 to June 2014 are collected as test data set. Using SO-PMI (Semantic Orientation from Point-wise Mutual Information) technique, we build customized sentiment dictionary per genre and compare prediction accuracy on review rating. As a result of the analysis, the prediction using customized dictionaries improves prediction accuracy. The performance improvement is 2.82% in overall and is statistical significant. Especially, the customized dictionary on 'sci-fi' leads the highest accuracy improvement among six genres. Even though this study shows the usefulness of customized dictionaries in sentiment analysis, further studies are required to generalize the results. In this study, we only consider adjectives as additional terms in customized sentiment dictionary. Other part of text such as verb and adverb can be considered to improve sentiment analysis performance. Also, we need to apply customized sentiment dictionary to other domain such as product reviews.
https://doi.org/10.13088/jiis.2016.22.2.097 인용 PDF KSCI

Improvement of recommendation system using attribute-based opinion mining of online customer reviews

Misun Lee;Hyunchul Ahn
- Journal of the Korea Society of Computer and Information
- /
- v.28 no.12
- /
- pp.259-266
- /
- 2023
In this paper, we propose an algorithm that can improve the accuracy performance of collaborative filtering using attribute-based opinion mining (ABOM). For the experiment, a total of 1,227 online consumer review data about smartphone apps from domestic smartphone users were used for analysis. After morpheme analysis using the KKMA (Kkokkoma) analyzer and emotional word analysis using KOSAC, attribute extraction is performed using LDA topic modeling, and the topic modeling results for each weighted review are used to add up the ratings of collaborative filtering and the sentiment score. MAE, MAPE, and RMSE, which are statistical model performance evaluations that calculate the average accuracy error, were used. Through experiments, we predicted the accuracy of online customers' app ratings (APP_Score) by combining traditional collaborative filtering among the recommendation algorithms and the attribute-based opinion mining (ABOM) technique, which combines LDA attribute extraction and sentiment analysis. As a result of the analysis, it was found that the prediction accuracy of ratings using attribute-based opinion mining CF was better than that of ratings implementing traditional collaborative filtering.
https://doi.org/10.9708/jksci.2023.28.12.259 인용 PDF HTML

An Opinion Document Clustering Technique for Product Characterization (제품 특징화를 위한 오피니언 문서의 클러스터링 기법)

Chang, Jae-Young
- The Journal of Society for e-Business Studies
- /
- v.19 no.2
- /
- pp.95-108
- /
- 2014
Opinion Mining is one of the application domains of text mining which extracting opinions from documents, and much researches are currently underway. Most of related researches focused on the sentiment classification which classifies the documents into positive/negative opinions. However, there is a little interest in extracting the features characterizing the individual product. In this paper, we propose the technique classifying the opinion documents according to the product features, and selecting the those features characterizing each product. In the proposed method, we utilize the document clustering technique and develope a new algorithm for evaluating the similarity between documents. In addition, through experiments, we prove the usefulness of proposed method.
https://doi.org/10.7838/jsebs.2014.19.2.095 인용 PDF KSCI

The effect of negated emotional words on polarity reversal and weakening value in valence (정서 단어 부정어가 정서가의 극성 전환 및 약화에 미치는 영향)

Rhee, Shin-Young;Ham, Jun-Seok;Kim, Mi-Sun;Bang, Green;Ko, Il-Ju
- Korean Journal of Cognitive Science
- /
- v.23 no.1
- /
- pp.97-107
- /
- 2012
Previous studies on opinion mining and sentiment analysis have supposed that the polarity and value of an emotional word is reversed when a negation word is attached. However, there are no quantitative studies on how much the polarity is changed when a negation word is following. Therefore, we measured the valence and arousal dimensions for Korean emotional words and their negations. Consequently, the polarity of valence and arousal was reversed on their intermediate level. Also, the value was reduced by about 30% to 50%. We propose this result as a guideline for processing negation words for studies on opinion mining and sentiment analysis.
PDF

A Classification and Selection Method of Emotion Based on Classifying Emotion Terms by Users (사용자의 정서 단어 분류에 기반한 정서 분류와 선택 방법)

Rhee, Shin-Young;Ham, Jun-Seok;Ko, Il-Ju
- Science of Emotion and Sensibility
- /
- v.15 no.1
- /
- pp.97-104
- /
- 2012
Recently, a big text data has been produced by users, an opinion mining to analyze information and opinion about users is becoming a hot issue. Of the opinion mining, especially a sentiment analysis is a study for analysing emotions such as a positive, negative, happiness, sadness, and so on analysing personal opinions or emotions for commercial products, social issues and opinions of politician. To analyze the sentiment analysis, previous studies used a mapping method setting up a distribution of emotions using two dimensions composed of a valence and arousal. But previous studies set up a distribution of emotions arbitrarily. In order to solve the problem, we composed a distribution of 12 emotions through carrying out a survey using Korean emotion words list. Also, certain emotional states on two dimension overlapping multiple emotions, we proposed a selection method with Roulette wheel method using a selection probability. The proposed method shows to classify a text into emotion extracting emotion terms from a text.
PDF

A Text Mining Analysis on Students' Perceptions about Capstone Design: Case of Industrial & Management Engineering (텍스트 마이닝을 활용한 캡스톤 디자인에 관한 학생 인식 탐색: 산업경영공학 사례)

Wi, Gwang-Ho;Kim, Yun-jin;Kim, Moon-Soo
- Journal of Engineering Education Research
- /
- v.25 no.5
- /
- pp.85-93
- /
- 2022
Capstone Design, a project-based learning technique, is the most important curriculum that clarifying major knowledge and cultivating the ability to apply through the process of solving problems in the industrial field centered on the student project team. Accordingly, various and extensive studies are being conducted for the successful implementation of capstone design courses. Unlike previous studies, this study aimed to quantitatively analyze the opinions that recorded the experiences and feelings of students who performed capstone design, and used text mining methodologies such as frequency analysis, correlation analysis, topic modeling, and sentiment analysis. As a result of examining the overall opinions of the latter period through frequency analysis and correlation analysis, there was a difference between the languages used by the students in the opinions according to gender and project results. Through topic modeling analysis, 'topic selection' and 'the relationship between team members' showed an increase in occupancy or high occupancy, and topics such as 'presentation', 'leadership', and 'feeling what they felt' showed a tendency to decreasing occupancy. Lastly, sentiment analysis has found that female students showed more neutral emotions than male students, and the passed group showed more negative emotions than the non-passed group and less neutral emotions. Based on these findings, students' practical recognition of the curriculum was considered and implications for the improvement of capstone design were presented.
https://doi.org/10.18108/jeer.2022.25.5.85 인용 PDF KSCI

Trend Analysis of FinTech and Digital Financial Services using Text Mining (텍스트마이닝을 활용한 핀테크 및 디지털 금융 서비스 트렌드 분석)

Kim, Do-Hee;Kim, Min-Jeong
- Journal of Digital Convergence
- /
- v.20 no.3
- /
- pp.131-143
- /
- 2022
Focusing on FinTech keywords, this study is analyzing newspaper articles and Twitter data by using text mining methodology in order to understand trends in the industry of domestic digital financial service. In the growth of FinTech lifecycle, the frequency analysis has been performed by four important points: Mobile Payment Service, Internet Primary Bank, Data 3 Act, MyData Businesses. Utilizing frequency analysis, which combines the keywords 'China', 'USA', and 'Future' with the 'FinTech', has been predicting the FinTech industry regarding of the current and future position. Next, sentiment analysis was conducted on Twitter to quantify consumers' expectations and concerns about FinTech services. Therefore, this study is able to share meaningful perspective in that it presented strategic directions that the government and companies can use to understanding future FinTech market by combining frequency analysis and sentiment analysis.
https://doi.org/10.14400/JDC.2022.20.3.131 인용 PDF KSCI

Visualization of movie recommendation system using the sentimental vocabulary distribution map

Ha, Hyoji;Han, Hyunwoo;Mun, Seongmin;Bae, Sungyun;Lee, Jihye;Lee, Kyungwon
- Journal of the Korea Society of Computer and Information
- /
- v.21 no.5
- /
- pp.19-29
- /
- 2016
This paper suggests a method to refine a massive collective intelligence data, and visualize with multilevel sentiment network, in order to understand information in an intuitive and semantic way. For this study, we first calculated a frequency of sentiment words from each movie review. Second, we designed a Heatmap visualization to effectively discover the main emotions on each online movie review. Third, we formed a Sentiment-Movie Network combining the MDS Map and Social Network in order to fix the movie network topology, while creating a network graph to enable the clustering of similar nodes. Finally, we evaluated our progress to verify if it is actually helpful to improve user cognition for multilevel analysis experience compared to the existing network system, thus concluded that our method provides improved user experience in terms of cognition, being appropriate as an alternative method for semantic understanding.
https://doi.org/10.9708/jksci.2016.21.5.019 인용 PDF KSCI

Search Result 239, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)