• Title/Summary/Keyword: Online mining

Search Result 398, Processing Time 0.026 seconds

Predicting Missing Ratings of Each Evaluation Criteria for Hotel by Analyzing User Reviews (사용자 리뷰 분석을 통한 호텔 평가 항목별 누락 평점 예측 방법론)

  • Lee, Donghoon;Boo, Hyunkyung;Kim, Namgyu
    • Journal of Information Technology Services
    • /
    • v.16 no.4
    • /
    • pp.161-176
    • /
    • 2017
  • Recently, most of the users can easily get access to a variety of information sources about companies, products, and services through online channels. Therefore, the online user evaluations are becoming the most powerful tool to generate word of mouth. The user's evaluation is provided in two forms, quantitative rating and review text. The rating is then divided into an overall rating and a detailed rating according to various evaluation criteria. However, since it is a burden for the reviewer to complete all required ratings for each evaluation criteria, so most of the sites requested only mandatory inputs for overall rating and optional inputs for other evaluation criteria. In fact, many users input only the ratings for some of the evaluation criteria and the percentage of missed ratings for each criteria is about 40%. As these missed ratings are the missing values in each criteria, the simple average calculation by ignoring the average 40% of the missed ratings can sufficiently distort the actual phenomenon. Therefore, in this study, we propose a methodology to predict the rating for the missed values of each criteria by analyzing user's evaluation information included the overall rating and text review for each criteria. The experiments were conducted on 207,968 evaluations collected from the actual hotel evaluation site. As a result, it was confirmed that the prediction accuracy of the detailed criteria ratings by the proposed methodology was much higher than the existing average-based method.

A Study on Political Attitude Estimation of Korean OSN Users (온라인 소셜네트워크를 통한 한국인의 정치성향 예측 기법의 연구)

  • Wijaya, Muhammad Eka;Ahn, Heejune
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.21 no.4
    • /
    • pp.1-11
    • /
    • 2016
  • Recently numerous studies are conducted to estimate the human personality from the online social activities. This paper develops a comprehensive model for political attitude estimation leveraging the Facebook Like information of the users. We designed a Facebook Crawler that efficiently collects data overcoming the difficulties in crawling Ajax enabled Facebook pages. We show that the category level selection can reduce the data analysis complexity utilizing the sparsity of the huge like-attitude matrix. In the Korean Facebook users' context, only 28 criteria (3% of the total) can estimate the political polarity of the user with high accuracy (AUC of 0.82).

Political Information Filtering on Online News Comment (정보 중립성 확보를 위한 인터넷 뉴스 댓글의 정치성향 분석)

  • Choi, Hyebong;Kim, Jaehong;Lee, Jihyun;Lee, Mingu
    • The Journal of the Convergence on Culture Technology
    • /
    • v.6 no.4
    • /
    • pp.575-582
    • /
    • 2020
  • We proposes a method to estimate political preference of users who write comments on internet news. We collected and analyzed a massive amount of new comment data from internet news to extract features that effectively characterizes political preference of users. We expect that it helps user to obtain unbiased information from internet news and online discussion by providing estimated political stance of news comment writer. Through comprehensive tests we prove the effectiveness of two proposed methods, lexicon-based algorithm and similarity-based algorithm.

Machine Learning Based Stock Price Fluctuation Prediction Models of KOSDAQ-listed Companies Using Online News, Macroeconomic Indicators, Financial Market Indicators, Technical Indicators, and Social Interest Indicators (온라인 뉴스와 거시경제 지표, 금융 지표, 기술적 지표, 관심도 지표를 이용한 코스닥 상장 기업의 기계학습 기반 주가 변동 예측)

  • Kim, Hwa Ryun;Hong, Seung Hye;Hong, Helen
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.3
    • /
    • pp.448-459
    • /
    • 2021
  • In this paper, we propose a method of predicting the next-day stock price fluctuations of 10 KOSDAQ-listed companies in 5G, autonomous driving, and electricity sectors by training SVM, XGBoost, and LightGBM models from macroeconomic·financial market indicators, technical indicators, social interest indicators, and daily positive indices extracted from online news. In the three experiments to find out the usefulness of social interest indicators and daily positive indices, the average accuracy improved when each indicator and index was added to the models. In addition, when feature selection was performed to analyze the superiority of the extracted features, the average importance ranking of the social interest indicator and daily positive index was 5.45 and 1.08, respectively, it showed higher importance than the macroeconomic financial market indicators and technical indicators. With the results of these experiments, we confirmed the effectiveness of the social interest indicators as alternative data and the daily positive index for predicting stock price fluctuation.

User Review Analysis of Microtransactions in Freemium Massively Multiplayer Online Role-Playing Games Using Structural Topic Modeling (구조적 토픽모델링을 활용한 무료형 대규모 다중이용자 온라인 롤플레잉 게임의 소액결제에 대한 이용자 리뷰 분석)

  • Cheol Lee;Jae-Eun Chung
    • Human Ecology Research
    • /
    • v.61 no.3
    • /
    • pp.475-492
    • /
    • 2023
  • This study investigated player responses to microtransactions in freemium Massively multiplayer online roleplaying games (MMORPG), specifically focusing on the game LostArk using English language review data. To this end, structural topic modeling was employed and the following six microtransaction-relevant topics were identified: microtransactions, developer issues, real money trade (RMT), random number generator (RNG) upgrade system, game content, and collectibles & adventure. The first four topics were classified as being "not recommended". However, the proportions of microtransaction-related topics were relatively lower than the other topics. Additionally, this study did not extract keywords related to unfairness and unethical issues in previous microtransaction research. The last two topics, game content, and collectibles & adventure were "recommended" topics, indicating positive functions of microtransactions such as enhancing the game experience by purchasing virtual items. Moreover, it was found that players who do not engage in microtransactions can still be satisfied through continuous game content updates. Additionally, an examination of the interaction effect between time and recommendation status revealed that while the frequency with which the six microtransaction-related topics were mentioned increased over time in the reviews, the ratio of recommendations to non-recommendations varied differently. This study contributes to game-related research by revealing players' authentic opinions on microtransactions in freemium MMORPGs, thereby providing practical implications for game companies.

APMDI-CF: An Effective and Efficient Recommendation Algorithm for Online Users

  • Ya-Jun Leng;Zhi Wang;Dan Peng;Huan Zhang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.11
    • /
    • pp.3050-3063
    • /
    • 2023
  • Recommendation systems provide personalized products or services to online users by mining their past preferences. Collaborative filtering is a popular recommendation technique because it is easy to implement. However, with the rapid growth of the number of users in recommendation systems, collaborative filtering suffers from serious scalability and sparsity problems. To address these problems, a novel collaborative filtering recommendation algorithm is proposed. The proposed algorithm partitions the users using affinity propagation clustering, and searches for k nearest neighbors in the partition where active user belongs, which can reduce the range of searching and improve real-time performance. When predicting the ratings of active user's unrated items, mean deviation method is used to impute values for neighbors' missing ratings, thus the sparsity can be decreased and the recommendation quality can be ensured. Experiments based on two different datasets show that the proposed algorithm is excellent both in terms of real-time performance and recommendation quality.

Development of 3D-printed Cultural Products Using Yuan Blue and White Porcelain Patterns

  • Bowei Hu;Sun Young Choi
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.48 no.3
    • /
    • pp.576-595
    • /
    • 2024
  • Bracelets have enjoyed extensive use among the Chinese since antiquity as decorative pieces credited with warding off evil forces and inviting auspicious fortune. This study aims to integrate traditional cultural elements, such as Yuan blue and white porcelain flower patterns, into modern design using 3D printing technology to create culturally inspired bracelets. To this end, bracelet designs from the top four museums on Taobao were examined. In addition, we analyzed online reviews of culturally themed bracelets using text mining and applying FEA criteria and found that Chinese consumers are easy to wear and sizable, enhance cultural pride, and drive the demand for artistically sophisticated bracelets. The research culminates in the development of a modular bracelet design inspired by flower motifs from blue and white ware of the Yuan dynasty, with an emphasis on iterative improvements based on reviewer feedback. The final design meets consumers' expressive and aesthetic needs while also maintaining cultural integrity and functionality. The aim of the study is to inspire pride in traditional culture, provide insights for fashion accessory industries, and promote the national image through the development of culturally inspired products.

Analysis of Research Trends of 'Word of Mouth (WoM)' through Main Path and Word Co-occurrence Network (주경로 분석과 연관어 네트워크 분석을 통한 '구전(WoM)' 관련 연구동향 분석)

  • Shin, Hyunbo;Kim, Hea-Jin
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.179-200
    • /
    • 2019
  • Word-of-mouth (WoM) is defined by consumer activities that share information concerning consumption. WoM activities have long been recognized as important in corporate marketing processes and have received much attention, especially in the marketing field. Recently, according to the development of the Internet, the way in which people exchange information in online news and online communities has been expanded, and WoM is diversified in terms of word of mouth, score, rating, and liking. Social media makes online users easy access to information and online WoM is considered a key source of information. Although various studies on WoM have been preceded by this phenomenon, there is no meta-analysis study that comprehensively analyzes them. This study proposed a method to extract major researches by applying text mining techniques and to grasp the main issues of researches in order to find the trend of WoM research using scholarly big data. To this end, a total of 4389 documents were collected by the keyword 'Word-of-mouth' from 1941 to 2018 in Scopus (www.scopus.com), a citation database, and the data were refined through preprocessing such as English morphological analysis, stopwords removal, and noun extraction. To carry out this study, we adopted main path analysis (MPA) and word co-occurrence network analysis. MPA detects key researches and is used to track the development trajectory of academic field, and presents the research trend from a macro perspective. For this, we constructed a citation network based on the collected data. The node means a document and the link means a citation relation in citation network. We then detected the key-route main path by applying SPC (Search Path Count) weights. As a result, the main path composed of 30 documents extracted from a citation network. The main path was able to confirm the change of the academic area which was developing along with the change of the times reflecting the industrial change such as various industrial groups. The results of MPA revealed that WoM research was distinguished by five periods: (1) establishment of aspects and critical elements of WoM, (2) relationship analysis between WoM variables, (3) beginning of researches of online WoM, (4) relationship analysis between WoM and purchase, and (5) broadening of topics. It was found that changes within the industry was reflected in the results such as online development and social media. Very recent studies showed that the topics and approaches related WoM were being diversified to circumstantial changes. However, the results showed that even though WoM was used in diverse fields, the main stream of the researches of WoM from the start to the end, was related to marketing and figuring out the influential factors that proliferate WoM. By applying word co-occurrence network analysis, the research trend is presented from a microscopic point of view. Word co-occurrence network was constructed to analyze the relationship between keywords and social network analysis (SNA) was utilized. We divided the data into three periods to investigate the periodic changes and trends in discussion of WoM. SNA showed that Period 1 (1941~2008) consisted of clusters regarding relationship, source, and consumers. Period 2 (2009~2013) contained clusters of satisfaction, community, social networks, review, and internet. Clusters of period 3 (2014~2018) involved satisfaction, medium, review, and interview. The periodic changes of clusters showed transition from offline to online WoM. Media of WoM have become an important factor in spreading the words. This study conducted a quantitative meta-analysis based on scholarly big data regarding WoM. The main contribution of this study is that it provides a micro perspective on the research trend of WoM as well as the macro perspective. The limitation of this study is that the citation network constructed in this study is a network based on the direct citation relation of the collected documents for MPA.

Clickstream Big Data Mining for Demographics based Digital Marketing (인구통계특성 기반 디지털 마케팅을 위한 클릭스트림 빅데이터 마이닝)

  • Park, Jiae;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.143-163
    • /
    • 2016
  • The demographics of Internet users are the most basic and important sources for target marketing or personalized advertisements on the digital marketing channels which include email, mobile, and social media. However, it gradually has become difficult to collect the demographics of Internet users because their activities are anonymous in many cases. Although the marketing department is able to get the demographics using online or offline surveys, these approaches are very expensive, long processes, and likely to include false statements. Clickstream data is the recording an Internet user leaves behind while visiting websites. As the user clicks anywhere in the webpage, the activity is logged in semi-structured website log files. Such data allows us to see what pages users visited, how long they stayed there, how often they visited, when they usually visited, which site they prefer, what keywords they used to find the site, whether they purchased any, and so forth. For such a reason, some researchers tried to guess the demographics of Internet users by using their clickstream data. They derived various independent variables likely to be correlated to the demographics. The variables include search keyword, frequency and intensity for time, day and month, variety of websites visited, text information for web pages visited, etc. The demographic attributes to predict are also diverse according to the paper, and cover gender, age, job, location, income, education, marital status, presence of children. A variety of data mining methods, such as LSA, SVM, decision tree, neural network, logistic regression, and k-nearest neighbors, were used for prediction model building. However, this research has not yet identified which data mining method is appropriate to predict each demographic variable. Moreover, it is required to review independent variables studied so far and combine them as needed, and evaluate them for building the best prediction model. The objective of this study is to choose clickstream attributes mostly likely to be correlated to the demographics from the results of previous research, and then to identify which data mining method is fitting to predict each demographic attribute. Among the demographic attributes, this paper focus on predicting gender, age, marital status, residence, and job. And from the results of previous research, 64 clickstream attributes are applied to predict the demographic attributes. The overall process of predictive model building is compose of 4 steps. In the first step, we create user profiles which include 64 clickstream attributes and 5 demographic attributes. The second step performs the dimension reduction of clickstream variables to solve the curse of dimensionality and overfitting problem. We utilize three approaches which are based on decision tree, PCA, and cluster analysis. We build alternative predictive models for each demographic variable in the third step. SVM, neural network, and logistic regression are used for modeling. The last step evaluates the alternative models in view of model accuracy and selects the best model. For the experiments, we used clickstream data which represents 5 demographics and 16,962,705 online activities for 5,000 Internet users. IBM SPSS Modeler 17.0 was used for our prediction process, and the 5-fold cross validation was conducted to enhance the reliability of our experiments. As the experimental results, we can verify that there are a specific data mining method well-suited for each demographic variable. For example, age prediction is best performed when using the decision tree based dimension reduction and neural network whereas the prediction of gender and marital status is the most accurate by applying SVM without dimension reduction. We conclude that the online behaviors of the Internet users, captured from the clickstream data analysis, could be well used to predict their demographics, thereby being utilized to the digital marketing.

Time Series Analysis of Park Use Behavior Utilizing Big Data - Targeting Olympic Park - (빅데이터를 활용한 공원 이용행태의 시계열분석 - 올림픽공원을 대상으로 -)

  • Woo, Kyung-Sook;Suh, Joo-Hwan
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.46 no.2
    • /
    • pp.27-36
    • /
    • 2018
  • This study suggests the necessity of behavior analysis as changes to a park environment to reflect user desires can be implemented only by grasping the needs of park users. Online data (blog) were defined as the basic data of the study. After collecting data by 5 - year units, data mining was used to derive the characteristics of the time series behavior while the significance of the online data was verified through social network analysis. The results of the text mining analysis are as follows. First, primary results included 'walking', 'photography', 'riding bicycles'(inline, kickboard, etc.), and 'eating'. Second, in the early days of the collected data, active physical activity such as exercise was the main factor, but recent passive behavior such as eating, using a mobile phone, games, food and drinking coffee also appeared as a new behavior characteristic in parks. Third, the factors affecting the behavior of park users are the changes of various conditions of society such as internet development and a culture of expressing unique personalities and styles. Fourth, the special behaviors appearing at Olympic Park were derived from educational activities such as cultural activities including watching performances and history lessons. In conclusion, it has been shown that people's lifestyle changes and the behavior of a park are influenced by the changes of the various times rather than the original purpose that was intended during park planning and design. Therefore, it is necessary to create an environment tailored to users by considering the main behaviors and influencing factors of Olympic Park. Text mining used as an analytical method has the merit that past data can be collected. Therefore, it is possible to form analysis from a long-term viewpoint of behavior analysis as well as to measure new behavior and value with derived keywords. In addition, the validity of online data was verified through social network analysis to increase the legitimacy of research results. Research on more comprehensive behavior analysis should be carried out by diversifying the types of data collected later, and various methods for verifying the accuracy and reliability of large-volume data will be needed.