Search | Korea Science

Handling Incomplete Data Problem in Collaborative Filtering System

Noh, Hyun-Ju;Kwak, Min-Jung;Han, In-Goo
- Journal of Intelligence and Information Systems
- /
- v.9 no.2
- /
- pp.51-63
- /
- 2003
Collaborative filtering is one of the methodologies that are most widely used for recommendation system. It is based on a data matrix of each customer's preferences of products. There could be a lot of missing values in such preference data matrix. This incomplete data is one of the reasons to deteriorate the accuracy of recommendation system. There are several treatments to deal with the incomplete data problem such as case deletion and single imputation. Those approaches are simple and easy to implement but they may provide biased results. Multiple imputation method imputes m values for each missing value. It overcomes flaws of single imputation approaches through considering the uncertainty of missing values. The objective of this paper is to suggest multiple imputation-based collaborative filtering approach for recommendation system to improve the accuracy in prediction performance. The experimental works show that the proposed approach provides better performance than the traditional Collaborative filtering approach, especially in case that there are a lot of missing values in dataset used for recommendation system.
PDF

A Study on the Real-Time Preference Prediction for Personalized Recommendation on the Mobile Device (모바일 기기에서 개인화 추천을 위한 실시간 선호도 예측 방법에 대한 연구)

Lee, Hak Min;Um, Jong Seok
- Journal of Korea Multimedia Society
- /
- v.20 no.2
- /
- pp.336-343
- /
- 2017
We propose a real time personalized recommendation algorithm on the mobile device. We use a unified collaborative filtering with reduced data. We use Fuzzy C-means clustering to obtain the reduced data and Konohen SOM is applied to get initial values of the cluster centers. The proposed algorithm overcomes data sparsity since it extends data to the similar users and similar items. Also, it enables real time service on the mobile device since it reduces computing time by data clustering. Applying the suggested algorithm to the MovieLens data, we show that the suggested algorithm has reasonable performance in comparison with collaborative filtering. We developed Android-based smart-phone application, which recommends restaurants with coupons and restaurant information.
https://doi.org/10.9717/kmms.2017.20.2.336 인용 PDF KSCI

A Comprehensive Performance Evaluation in Collaborative Filtering (협업필터링에서 포괄적 성능평가 모델)

Yu, Seok-Jong
- Journal of the Korea Society of Computer and Information
- /
- v.17 no.4
- /
- pp.83-90
- /
- 2012
In e-commerce systems that deal with a large number of items, the function of personalized recommendation is essential. Collaborative filtering that is a successful recommendation algorithm, suffers from the sparsity, cold-start, and scalability restrictions. Additionally, this work raises a new flaw of the algorithm, inconsistent performance of recommendation. This is also not measurable by the current MAE-based evaluation that does not consider the deviation of prediction error, and furthermore is performed independently of precision and recall measurement. To evaluate the collaborative filtering comprehensively, this work proposes an extended evaluation model that includes the current criteria such as MAE, Precision, Recall, deviation, and applies it to cluster-based combined collaborative filtering.
https://doi.org/10.9708/jksci.2012.17.4.083 인용 PDF KSCI

Personalized Exhibition Booth Recommendation Methodology Using Sequential Association Rule (순차 연관 규칙을 이용한 개인화된 전시 부스 추천 방법)

Moon, Hyun-Sil;Jung, Min-Kyu;Kim, Jae-Kyeong;Kim, Hyea-Kyeong
- Journal of Intelligence and Information Systems
- /
- v.16 no.4
- /
- pp.195-211
- /
- 2010
An exhibition is defined as market events for specific duration to present exhibitors' main product range to either business or private visitors, and it also plays a key role as effective marketing channels. Especially, as the effect of the opinions of the visitors after the exhibition impacts directly on sales or the image of companies, exhibition organizers must consider various needs of visitors. To meet needs of visitors, ubiquitous technologies have been applied in some exhibitions. However, despite of the development of the ubiquitous technologies, their services cannot always reflect visitors' preferences as they only generate information when visitors request. As a result, they have reached their limit to meet needs of visitors, which consequently might lead them to loss of marketing opportunity. Recommendation systems can be the right type to overcome these limitations. They can recommend the booths to coincide with visitors' preferences, so that they help visitors who are in difficulty for choices in exhibition environment. One of the most successful and widely used technologies for building recommender systems is called Collaborative Filtering. Traditional recommender systems, however, only use neighbors' evaluations or behaviors for a personalized prediction. Therefore, they can not reflect visitors' dynamic preference, and also lack of accuracy in exhibition environment. Although there is much useful information to infer visitors' preference in ubiquitous environment (e.g., visitors' current location, booth visit path, and so on), they use only limited information for recommendation. In this study, we propose a booth recommendation methodology using Sequential Association Rule which considers the sequence of visiting. Recent studies of Sequential Association Rule use the constraints to improve the performance. However, since traditional Sequential Association Rule considers the whole rules to recommendation, they have a scalability problem when they are adapted to a large exhibition scale. To solve this problem, our methodology composes the confidence database before recommendation process. To compose the confidence database, we first search preceding rules which have the frequency above threshold. Next, we compute the confidences of each preceding rules to each booth which is not contained in preceding rules. Therefore, the confidence database has two kinds of information which are preceding rules and their confidence to each booth. In recommendation process, we just generate preceding rules of the target visitors based on the records of the visits, and recommend booths according to the confidence database. Throughout these steps, we expect reduction of time spent on recommendation process. To evaluate proposed methodology, we use real booth visit records which are collected by RFID technology in IT exhibition. Booth visit records also contain the visit sequence of each visitor. We compare the performance of proposed methodology with traditional Collaborative Filtering system. As a result, our proposed methodology generally shows higher performance than traditional Collaborative Filtering. We can also see some features of it in experimental results. First, it shows the highest performance at one booth recommendation. It detects preceding rules with some portions of visitors. Therefore, if there is a visitor who moved with very a different pattern compared to the whole visitors, it cannot give a correct recommendation for him/her even though we increase the number of recommendation. Trained by the whole visitors, it cannot correctly give recommendation to visitors who have a unique path. Second, the performance of general recommendation systems increase as time expands. However, our methodology shows higher performance with limited information like one or two time periods. Therefore, not only can it recommend even if there is not much information of the target visitors' booth visit records, but also it uses only small amount of information in recommendation process. We expect that it can give real?time recommendations in exhibition environment. Overall, our methodology shows higher performance ability than traditional Collaborative Filtering systems, we expect it could be applied in booth recommendation system to satisfy visitors in exhibition environment.
PDF KSCI

Improving Performance of Recommendation Systems Using Topic Modeling (사용자 관심 이슈 분석을 통한 추천시스템 성능 향상 방안)

Choi, Seongi;Hyun, Yoonjin;Kim, Namgyu
- Journal of Intelligence and Information Systems
- /
- v.21 no.3
- /
- pp.101-116
- /
- 2015
Recently, due to the development of smart devices and social media, vast amounts of information with the various forms were accumulated. Particularly, considerable research efforts are being directed towards analyzing unstructured big data to resolve various social problems. Accordingly, focus of data-driven decision-making is being moved from structured data analysis to unstructured one. Also, in the field of recommendation system, which is the typical area of data-driven decision-making, the need of using unstructured data has been steadily increased to improve system performance. Approaches to improve the performance of recommendation systems can be found in two aspects- improving algorithms and acquiring useful data with high quality. Traditionally, most efforts to improve the performance of recommendation system were made by the former approach, while the latter approach has not attracted much attention relatively. In this sense, efforts to utilize unstructured data from variable sources are very timely and necessary. Particularly, as the interests of users are directly connected with their needs, identifying the interests of the user through unstructured big data analysis can be a crew for improving performance of recommendation systems. In this sense, this study proposes the methodology of improving recommendation system by measuring interests of the user. Specially, this study proposes the method to quantify interests of the user by analyzing user's internet usage patterns, and to predict user's repurchase based upon the discovered preferences. There are two important modules in this study. The first module predicts repurchase probability of each category through analyzing users' purchase history. We include the first module to our research scope for comparing the accuracy of traditional purchase-based prediction model to our new model presented in the second module. This procedure extracts purchase history of users. The core part of our methodology is in the second module. This module extracts users' interests by analyzing news articles the users have read. The second module constructs a correspondence matrix between topics and news articles by performing topic modeling on real world news articles. And then, the module analyzes users' news access patterns and then constructs a correspondence matrix between articles and users. After that, by merging the results of the previous processes in the second module, we can obtain a correspondence matrix between users and topics. This matrix describes users' interests in a structured manner. Finally, by using the matrix, the second module builds a model for predicting repurchase probability of each category. In this paper, we also provide experimental results of our performance evaluation. The outline of data used our experiments is as follows. We acquired web transaction data of 5,000 panels from a company that is specialized to analyzing ranks of internet sites. At first we extracted 15,000 URLs of news articles published from July 2012 to June 2013 from the original data and we crawled main contents of the news articles. After that we selected 2,615 users who have read at least one of the extracted news articles. Among the 2,615 users, we discovered that the number of target users who purchase at least one items from our target shopping mall 'G' is 359. In the experiments, we analyzed purchase history and news access records of the 359 internet users. From the performance evaluation, we found that our prediction model using both users' interests and purchase history outperforms a prediction model using only users' purchase history from a view point of misclassification ratio. In detail, our model outperformed the traditional one in appliance, beauty, computer, culture, digital, fashion, and sports categories when artificial neural network based models were used. Similarly, our model outperformed the traditional one in beauty, computer, digital, fashion, food, and furniture categories when decision tree based models were used although the improvement is very small.
https://doi.org/10.13088/jiis.2015.21.3.101 인용 PDF KSCI

Integration of Heterogeneous Models with Knowledge Consolidation (지식 결합을 이용한 서로 다른 모델들의 통합)

Bae, Jae-Kwon;Kim, Jin-Hwa
- Korean Management Science Review
- /
- v.24 no.2
- /
- pp.177-196
- /
- 2007
For better predictions and classifications in customer recommendation, this study proposes an integrative model that efficiently combines the currently-in-use statistical and artificial intelligence models. In particular, by integrating the models such as Association Rule, Frequency Matrix, and Rule Induction, this study suggests an integrative prediction model. Integrated models consist of four models: ASFM model which combines Association Rule(A) and Frequency Matrix(B), ASRI model which combines Association Rule(A) and Rule Induction(C), FMRI model which combines Frequency Matrix(B) and Rule Induction(C), and ASFMRI model which combines Association Rule(A), Frequency Matrix(B), and Rule Induction(C). The data set for the tests is collected from a convenience store G, which is the number one in its brand in S. Korea. This data set contains sales information on customer transactions from September 1, 2005 to December 7, 2005. About 1,000 transactions are selected for a specific item. Using this data set. it suggests an integrated model predicting whether a customer buys or not buys a specific product for target marketing strategy. The performance of integrated model is compared with that of other models. The results from the experiments show that the performance of integrated model is superior to that of all other models such as Association Rule, Frequency Matrix, and Rule Induction.
PDF KSCI

Financial Instruments Recommendation based on Classification Financial Consumer by Text Mining Techniques (비정형 데이터 분석을 통한 금융소비자 유형화 및 그에 따른 금융상품 추천 방법)

Lee, Jaewoong;Kim, Young-Sik;Kwon, Ohbyung
- Journal of Information Technology Services
- /
- v.15 no.4
- /
- pp.1-24
- /
- 2016
With the innovation of information technology, non-face-to-face robo advisor with high accessibility and convenience is spreading. The current robot advisor recommends appropriate investment products after understanding the investment propensity based on the structured data entered directly or indirectly by individuals. However, it is an inconvenient and obtrusive way for financial consumers to inquire or input their own subjective propensity to invest. Hence, this study proposes a way to deduce the propensity to invest in unstructured data that customers voluntarily exposed during consultation or online. Since prediction performance based on unstructured document differs according to the characteristics of text, in this study, classification algorithm optimized for the characteristic of text left by financial consumers is selected by performing prediction performance evaluation of various learning discrimination algorithms and proposed an intelligent method that automatically recommends investment products. User tests were given to MBA students. After showing the recommended investment and list of investment products, satisfaction was asked. Financial consumers' satisfaction was measured by dividing them into investment propensity and recommendation goods. The results suggest that the users high satisfaction with investment products recommended by the method proposed in this paper. The results showed that it can be applies to non-face-to-face robo advisor.
https://doi.org/10.9716/KITS.2016.15.4.001 인용 PDF KSCI

Performance of Collaborative Filtering Agent System using Clustering for Better Recommendations (개선된 추천을 위해 클러스터링을 이용한 협동적 필터링 에이전트 시스템의 성능)

Hwang, Byeong-Yeon
- The Transactions of the Korea Information Processing Society
- /
- v.7 no.5S
- /
- pp.1599-1608
- /
- 2000
Automated collaborative filtering is on the verge of becoming a popular technique to reduce overloaded information as well as to solve the problems that content-based information filtering systems cannot handle. In this paper, we describe three different algorithms that perform collaborative filtering: GroupLens that is th traditional technique; Best N, the modified one; and an algorithm that uses clustering. Based on the exeprimental results using real data, the algorithm using clustering is compared with the existing representative collaborative filtering agent algorithms such as GroupLens and Best N. The experimental results indicate that the algorithms using clustering is similar to Best N and better than GroupLens for prediction accuracy. The results also demonstrate that the algorithm using clustering produces the best performance according to the standard deviation of error rate. This means that the algorithm using clustering gives the most stable and the best uniform recommendation. In addition, the algorithm using clustering reduces the time of recommendation.
PDF

Genre-based Collaborative Filtering Movie Recommendation (장르 기반 Collaborative Filtering 영화 추천)

Hwang, Ki-Tae
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.10 no.3
- /
- pp.51-59
- /
- 2010
There have been proposed several movie recommendation algorithms based on Collaborative Filtering(CF). CF decides neighbors whose ratings are the most similar to each other and it predicts how well users will like new movies, based on ratings from neighbors. This paper proposes a new method to improve the result predicted by CF based on genres of the movies seen by users. The proposed method can be combined to the most of all existing CF algorithms. In this paper, a performance evaluation has been conducted between an existing simple CF algorithm and CF-Genre that is the proposed genre-based method added to the CF algorithm. The result shows that CF-Genre improves 3.3% in prediction performance over existing CF algorithms.
PDF KSCI

Improvement of Collaborative Filtering Algorithm Using Imputation Methods

Jeong, Hyeong-Chul;Kwak, Min-Jung;Noh, Hyun-Ju
- Journal of the Korean Data and Information Science Society
- /
- v.14 no.3
- /
- pp.441-450
- /
- 2003
Collaborative filtering is one of the most widely used methodologies for recommendation system. Collaborative filtering is based on a data matrix of each customer's preferences and frequently, there exits missing data problem. We introduced two imputation approach (multiple imputation via Markov Chain Monte Carlo method and multiple imputation via bootstrap method) to improve the prediction performance of collaborative filtering and evaluated the performance using EachMovie data.
PDF

Search Result 77, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)