• Title/Summary/Keyword: online algorithm

Search Result 587, Processing Time 0.026 seconds

Design of a Recommendation System for Improving Deep Neural Network Performance

  • Juhyoung Sung;Kiwon Kwon;Byoungchul Song
    • Journal of Internet Computing and Services
    • /
    • v.25 no.1
    • /
    • pp.49-56
    • /
    • 2024
  • There have been emerging many use-cases applying recommendation systems especially in online platform. Although the performance of recommendation systems is affected by a variety of factors, selecting appropriate features is difficult since most of recommendation systems have sparse data. Conventional matrix factorization (MF) method is a basic way to handle with problems in the recommendation systems. However, the MF based scheme cannot reflect non-linearity characteristics well. As deep learning technology has been attracted widely, a deep neural network (DNN) framework based collaborative filtering (CF) was introduced to complement the non-linearity issue. However, there is still a problem related to feature embedding for use as input to the DNN. In this paper, we propose an effective method using singular value decomposition (SVD) based feature embedding for improving the DNN performance of recommendation algorithms. We evaluate the performance of recommendation systems using MovieLens dataset and show the proposed scheme outperforms the existing methods. Moreover, we analyze the performance according to the number of latent features in the proposed algorithm. We expect that the proposed scheme can be applied to the generalized recommendation systems.

Robustness Analysis of a Novel Model-Based Recommendation Algorithms in Privacy Environment

  • Ihsan Gunes
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.5
    • /
    • pp.1341-1368
    • /
    • 2024
  • The concept of privacy-preserving collaborative filtering (PPCF) has been gaining significant attention. Due to the fact that model-based recommendation methods with privacy are more efficient online, privacy-preserving memory-based scheme should be avoided in favor of model-based recommendation methods with privacy. Several studies in the current literature have examined ant colony clustering algorithms that are based on non-privacy collaborative filtering schemes. Nevertheless, the literature does not contain any studies that consider privacy in the context of ant colony clustering-based CF schema. This study employed the ant colony clustering model-based PPCF scheme. Attacks like shilling or profile injection could potentially be successful against privacy-preserving model-based collaborative filtering techniques. Afterwards, the scheme's robustness was assessed by conducting a shilling attack using six different attack models. We utilize masked data-based profile injection attacks against a privacy-preserving ant colony clustering-based prediction algorithm. Subsequently, we conduct extensive experiments utilizing authentic data to assess its robustness against profile injection attacks. In addition, we evaluate the resilience of the ant colony clustering model-based PPCF against shilling attacks by comparing it to established PPCF memory and model-based prediction techniques. The empirical findings indicate that push attack models exerted a substantial influence on the predictions, whereas nuke attack models demonstrated limited efficacy.

Analysis and Evaluation of Frequent Pattern Mining Technique based on Landmark Window (랜드마크 윈도우 기반의 빈발 패턴 마이닝 기법의 분석 및 성능평가)

  • Pyun, Gwangbum;Yun, Unil
    • Journal of Internet Computing and Services
    • /
    • v.15 no.3
    • /
    • pp.101-107
    • /
    • 2014
  • With the development of online service, recent forms of databases have been changed from static database structures to dynamic stream database structures. Previous data mining techniques have been used as tools of decision making such as establishment of marketing strategies and DNA analyses. However, the capability to analyze real-time data more quickly is necessary in the recent interesting areas such as sensor network, robotics, and artificial intelligence. Landmark window-based frequent pattern mining, one of the stream mining approaches, performs mining operations with respect to parts of databases or each transaction of them, instead of all the data. In this paper, we analyze and evaluate the techniques of the well-known landmark window-based frequent pattern mining algorithms, called Lossy counting and hMiner. When Lossy counting mines frequent patterns from a set of new transactions, it performs union operations between the previous and current mining results. hMiner, which is a state-of-the-art algorithm based on the landmark window model, conducts mining operations whenever a new transaction occurs. Since hMiner extracts frequent patterns as soon as a new transaction is entered, we can obtain the latest mining results reflecting real-time information. For this reason, such algorithms are also called online mining approaches. We evaluate and compare the performance of the primitive algorithm, Lossy counting and the latest one, hMiner. As the criteria of our performance analysis, we first consider algorithms' total runtime and average processing time per transaction. In addition, to compare the efficiency of storage structures between them, their maximum memory usage is also evaluated. Lastly, we show how stably the two algorithms conduct their mining works with respect to the databases that feature gradually increasing items. With respect to the evaluation results of mining time and transaction processing, hMiner has higher speed than that of Lossy counting. Since hMiner stores candidate frequent patterns in a hash method, it can directly access candidate frequent patterns. Meanwhile, Lossy counting stores them in a lattice manner; thus, it has to search for multiple nodes in order to access the candidate frequent patterns. On the other hand, hMiner shows worse performance than that of Lossy counting in terms of maximum memory usage. hMiner should have all of the information for candidate frequent patterns to store them to hash's buckets, while Lossy counting stores them, reducing their information by using the lattice method. Since the storage of Lossy counting can share items concurrently included in multiple patterns, its memory usage is more efficient than that of hMiner. However, hMiner presents better efficiency than that of Lossy counting with respect to scalability evaluation due to the following reasons. If the number of items is increased, shared items are decreased in contrast; thereby, Lossy counting's memory efficiency is weakened. Furthermore, if the number of transactions becomes higher, its pruning effect becomes worse. From the experimental results, we can determine that the landmark window-based frequent pattern mining algorithms are suitable for real-time systems although they require a significant amount of memory. Hence, we need to improve their data structures more efficiently in order to utilize them additionally in resource-constrained environments such as WSN(Wireless sensor network).

An Analytical Approach Using Topic Mining for Improving the Service Quality of Hotels (호텔 산업의 서비스 품질 향상을 위한 토픽 마이닝 기반 분석 방법)

  • Moon, Hyun Sil;Sung, David;Kim, Jae Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.21-41
    • /
    • 2019
  • Thanks to the rapid development of information technologies, the data available on Internet have grown rapidly. In this era of big data, many studies have attempted to offer insights and express the effects of data analysis. In the tourism and hospitality industry, many firms and studies in the era of big data have paid attention to online reviews on social media because of their large influence over customers. As tourism is an information-intensive industry, the effect of these information networks on social media platforms is more remarkable compared to any other types of media. However, there are some limitations to the improvements in service quality that can be made based on opinions on social media platforms. Users on social media platforms represent their opinions as text, images, and so on. Raw data sets from these reviews are unstructured. Moreover, these data sets are too big to extract new information and hidden knowledge by human competences. To use them for business intelligence and analytics applications, proper big data techniques like Natural Language Processing and data mining techniques are needed. This study suggests an analytical approach to directly yield insights from these reviews to improve the service quality of hotels. Our proposed approach consists of topic mining to extract topics contained in the reviews and the decision tree modeling to explain the relationship between topics and ratings. Topic mining refers to a method for finding a group of words from a collection of documents that represents a document. Among several topic mining methods, we adopted the Latent Dirichlet Allocation algorithm, which is considered as the most universal algorithm. However, LDA is not enough to find insights that can improve service quality because it cannot find the relationship between topics and ratings. To overcome this limitation, we also use the Classification and Regression Tree method, which is a kind of decision tree technique. Through the CART method, we can find what topics are related to positive or negative ratings of a hotel and visualize the results. Therefore, this study aims to investigate the representation of an analytical approach for the improvement of hotel service quality from unstructured review data sets. Through experiments for four hotels in Hong Kong, we can find the strengths and weaknesses of services for each hotel and suggest improvements to aid in customer satisfaction. Especially from positive reviews, we find what these hotels should maintain for service quality. For example, compared with the other hotels, a hotel has a good location and room condition which are extracted from positive reviews for it. In contrast, we also find what they should modify in their services from negative reviews. For example, a hotel should improve room condition related to soundproof. These results mean that our approach is useful in finding some insights for the service quality of hotels. That is, from the enormous size of review data, our approach can provide practical suggestions for hotel managers to improve their service quality. In the past, studies for improving service quality relied on surveys or interviews of customers. However, these methods are often costly and time consuming and the results may be biased by biased sampling or untrustworthy answers. The proposed approach directly obtains honest feedback from customers' online reviews and draws some insights through a type of big data analysis. So it will be a more useful tool to overcome the limitations of surveys or interviews. Moreover, our approach easily obtains the service quality information of other hotels or services in the tourism industry because it needs only open online reviews and ratings as input data. Furthermore, the performance of our approach will be better if other structured and unstructured data sources are added.

A Study on the Visualization of Data in Virtual Space utilizing Realistic Exhibition Contents - Focusing on the application of the Tamed Cloud clustering algorithm in 70mK project (전시콘텐츠에 구현된 가상공간 내 데이터 시각화 연구 - 70mK의 Tamed Cloud 군집형 알고리즘 적용을 중심으로)

  • Sungmin Kang;Daniel H. Byun
    • Trans-
    • /
    • v.15
    • /
    • pp.1-24
    • /
    • 2023
  • This study examines the application of data visualization technology using a clustered data algorithm called 'Tamed Cloud' to virtual spaces and seeks the possibility of implementing it in various types of realistic exhibition contents. To this end, we first attempt to classify virtual reality (VR) exhibition contents starting with COVID-19, and summarize technologies applied. Also, various realistic exhibition contents provide visitors with an opportunity to appreciate the artworks through online and virtual exhibitions. In this trend, virtual reality and augmented reality (AR) technologies have been introduced, allowing visitors to enjoy the artwork more immersively, and the possibility of realistic exhibition content with interaction between the artwork and the user is also being demonstrated. Based on this background, this study examines the history of exhibition contents by dividing them before and after the advent of virtual reality technology, and examines how the clustered algorithm technology called Tamed Cloud was applied to virtual space and implemented as a realistic exhibition content in <70mK> project. By synthesizing all of this, we propose a convergence method of data visualization, virtual reality, and realistic content, and propose it as a new alternative to realistic exhibition content in virtual space.

Automatic Recommendation of (IP)TV programs based on A Rank Model using Collaborative Filtering (협업 필터링을 이용한 순위 정렬 모델 기반 (IP)TV 프로그램 자동 추천)

  • Kim, Eun-Hui;Pyo, Shin-Jee;Kim, Mun-Churl
    • Journal of Broadcast Engineering
    • /
    • v.14 no.2
    • /
    • pp.238-252
    • /
    • 2009
  • Due to the rapid increase of available contents via the convergence of broadcasting and internet, the efficient access to personally preferred contents has become an important issue. In this paper, for recommendation scheme for TV programs using a collaborative filtering technique is studied. For recommendation of user preferred TV programs, our proposed recommendation scheme consists of offline and online computation. About offline computation, we propose reasoning implicitly each user's preference in TV programs in terms of program contents, genres and channels, and propose clustering users based on each user's preferences in terms of genres and channels by dynamic fuzzy clustering method. After an active user logs in, to recommend TV programs to the user with high accuracy, the online computation includes pulling similar users to an active user by similarity measure based on the standard preference list of active user and filtering-out of the watched TV programs of the similar users, which do not exist in EPG and ranking of the remaining TV programs by proposed rank model. Especially, in this paper, the BM (Best Match) algorithm is extended to make the recommended TV programs be ranked by taking into account user's preferences. The experimental results show that the proposed scheme with the extended BM model yields 62.1% of prediction accuracy in top five recommendations for the TV watching history of 2,441 people.

Improvement of a Product Recommendation Model using Customers' Search Patterns and Product Details

  • Lee, Yunju;Lee, Jaejun;Ahn, Hyunchul
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.1
    • /
    • pp.265-274
    • /
    • 2021
  • In this paper, we propose a novel recommendation model based on Doc2vec using search keywords and product details. Until now, a lot of prior studies on recommender systems have proposed collaborative filtering (CF) as the main algorithm for recommendation, which uses only structured input data such as customers' purchase history or ratings. However, the use of unstructured data like online customer review in CF may lead to better recommendation. Under this background, we propose to use search keyword data and product detail information, which are seldom used in previous studies, for product recommendation. The proposed model makes recommendation by using CF which simultaneously considers ratings, search keywords and detailed information of the products purchased by customers. To extract quantitative patterns from these unstructured data, Doc2vec is applied. As a result of the experiment, the proposed model was found to outperform the conventional recommendation model. In addition, it was confirmed that search keywords and product details had a significant effect on recommendation. This study has academic significance in that it tries to apply the customers' online behavior information to the recommendation system and that it mitigates the cold start problem, which is one of the critical limitations of CF.

Non-face-to-face online home training application study using deep learning-based image processing technique and standard exercise program (딥러닝 기반 영상처리 기법 및 표준 운동 프로그램을 활용한 비대면 온라인 홈트레이닝 어플리케이션 연구)

  • Shin, Youn-ji;Lee, Hyun-ju;Kim, Jun-hee;Kwon, Da-young;Lee, Seon-ae;Choo, Yun-jin;Park, Ji-hye;Jung, Ja-hyun;Lee, Hyoung-suk;Kim, Joon-ho
    • The Journal of the Convergence on Culture Technology
    • /
    • v.7 no.3
    • /
    • pp.577-582
    • /
    • 2021
  • Recently, with the development of AR, VR, and smart device technologies, the demand for services based on non-face-to-face environments is also increasing in the fitness industry. The non-face-to-face online home training service has the advantage of not being limited by time and place compared to the existing offline service. However, there are disadvantages including the absence of exercise equipment, difficulty in measuring the amount of exercise and chekcing whether the user maintains an accurate exercise posture or not. In this study, we develop a standard exercise program that can compensate for these shortcomings and propose a new non-face-to-face home training application by using a deep learning-based body posture estimation image processing algorithm. This application allows the user to directly watch and follow the trainer of the standard exercise program video, correct the user's own posture, and perform an accurate exercise. Furthermore, if the results of this study are customized according to their purpose, it will be possible to apply them to performances, films, club activities, and conferences

A Study on Biomass Estimation Technique of Invertebrate Grazers Using Multi-object Tracking Model Based on Deep Learning (딥러닝 기반 다중 객체 추적 모델을 활용한 조식성 무척추동물 현존량 추정 기법 연구)

  • Bak, Suho;Kim, Heung-Min;Lee, Heeone;Han, Jeong-Ik;Kim, Tak-Young;Lim, Jae-Young;Jang, Seon Woong
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.3
    • /
    • pp.237-250
    • /
    • 2022
  • In this study, we propose a method to estimate the biomass of invertebrate grazers from the videos with underwater drones by using a multi-object tracking model based on deep learning. In order to detect invertebrate grazers by classes, we used YOLOv5 (You Only Look Once version 5). For biomass estimation we used DeepSORT (Deep Simple Online and real-time tracking). The performance of each model was evaluated on a workstation with a GPU accelerator. YOLOv5 averaged 0.9 or more mean Average Precision (mAP), and we confirmed it shows about 59 fps at 4 k resolution when using YOLOv5s model and DeepSORT algorithm. Applying the proposed method in the field, there was a tendency to be overestimated by about 28%, but it was confirmed that the level of error was low compared to the biomass estimation using object detection model only. A follow-up study is needed to improve the accuracy for the cases where frame images go out of focus continuously or underwater drones turn rapidly. However,should these issues be improved, it can be utilized in the production of decision support data in the field of invertebrate grazers control and monitoring in the future.

Implementation of CNN-based Classification Training Model for Unstructured Fashion Image Retrieval using Preprocessing with MASK R-CNN (비정형 패션 이미지 검색을 위한 MASK R-CNN 선형처리 기반 CNN 분류 학습모델 구현)

  • Seunga, Cho;Hayoung, Lee;Hyelim, Jang;Kyuri, Kim;Hyeon-Ji, Lee;Bong-Ki, Son;Jaeho, Lee
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.27 no.6
    • /
    • pp.13-23
    • /
    • 2022
  • In this paper, we propose a detailed component image classification algorithm by fashion item for unstructured data retrieval in the fashion field. Due to the COVID-19 environment, AI-based online shopping malls are increasing recently. However, there is a limit to accurate unstructured data search with existing keyword search and personalized style recommendations based on user surfing behavior. In this study, pre-processing using Mask R-CNN was conducted using images crawled from online shopping sites and then classified components for each fashion item through CNN. We obtain the accuaracy for collar of the shirt's as 93.28%, the pattern of the shirt as 98.10%, the 3 classese fit of the jeans as 91.73%, And, we further obtained one for the 4 classes fit of jeans as 81.59% and the color of the jeans as 93.91%. At the results for the decorated items, we also obtained the accuract of the washing of the jeans as 91.20% and the demage of jeans accuaracy as 92.96%.