• Title/Summary/Keyword: LDA algorithm

Search Result 157, Processing Time 0.027 seconds

Online Reviews Analysis for Prediction of Product Ratings based on Topic Modeling (토픽 모델링에 기반한 온라인 상품 평점 예측을 위한 온라인 사용 후기 분석)

  • Park, Sang Hyun;Moon, Hyun Sil;Kim, Jae Kyeong
    • Journal of Information Technology Services
    • /
    • v.16 no.3
    • /
    • pp.113-125
    • /
    • 2017
  • Customers have been affected by others' opinions when they make a purchase. Thanks to the development of technologies, people are sharing their experiences such as reviews or ratings through online or social network services, However, although ratings are intuitive information for others, many reviews include only texts without ratings. Also, because of huge amount of reviews, customers and companies can't read all of them so they are hard to evaluate to a product without ratings. Therefore, in this study, we propose a methodology to predict ratings based on reviews for a product. In a methodology, we first estimate the topic-review matrix using the Latent Dirichlet Allocation technic which is widely used in topic modeling. Next, we predict ratings based on the topic-review matrix using the artificial neural network model which is based on the backpropagation algorithm. Through experiments with actual reviews, we find that our methodology can predict ratings based on customers' reviews. And our methodology performs better with reviews which include certain opinions. As a result, our study can be used for customers and companies that want to know exactly a product with ratings. Moreover, we hope that our study leads to the implementation of future studies that combine machine learning and topic modeling.

A Study on the User Perception in Fashion Design through Social Media Text-Mining (소셜미디어 텍스트마이닝을 통한 패션디자인 사용자 인식 조사)

  • An, Hyosun;Park, Minjung
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.41 no.6
    • /
    • pp.1060-1070
    • /
    • 2017
  • This study seeks methods to analyze users' perception in fashion designs shown in social media using textmining analysis methods. The research methods selected 'men's stripe shirts' as subjects and collected texts related to the subject mainly from blogs. Texts from 13,648 posts from November 1st, 2015 to October 31st, 2016 were analyzed by applying the LDA algorithm and content analysis. As a result, the wearing status per season and subjects of men's stripe shirts were derived. Across the entire period, the main topics discussed by users to be pattern, customized suits, brands, coordination and purchase information. In terms of seasons, spring time showed the sharing of information on coordinating daily looks or boyfriend looks, and during the winter season the information shared were about shirts suitable for special occasions such as job interviews and stripe shirts that match suits. The study results showed that text-mining analysis is capable of analyzing the context and provide a user-centered index responding to demands newly mentioned by users along with the rapid changes in fashion design trends.

Feature extraction based on DWT and GA for Gesture Recognition of EPIC Sensor Signals (EPIC 센서 신호의 제스처 인식을 위한 이산 웨이블릿 변환과 유전자 알고리즘 기반 특징 추출)

  • Ji, Sang-Hun;Yang, Hyung-Jeong;Kim, Soo-Hyung;Kim, Young-Chul
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2016.04a
    • /
    • pp.612-615
    • /
    • 2016
  • 본 논문에서는 EPIC(Electric Potential Integrated Circuit) 센서를 통해 추출된 동작신호에 대해 이산 웨이블릿 변환(Discrete Wavelet Transform : DWT)과 선형 판별분석(Linear Discriminant Analysis : LDA), Support Vector Machine(SVM)을 사용하는 동작 분류 시스템을 제안한다. EPIC 센서 신호에 대해 이산 웨이블릿 변환을 사용하여 웨이블릿 계수인 근사계수(approximation coefficients)와 상세계수(detail coefficients)를 구한 후, 각각의 웨이블릿 계수에 대해 특징 파라미터를 추출한다. 이 때, 특징 파라미터는 14개의 통계적 특징 추출 파라미터 중에 유전자 알고리즘(Genetic Algorithm : GA)을 통하여 선택한 우수한 특징 파라미터이다. 웨이블릿 계수들에서 추출한 특징 파라미터는 선형 판별분석을 적용하여 차원을 축소하고 SVM의 훈련 및 분류에 사용한다. 실험결과, 4가지 동작에 대한 EPIC 센서 신호분류에서 제안된 방법의 분류율이 99.75%로 원신호에 대한 HMM 분류율 97% 보다 높은 정확률을 보여주었다.

Gate Management System by Face Recognition using Smart Phone (스마트폰을 이용한 얼굴인식 출입관리 시스템)

  • Kwon, Ki-Hyeon;Lee, Hyung-Bong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.11
    • /
    • pp.9-15
    • /
    • 2011
  • In this paper, we design and implement of gate management system by face recognition using smart phone. We investigate various algorithms for face recognition on smart phones. First step in any face recognition system is face detection. We investigated algorithms like color segmentation, template matching etc. for face detection, and Eigen & Fisher face for face recognition. The algorithms have been first profiled in MATLAB and then implemented on the Android phone. While implementing the algorithms, we made a tradeoff between accuracy and computational complexity of the algorithm mainly because we are implementing the face recognition system on a smart phone with limited hardware capabilities.

Wireless Measurement based TFRC for QoS Provisioning over IEEE 802.11 (IEEE 802.11에서 멀티미디어 QoS 보장을 위한 무선 측정 기반 TFRC 기법)

  • Pyun Jae young
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.4B
    • /
    • pp.202-209
    • /
    • 2005
  • In this paper, a dynamic TCP-friendly rate control (TFRC) is proposed to adjust the coding rates according to the channel characteristics of the wireless-to-wired network consisting of wireless first-hop channel. To avoid the throughput degradation of multimedia flows traveling through wireless lint the proposed rate control system employs a new wireless loss differentiation algorithm (LDA) using packet loss statistics. This method can produce the TCP-friendly rates while sharing the backbone bandwidth with TCP flows over the wireless-to-wired network. Experimental results show that the proposed rate control system can eliminate the effect of wireless losses in flow control of TFRC and substantially reduce the abrupt quality degradation of the video streaming caused by the unreliable wireless link status.

Applying Topic Modeling and Similarity for Predicting Bug Severity in Cross Projects

  • Yang, Geunseok;Min, Kyeongsic;Lee, Jung-Won;Lee, Byungjeong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.3
    • /
    • pp.1583-1598
    • /
    • 2019
  • Recently, software has increased in complexity and been applied in various industrial fields. As a result, the presence of software bugs cannot be avoided. Various bug severity prediction methodologies have been proposed, but their performance needs to be further improved. In this study, we propose a novel technique for bug severity prediction in cross projects such as Eclipse, Mozilla, WireShark, and Xamarin by using topic modeling and similarity (i.e., KL-divergence). First, we construct topic models from bug repositories in cross projects using Latent Dirichlet Allocation (LDA). Then, we find topics in each project that contain the most numerous similar bug reports by using a new bug report. Next, we extract the bug reports belonging to the selected topics and input them to a Naïve Bayes Multinomial (NBM) algorithm. Finally, we predict the bug severity in the new bug report. In order to evaluate the performance of our approach and to verify the difference between cross projects and single project, we compare it with the Naïve Bayes Multinomial approach; the Lamkanfi methodology, which is a well-known bug severity prediction approach; and an emotional similarity-based bug severity prediction approach. Our approach exhibits a better performance than the compared methods.

Changes in the Perception of Second-hand Fashion Consumption in the Post-pandemic Era (포스트 팬데믹 시대의 중고 패션 소비 인식 변화)

  • Kim, Habin;Lee, Ha Kyung
    • Fashion & Textile Research Journal
    • /
    • v.24 no.1
    • /
    • pp.66-80
    • /
    • 2022
  • Even before the Covid-19 outbreak, the second-hand fashion market has been growing as the fashion industry strives towards sustainability. It has also accelerated due to the economic contraction caused by the pandemic. In previous studies, the second-hand market has been steadily studied; however, the research is insufficient compared to the diversified market. Therefore, this study investigates changes in consumers' perception of the second-hand fashion market affected by Covid-19. This study collected text data with the keyword 'second-hand fashion' from various blogs. We analyzed 24,000 posts before and after the Covid-19 outbreak by applying the LDA algorithm for topic modeling and content analysis. Seven and nine different topics for the period before and after the pandemic respectively were derived. The results revealed that during the pandemic the consumers realized the practical value of sustainability in their daily lives than they did before the pandemic. Furthermore, they tried to minimize transaction anxiety by using diverse platforms with advanced technology. They also realized economic value by buying and selling sneakers in the popular sneakers resale market. The results could help understand the rapidly growing second-hand fashion market during Covid-19.

Non-Simultaneous Sampling Deactivation during the Parameter Approximation of a Topic Model

  • Jeong, Young-Seob;Jin, Sou-Young;Choi, Ho-Jin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.7 no.1
    • /
    • pp.81-98
    • /
    • 2013
  • Since Probabilistic Latent Semantic Analysis (PLSA) and Latent Dirichlet Allocation (LDA) were introduced, many revised or extended topic models have appeared. Due to the intractable likelihood of these models, training any topic model requires to use some approximation algorithm such as variational approximation, Laplace approximation, or Markov chain Monte Carlo (MCMC). Although these approximation algorithms perform well, training a topic model is still computationally expensive given the large amount of data it requires. In this paper, we propose a new method, called non-simultaneous sampling deactivation, for efficient approximation of parameters in a topic model. While each random variable is normally sampled or obtained by a single predefined burn-in period in the traditional approximation algorithms, our new method is based on the observation that the random variable nodes in one topic model have all different periods of convergence. During the iterative approximation process, the proposed method allows each random variable node to be terminated or deactivated when it is converged. Therefore, compared to the traditional approximation ways in which usually every node is deactivated concurrently, the proposed method achieves the inference efficiency in terms of time and memory. We do not propose a new approximation algorithm, but a new process applicable to the existing approximation algorithms. Through experiments, we show the time and memory efficiency of the method, and discuss about the tradeoff between the efficiency of the approximation process and the parameter consistency.

PIV System for the Flow Pattern Anaysis of Artificial Organs ; Applied to the In Vitro Test of Artificial Heart Valves

  • Lee, Dong-Hyeok;Seh, Soo-Won;An, Hyuk;Min, Byoung-Goo
    • Journal of Biomedical Engineering Research
    • /
    • v.15 no.4
    • /
    • pp.489-497
    • /
    • 1994
  • The most serious problems related to the cardiovascular prothesis are thrombosis and hemolysis. It is known that the flow pattern of cardiovascular prostheses is highly correlated with thrombosis and hemolysis. Laser Doppler Anemometry (LDA) is a usual method to get flow pattern, which is difficult to operate and has narrow measure region. Particle Image Velocimetry (PIV) can solve these problems. Because the flow speed of valve is too high to catch particles by CCD camera, high-speed camera (Hyspeed : Holland-Photonics) was used. The estimated maximum flow speed was 5m/sec and maximum trackable length is 0.5 cm, so the shutter speed was determined as 1000 frames per sec. Several image processing techniques (blurring, segmentation, morphology, etc) were used for the preprocessing. Particle tracking algorithm and 2-D interpolation technique which were necessary in making gridrized velocity pronto, were applied to this PIV program. By using Single-Pulse Multi-Frame particle tracking algorithm, some problems of PIV can be solved. To eliminate particles which penetrate the sheeted plane and to determine the direction of particle paths are these solving methods. 1-D relaxation fomula is modified to interpolate 2-D field. Parachute artificial heart valve which was developed by Seoul National University and Bjork-Shiely valve was testified. For each valve, different flow pattern, velocity profile, wall shear stress and mean velocity were obtained.

  • PDF

Analysis of Research Topics among Library, Archives and Museums using Topic Modeling (토픽 모델링을 활용한 도서관, 기록관, 박물관간의 연구 주제 분석)

  • Kim, Heesop;Kang, Bora
    • Journal of Korean Library and Information Science Society
    • /
    • v.50 no.4
    • /
    • pp.339-358
    • /
    • 2019
  • The purpose of this study is to understand the topics of the research for the establishment of cooperative platform between libraries, archives, and museums that carry out the common task of providing knowledge information in a broad sense. To achieve the purpose of this study, 637 bibliographic information on three institutions were collected from the Web version of Scopus database. Among the collected bibliographic information, 5,218 words were extracted through NetMiner V.4 and analysed topic modeling. The results are as follows: First, as a result of analyzing the frequency of word appearance according to the tf-idf weight 'Preservation' was the most hottest topic. Second, the topic modeling analysis through LDA(Latent Dirichlet Allocation) algorithm resulted in 13 topic areas. Third, as a result of expressing 13 topic areas as a network, repository construction was the central topic, and the research topics such as cooperation among institutions, conservation environment for collections, system and policy discovery, life cycle of collections, exhibition of information resources, and information retrieval were closely related to the central topic. Fourth, the trend of 13 topic areas by year 1998 is limited to the specific subjects such as system and policy discovery, information retrieval, and life cycle of collections, while the subsequent studies have been carried out after that year.