• Title/Summary/Keyword: 인기 예측

Search Result 76, Processing Time 0.021 seconds

A Model to Predict Popularity of Internet Posts on Internet Forum Sites (인터넷 토론 게시판의 게시물 인기도 예측 모델)

  • Lee, Yun-Jung;Jung, In-Jun;Woo, Gyun
    • The KIPS Transactions:PartD
    • /
    • v.19D no.1
    • /
    • pp.113-120
    • /
    • 2012
  • Today, Internet users can easily create and share the digital contents with others through various online content sharing services such as YouTube. So, many portal sites are flooded with lots of user created contents (UCC) in various media such as texts and videos. Estimating popularity of UCC is a crucial concern to both users and the site administrators. This paper proposes a method to predict the popularity of Internet articles, a kind of UCC, using the dynamics of the online contents themselves. To analyze the dynamics, we regarded the access counts of Internet posts as the popularity of them and analyzed the variation of the access counts. We derived a model to predict the popularity of a post represented by the time series of access counts, which is based on an exponential function. According to the experimental results, the difference between the actual access counts and the predicted ones is not more than 10 for 20,532 posts, which cover about 90.7% of the test set.

Predicting the Popularity of Post Articles with Virtual Temperature in Web Bulletin (웹게시판에서 가상온도를 이용한 게시글의 인기 예측)

  • Kim, Su-Do;Kim, So-Ra;Cho, Hwan-Gue
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.10
    • /
    • pp.19-29
    • /
    • 2011
  • A Blog provides commentary, news, or content on a particular subject. The important part of many blogs is interactive format. Sometimes, there is a heated debate on a topic and any article becomes a political or sociological issue. In this paper, we proposed a method to predict the popularity of an article in advance. First, we used hit count as a factor to predict the popularity of an article. We defined the saturation point and derived a model to predict the hit count of the saturation point by a correlation coefficient of the early hit count and hit count of the saturation point. Finally, we predicted the virtual temperature of an article using 4 types(explosive, hot, warm, cold). We can predict the virtual temperature of Internet discussion articles using the hit count of the saturation point with more than 70% accuracy, exploiting only the first 30 minutes' hit count. In the hot, warm, and cold categories, we can predict more than 86% accuracy from 30 minutes' hit count and more than 90% accuracy from 70 minutes' hit count.

Prediction Model for Popularity of Online Articles based on Analysis of Hit Count (온라인 게시글의 조회수 분석을 통한 인기도 예측)

  • Kim, Su-Do;Cho, Hwan-Gue
    • The Journal of the Korea Contents Association
    • /
    • v.12 no.4
    • /
    • pp.40-51
    • /
    • 2012
  • Online discussion bulletin in Korea is not only a specific place where user exchange opinions but also a public sphere through which users discuss and form public opinion. Sometimes, there is a heated debate on a topic and any article becomes a political or sociological issue. In this paper, we propose how to analyze the popularity of articles by collecting the information of articles obtained from two well-known discussion forums such as AGORA and SEOPRISE. And we propose a prediction model for the article popularity by applying the characteristics of subject articles. Our experiment shown that the popularity of 87.52% articles have been saturated within a day after the submission in AGORA, but the popularity of 39% articles is growing after 4 days passed in SEOPRISE. And we observed that there is a low correlation between the period of popularity and the hit count. The steady increase of the hit count of an article does not necessarily imply the final hit count of the article at the saturation point is so high. In this paper, we newly propose a new prediction model called 'baseline'. We evaluated the predictability for popular articles using three models (SVM, similar matching and baseline). Through the results of performance evaluation, we observed that SVM model is the best in F-measure and precision, but baseline is the best in running time.

'Hot Search Keyword' Rank-Change Prediction (인기 검색어의 순위 변화 예측)

  • Kim, Dohyeong;Kang, Byeong Ho;Lee, Sungyoung
    • Journal of KIISE
    • /
    • v.44 no.8
    • /
    • pp.782-790
    • /
    • 2017
  • The service, 'Hot Search Keywords', provides a list of the most hot search terms of different web services such as Naver or Daum. The service, bases the changes in rank of a specific search keyword on changes in its users' interest. This paper introduces a temporal modelling framework for predicting the rank change of hot search keywords using past rank data and machine learning. Past rank data shows that more than 70% of hot search keywords tend to disappear and reappear later. The authors processed missing rank value, using deletion, dummy variables, mean substitution, and expectation maximization. It is however crucial to calculate the optimal window size of the past rank data. We proposed an optimal window size selection approach based on the minimum amount of time a topic within the same or a differing context disappeared. The experiments were conducted with four different machine-learning techniques using the Naver, Daum, and Nate 'Hot Search Keywords' datasets, which were collected for 2 years.

A Machine Learning-based Popularity Prediction Model for YouTube Mukbang Content (머신러닝 기반의 유튜브 먹방 콘텐츠 인기 예측 모델)

  • Beomgeun Seo;Hanjun Lee
    • Journal of Internet Computing and Services
    • /
    • v.24 no.6
    • /
    • pp.49-55
    • /
    • 2023
  • In this study, models for predicting the popularity of mukbang content on YouTube were proposed, and factors influencing the popularity of mukbang content were identified through post-analysis. To accomplish this, information on 22,223 pieces of content was collected from top mukbang channels in terms of subscribers using APIs and Pretty Scale. Machine learning algorithms such as Random Forest, XGBoost, and LGBM were used to build models for predicting views and likes. The results of SHAP analysis showed that subscriber count had the most significant impact on view prediction models, while the attractiveness of a creator emerged as the most important variable in the likes prediction model. This confirmed that the precursor factors for content views and likes reactions differ. This study holds academic significance in analyzing a large amount of online content and conducting empirical analysis. It also has practical significance as it informs mukbang creators about viewer content consumption trends and provides guidance for producing high-quality, marketable content.

Prediction System of Facebook's popular post using Opinion Mining and Machine Learning (오피니언 마이닝과 머신러닝을 이용한 페이스북 인기 게시물 예측 시스템)

  • An, Hyeon-woo;Moon, Nammee
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2017.11a
    • /
    • pp.70-73
    • /
    • 2017
  • 페이스북 SNS 플랫폼에서 제공하는 데이터 수집 프로토콜을 이용해 콘텐츠들의 인기 점수와 사용자 의견들을 수집하고 수집된 정보를 가공하여 기계학습을 진행한다. 오피니언 데이터를 학습함으로 인해 인간의 관점을 모방하게 되며 결과적으로 콘텐츠의 질을 판단하는 요소로써 작용하도록 한다. 데이터의 수집은 페이스북 측에서 제공하는 Graph API 와 Python 을 이용하여 진행한다. Graph API 는 HTTP GET 방식의 프로토콜을 이용하여 요청 하고 JSON 형식으로 결과를 반환한다. 학습은 Multiple Linear Regression 과 Gradient Descent Algorithm(GDA)을 사용하여 진행한다. 이후 학습이 진행된 프로그램에 사용자 의견 데이터를 건네주면 최종인기 점수를 예측하는 시스템을 설명한다.

  • PDF

NOD Caching Strategy using User-Preference Pattern for Time-Window (구간별 사용자 요구 패턴을 이용한 NOD에서의 캐싱 방법)

  • 최태욱;박용운;김영주;정기동
    • Proceedings of the Korea Multimedia Society Conference
    • /
    • 1998.04a
    • /
    • pp.71.1-75
    • /
    • 1998
  • NOD 데이터는 VOD 데이터에 비해서 life cycle이 짧다. 그리고 사용자의 접근성이 높으며, 접근패턴도 시간에 따라 달라질 수 있다. VOD 데이터와 같이 NOD 뉴스기사의 경우 특정 기사들에 집중적으로 접근된다. 그리고 이러한 인기 있는 기사들은 시간대에 따라 변할 수 있다. 본 논문에서는 이러한 인기도의 변화를 예측하기 위해서 시계열분석방법중의 하나인 지수평활법(exponenital smoothing method)을 사용한다. 시간대별 타임윈도우로 나누고 이전의 윈도우들의 접근패턴을 분석하여 다음 접근을 예측한다. 그리고 이 예측값을 이용해서 캐시정책을 새운다. 즉 예측값이 높은 기사순으로 캐시에 배치하는 것이다. 실시간 멀티미디어데이터의 경우 데이터의 방대함으로 연산의 오버헤드가 크다. 따라서 정적인 캐싱전략을 사용하는데, 하나의 윈도우동안 재배치하는 한번으로 한다는 것이다. 전통적인 block 단위 캐싱은 멀티미디어데이터에 적합하지 않다. 따라서 기사단위의 캐시구조를 제안한다. 사용자는 기사단위로 요청을 하기 때문에 재사용을 위해서는 기사단위로 캐시되야 한다.

  • PDF

A Two-Phase On-Device Analysis for Gender Prediction of Mobile Users Using Discriminative and Popular Wordsets (모바일 사용자의 성별 예측을 위한 식별 및 인기 단어 집합 기반 2단계 기기 내 분석)

  • Choi, Yerim;Park, Kyuyon;Kim, Solee;Park, Jonghun
    • The Journal of Society for e-Business Studies
    • /
    • v.21 no.1
    • /
    • pp.65-77
    • /
    • 2016
  • As respecting one's privacy becomes an important issue in mobile device data analysis, on-device analysis is getting attention, in which the data analysis is conducted inside a mobile device without sending data from the device to outside. One possible application of the on-device analysis is gender prediction using text data in mobile devices, such as text messages, search keyword, website bookmarks, and contact, which are highly private, and the limited computing power of mobile devices can be addressed by utilizing the word comparison method, where words are selected beforehand and delivered to a mobile device of a user to determine the user's gender by matching mobile text data and the selected words. Moreover, it is known that performing prediction after filtering instances using definite evidences increases accuracy and reduces computational complexity. In this regard, we propose a two-phase approach to on-device gender prediction, where both discriminability and popularity of a word are sequentially considered. The proposed method performs predictions using a few highly discriminative words for all instances and popular words for unclassified instances from the previous prediction. From the experiments conducted on real-world dataset, the proposed method outperformed the compared methods.

Clustering of Web Objects with Similar Popularity Trends (유사한 인기도 추세를 갖는 웹 객체들의 클러스터링)

  • Loh, Woong-Kee
    • The KIPS Transactions:PartD
    • /
    • v.15D no.4
    • /
    • pp.485-494
    • /
    • 2008
  • Huge amounts of various web items such as keywords, images, and web pages are being made widely available on the Web. The popularities of such web items continuously change over time, and mining temporal patterns in popularities of web items is an important problem that is useful for several web applications. For example, the temporal patterns in popularities of search keywords help web search enterprises predict future popular keywords, enabling them to make price decisions when marketing search keywords to advertisers. However, presence of millions of web items makes it difficult to scale up previous techniques for this problem. This paper proposes an efficient method for mining temporal patterns in popularities of web items. We treat the popularities of web items as time-series, and propose gapmeasure to quantify the similarity between the popularities of two web items. To reduce the computation overhead for this measure, an efficient method using the Fast Fourier Transform (FFT) is presented. We assume that the popularities of web items are not necessarily following any probabilistic distribution or periodic. For finding clusters of web items with similar popularity trends, we propose to use a density-based clustering algorithm based on the gap measure. Our experiments using the popularity trends of search keywords obtained from the Google Trends web site illustrate the scalability and usefulness of the proposed approach in real-world applications.