• Title/Summary/Keyword: rank-based

Search Result 1,184, Processing Time 0.025 seconds

Ranking Quality Evaluation of PageRank Variations (PageRank 변형 알고리즘들 간의 순위 품질 평가)

  • Pham, Minh-Duc;Heo, Jun-Seok;Lee, Jeong-Hoon;Whang, Kyu-Young
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.46 no.5
    • /
    • pp.14-28
    • /
    • 2009
  • The PageRank algorithm is an important component for ranking Web pages in Google and other search engines. While many improvements for the original PageRank algorithm have been proposed, it is unclear which variations (and their combinations) provide the "best" ranked results. In this paper, we evaluate the ranking quality of the well-known variations of the original PageRank algorithm and their combinations. In order to do this, we first classify the variations into link-based approaches, which exploit the link structure of the Web, and knowledge-based approaches, which exploit the semantics of the Web. We then propose algorithms that combine the ranking algorithms in these two approaches and implement both the variations and their combinations. For our evaluation, we perform extensive experiments using a real data set of one million Web pages. Through the experiments, we find the algorithms that provide the best ranked results from either the variations or their combinations.

Improved PageRank Algorithm Using Similarity Information of Documents (문서간의 유사도를 이용한 개선된 PageRank 알고리즘)

  • 이경희;김민구;박승규
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2003.10a
    • /
    • pp.169-171
    • /
    • 2003
  • 웹에서의 검색 방법에는 크게 Text-Based 기법과 Link-Based 기법이 있다. 본 논문은 그 중에서 Link-Based 기법의 하나인 PageRank 알고리즘에 대해 연구 하고자 한다. 이 PageRank 알고리즘은 각 페이지의 중요성을 수치로 계산하는 방법이다. 하지만 이 알고리즘에서는 페이지에서 페이지로 링크를 따라갈 확률의 값을 일정하게 주어서 모든 페이지의 값을 획일적으로 계산하였기 때문에 각 페이지의 검색 효율성에 문제가 있다고 판단하여, 이를 해결하고자 본 논문은 페이지사이의 유사도를 측정하여 유사도에 따라 링크를 따라가는 확률 값인 Damping factor값을 다르게 부여하여 검색의 효율성을 높였다. 이를 위하여 두 가지 방법의 실험을 통하여 구현, 증명하였다.

  • PDF

Document Summarization Considering Entailment Relation between Sentences (문장 수반 관계를 고려한 문서 요약)

  • Kwon, Youngdae;Kim, Noo-ri;Lee, Jee-Hyong
    • Journal of KIISE
    • /
    • v.44 no.2
    • /
    • pp.179-185
    • /
    • 2017
  • Document summarization aims to generate a summary that is consistent and contains the highly related sentences in a document. In this study, we implemented for document summarization that extracts highly related sentences from a whole document by considering both similarities and entailment relations between sentences. Accordingly, we proposed a new algorithm, TextRank-NLI, which combines a Recurrent Neural Network based Natural Language Inference model and a Graph-based ranking algorithm used in single document extraction-based summarization task. In order to evaluate the performance of the new algorithm, we conducted experiments using the same datasets as used in TextRank algorithm. The results indicated that TextRank-NLI showed 2.3% improvement in performance, as compared to TextRank.

Improvement of topic modeling and case analysis through convergence of Bertopic and TextRank (버토픽과 텍스트랭크의 융합을 통한 토픽모델링의 개선 및 사례 분석)

  • Kim, Keun Hyung;Kang Jae Jung
    • The Journal of Information Systems
    • /
    • v.33 no.3
    • /
    • pp.105-121
    • /
    • 2024
  • Purpose The purpose of this paper is to develop a method to improve topic representation by incorporating the TextRank technique in Bertopic-based topic modeling and additional indicators for determining the optimal number of topics. Design/methodology/approach In this paper, we propose a method to extract important documents from documents assigned to each topic of a topic model using the TextRank technique, and to calculate secondary diversity and generate topic representations based on the results. First, we integrate the TextRank algorithm into the Bertopic-based topic modeling process to set local secondary labels for each topic. The secondary labels of each topic are derived through extractive summarization based on the TextRank algorithm. Second, we improve the accuracy of selecting the optimal number of topics by calculating the secondary diversity index based on the extractive summary results of each topic. Third, we improve the efficiency by utilizing ChatGPT when deriving the labels of each topic. Findings As a result of performing case analysis and analysis evaluation using the proposed method, it was confirmed that topic representation based on TextRank results generated more accurate topic labels and that the secondary diversity index was a more effective index for determining the optimal number of topics.

Proposal of keyword extraction method based on morphological analysis and PageRank in Tweeter (트위터에서 형태소 분석과 PageRank 기반 화제단어 추출 방법 제안)

  • Lee, Won-Hyung;Cho, Sung-Il;Kim, Dong-Hoi
    • Journal of Digital Contents Society
    • /
    • v.19 no.1
    • /
    • pp.157-163
    • /
    • 2018
  • People who use SNS publish their diverse ideas on SNS every day. The data posted on the SNS contains many people's thoughts and opinions. In particular, popular keywords served on Twitter compile the number of frequently appearing words in user posts and rank them. However, this method is sensitive to unnecessary data simply by listing duplicate words. The proposed method determines the ranking based on the topic of the word using the relationship diagram between words, so that the influence of unnecessary data is less and the main word can be stably extracted. For the performance comparison in terms of the descending keyword rank and the ratios of meaningless keywords among high rank 20 keywords, we make a comparison between the proposed scheme which is based on morphological analysis and PageRank, and the existing scheme which is based on the number of appearances. As a result, the proposed scheme and the existing scheme have included 55% and 70% of meaningless keywords among high rank 20 keywords, respectively, where the proposed scheme is improved about 15% compared with the existing scheme.

Joseons Badge System for Military Ranks and Practices (조선시대 무관의 길짐승흉배제도와 실제)

  • Lee, Eun-Joo
    • Journal of the Korean Society of Costume
    • /
    • v.58 no.5
    • /
    • pp.102-117
    • /
    • 2008
  • This study shows the badge system for military officials of Joseon dynasty. The badge system for military officials of the 15th century consists of rank badges with tiger and leopard for the first and second ranks and rank badges with bear for the third rank. According to the code of laws, military officials are supposed to wear the rank badges with four different kinds of animals in Joseon dynasty. However, the badge system shown in the code of laws sometimes does not match with the badges in practices. Based on the literature, remaining badges and the badges in portraits, six different kinds of badges with animals are found : First, rank badges with tiger and leopard were used until the late 16th century. Second, rank badges with tiger were found in the period between the early 17th century and the latter 18th century. Third, rank badges with Haechi were found in the early 17th century. Fourth, rank badges with lions can be found in remains of the mid 17th century, the literature and the portrait of the late 18th century. Finally, the rank badges with double leopards or with single leopard were found from a portrait dated the late of 18th century to the last period of Joseon dynasty.

Hypothesis Testing for New Scores in a Linear Model

  • Park, Young-Hun
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.3
    • /
    • pp.1007-1015
    • /
    • 2003
  • In this paper we introduced a new score generating function for the rank dispersion function in a general linear model. Based on the new score function, we derived the null asymptotic theory of the rank-based hypothesis testing in a linear model. In essence we showed that several rank test statistics, which are primarily focused on our new score generating function and new dispersion function, are mainly distribution free and asymptotically converges to a chi-square distribution.

Recommendations Based on Listwise Learning-to-Rank by Incorporating Social Information

  • Fang, Chen;Zhang, Hengwei;Zhang, Ming;Wang, Jindong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.1
    • /
    • pp.109-134
    • /
    • 2018
  • Collaborative Filtering (CF) is widely used in recommendation field, which can be divided into rating-based CF and learning-to-rank based CF. Although many methods have been proposed based on these two kinds of CF, there still be room for improvement. Firstly, the data sparsity problem still remains a big challenge for CF algorithms. Secondly, the malicious rating given by some illegal users may affect the recommendation accuracy. Existing CF algorithms seldom took both of the two observations into consideration. In this paper, we propose a recommendation method based on listwise learning-to-rank by incorporating users' social information. By taking both ratings and order of items into consideration, the Plackett-Luce model is presented to find more accurate similar users. In order to alleviate the data sparsity problem, the improved matrix factorization model by integrating the influence of similar users is proposed to predict the rating. On the basis of exploring the trust relationship between users according to their social information, a listwise learning-to-rank algorithm is proposed to learn an optimal ranking model, which can output the recommendation list more consistent with the user preference. Comprehensive experiments conducted on two public real-world datasets show that our approach not only achieves high recommendation accuracy in relatively short runtime, but also is able to reduce the impact of malicious ratings.

Rank-based Control of Mutation Probability for Genetic Algorithms

  • Jung, Sung-Hoon
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.10 no.2
    • /
    • pp.146-151
    • /
    • 2010
  • This paper proposes a rank-based control method of mutation probability for improving the performances of genetic algorithms (GAs). In order to improve the performances of GAs, GAs should not fall into premature convergence phenomena and should also be able to easily get out of the phenomena when GAs fall into the phenomena without destroying good individuals. For this, it is important to keep diversity of individuals and to keep good individuals. If a method for keeping diversity, however, is not elaborately devised, then good individuals are also destroyed. We should devise a method that keeps diversity of individuals and also keeps good individuals at the same time. To achieve these two objectives, we introduce a rank-based control method of mutation probability in this paper. We set high mutation probabilities to lowly ranked individuals not to fall into premature convergence phenomena by keeping diversity and low mutation probabilities to highly ranked individuals not to destroy good individuals. We experimented our method with typical four function optimization problems in order to measure the performances of our method. It was found from extensive experiments that the proposed rank-based control method could accelerate the GAs considerably.

FolkRank++: An Optimization of FolkRank Tag Recommendation Algorithm Integrating User and Item Information

  • Zhao, Jianli;Zhang, Qinzhi;Sun, Qiuxia;Huo, Huan;Xiao, Yu;Gong, Maoguo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.1
    • /
    • pp.1-19
    • /
    • 2021
  • The graph-based tag recommendation algorithm FolkRank can effectively utilize the relationships between three entities, namely users, items and tags, and achieve better tag recommendation performance. However, FolkRank does not consider the internal relationships of user-user, item-item and tag-tag. This leads to the failure of FolkRank to effectively map the tagging behavior which contains user neighbors and item neighbors to a tripartite graph. For item-item relationships, we can dig out items that are very similar to the target item, even though the target item may not have a strong connection to these similar items in the user-item-tag graph of FolkRank. Hence this paper proposes an improved FolkRank algorithm named FolkRank++, which fully considers the user-user and item-item internal relationships in tag recommendation by adding the correlation information between users or items. Based on the traditional FolkRank algorithm, an initial weight is also given to target user and target item's neighbors to supply the user-user and item-item relationships. The above work is mainly completed from two aspects: (1) Finding items similar to target item according to the attribute information, and obtaining similar users of the target user according to the history behavior of the user tagging items. (2) Calculating the weighted degree of items and users to evaluate their importance, then assigning initial weights to similar items and users. Experimental results show that this method has better recommendation performance.