• Title/Summary/Keyword: document ranking

Search Result 78, Processing Time 0.03 seconds

A dynamic web document ranking system for ICT teachers (ICT 교사를 위한 다이나믹 웹문서 랭킹시스템)

  • Lee, Mi-Sun;Chun, Seok-Ju
    • 한국정보교육학회:학술대회논문집
    • /
    • 2007.08a
    • /
    • pp.322-327
    • /
    • 2007
  • 2005년 12월 개정된 정보통신기술교육 지침에 따르면 컴퓨터의 과학적인 요소를 '정보처리이해' 단계에서 도입하였다. 자료구조와 알고리즘, 프로그래밍의 기초를 교육하도록 개정하였는데 현장 교사들이 그 내용을 잘 이해하지 못하고 있어 교육하기에 많은 어려움이 있다. 본 연구는 '정보처리이해' 과정을 가르치는데 도움이 되는 구체적인 웹문서를 검색 수집 정리 분류하여 ICT교사들에게 제공한다. 또한 ICT교사들이 참조한 웹문서에 대해 활용도를 평가하고 높은 점수의 웹문서를 상위에 링크시키는 다이나믹한 랭킹 시스템에 관한 설계이다.

  • PDF

Collaboration Document Ranking System for the Control of Subject dispersion (주제 분산의 억제를 위한 협업문서 생성제어 시스템)

  • 조성웅;원용관;이도헌;이귀상
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.10c
    • /
    • pp.163-165
    • /
    • 2002
  • 인터넷의 발전으로 단순한 co-browsing을 넘어선 다기능 협업시스템이 필요하게 되었다. 이러한 점에서 웹 저작 도구인 위키 시스템은 연구원들 간의 능동적이고 적극적인 정보교환을 위한 효과적인 시스템이다. 하지만 정보량이 증가함에 따라 공통된 주제의 문서가 다중 생성됨으로써 정보 공유의 힘이 분산되는 문제점을 발생시킨다. 본 논문에서는 파서(parser), 문서분류 시스템, 유사성측정 시스템으로 구성된 협업문서 생성제어 시스템을 제안한다. 결과적으로 협업문서 생성제어 시스템은 협업문서 생성을 제어함으로써 각 분야의 전문가들의 원활한 정보 공유와 지식창출을 효과적으로 할 수 있다.

  • PDF

A Document Ranking Method by Document Clustering Using Bayesian SoM and Botstrap (베이지안 SOM과 붓스트랩을 이용한 문서 군집화에 의한 문서 순위조정)

  • Choe, Jun-Hyeok;Jeon, Seong-Hae;Lee, Jeong-Hyeon
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.7
    • /
    • pp.2108-2115
    • /
    • 2000
  • The conventional Boolean retrieval systems based on vector spae model can provide the results of retrieval fast, they can't reflect exactly user's retrieval purpose including semantic information. Consequently, the results of retrieval process are very different from those users expected. This fact forces users to waste much time for finding expected documents among retrieved documents. In his paper, we designed a bayesian SOM(Self-Organizing feature Maps) in combination with bayesian statistical method and Kohonen network as a kind of unsupervised learning, then perform classifying documents depending on the semantic similarity to user query in real time. If it is difficult to observe statistical characteristics as there are less than 30 documents for clustering, the number of documents must be increased to at least 50. Also, to give high rank to the documents which is most similar to user query semantically among generalized classifications for generalized clusters, we find the similarity by means of Kohonen centroid of each document classification and adjust the secondary rank depending on the similarity.

  • PDF

An Innovative Approach of Bangla Text Summarization by Introducing Pronoun Replacement and Improved Sentence Ranking

  • Haque, Md. Majharul;Pervin, Suraiya;Begum, Zerina
    • Journal of Information Processing Systems
    • /
    • v.13 no.4
    • /
    • pp.752-777
    • /
    • 2017
  • This paper proposes an automatic method to summarize Bangla news document. In the proposed approach, pronoun replacement is accomplished for the first time to minimize the dangling pronoun from summary. After replacing pronoun, sentences are ranked using term frequency, sentence frequency, numerical figures and title words. If two sentences have at least 60% cosine similarity, the frequency of the larger sentence is increased, and the smaller sentence is removed to eliminate redundancy. Moreover, the first sentence is included in summary always if it contains any title word. In Bangla text, numerical figures can be presented both in words and digits with a variety of forms. All these forms are identified to assess the importance of sentences. We have used the rule-based system in this approach with hidden Markov model and Markov chain model. To explore the rules, we have analyzed 3,000 Bangla news documents and studied some Bangla grammar books. A series of experiments are performed on 200 Bangla news documents and 600 summaries (3 summaries are for each document). The evaluation results demonstrate the effectiveness of the proposed technique over the four latest methods.

A Study on Military Costumes of Hunryeondogam in the Mid and the Late Joseon (조선 중·후기 훈련도감(訓鍊都監)의 군사복식에 관한 연구)

  • Yum, Jung Ha;Cho, Woo Hyun
    • Journal of the Korean Society of Costume
    • /
    • v.63 no.8
    • /
    • pp.171-187
    • /
    • 2013
  • This is a study on the military costumes of Hunryeondogam, which was the center of Five Military Camps in the mid and the late Joseon dynasty. I confirmed the characteristics and system of military costumes of Hunryeondogam by document research and positive research. The military organization of Hunryeondogam was comprised of the high-ranking military officers such as Hunryeondaejang, Junggun, Cheonchong, Byeoljang and Gukbyeoljang, the mid and low ranking military officers such as Pachong, Chogwa and soldiers. And the military costume of Hunryeondogam included Gapju, Yoongbok and Goonbok for military officers and all kinds of military uniforms for soldiers. Imjin war and ritualized military ceremonies in the era of peace had influence on the military costume. Officers, for example, were wearing Dangap and soldiers were wearing Cheolgap or Pigap that depended on the branch of the army. Politically, kings in the mid and the late Joseon had organized military organizations to strengthen their royal authority. I think that the policy can be effectively seen by observing the military costume system. The qualitative differences in cloth materials and the presence or absence of patterns of Goonbok, the qualitative differences in the decoration of Jeonrip and the presence or absence of Yodae were able to distinguish the identity and the rank. An assumption can be made that these things could have been affected by social causes, such as frequent trips of the King and stable society. This cause could be influence on substitute Yoongbok with Goonbok that from the low ranking military officers to the high ranking military officers of Hunryeondogam. The societal changes in the mid and the late Joseon dynasty are reflected on the military costumes system of Hunryeondogam.

King's Status Reflected in The Joseon Dynasty's Document transmission System (조선 문서행이체제에 반영된 국왕의 위상)

  • Lee, Hyeongjung
    • The Korean Journal of Archival Studies
    • /
    • no.66
    • /
    • pp.203-227
    • /
    • 2020
  • This article explores the influence of the king in the Joseon dynasty's document transmission system, focusing on some exceptional cases. According to the Joseon's law, the form of official documents depended on rank differences between receiver and sender. However, there were cases of not following the general principles such as Byungjo(兵曹), Seungjeongwon(承政院) and Kyujanggak(奎章閣). Byungjo was a ministry in charge of military administration. Seungjeongwon was a royal secretary institution which assisted the king and delivered king's orders that existed from the early Joseon. Kyujanggak was a royal library and an assistant institution of the king that was established in the JeongJo(正祖) era. Byungjo was regarded as a relatively high-ranking institution when it sent and received military-related documents. Seungjeongwon and Kyujanggak could use Kwanmoon(關文) to upper rank institution. Kwanmoon was the document form used for institutions of the same or lower rank than itself. Conversely, higher rank institutions used Cheobjeong(牒呈) which was stipulated as a document form to using upper rank institution in law to send them. The reason that they could have privileges in transmission document system was that Joseon had an administrative system centered on the king. Byungjo was an institution entrusted with military power from King. Seungjeonwon and Kyujanggak took charge of the assistance and the delivery of King's order. so they could have a different system of receiving and sending document than the others. In conclusion, the Joseon Dynasty operated exceptions in document administration based on the existence of the king, it means Joseon's transmission document system was basically operated under the Confucian bureaucracy with the king as its peak.

A Folksonomy Ranking Framework: A Semantic Graph-based Approach (폭소노미 사이트를 위한 랭킹 프레임워크 설계: 시맨틱 그래프기반 접근)

  • Park, Hyun-Jung;Rho, Sang-Kyu
    • Asia pacific journal of information systems
    • /
    • v.21 no.2
    • /
    • pp.89-116
    • /
    • 2011
  • In collaborative tagging systems such as Delicious.com and Flickr.com, users assign keywords or tags to their uploaded resources, such as bookmarks and pictures, for their future use or sharing purposes. The collection of resources and tags generated by a user is called a personomy, and the collection of all personomies constitutes the folksonomy. The most significant need of the folksonomy users Is to efficiently find useful resources or experts on specific topics. An excellent ranking algorithm would assign higher ranking to more useful resources or experts. What resources are considered useful In a folksonomic system? Does a standard superior to frequency or freshness exist? The resource recommended by more users with mere expertise should be worthy of attention. This ranking paradigm can be implemented through a graph-based ranking algorithm. Two well-known representatives of such a paradigm are Page Rank by Google and HITS(Hypertext Induced Topic Selection) by Kleinberg. Both Page Rank and HITS assign a higher evaluation score to pages linked to more higher-scored pages. HITS differs from PageRank in that it utilizes two kinds of scores: authority and hub scores. The ranking objects of these pages are limited to Web pages, whereas the ranking objects of a folksonomic system are somewhat heterogeneous(i.e., users, resources, and tags). Therefore, uniform application of the voting notion of PageRank and HITS based on the links to a folksonomy would be unreasonable, In a folksonomic system, each link corresponding to a property can have an opposite direction, depending on whether the property is an active or a passive voice. The current research stems from the Idea that a graph-based ranking algorithm could be applied to the folksonomic system using the concept of mutual Interactions between entitles, rather than the voting notion of PageRank or HITS. The concept of mutual interactions, proposed for ranking the Semantic Web resources, enables the calculation of importance scores of various resources unaffected by link directions. The weights of a property representing the mutual interaction between classes are assigned depending on the relative significance of the property to the resource importance of each class. This class-oriented approach is based on the fact that, in the Semantic Web, there are many heterogeneous classes; thus, applying a different appraisal standard for each class is more reasonable. This is similar to the evaluation method of humans, where different items are assigned specific weights, which are then summed up to determine the weighted average. We can check for missing properties more easily with this approach than with other predicate-oriented approaches. A user of a tagging system usually assigns more than one tags to the same resource, and there can be more than one tags with the same subjectivity and objectivity. In the case that many users assign similar tags to the same resource, grading the users differently depending on the assignment order becomes necessary. This idea comes from the studies in psychology wherein expertise involves the ability to select the most relevant information for achieving a goal. An expert should be someone who not only has a large collection of documents annotated with a particular tag, but also tends to add documents of high quality to his/her collections. Such documents are identified by the number, as well as the expertise, of users who have the same documents in their collections. In other words, there is a relationship of mutual reinforcement between the expertise of a user and the quality of a document. In addition, there is a need to rank entities related more closely to a certain entity. Considering the property of social media that ensures the popularity of a topic is temporary, recent data should have more weight than old data. We propose a comprehensive folksonomy ranking framework in which all these considerations are dealt with and that can be easily customized to each folksonomy site for ranking purposes. To examine the validity of our ranking algorithm and show the mechanism of adjusting property, time, and expertise weights, we first use a dataset designed for analyzing the effect of each ranking factor independently. We then show the ranking results of a real folksonomy site, with the ranking factors combined. Because the ground truth of a given dataset is not known when it comes to ranking, we inject simulated data whose ranking results can be predicted into the real dataset and compare the ranking results of our algorithm with that of a previous HITS-based algorithm. Our semantic ranking algorithm based on the concept of mutual interaction seems to be preferable to the HITS-based algorithm as a flexible folksonomy ranking framework. Some concrete points of difference are as follows. First, with the time concept applied to the property weights, our algorithm shows superior performance in lowering the scores of older data and raising the scores of newer data. Second, applying the time concept to the expertise weights, as well as to the property weights, our algorithm controls the conflicting influence of expertise weights and enhances overall consistency of time-valued ranking. The expertise weights of the previous study can act as an obstacle to the time-valued ranking because the number of followers increases as time goes on. Third, many new properties and classes can be included in our framework. The previous HITS-based algorithm, based on the voting notion, loses ground in the situation where the domain consists of more than two classes, or where other important properties, such as "sent through twitter" or "registered as a friend," are added to the domain. Forth, there is a big difference in the calculation time and memory use between the two kinds of algorithms. While the matrix multiplication of two matrices, has to be executed twice for the previous HITS-based algorithm, this is unnecessary with our algorithm. In our ranking framework, various folksonomy ranking policies can be expressed with the ranking factors combined and our approach can work, even if the folksonomy site is not implemented with Semantic Web languages. Above all, the time weight proposed in this paper will be applicable to various domains, including social media, where time value is considered important.

XML Document Retrieval Models for Heterogeneous Data Set using Independent Regular paths (독립적인 질의 경로들을 사용하여 이질적인 문서들을 검색하는 XML 문서 검색 모델)

  • 유신재;민경섭;김형주
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.1_2
    • /
    • pp.140-152
    • /
    • 2003
  • An XML document has a structure which may be irregular. It is difficult for end-users to comprehend the irregular document structure exactly. For these XML documents, an end-user has a difficulty in using structured query. Therefore, an end-user formulates no structured query or a query which has a little structure information. In this context, we propose new retrieval models which use the structured information for ranking and compensate the difference between user query structure and document structure. To ease with querying, we assume the independence among querying paths which represent structural constraints. Since this assumption makes degradation of the expression power of a query language, we also propose a model which overcome this problem. As there had been no test collections for XML documents, we made a small test collection from TIPSTER of the RTEC and experimented on this collection without a structured query, From this experiment, we showed that our models improve average precision about 67% over conventional Vector-Space model.

Searching Patents Effectively in terms of Keyword Distributions (키워드 분포를 고려한 효과적 특허검색기법)

  • Lee, Wookey;Song, Justin Jongsu;Kang, Michael Mingu
    • Journal of Information Technology and Architecture
    • /
    • v.9 no.3
    • /
    • pp.323-331
    • /
    • 2012
  • With the advancement of the area of knowledge and information, Intellectual Property, especially, patents have captured attention more and more emergent. The increasing need for efficient way of patent information search has been essential, but the prevailing patent search engines have included too many noises for the results due to the Boolean models. This has occasioned too much time for the professional experts to investigate the results manually. In this paper, we reveal the differences between the conventional document search and patent search and analyze the limitations of existing patent search. Furthermore, we propose a specialized in patent search, so that the relationship between the keywords within each document and their significance within each patent document search keyword can be identified. Which in turn, the keywords and the relationships have been appointed a ranking for this patent in the upper ranks and the noise in the data sub-ranked. Therefore this approach is proposed to significantly reduce noise ratio of the data from the search results. Finally, in, we demonstrate the superiority of the proposed methodology by comparing the Kipris dataset.

A Study on STI Database Construction on Demand (이용 기반 데이터베이스 구축 방안에 관한 연구)

  • 조현양
    • Journal of the Korean Society for information Management
    • /
    • v.17 no.2
    • /
    • pp.155-170
    • /
    • 2000
  • In this research, several ways of creating effective STI(Scientific & Technological Information) databases were suggested. We put emphasis on the selection of input data, while on the other was handled, such factors as standardization for data entry, data entry system, etc.. In order to decide priority of target data, the status of document delivery service was analyzed. The result shows that conference proceedings were given priority to academic journals. In case of journals, ranking in the number of documents requested at KORDIC (Korea R&D Information Center) and 16 Specialized Information Centers was compared with the ranking in citation frequency and impact factor, appeared at SCI.

  • PDF