• Title/Summary/Keyword: Keyword-based Ranking

Search Result 33, Processing Time 0.023 seconds

Method of Improving Personal Name Search in Academic Information Service

  • Han, Heejun;Lee, Seok-Hyoung
    • International Journal of Knowledge Content Development & Technology
    • /
    • v.2 no.2
    • /
    • pp.17-29
    • /
    • 2012
  • All academic information on the web or elsewhere has its creator, that is, a subject who has created the information. The subject can be an individual, a group, or an institution, and can be a nation depending on the nature of the relevant information. Most information is composed of a title, an author, and contents. An essay which is under the academic information category has metadata including a title, an author, keyword, abstract, data about publication, place of publication, ISSN, and the like. A patent has metadata including the title, an applicant, an inventor, an attorney, IPC, number of application, and claims of the invention. Most web-based academic information services enable users to search the information by processing the meta-information. An important element is to search information by using the author field which corresponds to a personal name. This study suggests a method of efficient indexing and using the adjacent operation result ranking algorithm to which phrase search-based boosting elements are applied, and thus improving the accuracy of the search results of personal names. It also describes a method for providing the results of searching co-authors and related researchers in searching personal names. This method can be effectively applied to providing accurate and additional search results in the academic information services.

A study of investigation and improvement to classification for oriental medicine in search portal web site (검색포털 지식검색에 대한 한의학분류체계 조사 및 개선방안 연구)

  • Kim, Chul
    • Journal of the Korean Institute of Oriental Medical Informatics
    • /
    • v.15 no.1
    • /
    • pp.1-10
    • /
    • 2009
  • In these days everyone search the information easily with the Internet as the rapid distribution and active usage of the Internet. The search engines were developed specially to accuracy of information retrieval. User search the information more quickly and variously with them. The search portal system will be embossed with representation and basic services. The Internet user needs the result of text, image and video, knowledge search. The keyword based search is used generally for getting result of the information retrieval and another method is category based search. This paper investigates the classification of knowledge search structure for oriental medicine in market leader of search portal system by ranking web site. As a result, each classification system is unified and there is a possibility of getting up a many confusion to the user who approaches with classification systematic search method. This treatise proposed the improved oriental medicine classification system of internet information retrieval in knowledge search area. if the service provider amends about the classification system, there will be able to guarantee the compatibility of data. Also the proper access path of the knowledge which seeks is secured to user.

  • PDF

A Case Study on the Next Generation Library Catalogs (차세대 도서관 목록 사례의 고찰)

  • Yoon, Cheong-Ok
    • Journal of Korean Library and Information Science Society
    • /
    • v.41 no.1
    • /
    • pp.5-28
    • /
    • 2010
  • The purpose of this study is to investigate the major features of Next Generation Library Catalogs. 'Next Generation Melvyl Pilot' of University of California Library System and 'SearchWorks' of Stanford University Library are examined. While the former is developed, based on OCLC WorldCat Local, the latter is based on the Blacklight, an Open Source Catalog Software. Both commonly provide the features, including enriched contents, facet navigation, keyword searching, relevancy ranking of search results, and user contribution, etc., but some functions vary in scopes and contents. Also, it seems that both are in process of development rather than complete implementations.

  • PDF

Dynamic Management of Equi-Join Results for Multi-Keyword Searches (다중 키워드 검색에 적합한 동등조인 연산 결과의 동적 관리 기법)

  • Lim, Sung-Chae
    • The KIPS Transactions:PartA
    • /
    • v.17A no.5
    • /
    • pp.229-236
    • /
    • 2010
  • With an increasing number of documents in the Internet or enterprises, it becomes crucial to efficiently support users' queries on those documents. In that situation, the full-text search technique is accepted in general, because it can answer uncontrolled ad-hoc queries by automatically indexing all the keywords found in the documents. The size of index files made for full-text searches grows with the increasing number of indexed documents, and thus the disk cost may be too large to process multi-keyword queries against those enlarged index files. To solve the problem, we propose both of the index file structure and its management scheme suitable to the processing of multi-keyword queries against a large volume of index files. For this, we adopt the structure of inverted-files, which are widely used in the multi-keyword searches, as a basic index structure and modify it to a hierarchical structure for join operations and ranking operations performed during the query processing. In order to save disk costs based on that index structure, we dynamically store in the main memory the results of join operations between two keywords, if they are highly expected to be entered in users' queries. We also do performance comparisons using a cost model of the disk to show the performance advantage of the proposed scheme.

Performance Evaluation of Video Recommendation System with Rich Metadata (풍부한 메타데이터를 가진 동영상 추천 시스템의 성능 평가)

  • Min Hwa Cho;Da Yeon Kim;Hwa Rang Lee;Ha Neul Oh;Sun Young Lee;In Hwan Jung;Jae Moon Lee;Kitae Hwang
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.23 no.2
    • /
    • pp.29-35
    • /
    • 2023
  • This paper makes it possible to search videos based on sentence by improving the previous research which automatically generates rich metadata from videos and searches videos by key words. For search by sentence, morphemes are analyzed for each sentence, keywords are extracted, weights are assigned to each keyword, and some videos are recommended by applying a ranking algorithm developed in the previous research. In order to evaluate performance of video search in this paper, a sufficient amount of videos and sufficient number of user experiences are re required. However, in the current situation where these are insufficient, three indirect evaluation methods were used: evaluation of overall user satisfaction, comparison of recommendation scores and user satisfaction, and evaluation of user satisfaction by video categories. As a result of performance evaluation, it was shown that the rich metadata construction and video recommendation implementation in this paper give users high search satisfaction.

Course recommendation system using deep learning (딥러닝을 이용한 강좌 추천시스템)

  • Min-Ah Lim;Seung-Yeon Hwang;Dong-Jin Shin;Jae-Kon Oh;Jeong-Joon Kim
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.23 no.3
    • /
    • pp.193-198
    • /
    • 2023
  • We study a learner-customized lecture recommendation project using deep learning. Recommendation systems can be easily found on the web and apps, and examples using this feature include recommending feature videos by clicking users and advertising items in areas of interest to users on SNS. In this study, the sentence similarity Word2Vec was mainly used to filter twice, and the course was recommended through the Surprise library. With this system, it provides users with the desired classification of course data conveniently and conveniently. Surprise Library is a Python scikit-learn-based library that is conveniently used in recommendation systems. By analyzing the data, the system is implemented at a high speed, and deeper learning is used to implement more precise results through course steps. When a user enters a keyword of interest, similarity between the keyword and the course title is executed, and similarity with the extracted video data and voice text is executed, and the highest ranking video data is recommended through the Surprise Library.

A Study on Differences of Contents and Tones of Arguments among Newspapers Using Text Mining Analysis (텍스트 마이닝을 활용한 신문사에 따른 내용 및 논조 차이점 분석)

  • Kam, Miah;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.3
    • /
    • pp.53-77
    • /
    • 2012
  • This study analyses the difference of contents and tones of arguments among three Korean major newspapers, the Kyunghyang Shinmoon, the HanKyoreh, and the Dong-A Ilbo. It is commonly accepted that newspapers in Korea explicitly deliver their own tone of arguments when they talk about some sensitive issues and topics. It could be controversial if readers of newspapers read the news without being aware of the type of tones of arguments because the contents and the tones of arguments can affect readers easily. Thus it is very desirable to have a new tool that can inform the readers of what tone of argument a newspaper has. This study presents the results of clustering and classification techniques as part of text mining analysis. We focus on six main subjects such as Culture, Politics, International, Editorial-opinion, Eco-business and National issues in newspapers, and attempt to identify differences and similarities among the newspapers. The basic unit of text mining analysis is a paragraph of news articles. This study uses a keyword-network analysis tool and visualizes relationships among keywords to make it easier to see the differences. Newspaper articles were gathered from KINDS, the Korean integrated news database system. KINDS preserves news articles of the Kyunghyang Shinmun, the HanKyoreh and the Dong-A Ilbo and these are open to the public. This study used these three Korean major newspapers from KINDS. About 3,030 articles from 2008 to 2012 were used. International, national issues and politics sections were gathered with some specific issues. The International section was collected with the keyword of 'Nuclear weapon of North Korea.' The National issues section was collected with the keyword of '4-major-river.' The Politics section was collected with the keyword of 'Tonghap-Jinbo Dang.' All of the articles from April 2012 to May 2012 of Eco-business, Culture and Editorial-opinion sections were also collected. All of the collected data were handled and edited into paragraphs. We got rid of stop-words using the Lucene Korean Module. We calculated keyword co-occurrence counts from the paired co-occurrence list of keywords in a paragraph. We made a co-occurrence matrix from the list. Once the co-occurrence matrix was built, we used the Cosine coefficient matrix as input for PFNet(Pathfinder Network). In order to analyze these three newspapers and find out the significant keywords in each paper, we analyzed the list of 10 highest frequency keywords and keyword-networks of 20 highest ranking frequency keywords to closely examine the relationships and show the detailed network map among keywords. We used NodeXL software to visualize the PFNet. After drawing all the networks, we compared the results with the classification results. Classification was firstly handled to identify how the tone of argument of a newspaper is different from others. Then, to analyze tones of arguments, all the paragraphs were divided into two types of tones, Positive tone and Negative tone. To identify and classify all of the tones of paragraphs and articles we had collected, supervised learning technique was used. The Na$\ddot{i}$ve Bayesian classifier algorithm provided in the MALLET package was used to classify all the paragraphs in articles. After classification, Precision, Recall and F-value were used to evaluate the results of classification. Based on the results of this study, three subjects such as Culture, Eco-business and Politics showed some differences in contents and tones of arguments among these three newspapers. In addition, for the National issues, tones of arguments on 4-major-rivers project were different from each other. It seems three newspapers have their own specific tone of argument in those sections. And keyword-networks showed different shapes with each other in the same period in the same section. It means that frequently appeared keywords in articles are different and their contents are comprised with different keywords. And the Positive-Negative classification showed the possibility of classifying newspapers' tones of arguments compared to others. These results indicate that the approach in this study is promising to be extended as a new tool to identify the different tones of arguments of newspapers.

A Study of Perception of Golfwear Using Big Data Analysis (빅데이터를 활용한 골프웨어에 관한 인식 연구)

  • Lee, Areum;Lee, Jin Hwa
    • Fashion & Textile Research Journal
    • /
    • v.20 no.5
    • /
    • pp.533-547
    • /
    • 2018
  • The objective of this study is to examine the perception of golfwear and related trends based on major keywords and associated words related to golfwear utilizing big data. For this study, the data was collected from blogs, Jisikin and Tips, news articles, and web $caf{\acute{e}}$ from two of the most commonly used search engines (Naver & Daum) containing the keywords, 'Golfwear' and 'Golf clothes'. For data collection, frequency and matrix data were extracted through Textom, from January 1, 2016 to December 31, 2017. From the matrix created by Textom, Degree centrality, Closeness centrality, Betweenness centrality, and Eigenvector centrality were calculated and analyzed by utilizing Netminer 4.0. As a result of analysis, it was found that the keyword 'brand' showed the highest rank in web visibility followed by 'woman', 'size', 'man', 'fashion', 'sports', 'price', 'store', 'discount', 'equipment' in the top 10 frequency rankings. For centrality calculations, only the top 30 keywords were included because the density was extremely high due to high frequency of the co-occurring keywords. The results of centrality calculations showed that the keywords on top of the rankings were similar to the frequency of the raw data. When the frequency was adjusted by subtracting 100 and 500 words, it showed different results as the low-ranking keywords such as J. Lindberg in the frequency analysis ranked high along with changes in the rankings of all centrality calculations. Such findings of this study will provide basis for marketing strategies and ways to increase awareness and web visibility for Golfwear brands.

Social Perception of Disaster Safety Education for Young Children through Big Data (빅데이터를 통해 살펴본 유아 재난안전교육에 대한 사회적 인식)

  • Kang, Min-Jung;You, Hee-Jung
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.2
    • /
    • pp.162-171
    • /
    • 2020
  • The purpose of this study is to examine the social perception of disaster safety education for young children based on Textom big data and to explore the direction of young children's disaster safety education. Researchers collected and analyzed online text data using the keywords 'young children+disaster+safety education' from portal websites from 2014 to 2017. The raw data were then subjected to first and second data refinement process. Based on the frequency analysis results, 50 keywords were selected, and the selected keywords were converted into matrix data for network analysis. The results of the study are: first, the most frequently appeared keyword together with young children's disaster safety education was 'education', followed by 'experience', 'kindergarten', 'prevention', and 'school.' Second, keywords with high centrality in the analysis of centrality also were 'education', 'experience', and 'prevention'. In addition, keywords like 'prevention', 'life', and 'evacuation' appear higher in connection-centricity than frequency ranking, which means that the degree of connection between the words is high. These results suggest that young children need education in during early childhood in order to improve their disaster safety skills, and disaster safety education should be accomplished through 'prevention' and 'experience' in early childhood education institutions.

Forecasting the Future Korean Society: A Big Data Analysis on 'Future Society'-related Keywords in News Articles and Academic Papers (빅데이터를 통해 본 한국사회의 미래: 언론사 뉴스기사와 사회과학 학술논문의 '미래사회' 관련 키워드 분석)

  • Kim, Mun-Cho;Lee, Wang-Won;Lee, Hye-Soo;Suh, Byung-Jo
    • Informatization Policy
    • /
    • v.25 no.4
    • /
    • pp.37-64
    • /
    • 2018
  • This study aims to forecast the future of the Korean society via a big data analysis. Based upon two sets of database - a collection of 46,000,000 news on 127 media in Naver Portal operated by Naver Corporation and a collection of 70,000 academic papers of social sciences registered in KCI (Korea Citation Index of National Research Foundation) between 2005-2017, 40 most frequently occurring keywords were selected. Next, their temporal variations were traced and compared in terms of number and pattern of frequencies. In addition, core issues of the future were identified through keyword network analysis. In the case of the media news database, such issues as economy, polity or technology turned out to be the top ranked ones. As to the academic paper database, however, top ranking issues are those of feeling, working or living. Referring to the system and life-world conceptual framework suggested by $J{\ddot{u}}rgen$ Habermas, public interest of the future inclines to the matter of 'system' while professional interest of the future leans to that of 'life-world.' Given the disparity of future interest, a 'mismatch paradigm' is proposed as an alternative to social forecasting, which can substitute the existing paradigms based on the ideas of deficiency or deprivation.