• 제목/요약/키워드: Keyword weight

검색결과 61건 처리시간 0.022초

A Study on Providing Relative Keyword using The Social Network Analysis Technique in Academic Database (학술DB에서 SNA(Social Network Analysis) 기법을 이용한 연관검색어 제공방안 연구)

  • Kim, Kyoung-Yong;Seo, Jung-Yun;Seon, Choong-Nyoung
    • Annual Conference on Human and Language Technology
    • /
    • 한국정보과학회언어공학연구회 2011년도 제23회 한글 및 한국어 정보처리 학술대회
    • /
    • pp.79-82
    • /
    • 2011
  • 본 논문은 다양한 주제 분야의 연구 성과물을 제공하는 학술DB에서 주제어(Keyword) 정보를 바탕으로 SNA(Social Network Analysis)기법을 적용해 검색어와 연관도가 높은 연관검색어를 제공하는 것을 그 목적으로 한다. 이를 위해 주제어들 간의 가중치(Weight)를 계산한 뒤 Ego Network 분석을 통해 검색어와 연관된 연관주제어를 추출하고 이를 기존 학술DB에서 제공한 연관검색어와 비교 정리하였다. 그리고 정리된 결과를 연관규칙 마이닝기법, 유사계수를 적용해 연관도측면에서 비교 평가하였다.

  • PDF

Co-author and Keyword Networks and their Clustering Appearance in Preventive Medicine Fields in Korea: Analysis of Papers in the Journal of Preventive Medicine and Public Health, $1991{\sim}2006$ (국내 예방의학 분야의 공저자.핵심어 네트워크와 군집 양상 - 대한예방의학회지($1991{\sim}2006$) 게재논문의 분석 -)

  • Jung, Min-Soo;Chung, Dong-Jun
    • Journal of Preventive Medicine and Public Health
    • /
    • 제41권1호
    • /
    • pp.1-9
    • /
    • 2008
  • Objectives : This study evaluated knowledge structure and its effect factor by analysis of co-author and keyword networks in Korea's preventive medicine sector. Methods : The data was extracted from 873 papers listed in the Journal of Preventive Medicine and Public Health, and was transformed into a co-author and keyword matrix where the existence of a 'link' was judged by impact factors calculated by the weight value of the role and rate of author participation. Research achievement was dependent upon the author's status and networking index, as analyzed by neighborhood degree, multidimensional scaling, correspondence analysis, and multiple regression. Results : Co-author networks developed as randomness network in the center of a few high-productivity researchers. In particular, closeness centrality was more developed than degree centrality. Also, power law distribution was discovered in impact factor and research productivity by college affiliation. In multiple regression, the effect of the author's role was significant in both the impact factor calculated by the participatory rate and the number of listed articles. However, the number of listed articles varied by sex. Conclusions : This study shows that the small world phenomenon exists in co-author and keyword networks in a journal, as in citation networks. However, the differentiation of knowledge structure in the field of preventive medicine was relatively restricted by specialization.

A Study on the Rejection Capability Based on Anti-phone Modeling (반음소 모델링을 이용한 거절기능에 대한 연구)

  • 김우성;구명완
    • The Journal of the Acoustical Society of Korea
    • /
    • 제18권3호
    • /
    • pp.3-9
    • /
    • 1999
  • This paper presents the study on the rejection capability based on anti-phone modeling for vocabulary independent speech recognition system. The rejection system detects and rejects out-of-vocabulary words which were not included in candidate words which are defined while the speech recognizer is made. The rejection system can be classified into two categories by their implementation methods, keyword spotting method and utterance verification method. The keyword spotting method uses an extra filler model as a candidate word as well as keyword models. The utterance verification method uses the anti-models for each phoneme for the calculation of confidence score after it has constructed the anti-models for all phonemes. We implemented an utterance verification algorithm which can be used for vocabulary independent speech recognizer. We also compared three kinds of means for the calculation of confidence score, and found out that the geometric mean had shown the best result. For the normalization of confidence score, usually Sigmoid function is used. On using it, we compared the effect of the weight constant for Sigmoid function and determined the optimal value. And we compared the effects of the size of cohort set, the results showed that the larger set gave the better results. And finally we found out optimal confidence score threshold value. In case of using the threshold value, the overall recognition rate including rejection errors was about 76%. This results are going to be adapted for stock information system based on speech recognizer which is currently provided as an experimental service by Korea Telecom.

  • PDF

Keyword Extraction based on Style (스타일 기반 키워드 추출)

  • Lee, Joon-Hwi;Lee, Won-Suk
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 한국정보처리학회 2002년도 춘계학술발표논문집 (하)
    • /
    • pp.1049-1052
    • /
    • 2002
  • 기존의 키워드 추출 방법은 출현회수(frequency)에 기반한 가중치(weight) 부여 방식이 많이 쓰였다. 본 논문에서는 HTML 문서와 같이 스타일이 적용된 문서의 경우 출현회수와 함께 단어에 적용된 스타일을 고려하여 가중치를 부여해 키워드를 추출하는 방법을 제안한다. 가중치를 부여할 스타일 항목과 항목별 가중치 부여방법을 정의하고 이를 단어별로 합산하고 정규화(normalization)하는 방법을 정의하여 스타일에 기반 해 키워드를 추출하였다. 내용이 특정된 도메인으로부터 순위(ranking)가 매겨진 도메인 키워드 리스트를 뽑아서 이를 기준으로 삼아 기존의 출현회수 기반의 키워드 추출 방식과 양적, 질적인 비교를 수행하여 우월함을 보였다.

  • PDF

Patent data analysis using clique analysis in a keyword network (키워드 네트워크의 클릭 분석을 이용한 특허 데이터 분석)

  • Kim, Hyon Hee;Kim, Donggeon;Jo, Jinnam
    • Journal of the Korean Data and Information Science Society
    • /
    • 제27권5호
    • /
    • pp.1273-1284
    • /
    • 2016
  • In this paper, we analyzed the patents on machine learning using keyword network analysis and clique analysis. To construct a keyword network, important keywords were extracted based on the TF-IDF weight and their association, and network structure analysis and clique analysis was performed. Density and clustering coefficient of the patent keyword network are low, which shows that patent keywords on machine learning are weakly connected with each other. It is because the important patents on machine learning are mainly registered in the application system of machine learning rather thant machine learning techniques. Also, our results of clique analysis showed that the keywords found by cliques in 2005 patents are the subjects such as newsmaker verification, product forecasting, virus detection, biomarkers, and workflow management, while those in 2015 patents contain the subjects such as digital imaging, payment card, calling system, mammogram system, price prediction, etc. The clique analysis can be used not only for identifying specialized subjects, but also for search keywords in patent search systems.

Effective Searchable Symmetric Encryption System using Conjunctive Keyword on Remote Storage Environment (원격 저장소 환경에서 다중 키워드를 이용한 효율적인 검색 가능한 대칭키 암호 시스템)

  • Lee, Sun-Ho;Lee, Im-Yeong
    • The KIPS Transactions:PartC
    • /
    • 제18C권4호
    • /
    • pp.199-206
    • /
    • 2011
  • Removable Storage provides the excellent portability with light weight and small size which fits in one's hand, many users have recently turned attention to the high-capacity products. However, due to the easy of portability for Removable Storage, Removable Storage are frequently lost and stolen and then many problems have been occurred such as the leaking of private information to the public. The advent of remote storage services where data is stored throughout the network, has allowed an increasing number of users to access data. The main data of many users is stored together on remote storage, but this has the problem of disclosure by an unethical administrator or attacker. To solve this problem, the encryption of data stored on the server has become necessary, and a searchable encryption system is needed for efficient retrieval of encrypted data. However, the existing searchable encryption system has the problem of low efficiency of document insert/delete operations and multi-keyword search. In this paper, an efficient searchable encryption system is proposed.

Development of Similar Bibliographic Retrieval System based on Neighboring Words and Keyword Topic Information (인접한 단어와 키워드 주제어 정보에 기반한 유사 문헌 검색 시스템 개발)

  • Kim, Kwang-Young;Kwak, Seung-Jin
    • Journal of Korean Library and Information Science Society
    • /
    • 제40권3호
    • /
    • pp.367-387
    • /
    • 2009
  • The similar bibliographic retrieval system follows whether it selects a thing of the extracted index term and or not the difference in which the similar document retrieval system There be many in the search result is generated. In this research, the method minimally making the error of the selection of the extracted candidate index term is provided In this research, the word information in which it is adjacent by using candidate index terms extracted from the similar literature and the keyword topic information were used. And by using the related author information and the reranking method of the search result, the similar bibliographic system in which an accuracy is high was developed. In this paper, we conducted experiments for similar bibliographic retrieval system on a collection of Korean journal articles of science and technology arena. The performance of similar bibliographic retrieval system was proved through an experiment and user evaluation.

  • PDF

Research on Function and Policy for e-Government System using Semantic Technology (전자정부내 의미기반 기술 도입에 따른 기능 및 정책 연구)

  • Go, Gwang-Seop;Jang, Yeong-Cheol;Lee, Chang-Hun
    • 한국디지털정책학회:학술대회논문집
    • /
    • 한국디지털정책학회 2007년도 춘계학술대회
    • /
    • pp.79-87
    • /
    • 2007
  • This paper aims to offer a solution based on semantic document classification to improve e-Government utilization and efficiency for people using their own information retrieval system and linguistic expression Generally, semantic document classification method is an approach that classifies documents based on the diverse relationships between keywords in a document without fully describing hierarchial concepts between keywords. Our approach considers the deep meanings within the context of the document and radically enhances the information retrieval performance. Concept Weight Document Classification(CoWDC) method, which goes beyond using exist ing keyword and simple thesaurus/ontology methods by fully considering the concept hierarchy of various concepts is proposed, experimented, and evaluated. With the recognition that in order to verify the superiority of the semantic retrieval technology through test results of the CoWDC and efficiently integrate it into the e-Government, creation of a thesaurus, management of the operating system, expansion of the knowledge base and improvements in search service and accuracy at the national level were needed.

  • PDF

Study on Extraction of Keywords Using TF-IDF and Text Structure of Novels (TF-IDF와 소설 텍스트의 구조를 이용한 주제어 추출 연구)

  • You, Eun-Soon;Choi, Gun-Hee;Kim, Seung-Hoon
    • Journal of the Korea Society of Computer and Information
    • /
    • 제20권2호
    • /
    • pp.121-129
    • /
    • 2015
  • With the explosive growth of information about books, there is a growing number of customers who find it difficult to pick a book. Against the backdrop, the importance of a book recommendation system becomes greater, through which appropriate information about books could be offered then to encourage customers to buy a book in the end. However, existing recommendation systems based on the bibliographical information or user data reveal the reliability issue found in their recommendation results. This is why it is necessary to reflect semantic information extracted from the texts of a book's main body in a recommendation system. Accordingly, this paper suggests a method for extracting keywords from the main body of novels, as a preceding research, by using TF-IDF method as well as the text structure. To this end, the texts of 100 novels have been collected then to divide them into four structural elements of preface, dialogue, non-dialogue and closing. Then, the TF-IDF weight of each keyword has been calculated. The calculation results show that the extraction accuracy of keywords improves by 42.1% in performance when more weight is given to dialogue while including preface and closing instead of using just the main body.

Guidelines on the Operation Phases of Manual Material Handling Tasks Through Literature Reviews

  • Lee, Kyung-Sun;Jung, Myung-Chul
    • Journal of the Ergonomics Society of Korea
    • /
    • 제36권4호
    • /
    • pp.325-341
    • /
    • 2017
  • Objective: The purpose of this study is to suggest the guidelines of operation phases to minimize injuries and musculoskeletal disorders in manual material handling (MMH) tasks through literature reviews. The guidelines are presented as the preparing phase, lifting phase, carrying phase, and lowering phase. Also, we summarized the non-numerical general guidelines for MMH tasks. Background: Manual material handling is still a main cause to musculoskeletal disorders. Method: Procedures of a literature review are classified into database selection, keyword search, title review, abstract review related to literature selection, guideline review and arrangement. A total 48 papers and books were analyzed in detail by title and abstract reviews. Results: In the preparing phase, we suggested the basic conditions in MMH, preparing procedure, clothing and protective equipment, and education. In the lifting and carrying phases, we recommended maximal acceptable weight by frequency and body posture. In the lowering phase, we suggested the lowest weight and safety body postures. Finally, we recommended general guidelines and guideline items for MMH. General guidelines are presented to suggest worker selection, technical education, and work design parts. Conclusion: We suggested the guidelines on the four operation phases of MMH tasks such as preparing, lifting, carrying, and lowering phases. Application: The findings of this study can be utilized as guidelines for proactive recommendations according to workers in MMH tasks.