• Title/Summary/Keyword: 기술용어 추출

Search Result 113, Processing Time 0.109 seconds

Construction of Test Collection for Evaluation of Scientific Relation Extraction System (과학기술분야 용어 간 관계추출 시스템의 평가를 위한 테스트컬렉션 구축)

  • Choi, Yun-Soo;Choi, Sung-Pil;Jeong, Chang-Hoo;Yoon, Hwa-Mook;You, Beom-Jong
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2009.05a
    • /
    • pp.754-758
    • /
    • 2009
  • Extracting information in large-scale documents would be very useful not only for information retrieval but also for question answering and summarization. Even though relation extraction is very important area, it is difficult to develop and evaluate a machine learning based system without test collection. The study shows how to build test collection(KREC2008) for the relation extraction system. We extracted technology terms from abstracts of journals and selected several relation candidates between them using Wordnet. Judges who were well trained in evaluation process assigned a relation from candidates. The process provides the method with which even non-experts are able to build test collection easily. KREC2008 are open to the public for researchers and developers and will be utilized for development and evaluation of relation extraction system.

  • PDF

Terminology Tagging System using elements of Korean Encyclopedia (백과사전 기반 전문용어 태깅 시스템)

Ontology construction for Korea Telecom(KT) Terms (KT 용어 온토로지 구축)

  • Roh, Duck-Keun;Byun, Dong-Ryul;Park, Soon-Cheol
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2007.10d
    • /
    • pp.550-555
    • /
    • 2007
  • 본 논문에서는 한국통신(KT)에서 사용되는 주요 용어들을 추출하여 추출된 용어들 간의 고유성과 관계성을 기초로 한 용어 온토로지를 구축하였다. 또한 생성된 용어 온토로지를 이용한 검색질의 예를 통해서 기업의 다양한 분야를 관리하는데 도움을 줄 수 있는 방안을 모색했다. 온토로지 구축 툴로는 은토로지 에디터, Protege를 사용하였으며. 온토로지는 최상위 클래스 Organization(기관), Employee(직원), Product(상품), Technique(기술) 등 4가지로 분류하여 구축하였다. 본 연구를 기초로 한국통신(KT)의 다양한 지식정보를 체계화하고 KT 데이터베이스를 효과적으로 관리할 수 있을 것이다. 또한 구축된 온토로지를 이용한 미래의 KT 시멘틱 검색시스템 구축에 기초가 되기를 기대한다.

  • PDF

Con-Talky: Information Extraction and Visualization Platform for Communication of Construction Industry (Con-Talky: 건설 분야 전문가의 의사소통을 위한 정보 추출 및 시각화 플랫폼)

  • Shim, Midan;Park, Chanjun;Hur, Yuna;Lim, Heuiseok
    • Annual Conference on Human and Language Technology
    • /
    • 2021.10a
    • /
    • pp.476-481
    • /
    • 2021
  • 본 논문은 용어의 비통일성과 문서의 다양성으로 인해 발생하는 건설분야 전문가들의 의사소통 문제를 해결하기 위한 Con-Talky를 제안한다. Con-Talky는 자연언어처리의 대표적인 기술인 형태소분석, 의존구문분석, 의미역 결정 기술을 융합하여 건설분야의 "설계기준문서"를 시각화하고 핵심 정보추출을 자동으로 해주는 플랫폼이다. 해당 플랫폼을 이용하여 토목분야 전문가들의 의사소통 문제를 완화시킬 수 있으며 용어의 비통일성 및 표준화에도 기여할 수 있다. 또한 본 논문은 국내 건설 및 토목분야에 최초로 자연언어처리 기술을 적용한 논문이다. 해당 분야의 연구를 활성화 하기 위해 건설분야에 특화된 단일 말뭉치와 트리플 데이터를 자체 제작함과 동시에 전면 공개하였다.

  • PDF

Application and Process Standardization of Terminology Dictionary for Defense Science and Technology (국방과학기술 전문용어 사전 구축을 위한 프로세스 표준화 및 활용 방안)

  • Choi, Jung-Hwoan;Choi, Suk-Doo;Kim, Lee-Kyum;Park, Young-Wook;Jeong, Jong-Hee;An, Hee-Jung;Jung, Han-Min;Kim, Pyung
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.8
    • /
    • pp.247-259
    • /
    • 2011
  • It is necessary to collect, manage and standardize defense and technology terminologies which are used by defense-related agencies in the field of national defense science and technology. The standardization of terminology dictionary can eliminate confusion about terminology and increase accessibility for the terminology by offline and online services. This study focuses on building national defense science and technology terminologies, publishing dictionary including them, and improving information analysis in defense area. as well as take advantage of offline and online services for easy accessibility for the terminology. Based on the results of this study, the terminology data will be used as follows; 1) Defence science and technology terminology databases and its publication. 2) Information analysis in military fields. 3) Multilingual information analysis translated terms in the thesauri. 4) Verification on the consistency of information processing. 5) Language resources for terminology extraction.

Development of u-Health standard terminology and guidelines for terminology standardization (유헬스 표준용어 및 용어 표준화 가이드라인 개발)

  • Lee, Soo-Kyoung
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.16 no.6
    • /
    • pp.4056-4066
    • /
    • 2015
  • For understanding of terminology related to u-Health and activating u-Health industry, it is required to develop u-Health standard terminology for communication. The purpose of this study is to develop u-Health standard terminology and provides guidelines for terminology standardization in order to develop the u-Health standard terminology. We finally developed the 187 u-Health standard terminology through the process of data acquisition, term extraction, term refinement, term selection and term management based on reports, glossary and Telecommunications Technology Association (TTA) standards about u-Health. As a result, the standard terminology and guidelines of u-Health optimized to the domestic environment were suggested. They included details of definition, classification, components, the methods and principles of the process for u-Health standard terminology. Presented in this study, u-Health standard terminology and guidelines for terminology standardization would assist the cost-reducing of employing terminology and management of it, while making information transfer easy. This would make possible promoting efficient development of u-Health industry in general.

Research of Topic Analysis for Extracting the Relationship between Science Data (과학기술용어 간 관계 도출을 위한 토픽 분석 연구)

  • Kim, Mucheol
    • The Journal of Society for e-Business Studies
    • /
    • v.21 no.1
    • /
    • pp.119-129
    • /
    • 2016
  • With the development of web, amount of information are generated in social web. Then many researchers are focused on the extracting and analyzing social issues from various social data. The proposed approach performed gathering the science data and analyzing with LDA algorithm. It generated the clusters which represent the social topics related to 'health'. As a result, we could deduce the relationship between science data and social issues.

Automatic Keyword Extraction System for Korean Documents Information Retrieval (국내(國內) 문헌정보(文獻情報) 검색(檢索)을 위한 키워드 자동추출(自動抽出) 시스템 개발(開發))

  • Yae, Yong-Hee
    • Journal of Information Management
    • /
    • v.23 no.1
    • /
    • pp.39-62
    • /
    • 1992
  • In this paper about 60 auxiliary words and 320 stopwords are selected from analysis of sample data, four types of stop word are classified left, right and - auxiliary word truncation & normal. And a keyword extraction system is suggested which undertakes efficient truncation of auxiliary word from words, conversion of Chinese word to Korean and exclusion of stopword. The selected keyeords in this system show 92.2% of accordance ratio compared with manually selected keywords by expert. And then compound words consist of $4{\sim}6$ character generate twice of additional new words and 58.8% words of those are useful as keyword.

  • PDF

Analyzing and Extracting Relations between Topic Keywords Based on Word Formation (조어 중심적 주제어간 관계 추출 및 분석)

  • Jung, Han-Min;Lee, Mi-Kyoung;Sung, Won-Kyung
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2008.06a
    • /
    • pp.166-171
    • /
    • 2008
  • 본 연구는 기존에 잘 알려지고 널리 사용되고 있는 어휘 의미망이나 시소러스를 활용하기 어려운 과학 기술 분야, 특히 IT 분야에서 대용량 용어간 관계를 빠른 시간 내에 구축하여 검색 브라우징, 내비게이션 용도로 활용하는 것을 목표로 한다. 시소러스 구축 절차를 따르는 경우에 분야 전문가에 의한 정교한 작업과 고비용을 필요로 하여 충분한 구축 크기를 확보하는 것에 현실적인 어려움이 있다. 시소러스 자동 구축 방법론을 사용하는 경우에도 해당 용어들이 출현하는 방대한 말뭉치를 확보해야 하며 관계 구축 결과에 대한 직관적 이해가 쉽지 않다는 단점이 있다. 본 연구는 해외 학술 논문 말뭉치와 메타데이터에서 획득한 37만 여 주제어들을 이용하여 상 하위 관계, 관련어, 형제 관계를 추출하기 위해 조어적 기준에 근거한 규칙들을 이용한다. 이들 규칙을 이용하여 추출한 관계 수는 상 하위 관계 60여 만 개, 관련어 640여 만 개, 형제 관계 2,000여 만 개 등이다. 또한, 추출 결과 중 일부를 수작업으로 분석하여 단순한 추출 규칙에서 발생하는 오류 유형을 찾아내고 향후 과제에서 해결할 수 있는 방안에 대해 논하자고 한다.

  • PDF

Analysis of Technology Trends from Words in Patent Titles (특허 발명의 명칭에 쓰인 단어를 이용한 기술동향 분석 연구)

  • Kim, Tae-Jung;Lee, Myung-Sun;Choi, Ho-Nam
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.4
    • /
    • pp.433-437
    • /
    • 2010
  • Patent contains meaningful technical achievement. There are many cases explaining technology trends from the analysis of frequency of term. Term sometimes has different meaning on fields. In this paper, words from patent titles of US, Japan, Korea PCT and EPO are collected by the 5 categories of WIPO. Frequency changes rate of each word were calculated and high ranked words of 5 categories were analyzed to find relationship between patent and technology development as well as technology trends.