• Title, Summary, Keyword: 핵심어 추출

Search Result 69, Processing Time 0.054 seconds

Syllables-based Named Entity Extraction and Automatic Corpus Construction using Bidirectional Dynamic LSTM (Bidirectional Dynamic LSTM을 이용한 음절 단위 개체명 추출 및 자동화된 말뭉치 구축)

  • Oh, Sungsik;Lim, Changdae;Ahn, Keeho;Park, Weijin
    • 한국어정보학회:학술대회논문집
    • /
    • /
    • pp.317-320
    • /
    • 2017
  • 개체명 인식은 자연어 문장에서 장소, 제작물, 사람 등 분류를 통한 의미 부여가 가능한 단어를 파악하는 기술로서 의미 분석을 위한 핵심 기술이다. 현재 많은 개체명 분석 관련 연구들은 형태소 분석 결과에 의존적인 형태를 갖고 있어서, 형태소 분석 결과의 정확성이 개체명 분석 결과의 성능에 영향을 미치고 있다. 본 연구에서는 형태소 분석 과정을 거치지 않는 음절 기반의 개체명 분석 기술을 제안하여 형태소 분석의 정확도가 낮은 통신어, 신조어 분석 성능을 향상하였다. 또한, 자동화된 방법으로 음절 단위 개체명 말뭉치 및 개체명 사전을 구축하는 프로세스를 정의하여 개체명 분석의 정확도 향상 및 인지 범주의 확대를 도모하였다. 본 연구에서 제안한 개체명 인식 기술은 한국어 개체명 표준에 기반한 129가지의 개체명 분류가 가능하며, 이는 자연어 처리 기술이 필요한 산업계에서 상용화하는데 큰 기여를 할 것으로 판단된다.

  • PDF

A Study on the Integration of Recognition Technology for Scientific Core Entities (과학기술 핵심개체 인식기술 통합에 관한 연구)

  • Choi, Yun-Soo;Jeong, Chang-Hoo;Cho, Hyun-Yang
    • Journal of the Korean Society for information Management
    • /
    • v.28 no.1
    • /
    • pp.89-104
    • /
    • 2011
  • Large-scaled information extraction plays an important role in advanced information retrieval as well as question answering and summarization. Information extraction can be defined as a process of converting unstructured documents into formalized, tabular information, which consists of named-entity recognition, terminology extraction, coreference resolution and relation extraction. Since all the elementary technologies have been studied independently so far, it is not trivial to integrate all the necessary processes of information extraction due to the diversity of their input/output formation approaches and operating environments. As a result, it is difficult to handle scientific documents to extract both named-entities and technical terms at once. In order to extract these entities automatically from scientific documents at once, we developed a framework for scientific core entity extraction which embraces all the pivotal language processors, named-entity recognizer and terminology extractor.

The study on physical factors related with emotional reaction on the flying path (나는(flying) 궤적(path)에 있어서 감성반응을 일으키는 물리적 속성(요소)에 대한 연구)

  • Kim, Do-Yun;Jeong, Jea-Wook
    • Archives of design research
    • /
    • v.18 no.4
    • /
    • pp.139-146
    • /
    • 2005
  • Animation works have been peformed by the objective sensitivity and experience so far. Software designs have been also manufactured based on intelligent data because they are easy to objectify and digitalize. In contrast, there are many elements, which human senses are hard to objectify and digitalize. This study investigates how to digitalize and objectify human senses and how to use them as the quantitative data and its subject is a flying path. In the experiment, this study collects some sensitive words for how human beings express the living path. The evaluation words for sensitivity through the collected sensitive words are extracted and the sketch images for the flying path are collected from the extracted evaluation words for sensitivity. Based on the collected sketch images, the samples of real moving image, which are the core of this study, are manufactured. Then, quantification theory III and I are used in order to analyze the correlation between the sensitive words representing the flying path and the samples of moving image. As a result, this study can figure out the structure of sensitive words and the samples of moving image and analyze the physical stimulating elements for the flying path. The flying path corresponds to the path that the object has passed. Some unique sensitive words are expressed by means of interacting some sensitive stimulating elements after looking at such a path. There are some elements that stimulate the senses and they include the physical elements such as speed, rotation, pattern and length of arc. The purpose of this study is to objectify and quantify the animation works that are created by animators' subjective thought and experience and to use them in animation works in the future.

  • PDF

Matching Agent using Automatic Weight-Control (가중치 자동 조절을 이용한 매칭 에이전트)

  • 김동조;박영택
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • /
    • pp.439-445
    • /
    • 2000
  • 다차원의 속성들을 포함한 대용량의 데이터베이스 또는 점보 저장소의 데이터로부터 지식을 추출하고 이를 활용하기 위해서는 데이터 마이닝의 인공지능 기법 중 기계학습을 활용할 수 있다. 본 논문은 질의어를 바탕으로 각 작성들에 가중치를 적용하여 사용자가 원하는 데이터 집합을 분류하고, 사용자 피드백을 통하여 속성 가중치를 동적으로 변화시킴으로써 검색결과를 향상시키는 방법을 제안한다. 본 논문에서는 데이터 집합을 분류해내기 위해서 각 속성간의 거리에 가중치를 적용하는 k-nearest neighbor 분류법을 사용하였고, 속성 가중치를 동적으로 변화시키는 규칙을 추출하기 위한 방법으로는 결정 트리 생성에 의한 규칙(decision rule) 생성 방법을 적용하였다. 검색결과 향상을 \ulcorner이기 위한 실험으로써 온라인 커플매칭(online couple-matching) 시스템의 핵심부문을 구현하고 이를 적용하였다.

  • PDF

A Study on Constructing Theological Thesaurus (신학 시소러스 구축에 관한 연구)

  • Yoo, Yeong-Jun
    • Journal of the Korean Society for information Management
    • /
    • v.27 no.3
    • /
    • pp.207-225
    • /
    • 2010
  • Terms collected from theological dictionaries in English and the Scripture are used in order to construct conceptual relationships of theological thesaurus. Using the terms, equivalence relationships, hierarchical relationships, and associative relationships as the basic relationships in thesaurus are constructed. In equivalence relationships, Hebrew, Greek, and Latin terms are included as descriptors and in hierarchical relationships, generic, instance, whole-part, and polyhierarchical relationships are constructed. Also, there is no big difference in the kinds of conceptual relationships between this theological thesaurus and the thesauri of other subjects. Examples of Biblical Theology are showed. Because Biblical Theology has a strong point to view the Scripture and Protestantism on comprehensive perspective. In this context, one of the main feature in the theological thesaurus is that there are a lot of the allegorical terms. Typology, which is the core structure causes this result.

Methods to Propel Tourism of Yeosu City Using Big Data (빅데이터를 활용한 여수관광 활성화 방안)

  • Lim, Yang-Ui;Kim, Kang-Chul
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.15 no.4
    • /
    • pp.739-746
    • /
    • 2020
  • The fourth industrial revolution introduced at world economic forum in 2016 has had huge effects on tourism industries as well as the change of core technologies in ICT such as big data, IoT, etc, This paper proposes the methods to propel tourism of Yoesu city through big data analysis and questionnaires. Sensitive words and positive-negative trend are extracted by Social Metrics and the keywords for Yeosu tour trends are extracted and analyzed by Naver datalab, and the results are visualized by R language. And frequency, difference, factor, covariance and regression analysis in SPSS are executed for the questionnaires for 493 visitors who traveled in Yeosu city. Sentiment analysis for Yeosu tour and maritime cable car shows that positive effect is much more than negative one. The analyses for questionnaires in SPSS show that Yeosu area is statistically significant to tour satisfaction index and tour revitalization for Yeosu, and favorite sightseeing places and searching electronic devices for age groups are different. The sightseeing places such as a maritime park with soft contents that give joyfulness and healing to tourists are highly attracted in both the big data and questionnaires analysis.

Analysis of Trends of Researches in Science Education on Underrepresented Students (소외계층학생을 대상으로 한 과학교육 연구의 동향 분석)

  • Nam, Ilkyun;Rhee, Sang Won;Im, Sungmin
    • Journal of The Korean Association For Science Education
    • /
    • v.37 no.6
    • /
    • pp.921-935
    • /
    • 2017
  • The purpose of this research is to investigate trends of science educational researches on underrepresented students by scrutinizing Korean science education research literatures. For this particular purpose, literatures on underrepresented students were extracted from both listed and candidate journals for KCI and theses from 1984 to February 2017, and analyzed criteria such as source, year of publication, design, method, and content of research. A total of 125 papers from journals and 147 theses were extracted. In these researches, 61%, 20%, 6% were about students with disability, underachievers, and North Korean defector students respectively. The ratio of the researches on other underrepresented students such as multicultural, low income families, students who are from rural areas, and other underrepresented students were less than 5%. According to the year of publication, it was found that the number of research papers on underrepresented students increased continuously by a single digit from 1984 by focusing on the students with disability and underachievers. After that, from around 2008, it showed a rapid increase and researches on underrepresented students carried out more than 20 times annually. With regards to research design, there were 58% quantitative, 28% qualitative and 14% hybrid research design. Through analysis of research methods, we found that 30% of experimental research, 22% of interpretive research, 20% of correlation analysis, and 14% of survey research. After going through the characteristics of the research contents by visualizing the relationship between the research groups and the keywords that were extracted, it was found that even though the science education researches on underrepresented students have various contents, there were no keywords that were researched continuously and intensively in this area. The structural relationship between the keywords and each research group on underrepresented students showed that 'academic achievement' is the keyword with the highest degree of mediateness and connectedness.

A Theoretical Study on Indexing Methods using the Metadata for the Automatic Construction of a Thesaurus Browser (시소러스 브라우저 자동구현을 위한 Metadata를 이용한 색인어 처리방안에 대한 연구)

  • Seo , Whee
    • Journal of Korean Library and Information Science Society
    • /
    • v.35 no.4
    • /
    • pp.451-467
    • /
    • 2004
  • This paper is intended to present the theoretical analyses on automatic indexing, which is vital in the process of constructing a thesaurus browser, and clustering algorithms to construct hierarchical relations among terms as well as the methods for the automatic construction of a thesaurus browser. The methods to select the index term automatically in the web documents are studied by surveying the methods for analyzing and processing metadata which conforms to bibliographical roles of traditional paper documents in web documents. Also, the result of the study suggests to adding or involving the metadata in web documents, using the metadata automatic editor because metadata is not listed in most of the web documents.

  • PDF

Design and Implementation of Tag Coupling-based Boolean Query Matching System for Ranked Search Result (태그결합을 이용한 불리언 검색에서 순위화된 검색결과를 제공하기 위한 시스템 설계 및 구현)

  • Kim, Yong;Joo, Won-Kyun
    • Journal of the Korean Society for information Management
    • /
    • v.29 no.4
    • /
    • pp.101-121
    • /
    • 2012
  • Since IR systems which adopt only Boolean IR model can not provide ranked search result, users have to conduct time-consuming checking process for huge result sets one by one. This study proposes a method to provide search results ranked by using coupling information between tags instead of index weight information in Boolean IR model. Because document queries are used instead of general user queries in the proposed method, key tags used as queries in a relevant document are extracted. A variety of groups of Boolean queries based on tag couplings are created in the process of extracting queries. Ranked search result can be extracted through the process of matching conducted with differential information among the query groups and tag significance information. To prove the usability of the proposed method, the experiment was conducted to find research trend analysis information on selected research information. Aslo, the service based on the proposed methods was provided to get user feedback for a year. The result showed high user satisfaction.

An Image-based Word Matching Method for Large volume Printed Hangul Document Retrieval (대용량 인쇄 한글 문서 검색을 위한 영상 기반 단어 매칭 방법)

  • 진영범;오일석
    • Proceedings of the Korean Information Science Society Conference
    • /
    • /
    • pp.461-463
    • /
    • 2000
  • 기계 인쇄된 문서 영상에서 주제어를 탐색하는 문제는 여러 응용 분야에 필수적인 핵심 기술이지만 수작업 또는 OCR 소프트웨어를 이용하여 텍스트로 변환하는 방법은 많은 비용 때문에 한계를 가지고 있다. 요즘 영상 형태로 원문을 저장하는 경우가 많으므로 본 논문은 영상-기반 매칭을 통한 검색 방법을 채택하였다. 문자 또는 단어 매칭에서 가장 중요한 요소가 특징인데 본 논문에서는 디지털도서관과 같이 매칭 대상 단어가 수천만∼수십억에 달하는 대용량 한글 문서 검색에 이용될 수 있도록 비교적 간단히 추출할 수 있고 차원수 조절이 용이한 4방향 프로파일 특징을 이용하는 빠른 검색 방법을 제안한다. 실험결과 8-차원 정도의 간단한 특징으로도 의미 있는 검색 성능을 얻을 수 있음을 보였다.

  • PDF