• Title/Summary/Keyword: 자동정보 추출

Search Result 1,996, Processing Time 0.033 seconds

Performance and Limitations of a Korean Sentiment Lexicon Built on the English SentiWordNet (영어 SentiWordNet을 이용하여 구축한 한국어 감성어휘사전의 성능 평가와 한계 연구)

  • Shin, Donghyok;Kim, Sairom;Cho, Donghee;Nguyen, Minh Dieu;Park, Soongang;Eo, Keonjoo;Nam, Jeesun
    • 한국어정보학회:학술대회논문집
    • /
    • 2016.10a
    • /
    • pp.189-194
    • /
    • 2016
  • 본 연구는 다국어 감성사전 및 감성주석 코퍼스 구축 프로젝트인 MUSE 프로젝트의 일환으로 한국어 감성사전을 구축하기 위해 대표적인 영어 감성사전인 SentiWordNet을 이용하여 한국어 감성사전을 구축하는 방법의 의의와 한계점을 검토하는 것을 목적으로 한다. 우선 영어 SentiWordNet의 117,659개의 어휘중에서 긍정/부정 0.5 스코어 이상의 어휘를 추출하여 구글 번역기를 이용해 자동 번역하는 작업을 실시하였다. 그 중에서 번역이 되지 않거나, 중복되는 경우를 제거하고, 언어학 전문가들의 수작업으로 분류해낸 결과 3,665개의 감성어휘를 획득할 수 있었다. 그러나 이마저도 병명이나 순수 감성어휘로 보기 어려운 사례들이 상당수 포함되어 있어 실제 이를 코퍼스에 적용하여 감성어휘를 자동 판별했을 때에 맛집 코퍼스에서의 재현율(recall)이 긍정과 부정에서 각각 47.4%, 37.7%, IT 코퍼스에서 각각 55.2%, 32.4%에 불과하였다. 이와 더불어 F-measure의 경우, 맛집 코퍼스에서는 긍정과 부정의 값이 각각 62.3%, 38.5%였고, IT 코퍼스에서는 각각 65.5%, 44.6%의 낮은 수치를 보여주고 있어, SentiWordNet 기반의 감성사전은 감성사전으로서의 역할을 수행하기에 충분하지 않은 것으로 나타났다. 이를 통해 한국어 감성사전을 구축할 때에는 한국어의 언어적 속성을 고려한 체계적인 접근이 필요함을 역설하고, 현재 한국어 전자사전 DECO에 기반을 두어 보완 확장중인 SELEX 감성사전에 대해 소개한다.

  • PDF

The Development of Automatic Ontology Generation System Using Extended Search Keywords (검색 키워드 확장을 이용한 온톨로지 자동 생성 시스템 개발)

  • Shim, Joon;Lee, Hong-Chul
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.10 no.6
    • /
    • pp.1220-1228
    • /
    • 2009
  • Ontologies, which are the core of the Semantic Web, are usually limited by specific domains or created by defining meanings and relationships that depend on the heuristic. However, the creation of an ontology is not only very difficult but also very time-consuming. In contrast with ontologies that are used in specific fields, an ontology for the Web entails an unlimited scope of knowledge and expression of information. Hence, it is hard to express information in the same way that is used to create ontologies in specific fields. Therefore, the automatic generation of an ontology takes very important role in the Semantic Web. In this paper, to make ontologies automatically, we suggest the methods to create and renew ontologies by expanding keywords related to the index-terms which are extracted from the search keywords which users input in the search engines by analyzing the morphemes.

Similar Patent Search Service System using Latent Dirichlet Allocation (잠재 의미 분석을 적용한 유사 특허 검색 서비스 시스템)

  • Lim, HyunKeun;Kim, Jaeyoon;Jung, Hoekyung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.8
    • /
    • pp.1049-1054
    • /
    • 2018
  • Keyword searching used in the past as a method of finding similar patents, and automated classification by machine learning is using in recently. Keyword searching is a method of analyzing data that is formalized through data refinement. While the accuracy for short text is high, long one consisted of several words like as document that is not able to analyze the meaning contained in sentences. In semantic analysis level, the method of automatic classification is used to classify sentences composed of several words by unstructured data analysis. There was an attempt to find similar documents by combining the two methods. However, it have a problem in the algorithm w the methods of analysis are different ways to use simultaneous unstructured data and regular data. In this paper, we study the method of extracting keywords implied in the document and using the LDA(Latent Semantic Analysis) method to classify documents efficiently without human intervention and finding similar patents.

Automatic Meeting Summary System using Enhanced TextRank Algorithm (향상된 TextRank 알고리즘을 이용한 자동 회의록 생성 시스템)

  • Bae, Young-Jun;Jang, Ho-Taek;Hong, Tae-Won;Lee, Hae-Yeoun
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.11 no.5
    • /
    • pp.467-474
    • /
    • 2018
  • To organize and document the contents of meetings and discussions is very important in various tasks. However, in the past, people had to manually organize the contents themselves. In this paper, we describe the development of a system that generates the meeting minutes automatically using the TextRank algorithm. The proposed system records all the utterances of the speaker in real time and calculates the similarity based on the appearance frequency of the sentences. Then, to create the meeting minutes, it extracts important words or phrases through a non-supervised learning algorithm for finding the relation between the sentences in the document data. Especially, we improved the performance by introducing the keyword weighting technique for the TextRank algorithm which reconfigured the PageRank algorithm to fit words and sentences.

Melody Note - Music Score Editor and Play System (악보작성 및 재생 시스템)

  • Kim, Tae-Ki;Lee, Dae-jeong;Park, Mi-Ra;Min, Jun-Ki
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2009.10a
    • /
    • pp.1059-1062
    • /
    • 2009
  • As the electronic processing of music gradually is developed, there has been growing interest in automatical input of music. As a result, various researches which input music in the computer has been studied. However, previous studies have drawbacks that only the experts can do it. In other words, if beginners would like to use traditional production program of music scores than prior knowledge is required. To resolve this, we propose system painting music scores automatically using a bandwidth of soundsource, after extracting the voice sounds created by amateurs. The System provides amateurs with convenience so that they can compose. As well as, the System provides the ability to play music that produced by the computer. By using the system, amateurs can compose using voice and simple system handling. And, they can make a music that plays desired instruments.

  • PDF

Personalized EPG Application using Automatic User Preference Learning Method (사용자 선호도 자동 학습 방법을 이용한 개인용 전자 프로그램 가이드 어플리케이션 개발)

  • Lim Jeongyeon;Jeong Hyun;Kim Munchurl;Kang Sanggil;Kang Kyeongok
    • Journal of Broadcast Engineering
    • /
    • v.9 no.4 s.25
    • /
    • pp.305-321
    • /
    • 2004
  • With the advent of the digital broadcasting, the audiences can access a large number of TV programs and their information through the multiple channels on various media devices. The access to a large number of TV programs can support a user for many chances with which he/she can sort and select the best one of them. However, the information overload on the user inevitably requires much effort with a lot of patience for finding his/her favorite programs. Therefore, it is useful to provide the persona1ized broadcasting service which assists the user to automatically find his/her favorite programs. As the growing requirements of the TV personalization, we introduce our automatic user preference learning algorithm which 1) analyzes a user's usage history on TV program contents: 2) extracts the user's watching pattern depending on a specific time and day and shows our automatic TV program recommendation system using MPEG-7 MDS (Multimedia Description Scheme: ISO/IEC 15938-5) and 3) automatically calculates the user's preference. For our experimental results, we have used TV audiences' watching history with the ages, genders and viewing times obtained from AC Nielson Korea. From our experimental results, we observed that our proposed algorithm of the automatic user preference learning algorithm based on the Bayesian network can effectively learn the user's preferences accordingly during the course of TV watching periods.

A Robust Real-time Object Detection Method using Dominant Colors in Images (이미지의 주요 색상 정보들을 이용한 실시간 객체 검출 방법)

  • Park, Kyung-Wook;Koh, Jae-Han;Park, Jae-Han;Baeg, Seung-Ho;Baeg, Moon-Hong
    • Annual Conference of KIPS
    • /
    • 2007.05a
    • /
    • pp.301-304
    • /
    • 2007
  • 자동으로 이미지 안에 존재하는 객체들을 인식하는 문제는 내용 기반 이미지 검색이나 로봇 비전과 같은 다양한 분야들에서 매우 중요한 문제이다. 이 문제를 해결하기 위하여 본 논문에서는 객체의 주요 색상 정보들을 이용하여 실시간으로 이미지 안의 객체들을 인식하는 알고리즘을 제안한다. 본 논문에서 제안하는 방법의 전체적인 구조는 다음과 같다. 처음에 MPEG-7 색상 정보 기술자들 중 하나인 주요 색상 정보 기술자를 이용하여 객체의 주요 색상 정보들을 추출한다. 이 때 이 정보는 가우시안 색상 모델링을 통하여 빛이나 그림자와 같은 외부 환경 조건에 좀 더 강인한 색상 정보로 변환된다. 다음으로 변환된 색상 정보들을 기반으로 주요 객체와 입력 이미지와의 픽셀 값차이를 계산하고, 임계값 이상의 값을 가지는 픽셀들을 제거한다. 마지막으로 입력 이미지에서 제거되지 않은 픽셀들을 기반으로 하나의 영역을 생성한다. 결론으로서, 본 논문에서는 제안된 방법에 대한 실험 평가들을 수행 및 분석하고 몇몇 한계점들에 대해서 알아본다. 또한 이 문제들을 해결하기 위한 앞으로의 연구 계획에 대해서 기술한다.

Study on Automatic Mapping Method for Reference of Scholarly Papers (학술논문의 참고문헌 자동매핑 방법에 관한 연구)

  • Han, Jeong-Min;Jang, Hyun-Chul;Kim, Jin-Hyun;Yea, Sang-Jun;Kim, Sang-Kyun;Kim, Chul;Song, Mi-Young
    • Journal of Information Management
    • /
    • v.41 no.3
    • /
    • pp.155-173
    • /
    • 2010
  • With the advanced learning and the diversity of topics, researchers on each area keenly feel the need of precise and a quick discovery of required information at any time. This study presents a way of constructing the automatic mapping system that can compare and analyze duplicated data and that describes the result by building an effective reference extraction method and another way of correcting the wrong form of used Chinese characters with Traditional Korean Medicine dictionary. With this innovation, data duplication on references and Chinese characters errors can be fixed. Under the situation that a number of references of newly published papers that can continuously be extracted.

A Systematic Evaluation of Thinning Algorithms for Automatic Vectorization of Cartographic Maps (지리도면의 자동 벡터화를 위한 영상 세선화 알고리즘의 체계적인 성능평가)

  • Lee, Kyung-Ho;Kim, Kyong-Ho;Cho, Sung-Bae;Choy, Yoon-Chul
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.12
    • /
    • pp.2960-2970
    • /
    • 1997
  • In a variety of fields, recently, there is a growing interest in Geographic Information System which facilitates efficient storage and retrieval of geographic information. It is of extreme importance to make a good choice of efficient input method, because it takes the most of the lime and cost in constructing a GIS. Among several steps, thinning input image to produce skeleton of unit width is prerequisite to the automatic input or geographic maps. In this paper, we systematically evaluate the performance of representative thinning algorithms in geographic maps such as contour, cadastral, and water and sewer maps, and suggest appropriate algorithms for the maps, respectively. A thorough experiment indicates that Arcelli's method is best for contour maps, Holt's method for cadastral maps, and Chen's method for water and sewer maps.

  • PDF

Spatial Image Information Generation of Rock Wall by Automatic Focal Length Extraction System (초점거리 자동추출 시스템에 의한 암벽의 공간영상정보 생성)

  • Lee, Jae-Kee;Lee, Kye-Dong
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.25 no.5
    • /
    • pp.427-436
    • /
    • 2007
  • Because the slope made up the construction of any other facilities, has many risks of a collapse, existing inspection methods to collect information for a construction site of slope bring up a long time of inspection period, cost and approach for a measuring instrument and it presents the critical point of collecting materials. For getting images to use zoom lens in any positions this study will use free zoomer constructed values of data classified by the focal length develop Image Loader system to make it load not only camera information but also camera test data values of the focal length took a photograph automatically if it measure to use a variety of cameras or other lens. Also, as it constructs three dimensions spatial image information from images of obtained objects this study presents effective basic materials of slope surveying and inspection and it shows exact surveying methods for dangerous slope not to access.