• Title/Summary/Keyword: 태깅 도구

Search Result 30, Processing Time 0.02 seconds

A Parser for Noun's Definition in Korean Dictionary (국어사전의 명사 뜻풀이말 Parser)

  • Hur, Jeong;Kim, Jun-Soo;Lee, Soo-Kwang;Ok, Chul-young
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.04b
    • /
    • pp.321-323
    • /
    • 2000
  • 국어 사전은 자연 언어 처리에서 필요로 하는 많은 정보를 구조적으로 포함하고 있으므로, 사전으로부터 다양한 언어 지식을 자동으로 획득할 수 있는 방법이 필요하다. 본 연구는 이러한 자동 지식 획득을 위한 기본적인 도구로서 국어 사전의 뜻풀이말 파서를 구현하는 것을 목적으로 한다. 이를 위해서 우선 국어 사전의 뜻풀이말을 대상으로 일정한 수준의 구문 부착 말뭉치를 구축하고, 이 말뭉치로부터 통계적인 방법에 기반하여 문법 규칙과 확률을 자동으로 추출한다. 본 연구는 이를 응용한 확률적 차트 파서를 구현하는 것이다. 그 결과 고려대 태거보다 11.61%의 정확률 향상을 보였는데, 이로써 구문 구조 정보가 품사 태깅에도 유용함을 알 수 있었다.

  • PDF

Saken: A Korean Event Recognizer (Saken: 한국어 사건 인식 시스템)

  • You, Hyun-Jo;Kim, Moonhyung;Junho, Juliano P.;Nam, Seungho;Shin, Hyopil
    • Annual Conference on Human and Language Technology
    • /
    • 2009.10a
    • /
    • pp.25-30
    • /
    • 2009
  • 한국어 자연언어 텍스트에서 사건을 자동으로 인식하기 위한 Saken 태거를 소개하고자 한다. Saken 태거는 한국어 사건 및 시간의 자동 인식을 위한 시스템인 한국어 TARSQI 툴킷을 구성하는 하나의 모듈로 개발된 것이나 독립적으로 사건 추출 도구로 사용될 수도 있다. Saken 태거는 미리 구축된 사건의 목록이나 특정 도메인으로 적용 대상을 제한하지 않고 보편적으로 사용될 수 있는 사건 분석기를 지향하고 있다. 이 논문에서는 사건 태깅을 위한 언어학적 배경과 Saken 태거를 구성하는 세부 모듈을 소개하고 신문 기사를 이용한 평가 실험 결과를 분석할 것이다.

  • PDF

Design of a Video Metadata Schema and Implementation of an Authoring Tool for User Edited Contents Creation (User Edited Contents 생성을 위한 동영상 메타데이터 스키마 설계 및 저작 도구 구현)

  • Song, Insun;Nang, Jongho
    • Journal of KIISE
    • /
    • v.42 no.3
    • /
    • pp.413-418
    • /
    • 2015
  • In this paper, we design new video metadata schema for searching video segments to create UEC (User Edited Contents). The proposed video metadata schema employs hierarchically structured units of 'Title-Event-Place(Scene)-Shot', and defines the fields of the semantic information as structured form in each segment unit. Since this video metadata schema is defined by analyzing the structure of existing UECs and by experimenting the tagging and searching the video segment units for creating the UECs, it helps the users to search useful video segments for UEC easily than MPEG-7 MDS (Multimedia Description Scheme) which is a general purpose international standard for video metadata schema.

Constructing Tagged Corpus and Cue Word Patterns for Detecting Korean Hedge Sentences (한국어 Hedge 문장 인식을 위한 태깅 말뭉치 및 단서어구 패턴 구축)

  • Jeong, Ju-Seok;Kim, Jun-Hyeouk;Kim, Hae-Il;Oh, Sung-Ho;Kang, Sin-Jae
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.21 no.6
    • /
    • pp.761-766
    • /
    • 2011
  • A hedge is a linguistic device to express uncertainties. Hedges are used in a sentence when the writer is uncertain or has doubt about the contents of the sentence. Due to this uncertainty, sentences with hedges are considered to be non-factual. There are many applications which need to determine whether a sentence is factual or not. Detecting hedges has the advantage in information retrieval, and information extraction, and QnA systems, which make use of non-hedge sentences as target to get more accurate results. In this paper, we constructed Korean hedge corpus, and extracted generalized hedge cue-word patterns from the corpus, and then used them in detecting hedges. In our experiments, we achieved 78.6% in F1-measure.

Development of GML Map Visualization Service and POI Management Tool using Tagging (GML 지도 가시화 서비스 및 태깅을 이용한 POI 관리 도구 개발)

  • Park, Yong-Jin;Song, Eun-Ha;Jeong, Young-Sik
    • Journal of Internet Computing and Services
    • /
    • v.9 no.3
    • /
    • pp.141-158
    • /
    • 2008
  • In this paper, we developed the GML Map Server which visualized the map based on GML as international standard for exchanging the common format map and for interoperability of GIS information. And also, it should transmit effectively GML map into the mobile device by using dynamic map partition and caching. It manages a partition based on the visualization area of a mobile device in order to visualize the map to a mobile device in real time, and transmits the partition area by serializing it for the benefit of transmission. Also, the received partition area is compounded in a mobile device and is visualized by being partitioned again as four visible areas based on the display of a mobile device. Then, the area is managed by applying a caching algorithm in consideration of repetitiveness for a received map for the efficient operation of resources. Also, in order to prevent the delay in transmission time as regards the instance density area of the map, an adaptive map partition mechanism is proposed for maintaining the regularity of transmission time. GML Map Server can trace the position of mobile device with WIPI environment in this paper. The field emulator can be created mobile devices and mobile devices be moved and traced it's position instead of real-world. And we developed POIM(POI Management) for management hierarchically POI information and for the efficiency POI search by using the individual tagging technology with visual interface.

  • PDF

PPEditor: Semi-Automatic Annotation Tool for Korean Dependency Structure (PPEditor: 한국어 의존구조 부착을 위한 반자동 말뭉치 구축 도구)

  • Kim Jae-Hoon;Park Eun-Jin
    • The KIPS Transactions:PartB
    • /
    • v.13B no.1 s.104
    • /
    • pp.63-70
    • /
    • 2006
  • In general, a corpus contains lots of linguistic information and is widely used in the field of natural language processing and computational linguistics. The creation of such the corpus, however, is an expensive, labor-intensive and time-consuming work. To alleviate this problem, annotation tools to build corpora with much linguistic information is indispensable. In this paper, we design and implement an annotation tool for establishing a Korean dependency tree-tagged corpus. The most ideal way is to fully automatically create the corpus without annotators' interventions, but as a matter of fact, it is impossible. The proposed tool is semi-automatic like most other annotation tools and is designed to edit errors, which are generated by basic analyzers like part-of-speech tagger and (partial) parser. We also design it to avoid repetitive works while editing the errors and to use it easily and friendly. Using the proposed annotation tool, 10,000 Korean sentences containing over 20 words are annotated with dependency structures. For 2 months, eight annotators have worked every 4 hours a day. We are confident that we can have accurate and consistent annotations as well as reduced labor and time.

An development of framework and a supporting tool for organizing Grouped Folksonomy (그룹화된 폭소노미 구축을 위한 프레임워크와 지원도구의 개발)

  • Kang, Yu-Kyung;Hwang, Suk-Hyung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.5
    • /
    • pp.109-125
    • /
    • 2011
  • A folksonomy is a new classification approach for organizing information by users to freely attach one or more tags to various resources published on the web. Recently, in order to provide useful services and organize the folksonomy data, many collaborative tagging systems based on folksonomy offer additional functionalities for grouping each elements of a folksonomy. In this paper, organization framework for grouped folksonomy is proposed. That is, we suggest the grouped folksonomy model that is an extended folksonomy with the concept of "group" and fundamental operations(Group Aggregation, Group Composition, Group Intersection, Group Difference) for grouping of folksonomy elements. Also, we developed a supporting tool(GFO) that constructs grouped folksonomy and executes fundamental operations. And we introduce some cases using the fundamental operations for grouping of each elements of folksonomy. Based on suggested our approach, we can construct grouped folksonomy and organize and extract useful information from the folksonomy data by grouping each elements of a folksonomy.

On development of supporting tool for Folksonomy Mining based on Formal Concept Analysis (형식개념분석을 이용한 폭소노미 마이닝 기법과 지원도구의 개발)

  • Kang, Yu-Kyung;Hwang, Suk-Hyung;Yang, Hae-Sool
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.10 no.8
    • /
    • pp.1877-1893
    • /
    • 2009
  • Folksonomy is a user-generated taxonomy to organize information by which a user assigns tags to resources published on the web. Triadic datas that indicate relations of between users, tags, and resources, are created by collaborative tagging from many users in folksonomy-based system. Such the folksonomy data has been utilized in the field of the semantic web and web2.0 as metadata about web resources. In this paper, we propose FCA-based folksonomy data mining approach in order to extract the useful information from folksonomy data with various points of view. And we developed tool for supporting our approach. In order to verify the usefulness of our proposed approach and FMT, we have done some experiments for data of del.icio.us, which is a popular folksonomy-based bookmarking system. And we report about result of our experiments.

Design and Implementation of Finite-State-Transducer Preprocessor for an Efficient Parsing and Translation in Korean-to-English Machine Translation (한영 기계번역에서의 효율적인 구문분석과 번역을 위한 유한상태 변환기 기반 전처리기의 설계 및 구현)

  • Park, Jun-Sik;Choi, Key-Sun
    • Annual Conference on Human and Language Technology
    • /
    • 1999.10e
    • /
    • pp.128-134
    • /
    • 1999
  • 기계번역이나 정보검색 등에 적용되는 자연언어처리기술에 있어서 구문분석은 매우 중요한 위치를 차지한다. 하지만, 문장의 길이가 증가함에 따라 구문분석의 복잡도는 크게 증가하게 된다. 이를 해결하기 위한 많은 노력 중에서 전처리기의 지원을 통해 구문분석기의 부담을 줄이려는 방법이 있다. 본 논문에서는 구문분석의 애매성과 복잡성을 감소시키기 위해 유한상태 변환기 (Finite-State-Transducer FSI)를 이용한 전처리기를 제안한다. 유한상태 변환기는 사전표현, 단어분할, 품사태깅 등에 널리 사용되어 왔는데, 본 논문에서는 유한상태 변환기를 이용하여 형태소 분석된 문장에서 시간표현 등의 제한된 표현들을 구문요소화하는 전처리기를 설계 및 구현하였다. 본 논문에서는 기계번역기에서의 구문분석기 뿐만 아니라 변환지식의 모듈화를 지원하기 위해 유한상태 변환기를 이용하여 시간표현 등의 부분적인 표현들을 번역하는 방법을 제안한다. 또한 유한상태 변환기의 편리한 작성을 위하여 유한상태 변환기 작성 지원도구를 구현하였다. 본 논문에서는 전처리기의 적용을 통해 구문분석기의 부담을 덜어 주며 기계번역기의 변환부분의 일부를 성공적으로 담당할 수 있음을 보여 준다.

  • PDF

A Lifelog Management System Based on the Relational Data Model and its Applications (관계 데이터 모델 기반 라이프로그 관리 시스템과 그 응용)

  • Song, In-Chul;Lee, Yu-Won;Kim, Hyeon-Gyu;Kim, Hang-Kyu;Haam, Deok-Min;Kim, Myoung-Ho
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.9
    • /
    • pp.637-648
    • /
    • 2009
  • As the cost of disks decreases, PCs are soon expected to be equipped with a disk of 1TB or more. Assuming that a single person generates 1GB of data per month, 1TB is enough to store data for the entire lifetime of a person. This has lead to the growth of researches on lifelog management, which manages what people see and listen to in everyday life. Although many different lifelog management systems have been proposed, including those based on the relational data model, based on ontology, and based on file systems, they have all advantages and disadvantages: Those based on the relational data model provide good query processing performance but they do not support complex queries properly; Those based on ontology handle more complex queries but their performances are not satisfactory: Those based on file systems support only keyword queries. Moreover, these systems are lack of support for lifelog group management and do not provide a convenient user interface for modifying and adding tags (metadata) to lifelogs for effective lifelog search. To address these problems, we propose a lifelog management system based on the relational data model. The proposed system models lifelogs by using the relational data model and transforms queries on lifelogs into SQL statements, which results in good query processing performance. It also supports a simplified relationship query that finds a lifelog based on other lifelogs directly related to it, to overcome the disadvantage of not supporting complex queries properly. In addition, the proposed system supports for the management of lifelog groups by providing ways to create, edit, search, play, and share them. Finally, it is equipped with a tagging tool that helps the user to modify and add tags conveniently through the ion of various tags. This paper describes the design and implementation of the proposed system and its various applications.