• Title/Summary/Keyword: State language

Search Result 660, Processing Time 0.032 seconds

Performance Comparison and Error Analysis of Korean Bio-medical Named Entity Recognition (한국어 생의학 개체명 인식 성능 비교와 오류 분석)

  • Jae-Hong Lee
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.4
    • /
    • pp.701-708
    • /
    • 2024
  • The advent of transformer architectures in deep learning has been a major breakthrough in natural language processing research. Object name recognition is a branch of natural language processing and is an important research area for tasks such as information retrieval. It is also important in the biomedical field, but the lack of Korean biomedical corpora for training has limited the development of Korean clinical research using AI. In this study, we built a new biomedical corpus for Korean biomedical entity name recognition and selected language models pre-trained on a large Korean corpus for transfer learning. We compared the name recognition performance of the selected language models by F1-score and the recognition rate by tag, and analyzed the errors. In terms of recognition performance, KlueRoBERTa showed relatively good performance. The error analysis of the tagging process shows that the recognition performance of Disease is excellent, but Body and Treatment are relatively low. This is due to over-segmentation and under-segmentation that fails to properly categorize entity names based on context, and it will be necessary to build a more precise morphological analyzer and a rich lexicon to compensate for the incorrect tagging.

Part-of-speech Tagging for Hindi Corpus in Poor Resource Scenario

  • Modi, Deepa;Nain, Neeta;Nehra, Maninder
    • Journal of Multimedia Information System
    • /
    • v.5 no.3
    • /
    • pp.147-154
    • /
    • 2018
  • Natural language processing (NLP) is an emerging research area in which we study how machines can be used to perceive and alter the text written in natural languages. We can perform different tasks on natural languages by analyzing them through various annotational tasks like parsing, chunking, part-of-speech tagging and lexical analysis etc. These annotational tasks depend on morphological structure of a particular natural language. The focus of this work is part-of-speech tagging (POS tagging) on Hindi language. Part-of-speech tagging also known as grammatical tagging is a process of assigning different grammatical categories to each word of a given text. These grammatical categories can be noun, verb, time, date, number etc. Hindi is the most widely used and official language of India. It is also among the top five most spoken languages of the world. For English and other languages, a diverse range of POS taggers are available, but these POS taggers can not be applied on the Hindi language as Hindi is one of the most morphologically rich language. Furthermore there is a significant difference between the morphological structures of these languages. Thus in this work, a POS tagger system is presented for the Hindi language. For Hindi POS tagging a hybrid approach is presented in this paper which combines "Probability-based and Rule-based" approaches. For known word tagging a Unigram model of probability class is used, whereas for tagging unknown words various lexical and contextual features are used. Various finite state machine automata are constructed for demonstrating different rules and then regular expressions are used to implement these rules. A tagset is also prepared for this task, which contains 29 standard part-of-speech tags. The tagset also includes two unique tags, i.e., date tag and time tag. These date and time tags support all possible formats. Regular expressions are used to implement all pattern based tags like time, date, number and special symbols. The aim of the presented approach is to increase the correctness of an automatic Hindi POS tagging while bounding the requirement of a large human-made corpus. This hybrid approach uses a probability-based model to increase automatic tagging and a rule-based model to bound the requirement of an already trained corpus. This approach is based on very small labeled training set (around 9,000 words) and yields 96.54% of best precision and 95.08% of average precision. The approach also yields best accuracy of 91.39% and an average accuracy of 88.15%.

Development of Intelligent Learning Tool based on Human eyeball Movement Analysis for Improving Foreign Language Competence (외국어 능력 향상을 위한 사용자 안구운동 분석 기반의 지능형 학습도구 개발)

  • Shin, Jihye;Jang, Young-Min;Kim, Sangwook;Mallipeddi, Rammohan;Bae, Jungok;Choi, Sungmook;Lee, Minho
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.11
    • /
    • pp.153-161
    • /
    • 2013
  • Recently, there has been a tremendous increase in the availability of educational materials for foreign language learning. As part of this trend, there has been an increase in the amount of electronically mediated materials available. However, conventional educational contents developed using computer technology has provided typically one-way information, which is not the most helpful thing for users. Providing the user's convenience requires additional off-line analysis for diagnosing an individual user's learning. To improve the user's comprehension of texts written in a foreign language, we propose an intelligent learning tool based on the analysis of the user's eyeball movements, which is able to diagnose and improve foreign language reading ability by providing necessary supplementary aid just when it is needed. To determine the user's learning state, we correlate their eye movements with findings from research in cognitive psychology and neurophysiology. Based on this, the learning tool can distinguish whether users know or do not know words when they are reading foreign language sentences. If the learning tool judges a word to be unknown, it immediately provides the student with the meaning of the word by extracting it from an on-line dictionary. The proposed model provides a tool which empowers independent learning and makes access to the meanings of unknown words automatic. In this way, it can enhance a user's reading achievement as well as satisfaction with text comprehension in a foreign language.

Intermediary Systems for Bibliographic Information Retrieval

  • Yoo, Ja Kyung
    • Journal of the Korean Society for information Management
    • /
    • v.2 no.2
    • /
    • pp.38-70
    • /
    • 1985
  • The purpose of this paper is to provide a review of the literature on the role of end-user intermediary systems in information retrieval. The paper starts with an introduction pointing out the problems involved in conventional retrieval system. The next section covers the major developments in the field of intermediary systems including natural language processing, automatic query formulation, relevance feedback, and automatic query refinement. The paper concludes with a general overview of the current state of the art and its future implications in information retrieval.

  • PDF

Technical Trend and View of Neural Networks for Factory Automation (공장 자동화에 적용되는 Neural Networks의 기술동향 및 전망)

  • Lee, Jin-Seop;Ha, Jae-Hun
    • Proceedings of the KIEE Conference
    • /
    • 1991.07a
    • /
    • pp.892-895
    • /
    • 1991
  • In this study, it has been refering that disposal of rapidly international information society and artificial intelligence neural networks of the vanguard software technology. This paper is human brain cell structure modeling in order to neural networks realization for order language and computer embodiment of parallel processing. And it is shown that the usage extreme of time saving and correct judgement for business services, Overviews some of the currently popular neural networks architectures, and describes the current state of the neural networks technology.

  • PDF

Computing Coarser Observation Ffunctions Using Control-Compatible States of Supervisor

  • Cho, Hangju
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1993.10a
    • /
    • pp.318-323
    • /
    • 1993
  • The paper discusses the problem of computing coarser observation functions in supervisory control of discrete event systems. It is shown that when a supervisor that realizes a given language L has certain properties, L-realizability of a coarser observation function is equivalent to control-compatibility of the states in some subsets of the state space of the supervisor. This characterization is then used to devise an iterative procedure of computing coarser L-realizable observation functions, where supervisor reduction and L-realizability verification of an observation function are performed at each iteration.

  • PDF

Application of Object-oriented Language to Power Systems (객체지향기법의 전력계통시스템에의 적용)

  • Park, J.H.;Kim, J.N.;Shin, J.H.;Lee, J.Y.;Baek, Y.S.
    • Proceedings of the KIEE Conference
    • /
    • 1999.07c
    • /
    • pp.1218-1220
    • /
    • 1999
  • In this paper, we developed object-oriented analysis method for electric power system. It was applied to fault diagnosis, power system stability analysis and service restoration system in emergency state. Objectoriented programming(OOP)is a more flexible method than procedural programming. We proposed flexible modeling method for power system analysis.

  • PDF

Development on ATM Protocol Verificator (ATM 프로토콜 검정기 개발)

  • Min, J.H.;Lee, B.H.
    • Electronics and Telecommunications Trends
    • /
    • v.13 no.6 s.54
    • /
    • pp.94-107
    • /
    • 1998
  • 연구 개발의 주된 내용은 SDL(Specification Description Language)을 위한 정형기법 지원도구 중 명세상에서 행위 부분에 대한 동적 특성을 검정하는 검정기 개발이다. 모델 검정기는 해당 프로토콜에 대해 생성된 중간 모델 I/O FSM(Input/Output Finite State Machine)에 Modal-calculus에 의해 검정대상인 deadlock, livelock, reachability 및 liveness에 대한 표현과 I/O FSM에 대해 해당 알고리즘 적용 및 분석 기능을 C++언어로 구현하였다. 또한 SDL Editer 기능과 관련된 도구들과 통합하여 사용자들이 쉽고 편하게 쓸 수 있도록 환경 및 통합 모듈을 구현한다.

A Pilot Study on the Standard Model for the Classification of Database (데이터베이스 분류 표준화를 위한 기초연구)

  • 고영만
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.7 no.1
    • /
    • pp.193-230
    • /
    • 1994
  • The systematic classification of database is much debated issue currently in telecommunication industry. Nevertheless, the attempt to build the systematic model is nowadays nowhere to be found. The purpose of this study is to gain a general overview relating to this subject and to make out a draft for the development of standard model. Relating th the study for the databases classification, it was classified from the 9 points of view: manufacturer, subject, processed form (level), (re)presented form, language, completion state and updating cycle, retrieval method, communication media, and use.

  • PDF

Object-oriented modeling based on the BCSM in PSTN/IP networks (PSTN/IP 통합망에서 BCSM에 기반한 객체 지향 모델링)

  • 이종혁
    • Proceedings of the Korea Society for Simulation Conference
    • /
    • 1999.10a
    • /
    • pp.18-23
    • /
    • 1999
  • 본 논문에서는 음성 중심의 통신이 아닌 데이터 중심의 통신을 위한 네트워크인 PSTN/IP 통합망의 기본 호처리 프로세스를 객체 지향 관점에서 모델링을 하였다. 데이터 통신의 빠른 증가를 충족 시키기 위해, 기존의 하드웨어로 구성된 교환기를 범용 컴퓨터를 이용해 구현하고자 하는 PSTN/IP 통합망의 개발은 하드웨어 관점의 모델링이 아닌 소프트웨어 관점의 모델링을 기반으로 구축되어야 한다. 이를 위해 기존의 하드웨어 교환기 모델링에서 사용되던 State Model을 객체 지향 소프트웨어 개발을 위한 모델링의 표준인 UML(Unified Modeling Language) 표기법을 이용해 재모델링하였다.

  • PDF