• Title/Summary/Keyword: Sequence Tagging

Search Result 39, Processing Time 0.021 seconds

Discriminative Training of Sequence Taggers via Local Feature Matching

  • Kim, Minyoung
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.14 no.3
    • /
    • pp.209-215
    • /
    • 2014
  • Sequence tagging is the task of predicting frame-wise labels for a given input sequence and has important applications to diverse domains. Conventional methods such as maximum likelihood (ML) learning matches global features in empirical and model distributions, rather than local features, which directly translates into frame-wise prediction errors. Recent probabilistic sequence models such as conditional random fields (CRFs) have achieved great success in a variety of situations. In this paper, we introduce a novel discriminative CRF learning algorithm to minimize local feature mismatches. Unlike overall data fitting originating from global feature matching in ML learning, our approach reduces the total error over all frames in a sequence. We also provide an efficient gradient-based learning method via gradient forward-backward recursion, which requires the same computational complexity as ML learning. For several real-world sequence tagging problems, we empirically demonstrate that the proposed learning algorithm achieves significantly more accurate prediction performance than standard estimators.

Sequence-to-sequence based Morphological Analysis and Part-Of-Speech Tagging for Korean Language with Convolutional Features (Sequence-to-sequence 기반 한국어 형태소 분석 및 품사 태깅)

  • Li, Jianri;Lee, EuiHyeon;Lee, Jong-Hyeok
    • Journal of KIISE
    • /
    • v.44 no.1
    • /
    • pp.57-62
    • /
    • 2017
  • Traditional Korean morphological analysis and POS tagging methods usually consist of two steps: 1 Generat hypotheses of all possible combinations of morphemes for given input, 2 Perform POS tagging search optimal result. require additional resource dictionaries and step could error to the step. In this paper, we tried to solve this problem end-to-end fashion using sequence-to-sequence model convolutional features. Experiment results Sejong corpus sour approach achieved 97.15% F1-score on morpheme level, 95.33% and 60.62% precision on word and sentence level, respectively; s96.91% F1-score on morpheme level, 95.40% and 60.62% precision on word and sentence level, respectively.

Epitope Tagging with a Peptide Derived from the preS2 Region of Hepatitis B Virus Surface Antigen

  • Kang, Hyun-Ah;Yi, Gwan-Su;Yu, Myeong-Hee
    • BMB Reports
    • /
    • v.28 no.4
    • /
    • pp.353-358
    • /
    • 1995
  • Epitope tagging is the process of fusing a set of amino acid residues that are recognized as an antigenic determinant to a protein of interest. Tagging a protein with an epitope facilitates various immunochemical analyses of the tagged protein with a specific monoclonal antibody. The monoclonal antibody H8 has subtype specificity for an epitope derived from the preS2 region of hepatitis B virus surface antigen. Previous studies on serial deletions of the preS2 region indicated that the preS2 epitope was located in amino acid residues 130~142. To test whether the amino acid sequence in this interval is sufficient to confer on proteins the antigenicity recognizable by the antibody H8, the set of amino acid residues in the interval was tagged to the amino terminal of ${\beta}$-galactosidase and to the carboxyl terminal of the truncated $p56^{lck}$ fragment. The tagged ${\beta}$-galactosidase, expressed in Escherichia coli, maintained the enzymatic activity and was immunoprecipitated efficiently with H8. The tagged $p56^{lck}$ fragment, synthesized in an in vitro translation system, was also immunoprecipitated specifically with H8. These results demonstrate that the amino acid sequence of the preS2 region can be used efficiently for the epitope tagging approach.

  • PDF

Syllable-based POS Tagging without Korean Morphological Analysis (형태소 분석기 사용을 배제한 음절 단위의 한국어 품사 태깅)

  • Shim, Kwang-Seob
    • Korean Journal of Cognitive Science
    • /
    • v.22 no.3
    • /
    • pp.327-345
    • /
    • 2011
  • In this paper, a new approach to Korean POS (Part-of-Speech) tagging is proposed. In previous works, a Korean POS tagger was regarded as a post-processor of a morphological analyzer, and as such a tagger was used to determine the most likely morpheme/POS sequence from morphological analysis. In the proposed approach, however, the POS tagger is supposed to generate the most likely morpheme and POS pair sequence directly from the given sentences. 398,632 eojeol POS-tagged corpus and 33,467 eojeol test data are used for training and evaluation, respectively. The proposed approach shows 96.31% of POS tagging accuracy.

  • PDF

The Variation of Tagging Contrast-to-Noise Ratio (CNR) of SPAMM Image by Modulation of Tagline Spacing

  • Kang, Won-Suk;Park, Byoung-Wook;Choe, Kyu-Ok;Lee, Sang-Ho;Soonil Hong;Haijo Jung;Kim, Hee-Joung
    • Proceedings of the Korean Society of Medical Physics Conference
    • /
    • 2002.09a
    • /
    • pp.360-362
    • /
    • 2002
  • Myocardial tagging technique such as spatial modulation of magnetization (SPAMM) allows the study of myocardial motion with high accuracy. Tagging contrast of such a tagging images can affect to the accuracy of the estimation of tag intersection in order to analyze the myocardial motion. Tagging contrast can be affected by tagline spacing. The aim of this study was to investigate the relationship between tagline spacing of SPAMM image and tagging contrast-to-noise ratio (CNR) experimentally. One healthy volunteer was undergone electrocardiographically triggered MR imaging with SPAMM-based tagging pulse sequence at a 1.5T MR scanner (Gyroscan Intera, Philips Medical System, Netherland). Horizontally modulated stripe patterns were imposed with a range from 3.6mm to 9.6mm of tagline spacing. Images of the left ventricle (LV) wall were acquired at the mid-ventricle level during cardiac cycle with FEEPI (TR/TE/FA=5.8/2.2/10). Tagging CNR for each image was calculated with a software which developed in our group. During contraction, tagging CNR was more rapidly decreased in case of short tagline spacing than in case of long tagline spacing. In the same heart phase, CNR was increased corresponding with tag line spacing. Especially, at the fully contracted heart phase, CNR was more rapidly increased than the other heart phases as a function of tagline spacing.

  • PDF

Syllable-based Korean POS Tagging Based on Combining a Pre-analyzed Dictionary with Machine Learning (기분석사전과 기계학습 방법을 결합한 음절 단위 한국어 품사 태깅)

  • Lee, Chung-Hee;Lim, Joon-Ho;Lim, Soojong;Kim, Hyun-Ki
    • Journal of KIISE
    • /
    • v.43 no.3
    • /
    • pp.362-369
    • /
    • 2016
  • This study is directed toward the design of a hybrid algorithm for syllable-based Korean POS tagging. Previous syllable-based works on Korean POS tagging have relied on a sequence labeling method and mostly used only a machine learning method. We present a new algorithm integrating a machine learning method and a pre-analyzed dictionary. We used a Sejong tagged corpus for training and evaluation. While the machine learning engine achieved eojeol precision of 0.964, the proposed hybrid engine achieved eojeol precision of 0.990. In a Quiz domain test, the machine learning engine and the proposed hybrid engine obtained 0.961 and 0.972, respectively. This result indicates our method to be effective for Korean POS tagging.

KRDD: Korean Rice Ds-tagging Lines Database for Rice (Oryza sativa L. Dongjin)

  • Kim, Chang-Kug;Lee, Myung-Chul;Ahn, Byung-Ohg;Yun, Doh-Won;Yoon, Ung-Han;Suh, Seok-Cheol;Eun, Moo-Young;Hahn, Jang-Ho
    • Genomics & Informatics
    • /
    • v.6 no.2
    • /
    • pp.64-67
    • /
    • 2008
  • The Korean Rice Ds-tagging lines Database (KRDD) is designed to provide information about Ac/Ds insertion lines and activation tagging lines using japonica rice. This database has provided information on 18,158 Ds lines, which includes the ID, description, photo image, sequence information, and gene characteristics. The KRDD is visualized using a web-based graphical view, and anonymous users can query and browse the data using the search function. It has four major menus of web pages: (i) a Blast Search menu of a mutant line; Blast from rice Ds-tagging mutant lines; (ii) a primer design tool to identify genotypes of Ds insertion lines; (iii) a Phenotype menu for Ds lines, searching by identification name and phenotype characteristics; and (iv) a Management menu for Ds lines.

The Variation of Tagging Contrast-to-Noise Radio (CNR) of SPAMM Image by Modulation of Tagline Spacing (Tagline 간격의 조절을 통한 SPAMM 영상에서의 Tagging 대조도 대 잡음비의 변화)

  • 강원석;최병욱;최규옥;이상호;홍순일;정해조;김희중
    • Progress in Medical Physics
    • /
    • v.13 no.4
    • /
    • pp.224-228
    • /
    • 2002
  • Myocardial tagging technique such as spatial modulation of magnetization (SPAMM) allows the study of myocardial motion with high accuracy. However, the accuracy of the estimation of tag intersection can be affected by tagline spacing. The aim of this study was to investigate the relationship between tagline spacing of SPAMM image and tagging contrast-to-noise ratio (CNR) in in-vivo study. Two healthy volunteers were undergone electrocardiographically triggered MR imaging with SPAMM-based tagging pulse sequence at a 1.5T MR scanner. Horizontally modulated stripe patterns were imposed with a range from 3.6 to 9.6 mm of tagline spacing. Images of the left ventricle(LV) wall were acquired at the mid-ventricle level during cardiac cycle with FE-EPI (TR/TE = 5.8/2.2 msec, FA= 10$^{\circ}$. Tagging CNR for each image was calculated with a software which developed in our group. During contraction, tagging CNR was more rapidly decreased in case of narrow tagline spacing than in case of wide tagline spacing. In the same heart phase, CNR was increased corresponding with tagline spacing. Especially, at the fully contracted heart phase, CNR was more rapidly increased than the other heart phases as a function of tagline spacing. The results indicated that the optimization of tagline spacing provides better tagging CNR in order to analyze the myocardial motion more accurately.

  • PDF

Korean Morphological Analysis and Part-Of-Speech Tagging with LSTM-CRF based on BERT (BERT기반 LSTM-CRF 모델을 이용한 한국어 형태소 분석 및 품사 태깅)

  • Park, Cheoneum;Lee, Changki;Kim, Hyunki
    • Annual Conference on Human and Language Technology
    • /
    • 2019.10a
    • /
    • pp.34-36
    • /
    • 2019
  • 기존 딥 러닝을 이용한 형태소 분석 및 품사 태깅(Part-Of-Speech tagging)은 feed-forward neural network에 CRF를 결합하는 방법이나 sequence-to-sequence 모델을 이용한 방법 등의 다양한 모델들이 연구되었다. 본 논문에서는 한국어 형태소 분석 및 품사 태깅을 수행하기 위하여 최근 자연어처리 태스크에서 많은 성능 향상을 보이고 있는 BERT를 기반으로 한 음절 단위 LSTM-CRF 모델을 제안한다. BERT는 양방향성을 가진 트랜스포머(transformer) 인코더를 기반으로 언어 모델을 사전 학습한 것이며, 본 논문에서는 한국어 대용량 코퍼스를 어절 단위로 사전 학습한 KorBERT를 사용한다. 실험 결과, 본 논문에서 제안한 모델이 기존 한국어 형태소 분석 및 품사 태깅 연구들 보다 좋은 (세종 코퍼스) F1 98.74%의 성능을 보였다.

  • PDF

A Pipeline Model for Korean Morphological Analysis and Part-of-Speech Tagging Using Sequence-to-Sequence and BERT-LSTM (Sequence-to-Sequence 와 BERT-LSTM을 활용한 한국어 형태소 분석 및 품사 태깅 파이프라인 모델)

  • Youn, Jun Young;Lee, Jae Sung
    • Annual Conference on Human and Language Technology
    • /
    • 2020.10a
    • /
    • pp.414-417
    • /
    • 2020
  • 최근 한국어 형태소 분석 및 품사 태깅에 관한 연구는 주로 표층형에 대해 형태소 분리와 품사 태깅을 먼저하고, 추가 언어자원을 사용하여 후처리로 형태소 원형과 품사를 복원해왔다. 본 연구에서는 형태소 분석 및 품사 태깅을 두 단계로 나누어, Sequence-to-Sequence를 활용하여 형태소 원형 복원을 먼저 하고, 최근 자연어처리의 다양한 분야에서 우수한 성능을 보이는 BERT를 활용하여 형태소 분리 및 품사 태깅을 하였다. 본 논문에서는 두 단계를 파이프라인으로 연결하였고, 제안하는 형태소 분석 및 품사 태깅 파이프라인 모델은 음절 정확도가 98.39%, 형태소 정확도 98.27%, 어절 정확도 96.31%의 성능을 보였다.

  • PDF