• Title/Summary/Keyword: Entity

Search Result 2,083, Processing Time 0.026 seconds

Encoding Dictionary Feature for Deep Learning-based Named Entity Recognition

  • Ronran, Chirawan;Unankard, Sayan;Lee, Seungwoo
    • International Journal of Contents
    • /
    • v.17 no.4
    • /
    • pp.1-15
    • /
    • 2021
  • Named entity recognition (NER) is a crucial task for NLP, which aims to extract information from texts. To build NER systems, deep learning (DL) models are learned with dictionary features by mapping each word in the dataset to dictionary features and generating a unique index. However, this technique might generate noisy labels, which pose significant challenges for the NER task. In this paper, we proposed DL-dictionary features, and evaluated them on two datasets, including the OntoNotes 5.0 dataset and our new infectious disease outbreak dataset named GFID. We used (1) a Bidirectional Long Short-Term Memory (BiLSTM) character and (2) pre-trained embedding to concatenate with (3) our proposed features, named the Convolutional Neural Network (CNN), BiLSTM, and self-attention dictionaries, respectively. The combined features (1-3) were fed through BiLSTM - Conditional Random Field (CRF) to predict named entity classes as outputs. We compared these outputs with other predictions of the BiLSTM character, pre-trained embedding, and dictionary features from previous research, which used the exact matching and partial matching dictionary technique. The findings showed that the model employing our dictionary features outperformed other models that used existing dictionary features. We also computed the F1 score with the GFID dataset to apply this technique to extract medical or healthcare information.

Multitask Transformer Model-based Fintech Customer Service Chatbot NLU System with DECO-LGG SSP-based Data (DECO-LGG 반자동 증강 학습데이터 활용 멀티태스크 트랜스포머 모델 기반 핀테크 CS 챗봇 NLU 시스템)

  • Yoo, Gwang-Hoon;Hwang, Chang-Hoe;Yoon, Jeong-Woo;Nam, Jee-Sun
    • Annual Conference on Human and Language Technology
    • /
    • 2021.10a
    • /
    • pp.461-466
    • /
    • 2021
  • 본 연구에서는 DECO(Dictionnaire Electronique du COreen) 한국어 전자사전과 LGG(Local-Grammar Graph)에 기반한 반자동 언어데이터 증강(Semi-automatic Symbolic Propagation: SSP) 방식에 입각하여, 핀테크 분야의 CS(Customer Service) 챗봇 NLU(Natural Language Understanding)을 위한 주석 학습 데이터를 효과적으로 생성하고, 이를 기반으로 RASA 오픈 소스에서 제공하는 DIET(Dual Intent and Entity Transformer) 아키텍처를 활용하여 핀테크 CS 챗봇 NLU 시스템을 구현하였다. 실 데이터을 통해 확인된 핀테크 분야의 32가지의 토픽 유형 및 38가지의 핵심 이벤트와 10가지 담화소 구성에 따라, DECO-LGG 데이터 생성 모듈은 질의 및 불만 화행에 대한 양질의 주석 학습 데이터를 효과적으로 생성하며, 이를 의도 분류 및 Slot-filling을 위한 개체명 인식을 종합적으로 처리하는 End to End 방식의 멀티태스크 트랜스포머 모델 DIET로 학습함으로써 DIET-only F1-score 0.931(Intent)/0.865(Slot/Entity), DIET+KoBERT F1-score 0.951(Intent)/0.901(Slot/Entity)의 성능을 확인하였으며, DECO-LGG 기반의 SSP 생성 데이터의 학습 데이터로서의 효과성과 함께 KoBERT에 기반한 DIET 모델 성능의 우수성을 입증하였다.

  • PDF

Applicability of the Single Rate Presumption for Non-Market Economies within the Framework of the WTO Anti-Dumping Agreement (WTO 반덤핑협정 상 비시장경제 규율에 대한 고찰: 미국의 단일률 적용 관행을 중심으로)

  • Kyoung-Hwa Kim
    • Korea Trade Review
    • /
    • v.46 no.4
    • /
    • pp.113-130
    • /
    • 2021
  • This study aims to analyze the WTO-inconsistent aspects of the single rate presumption of the United States in establishing and imposing anti-dumping duties for non-market economy exporters. By examining the drafting history in the GATT/WTO negotiations and the practice of the single rate presumption for non-market economies by the United States from a comparative perspective, it critically addresses the inherent lack of pertinent disciplines under the framework of the WTO Anti-Dumping Agreement in establishing dumping margins for exporters of non-market economies. The WTO Dispute Settlement Body leaves open the possibility of allowing the investigating authority to consider multiple exporters and the exporting country as a single entity. However, the study argues that it is difficult in practice for the investigating authority to make a single-entity decision in a WTO-consistent manner. The study also finds an incompatibility in the notion between establishing dumping margins for 'individual' exporters and 'non-market economies.' A proper discipline for non-market economies under the multilateral anti-dumping norm needs to be reconsidered in the era of persistent trade conflicts between the United States and China.

Entity Linking For Tweets Using User Model and Real-time News Stream (유저 모델과 실시간 뉴스 스트림을 사용한 트윗 개체 링킹)

  • Jeong, Soyoon;Park, Youngmin;Kang, Sangwoo;Seo, Jungyun
    • Korean Journal of Cognitive Science
    • /
    • v.26 no.4
    • /
    • pp.435-452
    • /
    • 2015
  • Recent researches on Entity Linking(EL) have attempted to disambiguate entities by using a knowledge base to handle the semantic relatedness and up-to-date information. However, EL for tweets using a knowledge base is still unsatisfactory, mainly because the tweet data are mostly composed of short and noisy contexts and real-time issues. The EL system the present work builds up links ambiguous entities to the corresponding entries in a given knowledge base via exploring the news articles and the user history. Using news articles, the system can overcome the problem of Wikipedia coverage (i.e., not handling real-time issues). In addition, given that users usually post tweets related to their particular interests, the current system referring to the user history robustly and effectively works with a small size of tweet data. In this paper, we propose an approach to building an EL system that links ambiguous entities to the corresponding entries in a given knowledge base through the news articles and the user history. We created a dataset of Korean tweets including ambiguous entities randomly selected from the extracted tweets over a seven-day period and evaluated the system using this dataset. We use accuracy index(number of correct answer given by system/number of data set) The experimental results show that our system achieves a accuracy of 67.7% and outperforms the EL methods that exclusively use a knowledge base.

A Quantitative Trust Model based on Empirical Outcome Distributions and Satisfaction Degree (경험적 확률분포와 만족도에 기반한 정량적 신뢰 모델)

  • Kim, Hak-Joon;Sohn, Bong-Ki;Lee, Seung-Joo
    • The KIPS Transactions:PartB
    • /
    • v.13B no.7 s.110
    • /
    • pp.633-642
    • /
    • 2006
  • In the Internet environment many interactions between many users and unknown users take place and it is usually rare to have the trust information about others. Due to the lack of trust information, entities have to take some risks in transactions with others. In this perspective, it is crucial for the entities to be equipped with functionality to accumulate and manage the trust information on other entities in order to reduce risks and uncertainty in their transactions. This paper is concerned with a quantitative computational trust model which takes into account multiple evaluation criteria and uses the recommendation from others in order to get the trust for an entity. In the proposed trust model, the trust for an entity is defined as the expectation for the entity to yield satisfactory outcomes in the given situation. Once an interaction has been made with an entity, it is assumed that outcomes are observed with respect to evaluation criteria. When the trust information is needed, the satisfaction degree, which is the probability to generate satisfactory outcomes for each evaluation criterion, is computed based on the empirical outcome outcome distributions and the entity's preference degrees on the outcomes. Then, the satisfaction degrees for evaluation criteria are aggregated into a trust value. At that time, the reputation information is also incorporated into the trust value. This paper also shows that the model could help the entities effectively choose other entities for transactions with some experiments in e-commerce.

Exploration of the Path Model among Goal Orientation, Self-efficacy, Achievement Need, Entity Theory of Intelligence, Learning Strategy, and Self-handicapping Tendency in Chemistry Education (화학교육의 목표지향성, 자기효능감, 성취욕구, 지능신념, 자기핸디캡경향 및 학습전략 간의 경로모형 탐색)

  • Ko, Young Chun
    • Journal of the Korean Chemical Society
    • /
    • v.57 no.1
    • /
    • pp.147-158
    • /
    • 2013
  • This study is to search an optimal model on causal relationships of the motivations to learn and motivation strategy in chemistry education. The participants in this study are consisted of G and I high schools students (487) in Gwangju. They all answered to the questionnaire. Model I is hypothesized to be path model of the mediation between 'self-efficacy, achievement need, and entity theory of intelligence' and 'learning strategy and self-handicapping tendency of motivation strategy' by goal orientation to explore variables of study effecting the motivation strategy. And Model II is hypothesized path model of the mediation between goal orientation and 'learning strategy and self-handicapping tendency' by 'self-efficacy, achievement need, and entity theory' to explore variables of study effecting the motivation strategy. Based on these models, structural equation modeling techniques are used to evaluate for the path model among goal orientation(learning, performance approach, and performance approach goal orientation), self-efficacy, achievement need, entity theory of intelligence, self-handicapping tendency, and learning strategy in chemistry education. As the results, Model II is considered. Goodness-of-fit indexes of this model related modification models are identified and analyzed in phases. And this model is accomplished by correcting the model the fifth time to enhance goodness-of-fit indexes. In this optimal model II-5 (Fig. 3) on causal relationships of the motivations to learn and learning strategy (p

A Comparative Research on End-to-End Clinical Entity and Relation Extraction using Deep Neural Networks: Pipeline vs. Joint Models (심층 신경망을 활용한 진료 기록 문헌에서의 종단형 개체명 및 관계 추출 비교 연구 - 파이프라인 모델과 결합 모델을 중심으로 -)

  • Sung-Pil Choi
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.57 no.1
    • /
    • pp.93-114
    • /
    • 2023
  • Information extraction can facilitate the intensive analysis of documents by providing semantic triples which consist of named entities and their relations recognized in the texts. However, most of the research so far has been carried out separately for named entity recognition and relation extraction as individual studies, and as a result, the effective performance evaluation of the entire information extraction systems was not performed properly. This paper introduces two models of end-to-end information extraction that can extract various entity names in clinical records and their relationships in the form of semantic triples, namely pipeline and joint models and compares their performances in depth. The pipeline model consists of an entity recognition sub-system based on bidirectional GRU-CRFs and a relation extraction module using multiple encoding scheme, whereas the joint model was implemented with a single bidirectional GRU-CRFs equipped with multi-head labeling method. In the experiments using i2b2/VA 2010, the performance of the pipeline model was 5.5% (F-measure) higher. In addition, through a comparative experiment with existing state-of-the-art systems using large-scale neural language models and manually constructed features, the objective performance level of the end-to-end models implemented in this paper could be identified properly.