• Title/Summary/Keyword: 기계번역 사후교정

Search Result 6, Processing Time 0.025 seconds

Verification of the Domain Specialized Automatic Post Editing Model (도메인 특화 기계번역 사후교정 모델 검증 연구)

  • Moon, Hyeonseok;Park, Chanjun;Seo, Jaehyeong;Eo, Sugyeong;Lim, Heuiseok
    • Annual Conference on Human and Language Technology
    • /
    • 2021.10a
    • /
    • pp.3-8
    • /
    • 2021
  • 인공지능 기술이 발달함에 따라 기계번역 기술도 많은 진보를 이루었지만 여전히 기계번역을 통한 번역문 내에는 사람이 교정해야 하는 오류가 다수 포함되어있다. 이렇게 번역 모델에서 생성되는 오류를 교정하는 전문인력의 요구를 경감시키기 위하여 기계번역 사후교정 연구가 등장하였고, 해당 연구는 현재 WMT를 주축으로 활발하게 연구되고 있다. 이러한 사후교정 연구는 최근 도메인 특화 관점에서 주로 연구가 이루어지고 있으며 현재 많은 도메인에서 유의미한 성과를 내고 있다. 하지만 이런 연구들은 기존 번역문의 품질을 얼만큼 향상시켰는가에 초점을 맞출 뿐, 다른 도메인 특화 번역모델의 성능과 비교했을 때 얼마나 뛰어난지는 밝히지 않기 때문에 사후교정 연구가 도메인 특화에서 효과적으로 작용하는지 명확하게 알 수 없다. 이에 본 연구에서는 도메인 특화 번역 모델과 도메인 특화 사후교정 모델간의 성능을 비교함으로써, 도메인 특화에서 사후교정을 통해 얻을 수 있는 실제적인 성능을 검증한다. 이를 통해 사후교정이 도메인 특화 번역모델과 비교했을 때 미미한 수준의 성능을 보임을 실험적으로 확인하였고, 해당 실험 결과를 분석함으로써 향후 도메인특화 사후교정 연구의 방향을 제안하였다.

  • PDF

Recent Automatic Post Editing Research (최신 기계번역 사후 교정 연구)

  • Moon, Hyeonseok;Park, Chanjun;Eo, Sugyeong;Seo, Jaehyung;Lim, Heuiseok
    • Journal of Digital Convergence
    • /
    • v.19 no.7
    • /
    • pp.199-208
    • /
    • 2021
  • Automatic Post Editing(APE) is the study that automatically correcting errors included in the machine translated sentences. The goal of APE task is to generate error correcting models that improve translation quality, regardless of the translation system. For training these models, source sentence, machine translation, and post edit, which is manually edited by human translator, are utilized. Especially in the recent APE research, multilingual pretrained language models are being adopted, prior to the training by APE data. This study deals with multilingual pretrained language models adopted to the latest APE researches, and the specific application method for each APE study. Furthermore, based on the current research trend, we propose future research directions utilizing translation model or mBART model.

Automatic Post Editing Research (기계번역 사후교정(Automatic Post Editing) 연구)

  • Park, Chan-Jun;Lim, Heui-Seok
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.5
    • /
    • pp.1-8
    • /
    • 2020
  • Machine translation refers to a system where a computer translates a source sentence into a target sentence. There are various subfields of machine translation. APE (Automatic Post Editing) is a subfield of machine translation that produces better translations by editing the output of machine translation systems. In other words, it means the process of correcting errors included in the translations generated by the machine translation system to make proofreading. Rather than changing the machine translation model, this is a research field to improve the translation quality by correcting the result sentence of the machine translation system. Since 2015, APE has been selected for the WMT Shaed Task. and the performance evaluation uses TER (Translation Error Rate). Due to this, various studies on the APE model have been published recently, and this paper deals with the latest research trends in the field of APE.

The Verification of the Transfer Learning-based Automatic Post Editing Model (전이학습 기반 기계번역 사후교정 모델 검증)

  • Moon, Hyeonseok;Park, Chanjun;Eo, Sugyeong;Seo, Jaehyung;Lim, Heuiseok
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.10
    • /
    • pp.27-35
    • /
    • 2021
  • Automatic post editing is a research field that aims to automatically correct errors in machine translation results. This research is mainly being focus on high resource language pairs, such as English-German. Recent APE studies are mainly adopting transfer learning based research, where pre-training language models, or translation models generated through self-supervised learning methodologies are utilized. While translation based APE model shows superior performance in recent researches, as such researches are conducted on the high resource languages, the same perspective cannot be directly applied to the low resource languages. In this work, we apply two transfer learning strategies to Korean-English APE studies and show that transfer learning with translation model can significantly improves APE performance.

Construction of an Artificial Training Corpus for The Quality Estimation Task based on HTER Distribution Equalization (번역 품질 예측을 위한 HTER 분포 평준화 기반 인조 번역 품질 말뭉치 구축 방법)

  • Park, Junsu;Lee, WonKee;Shin, Jaehun;Han, H. Jeung;Lee, Jong-hyeok
    • Annual Conference on Human and Language Technology
    • /
    • 2019.10a
    • /
    • pp.460-464
    • /
    • 2019
  • 번역 품질 예측은 기계번역 시스템이 생성한 번역문의 품질을 정답 번역문을 참고하지 않고 예측하는 과정으로, 번역문의 사후 교정을 위한 번역 오류 검출의 역할을 담당하는 중요한 연구이다. 본 논문은 문장 수준의 번역 품질 예측 문제를 HTER 구간의 분류 문제로 간주하여, 번역 품질 말뭉치의 HTER 분포 불균형으로 인한 성능 제약을 완화하기 위해 인조 사후 교정 말뭉치를 이용하는 방법을 제안하였다. 결과적으로 HTER 분포를 균등하게 조정한 학습 말뭉치가 그렇지 않은 쪽에 비해 번역 품질 예측에 더 효과적인 것을 보였다.

  • PDF

Methodology of Automatic Editing for Academic Writing Using Bidirectional RNN and Academic Dictionary (양방향 RNN과 학술용어사전을 이용한 영문학술문서 교정 방법론)

  • Roh, Younghoon;Chang, Tai-Woo;Won, Jongwun
    • The Journal of Society for e-Business Studies
    • /
    • v.27 no.2
    • /
    • pp.175-192
    • /
    • 2022
  • Artificial intelligence-based natural language processing technology is playing an important role in helping users write English-language documents. For academic documents in particular, the English proofreading services should reflect the academic characteristics using formal style and technical terms. But the services usually does not because they are based on general English sentences. In addition, since existing studies are mainly for improving the grammatical completeness, there is a limit of fluency improvement. This study proposes an automatic academic English editing methodology to deliver the clear meaning of sentences based on the use of technical terms. The proposed methodology consists of two phases: misspell correction and fluency improvement. In the first phase, appropriate corrective words are provided according to the input typo and contexts. In the second phase, the fluency of the sentence is improved based on the automatic post-editing model of the bidirectional recurrent neural network that can learn from the pair of the original sentence and the edited sentence. Experiments were performed with actual English editing data, and the superiority of the proposed methodology was verified.