• 제목/요약/키워드: Translation research

검색결과 836건 처리시간 0.028초

Simultaneous neural machine translation with a reinforced attention mechanism

  • Lee, YoHan;Shin, JongHun;Kim, YoungKil
    • ETRI Journal
    • /
    • 제43권5호
    • /
    • pp.775-786
    • /
    • 2021
  • To translate in real time, a simultaneous translation system should determine when to stop reading source tokens and generate target tokens corresponding to a partial source sentence read up to that point. However, conventional attention-based neural machine translation (NMT) models cannot produce translations with adequate latency in online scenarios because they wait until a source sentence is completed to compute alignment between the source and target tokens. To address this issue, we propose a reinforced learning (RL)-based attention mechanism, the reinforced attention mechanism, which allows a neural translation model to jointly train the stopping criterion and a partial translation model. The proposed attention mechanism comprises two modules, one to ensure translation quality and the other to address latency. Different from previous RL-based simultaneous translation systems, which learn the stopping criterion from a fixed NMT model, the modules can be trained jointly with a novel reward function. In our experiments, the proposed model has better translation quality and comparable latency compared to previous models.

English-Korean speech translation corpus (EnKoST-C): Construction procedure and evaluation results

  • Jeong-Uk Bang;Joon-Gyu Maeng;Jun Park;Seung Yun;Sang-Hun Kim
    • ETRI Journal
    • /
    • 제45권1호
    • /
    • pp.18-27
    • /
    • 2023
  • We present an English-Korean speech translation corpus, named EnKoST-C. End-to-end model training for speech translation tasks often suffers from a lack of parallel data, such as speech data in the source language and equivalent text data in the target language. Most available public speech translation corpora were developed for European languages, and there is currently no public corpus for English-Korean end-to-end speech translation. Thus, we created an EnKoST-C centered on TED Talks. In this process, we enhance the sentence alignment approach using the subtitle time information and bilingual sentence embedding information. As a result, we built a 559-h English-Korean speech translation corpus. The proposed sentence alignment approach showed excellent performance of 0.96 f-measure score. We also show the baseline performance of an English-Korean speech translation model trained with EnKoST-C. The EnKoST-C is freely available on a Korean government open data hub site.

Classification-Based Approach for Hybridizing Statistical and Rule-Based Machine Translation

  • Park, Eun-Jin;Kwon, Oh-Woog;Kim, Kangil;Kim, Young-Kil
    • ETRI Journal
    • /
    • 제37권3호
    • /
    • pp.541-550
    • /
    • 2015
  • In this paper, we propose a classification-based approach for hybridizing statistical machine translation and rulebased machine translation. Both the training dataset used in the learning of our proposed classifier and our feature extraction method affect the hybridization quality. To create one such training dataset, a previous approach used auto-evaluation metrics to determine from a set of component machine translation (MT) systems which gave the more accurate translation (by a comparative method). Once this had been determined, the most accurate translation was then labelled in such a way so as to indicate the MT system from which it came. In this previous approach, when the metric evaluation scores were low, there existed a high level of uncertainty as to which of the component MT systems was actually producing the better translation. To relax such uncertainty or error in classification, we propose an alternative approach to such labeling; that is, a cut-off method. In our experiments, using the aforementioned cut-off method in our proposed classifier, we managed to achieve a translation accuracy of 81.5% - a 5.0% improvement over existing methods.

한국고전번역원 번역인력의 진입경로에 대한 연구 - 2013년~2017년 번역인력을 중심으로 - (A Study on the Entry Path of Institute of the Translation of Korean Classics)

  • 권경순
    • 동양고전연구
    • /
    • 제71호
    • /
    • pp.259-304
    • /
    • 2018
  • 본고에서는 최근 5년간 번역원 번역사업에 참여한 번역인력의 진입경로와 담당 번역량의 현황을 파악하고 분석하였다. 이를 위해 번역인력의 진입경로를 '직원출신(입사)', '연구과정출신', '자격시험출신', '외부전문가'로 분류하고 연도별로 인력 수와 번역 원고량을 조사하였다. 아울러 번역인력의 1인 평균원고량도 조사하였다. 조사 분석 결과 번역원 전체 번역사업의 인력 비중은 외부전문가 - 연구과정출신 - 자격시험출신 - 직원출신의 순이고, 원고량 비중은 연구과정출신 - 외부전문가 - 자격시험출신 - 직원출신임을 밝혔다. 또한 번역원의 세부사업별 현황도 조사 분석하여 각 사업분야별 특징과 문제점을 논하였다. 역사문헌번역에서 인력 비중은 연구과정출신이 가장 크지만, 원고량 비중은 자격시험출신 비중이 가장 크고, 문집번역과 특수고전번역에서는 외부전문가의 비중이 상당하다는 점을 밝혔다. 또한 번역인력의 1인 평균원고량 분석을 통해 번역원의 번역사업이 비효율적으로 진행되고 있다고 판단하였다. 이러한 문제를 해결하기 위해 '사업계획에 의거한 번역인력 활용 계획 필요', '교점 번역위원 자격시험 개선', '전문분야 번역인력 양성과 진입경로 확대'의 세 가지 방안을 제시하였다. 이상의 방안은 각각 치밀한 분석과 연구를 통해 타당성과 실효성을 검증받아야 할 것들이지만, 번역원이 보다 체계적이고 계획적인 사업계획과 번역인력 활용계획을 수립하여 번역사업을 진행하는 데 본고가 미약하나마 도움이 되길 기대한다.

Translation initiation mediated by nuclear cap-binding protein complex

  • Ryu, Incheol;Kim, Yoon Ki
    • BMB Reports
    • /
    • 제50권4호
    • /
    • pp.186-193
    • /
    • 2017
  • In mammals, cap-dependent translation of mRNAs is initiated by two distinct mechanisms: cap-binding complex (CBC; a heterodimer of CBP80 and 20)-dependent translation (CT) and eIF4E-dependent translation (ET). Both translation initiation mechanisms share common features in driving cap- dependent translation; nevertheless, they can be distinguished from each other based on their molecular features and biological roles. CT is largely associated with mRNA surveillance such as nonsense-mediated mRNA decay (NMD), whereas ET is predominantly involved in the bulk of protein synthesis. However, several recent studies have demonstrated that CT and ET have similar roles in protein synthesis and mRNA surveillance. In a subset of mRNAs, CT preferentially drives the cap-dependent translation, as ET does, and ET is responsible for mRNA surveillance, as CT does. In this review, we summarize and compare the molecular features of CT and ET with a focus on the emerging roles of CT in translation.

Customizing an English-Korean Machine Translation System for Patent Translation

  • Choi, Sung-Kwon;Kim, Young-Gil
    • 한국언어정보학회:학술대회논문집
    • /
    • 한국언어정보학회 2007년도 정기학술대회
    • /
    • pp.105-114
    • /
    • 2007
  • This paper addresses a method for customizing an English-to-Korean machine translation system from general domain to patent domain. The customizing method consists of following steps: 1) linguistically studying about characteristics of patent documents, 2) extracting unknown words from large patent documents and constructing large bilingual terminology, 3) extracting and constructing the patent-specific translation patterns 4) customizing the translation engine modules of the existing general MT system according to linguistic study about characteristics of patent documents, and 5) evaluating the accuracy of translation modules and the translation quality. This research was performed under the auspices of the MIC (Ministry of Information and Communication) of Korean government during 2005-2006. The translation accuracy of the customized English-Korean patent translation system is 82.43% on the average in 5 patent fields (machinery, electronics, chemistry, medicine and computer) according to the evaluation of 7 professional human translators. In 2006, the patent MT system started an on-line patent MT service in IPAC (International Patent Assistance Center) under MOCIE (Ministry of Commerce, Industry and Energy) in Korea. In 2007, KIPO (Korean Intellectual Property Office) tries to launch an English-Korean patent MT service.

  • PDF

한국어판 Balance Evaluation Systems Test의 번역 적합성 연구 (A Study of Translation Conformity on Korean Version of a Balance Evaluation Systems Test)

  • 전용진;김경모
    • 한국전문물리치료학회지
    • /
    • 제25권1호
    • /
    • pp.53-61
    • /
    • 2018
  • Background: The process of language translation, adaptation, and cross-cultural validation of tools for use in multiple countries requires the adoption of well-established, comprehensive, and rigorous methodological approaches. Back translation, which is the most recommended method, permits the detection of errors in the translation and the identification of words or phrases that cannot be accurately or literally translated. Objects: The aim of this study was to verify the content validity of a Korean version of a Balance Evaluation Systems test (BESTest) by using a back-translation method. Methods: This research was conducted in six steps: 1) translation of the BESTest into Korean, 2) evaluation of the translation conformity of Korean-translated BESTest, 3) evaluation of the degree of translation comprehension, 4) back translation of Korean BESTest, 5) evaluation of the technical and conceptual equivalence, and 6) completion of the Korean version of BESTest by the translation verification committee. Results: In this study, Korean version of the BESTest achieved a rating of more than 3 (moderate) for translation comprehension, and technical equivalence and conceptual equivalence of back translation were evaluated as 3 (moderate) or more. Conclusion: The Korean version of the BESTest has proven content validity and is an appropriate tool to measure balance function.

Content Validity of a Korean-Translated Version of a Fullerton Advanced Balance Scale: A Pilot Study

  • Kim, Gyoung-mo
    • 한국전문물리치료학회지
    • /
    • 제22권4호
    • /
    • pp.51-61
    • /
    • 2015
  • The purpose of this study were to translate the Fullerton Advanced Balance (FAB) scale into Korean and to verify the content validity by utilizing a back-translation method with a view to assessing balance function and the risk of falling in a clinical research setting. This research was conducted in six steps. First, three Korean physical therapists translated the FAB scale into Korean. Second, two bilingual professors of physical therapy and a physical therapist evaluated translation conformity of Korean-translated FAB scale. In the third and fourth steps, twelve physical therapists evaluated the degree of translation comprehension, and a translator back-translated the Korean FAB scale into the original language. Fifth, a bilingual professor of physical therapy and two native speakers evaluated the technical and conceptual equivalence between the original and translation versions. In this process, inappropriate translated items were revised using recommended substitute words or sentences, and all items were evaluated on the basis of three points or more on a rating scale in terms of translation comprehension, and the technical and conceptual equivalence of the back-translation. In the sixth and last step, the translation verification committee completed the final Korean version. The above process indicated that the content validity of the Korean-translated FAB scale was established by means of systematic translation methods, and it can therefore be used to assess balance function and the risk of falls in a clinical research setting.

공통변환 기반 다국어 자동번역을 위한 언어학적 모델링 (Linguistic Modeling for Multilingual Machine Translation based on Common Transfer)

  • 최승권;김영길
    • 한국언어정보학회지:언어와정보
    • /
    • 제18권1호
    • /
    • pp.77-97
    • /
    • 2014
  • Multilingual machine translation means the machine translation that is for more than two languages. Common transfer means the transfer in which we can reuse the transfer rules among similar languages according to linguistic typology. Therefore, the multilingual machine translation based on common transfer is the multilingual machine translation that can share the transfer rules among languages with similar linguistic typology. This paper describes the linguistic modeling for multilingual machine translation based on common transfer under development. This linguistic modeling consists of the linguistic devices such as 1) multilingual common Part-of-Speech set, 2) multilingual common transfer format, 3) multilingual common transfer chunking, and 4) multilingual common transfer rules based on linguistic typology. Validity of this linguistic modeling for multilingual machine translation is shown in the simulation. The multilingual machine translation system based on common transfer including Korean, English, Chinese, Spanish, and French will be developed till 2018.

  • PDF

기계번역 시스템 측정 장치 연구 (A Research on Test Suites for Machine Translation Systems.)

  • 이민행;지광신;정소우
    • 한국언어정보학회지:언어와정보
    • /
    • 제2권2호
    • /
    • pp.185-220
    • /
    • 1998
  • The purpose of this research is to propose a set of basic guidelines for the construction of English test suites, a set of basic guidelines for the construction of Korean test suites to objectively evaluate the performance of machine translation systems. For this end, we constructed 650 English test sentences, 650 Korean test sentences, and developed the statistical methods and tools for the comparative evaluation of the English-Korean machine translation systems. It also evaluates the existing commercial English-Korean machine translation systems. The importance of this research lies in that it will promote an awareness of the importance and need of testing machine translation systems within the Natural Language Community. This research will also make a big contribution to the development of evaluation methods and techniques for appropriate test suites for Korean information processing systems. The results of this research can be used by the natural language community to test the performance and development of their information processing systems or machine translation systems.

  • PDF