• Title/Summary/Keyword: Word translation

Search Result 146, Processing Time 0.033 seconds

Environment for Translation Domain Adaptation and Continuous Improvement of English-Korean Machine Translation System

  • Kim, Sung-Dong;Kim, Namyun
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.12 no.2
    • /
    • pp.127-136
    • /
    • 2020
  • This paper presents an environment for rule-based English-Korean machine translation system, which supports the translation domain adaptation and the continuous translation quality improvement. For the purposes, corpus is essential, from which necessary information for translation will be acquired. The environment consists of a corpus construction part and a translation knowledge extraction part. The corpus construction part crawls news articles from some newspaper sites. The extraction part builds the translation knowledge such as newly-created words, compound words, collocation information, distributional word representations, and so on. For the translation domain adaption, the corpus for the domain should be built and the translation knowledge should be constructed from the corpus. For the continuous improvement, corpus needs to be continuously expanded and the translation knowledge should be enhanced from the expanded corpus. The proposed web-based environment is expected to facilitate the tasks of domain adaptation and translation system improvement.

An Analysis of Korean inflected Word for Machine Translation (한국어의 기계번역을 위한 용언 구조의 해석)

  • Han, H.R.;Lee, J.K.
    • Proceedings of the KIEE Conference
    • /
    • 1988.07a
    • /
    • pp.612-615
    • /
    • 1988
  • This paper proposes a method for analyzing the Korean inflected word in machine translation system. We define the processing rules which are useful of analyzing an irregular conjugation, pesent an parsing algorithm of noun and specifed verb and reduce the space of dictionary by the algorithm.

  • PDF

Explaining the Translation Error Factors of Machine Translation Services Using Self-Attention Visualization (Self-Attention 시각화를 사용한 기계번역 서비스의 번역 오류 요인 설명)

  • Zhang, Chenglong;Ahn, Hyunchul
    • Journal of Information Technology Services
    • /
    • v.21 no.2
    • /
    • pp.85-95
    • /
    • 2022
  • This study analyzed the translation error factors of machine translation services such as Naver Papago and Google Translate through Self-Attention path visualization. Self-Attention is a key method of the Transformer and BERT NLP models and recently widely used in machine translation. We propose a method to explain translation error factors of machine translation algorithms by comparison the Self-Attention paths between ST(source text) and ST'(transformed ST) of which meaning is not changed, but the translation output is more accurate. Through this method, it is possible to gain explainability to analyze a machine translation algorithm's inside process, which is invisible like a black box. In our experiment, it was possible to explore the factors that caused translation errors by analyzing the difference in key word's attention path. The study used the XLM-RoBERTa multilingual NLP model provided by exBERT for Self-Attention visualization, and it was applied to two examples of Korean-Chinese and Korean-English translations.

Japanese-Korean Machine Translation System Using Connection Forms of Neighboring Words (인접 단어들의 접속정보를 이용한 일한 기계번역 시스템)

  • Kim, Jung-In
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.7
    • /
    • pp.998-1008
    • /
    • 2004
  • There are many syntactic similarities between Japanese and Korean languages. Using these similarities, we can make out the Japanese-Korean translation system without most of syntactic analysis and semantic analysis. To improve the translation rates greatly, we have been developing the Japanese-Korean translation system using these similarities from several years ago. However, the system remains some problems such as a translation of inflected words, processing of multi-translatable words and so on. In this paper, we suggest the new method of Japanese-Korean translation by using relations of two neighboring words. To solve the problems, we investigated the connection rules of auxiliary verbs priority. And we design the translation table which is consists of entry tables and connection forms tables. A case of only one translation word, we can translate a Korean to Japanese by direct matching method use of only entry table, otherwise we have to evaluate the connection value by connection forms tables and then we can select the best translation word.

  • PDF

Ontology Construction and Its Application to Disambiguate Word Senses (온톨로지 구축 및 단어 의미 중의성 해소에의 활용)

  • Kang, Sin-Jae
    • The KIPS Transactions:PartB
    • /
    • v.11B no.4
    • /
    • pp.491-500
    • /
    • 2004
  • This paper presents an ontology construction method using various computational language resources, and an ontology-based word sense disambiguation method. In order to acquire a reasonably practical ontology the Kadokawa thesaurus is extended by inserting additional semantic relations into its hierarchy, which are classified as case relations and other semantic relations. To apply the ontology to disambiguate word senses, we apply the previously-secured dictionary information to select the correct senses of some ambiguous words with high precision, and then use the ontology to disambiguate the remaining ambiguous words. The mutual information between concepts in the ontology was calculated before using the ontology as knowledge for disambiguating word senses. If mutual information is regarded as a weight between ontology concepts, the ontology can be treated as a graph with weighted edges, and then we locate the weighted path from one concept to the other concept. In our practical machine translation system, our word sense disambiguation method achieved a 9% improvement over methods which do not use ontology for Korean translation.

Translators: Traitors or Traders\ulcorner

  • Kim, Chin-W.
    • Lingua Humanitatis
    • /
    • v.6
    • /
    • pp.7-31
    • /
    • 2004
  • This paper argues that (1) word-for-word literary translation is not possible; all it does is achieve what Chukovsky characterized as 'imprecise precision' (1984:47), (2) contra to Nida (1969) and others, translation does not just mean translating meaning, and (3) therefore, a translator must negotiate an uneasy but inevitable compromise between accuracy and elegance. To make the translated passage just as pleasing, moving, and cathartic as the original passage as much as possible, a great deal of literary skill is required on the part of the translator. The iniquity of translators is not so much infidelity as infertility to produce an offspring worthy of an heir to the original writer. Translators are not traitors; they are traders, or literary merchants, who trade one form of linguistic unit for another, often meaning for form, or sense for sound, but sometimes form for meaning. A translator then is not a man of treason but is a tradesman.

  • PDF

Korean-English Non-Autoregressive Neural Machine Translation using Word Alignment (단어 정렬을 이용한 한국어-영어 비자기회귀 신경망 기계 번역)

  • Jung, Young-Jun;Lee, Chang-Ki
    • Annual Conference on Human and Language Technology
    • /
    • 2021.10a
    • /
    • pp.629-632
    • /
    • 2021
  • 기계 번역(machine translation)은 자연 언어로 된 텍스트를 다른 언어로 자동 번역 하는 기술로, 최근에는 주로 신경망 기계 번역(Neural Machine Translation) 모델에 대한 연구가 진행되었다. 신경망 기계 번역은 일반적으로 자기회귀(autoregressive) 모델을 이용하며 기계 번역에서 좋은 성능을 보이지만, 병렬화할 수 없어 디코딩 속도가 느린 문제가 있다. 비자기회귀(non-autoregressive) 모델은 단어를 독립적으로 생성하며 병렬 계산이 가능해 자기회귀 모델에 비해 디코딩 속도가 상당히 빠른 장점이 있지만, 멀티모달리티(multimodality) 문제가 발생할 수 있다. 본 논문에서는 단어 정렬(word alignment)을 이용한 비자기회귀 신경망 기계 번역 모델을 제안하고, 제안한 모델을 한국어-영어 기계 번역에 적용하여 단어 정렬 정보가 어순이 다른 언어 간의 번역 성능 개선과 멀티모달리티 문제를 완화하는 데 도움이 됨을 보인다.

  • PDF

Choosing preferable labels for the Japanese translation of the Human Phenotype Ontology

  • Ninomiya, Kota;Takatsuki, Terue;Kushida, Tatsuya;Yamamoto, Yasunori;Ogishima, Soichi
    • Genomics & Informatics
    • /
    • v.18 no.2
    • /
    • pp.23.1-23.6
    • /
    • 2020
  • The Human Phenotype Ontology (HPO) is the de facto standard ontology to describe human phenotypes in detail, and it is actively used, particularly in the field of rare disease diagnoses. For clinicians who are not fluent in English, the HPO has been translated into many languages, and there have been four initiatives to develop Japanese translations. At the Biomedical Linked Annotation Hackathon 6 (BLAH6), a rule-based approach was attempted to determine the preferable Japanese translation for each HPO term among the candidates developed by the four approaches. The relationship between the HPO and Mammalian Phenotype translations was also investigated, with the eventual goal of harmonizing the two translations to facilitate phenotype-based comparisons of species in Japanese through cross-species phenotype matching. In order to deal with the increase in the number of HPO terms and the need for manual curation, it would be useful to have a dictionary containing word-by-word correspondences and fixed translation phrases for English word order. These considerations seem applicable to HPO localization into other languages.

Sentence Translation and Vocabulary Retention in an EFL Reading Class

  • Kim, Boram
    • English Language & Literature Teaching
    • /
    • v.18 no.2
    • /
    • pp.67-84
    • /
    • 2012
  • The present study investigated the effect of sentence translation as a production task on short-term and long-term retention of foreign vocabulary. 87 EFL university students at a beginning level, enrolled in reading class participated in the study. The study compared the performance of three groups on vocabulary recall: (1) Control group, (2) Translation group, and (3) Copy group. During the treatment sessions, translation group translated L1 sentences into English, while copy group simply copied given English sentences with each target word. Results of the immediate test were collected each week from week 2 to week 5 and analyzed by one-way ANOVA. Results revealed that regarding short-term vocabulary retention, participants in rote-copy condition outperformed those in translation group. Four weeks later a delayed test was administered to measure long-term vocabulary retention. In contrast, the results of two-way repeated measures ANOVA showed that long-term vocabulary retention of translation group was significantly greater than copy group. The findings suggest that although sentence translation is rather challenging to low-level learners, it may facilitate long-term retention of new vocabulary given the more elaborate and deeper processing the task entails.

  • PDF