• Title/Summary/Keyword: Word translation

Search Result 146, Processing Time 0.021 seconds

Ranking Translation Word Selection Using a Bilingual Dictionary and WordNet

  • Kim, Kweon-Yang;Park, Se-Young
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.16 no.1
    • /
    • pp.124-129
    • /
    • 2006
  • This parer presents a method of ranking translation word selection for Korean verbs based on lexical knowledge contained in a bilingual Korean-English dictionary and WordNet that are easily obtainable knowledge resources. We focus on deciding which translation of the target word is the most appropriate using the measure of semantic relatedness through the 45 extended relations between possible translations of target word and some indicative clue words that play a role of predicate-arguments in source language text. In order to reduce the weight of application of possibly unwanted senses, we rank the possible word senses for each translation word by measuring semantic similarity between the translation word and its near synonyms. We report an average accuracy of $51\%$ with ten Korean ambiguous verbs. The evaluation suggests that our approach outperforms the default baseline performance and previous works.

Effects of Name Agreement and Word Frequency on the English-Korean Word Translation Task (영어-한국어 단어번역과제에서 이름-일치도와 단어빈도의 효과)

  • Koo, Min-Mo;Nam, Ki-Chun
    • MALSORI
    • /
    • no.61
    • /
    • pp.31-48
    • /
    • 2007
  • This study investigated the roles of name agreement and word frequency in the English-Korean word translation task. Using the low-frequency homonyms with low name agreement as stimuli, Experiment 1 revealed that the name agreement of materials is a determinant which could modulate times to translate English words into Korean equivalents. On the contrary, Experiment 2 showed that the name agreement of materials does not play a decisive role in the translation task, using the low-frequency homonyms having high name agreement as stimuli. In Experiment 3, we identified that the frequency effects observed from previous two experiments are indeed brought about during the lexical access. Our findings suggest that the word frequencies of materials have a strong influence on English-Korean word translation times, and homonyms are represented independently each other in the lexeme level.

  • PDF

Multilingual Word Translation Service based on Word Semantic Analysis (어휘의미분석 기반 다국어 어휘대역 서비스)

  • Ryu, Pum-Mo
    • Journal of Digital Contents Society
    • /
    • v.19 no.1
    • /
    • pp.75-83
    • /
    • 2018
  • Multicultural family members have difficulty in educating their children due to language differences. In order to solve these difficulties, it is necessary to provide smart translation services that enable them easily and quickly access real-life vocabularies. However, the current automatic translation technology is being developed in dominant languages such as English, Chinese, and Japanese. There are also limitations to translating special-purpose terms such as documents of schools and instructions of public institutions. In this study, we propose a real-time automatic word translation service for multicultural family members who understand beginner level Korean. The service automatically analyzes the semantics of each word in the Korean sentences and provides a word-by-word translation. This study includes semantic analysis research for Korean language, building multilingual translation knowledge, and fusion study of language education. We evaluated the word translation service for migrant women from Vietnam and Japan and obtained meaningful evaluation results.

Retrieval Model Based on Word Translation Probabilities and the Degree of Association of Query Concept (어휘 번역확률과 질의개념연관도를 반영한 검색 모델)

  • Kim, Jun-Gil;Lee, Kyung-Soon
    • The KIPS Transactions:PartB
    • /
    • v.19B no.3
    • /
    • pp.183-188
    • /
    • 2012
  • One of the major challenge for retrieval performance is the word mismatch between user's queries and documents in information retrieval. To solve the word mismatch problem, we propose a retrieval model based on the degree of association of query concept and word translation probabilities in translation-based model. The word translation probabilities are calculated based on the set of a sentence and its succeeding sentence pair. To validate the proposed method, we experimented on TREC AP test collection. The experimental results show that the proposed model achieved significant improvement over the language model and outperformed translation-based language model.

Translation Disambiguation Based on 'Word-to-Sense and Sense-to-Word' Relationship (`단어-의미 의미-단어` 관계에 기반한 번역어 선택)

  • Lee Hyun-Ah
    • The KIPS Transactions:PartB
    • /
    • v.13B no.1 s.104
    • /
    • pp.71-76
    • /
    • 2006
  • To obtain a correctly translated sentence in a machine translation system, we must select target words that not only reflect an appropriate meaning in a source sentence but also make a fluent sentence in a target language. This paper points out that a source language word has various senses and each sense can be mapped into multiple target words, and proposes a new translation disambiguation method based on this 'word-to-sense and sense-to-word' relationship. In my method target words are chosen through disambiguation of a source word sense and selection of a target word. Most of translation disambiguation methods are based on a 'word-to-word' relationship that means they translate a source word directly into a target wort so they require complicate knowledge sources that directly link a source words to target words, which are hard to obtain like bilingual aligned corpora. By combining two sub-problems for each language, knowledge for translation disambiguation can be automatically extracted from knowledge sources for each language that are easy to obtain. In addition, disambiguation results satisfy both fidelity and intelligibility because selected target words have correct meaning and generate naturally composed target sentences.

An Evaluation of Translation Quality by Homograph Disambiguation in Korean-X Neural Machine Translation Systems (한-X 신경기계번역시스템에서 동형이의어 분별에 따른 변역질 평가)

  • Nguyen, Quang-Phuoc;Shin, Joon-Choul;Ock, Cheol-Young
    • Annual Conference on Human and Language Technology
    • /
    • 2018.10a
    • /
    • pp.504-509
    • /
    • 2018
  • Neural machine translation (NMT) has recently achieved the state-of-the-art performance. However, it is reported failing in the word sense disambiguation (WSD) for several popular language pairs. In this paper, we explore the extent to which NMT systems are able to disambiguate the Korean homographs. Homographs, words with different meanings but the same written form, cause the word choice problems for NMT systems. Consistent with the popular language pairs, we discover that NMT systems fail to translate Korean homographs correctly. We provide a Korean word sense disambiguation tool-UTagger to use for improvement of NMT's translation quality. We conducted translation experiments using Korean-English and Korean-Vietnamese language pairs. The experimental results show that UTagger can significantly improve the translation quality of NMT in terms of the BLEU, TER, and DLRATIO evaluation metrics.

  • PDF

Target Word Selection for English-Korean Machine Translation System using Multiple Knowledge (다양한 지식을 사용한 영한 기계번역에서의 대역어 선택)

  • Lee, Ki-Young;Kim, Han-Woo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.5 s.43
    • /
    • pp.75-86
    • /
    • 2006
  • Target word selection is one of the most important and difficult tasks in English-Korean Machine Translation. It effects on the translation accuracy of machine translation systems. In this paper, we present a new approach to select Korean target word for an English noun with translation ambiguities using multiple knowledge such as verb frame patterns, sense vectors based on collocations, statistical Korean local context information and co-occurring POS information. Verb frame patterns constructed with dictionary and corpus play an important role in resolving the sparseness problem of collocation data. Sense vectors are a set of collocation data when an English word having target selection ambiguities is to be translated to specific Korean target word. Statistical Korean local context Information is an N-gram information generated using Korean corpus. The co-occurring POS information is a statistically significant POS clue which appears with ambiguous word. The experiment showed promising results for diverse sentences from web documents.

  • PDF

A Statistical Model for Choosing the Best Translation of Prepositions. (통계 정보를 이용한 전치사 최적 번역어 결정 모델)

  • 심광섭
    • Language and Information
    • /
    • v.8 no.1
    • /
    • pp.101-116
    • /
    • 2004
  • This paper proposes a statistical model for the translation of prepositions in English-Korean machine translation. In the proposed model, statistical information acquired from unlabeled Korean corpora is used to choose the best translation from several possible translations. Such information includes functional word-verb co-occurrence information, functional word-verb distance information, and noun-postposition co-occurrence information. The model was evaluated with 443 sentences, each of which has a prepositional phrase, and we attained 71.3% accuracy.

  • PDF

A Design of Japanese Analyzer for Japanese to Korean Translation System (일반 번역시스탬을 위한 일본어 해석기 설계)

  • 강석훈;최병욱
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.32B no.1
    • /
    • pp.136-146
    • /
    • 1995
  • In this paper, a Japanese morphological analyzer for Japanese to Korean Machine Translation System is designed. The analyzer reconstructs the Japanese input sentence into word phrases that include grammatical and dictionary informations. Thus we propose the algorithm to separate morphemes and then connect them by reference to a corresponding Korean word phrases. And we define the connector to control Japanese word phrases It is used in controlling the start and the end point of the word phrase in the Japanese sentence which is without a space. The proposed analyzer uses the analysis dictionary to perform more efficient analysis than the existing analyzer. And we can decrease the number of its dictionary searches. Since the analyzer, proposed in this paper, for Japanese to Korean Machine Translation System processes each word phrase in consideration of the corresponding Korean word phrase, it can generate more accurate Korean expressions than the existing one which places great importance on the generation of the entire sentence structure.

  • PDF

Utilizing Local Bilingual Embeddings on Korean-English Law Data (한국어-영어 법률 말뭉치의 로컬 이중 언어 임베딩)

  • Choi, Soon-Young;Matteson, Andrew Stuart;Lim, Heui-Seok
    • Journal of the Korea Convergence Society
    • /
    • v.9 no.10
    • /
    • pp.45-53
    • /
    • 2018
  • Recently, studies about bilingual word embedding have been gaining much attention. However, bilingual word embedding with Korean is not actively pursued due to the difficulty in obtaining a sizable, high quality corpus. Local embeddings that can be applied to specific domains are relatively rare. Additionally, multi-word vocabulary is problematic due to the lack of one-to-one word-level correspondence in translation pairs. In this paper, we crawl 868,163 paragraphs from a Korean-English law corpus and propose three mapping strategies for word embedding. These strategies address the aforementioned issues including multi-word translation and improve translation pair quality on paragraph-aligned data. We demonstrate a twofold increase in translation pair quality compared to the global bilingual word embedding baseline.