• Title/Summary/Keyword: Language Translation

Search Result 565, Processing Time 0.027 seconds

Quality, not Quantity? : Effect of parallel corpus quantity and quality on Neural Machine Translation (양보다 질? : 병렬 말뭉치의 양과 질이 인공신경망 기계번역에 미치는 효과)

  • Park, Chanjun;Lee, Yeonsu;Lee, Chanhee;Lim, Heuiseok
    • Annual Conference on Human and Language Technology
    • /
    • 2020.10a
    • /
    • pp.363-368
    • /
    • 2020
  • 글로벌 시대를 맞이하여 언어의 장벽을 해소하기 위하여 기계번역 연구들이 전 세계적으로 이루어지고 있다. 딥러닝의 등장으로 기존 규칙 및 통계기반 방법론에 비하여 눈에 띄는 성능향상을 이루어내고 있으며 많은 연구들이 이루어지고 있다. 인공신경망 기반 기계번역 모델을 만들 때 가장 중요한 요소는 병렬 말뭉치의 양과 질이다. 본 논문은 한-영 대용량의 말뭉치를 수집하고 병렬 말뭉치 필터링 기법을 적용하여 데이터의 양과 질을 충족시켰으며 한-영 기계번역 관련 객관적인 테스트셋인 Iwslt 16, Iwslt 17을 기준으로 기존 한-영 기계번역 관련 연구 중 가장 좋은 성능을 보였다.

  • PDF

Study on Korean-Korean Sign language Translation Technology for Avatar Sign language Service (아바타 수어 서비스를 위한 한국어-한국수어 변환 기술 연구)

  • Choi, Ji Hoon;Lee, Han-kyu;AHN, ChungHyun
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2020.07a
    • /
    • pp.459-460
    • /
    • 2020
  • 한국수화언어가 2016년 2월 제정된 한국수화언어법(약칭, 한국수어법)을 통해 한국어와 동일한 대한민국 공식 언어로 인정받았지만, 사회적 인식 부족과 서비스 비용 문제로 널리 사용되지 못하고 있다. 그리고 일상생활에서 접하는 많은 한국어 정보들 조차도 농인들은 쉽게 이해하기 어렵기 때문에 정보 접근에 대한 차별성 문제가 지속적으로 언급되고 있다. 이를 해결하기 위한 대안으로 아바타를 이용한 수어 서비스가 대두되고 있지만, 한국어-한국수어 번역을 위한 자연어처리 기술의 한계로 인해 일기예보와 같이 탬플릿 기반의 서비스에 국한되거나 비수지신호 표현에 대한 기술 부족으로 인해서 서비스 상용화까지 도달하지 못하고 있는 상황이다. 본 논문에서는 딥러닝 기반으로 한국어에서 한국수어로 변환하기 위한 병렬 말뭉치 데이터 전사 및 변환 시스템 설계 방법을 제안하고자 한다.

  • PDF

LyriKOR: English to Korean Song Translation with Syllabic Alignment (LyriKOR: 음절을 맞춘 영한 노래 가사 번역 모델)

  • Hyejin Jo;Eunbeen Hong;Jimin Oh;Junghwan Park;Byungjun Lee
    • Annual Conference on Human and Language Technology
    • /
    • 2023.10a
    • /
    • pp.510-516
    • /
    • 2023
  • 세계화가 진행됨에 따라 다양한 문화의 음악을 즐기는 사람들이 늘어나고, 해외 팬들이 외국 노래를 이해하고 따라 부를 수 있는 접근성을 확보하는 것이 중요해졌다. 이를 위해 본 논문에서는 노래 가사 데이터에 특화된 영어-한국어 번역 모델 리리코(LyriKOR)를 제시한다. 리리코는 영어 노래를 한국어로 번역하여 그 의미를 담아낼 뿐만 아니라, 번역 결과물이 원곡의 선율과 리듬에 어느 정도 부합하도록 하여 한국어로 바로 따라 부를 수 있도록 하는 것을 목표로 한다. 이를 위해 번역과 음절 조정의 두 단계(two-stage)를 거쳐 제한된 데이터로 음절 정렬된 번역 모델을 훈련하는 새로운 방법을 소개한다. 모델 코드는 여기에서 볼 수 있다.

  • PDF

Neural Machine Translation with Dictionary Information (사전 정보를 활용한 신경망 기계 번역)

  • Hyun-Kyun Jeon;Ji-Yoon Kim;Seung-Ho Choi;Bongsu Kim
    • Annual Conference on Human and Language Technology
    • /
    • 2023.10a
    • /
    • pp.86-90
    • /
    • 2023
  • 최근 생성형 언어 모델이 주목받고 있으며, 이와 관련된 과제 또한 주목받고 있다. 언어 생성과 관련하여 많은 연구가 진행된 분야 중 하나가 '번역'이다. 번역과 관련하여, 최근 인공신경망 기반의 신경망 기계 번역(NMT)가 주로 연구되고 있으며, 뛰어난 성능을 보여주고 있다. 하지만 교착어인 한국어에서 언어유형학 상의 다른 분류에 속한 언어로 번역은 매끄럽게 번역되지 않는다는 한계가 여전하다. 따라서, 본 논문에서는 이러한 문제점을 극복하기 위해 한-영 사전을 통한 번역 품질 향상 방법을 제안한다. 또한 출력과 관련하여 소형 언어모델(sLLM)을 통해 CoT데이터셋을 구축하고 이를 기반으로 조정 학습하여 성능을 평가할 것이다.

  • PDF

An Artificial Intelligence Approach for Word Semantic Similarity Measure of Hindi Language

  • Younas, Farah;Nadir, Jumana;Usman, Muhammad;Khan, Muhammad Attique;Khan, Sajid Ali;Kadry, Seifedine;Nam, Yunyoung
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.6
    • /
    • pp.2049-2068
    • /
    • 2021
  • AI combined with NLP techniques has promoted the use of Virtual Assistants and have made people rely on them for many diverse uses. Conversational Agents are the most promising technique that assists computer users through their operation. An important challenge in developing Conversational Agents globally is transferring the groundbreaking expertise obtained in English to other languages. AI is making it possible to transfer this learning. There is a dire need to develop systems that understand secular languages. One such difficult language is Hindi, which is the fourth most spoken language in the world. Semantic similarity is an important part of Natural Language Processing, which involves applications such as ontology learning and information extraction, for developing conversational agents. Most of the research is concentrated on English and other European languages. This paper presents a Corpus-based word semantic similarity measure for Hindi. An experiment involving the translation of the English benchmark dataset to Hindi is performed, investigating the incorporation of the corpus, with human and machine similarity ratings. A significant correlation to the human intuition and the algorithm ratings has been calculated for analyzing the accuracy of the proposed similarity measures. The method can be adapted in various applications of word semantic similarity or module for any other language.

Teacher's corrective feedback: Focus on initiations to self-repair (학습자의 오류에 대한 교사의 오류 수정: 학습자 자기 교정 유도를 중심으로)

  • Kim, Young-Eun
    • English Language & Literature Teaching
    • /
    • v.13 no.1
    • /
    • pp.111-131
    • /
    • 2007
  • This study explores teacher's corrective feedback types in an error treatment sequence in Korean EFL classroom setting. Corrective feedback moves are coded as explicit correction, recast, or initiations to self-repair. The frequency and distribution of each corrective feedback type are examined. But the special focus was given on feedback types eliciting learner's self-repair (clarification request, metalinguistic feedback, elicitation, and repetition of error) because initiations to self-repair are believed to facilitate language learning more than other strategies. The results of the study are as follows. First, there was an overwhelming tendency for teacher to use recasts whereas initiations to self-repair were not used as much as recast (52.4% vs. 29.5%). Second, the teacher tended to select feedback types in accordance with error types: namely, recasts after phonological, lexical, and translation errors and initiations to self-repair after grammatical errors though the differences were not significant. Finally, teacher's belief and students' expectation on corrective feedback were compared with actual corrective feedback representations respectively and some mismatches were found. Though both teacher and the students acknowledged the importance and necessity of self-repair, self-repair were not put into practice as such. Therefore, this study suggests more initiations to self-repair be used for effective language learning.

  • PDF

Execution of a functional Logic language using the Dataflow Graph Representation (데이터플로우 그래프 표현 방식을 이용한 함수 논리 언어의 실행)

  • Kim, Yong-Jun;Cheon, Suh-Hyun
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.9
    • /
    • pp.2435-2446
    • /
    • 1998
  • In this paper. We describe a dataflow model for efficient execution of a functional logic language and a method of translation a functional logic language into a dataflow graph. To explore parallelism and intelligent backtracking, we us model in which clause and function are represented as independent dataflow graph. The node denotes basic actions to be performed when the clause and function are executed. The dataflow mechanism allows an operation to be executed as soon as all its operands are available. Since the operations can never be executed earlier, a dataflow model is an excellent base for increasing execution speed. We did decrease a delay time with concurrent execution of dependency analysis and subgoal.

  • PDF

Design of Visual Object-Oriented Database Query Language and Implementation of the Query Processor (시각적 객체지향 데이터베이스 질의어의 설계 및 질의처리기의 구현)

  • Lee, Suk-Kyoon;Nah, Yun-Mook;Suh, Yong-Moo
    • Asia pacific journal of information systems
    • /
    • v.11 no.2
    • /
    • pp.121-139
    • /
    • 2001
  • VOQL* query language, recently proposed, is a visual language for object-oriented databases. It is based on Ven Diagram and graph, so that the underlying schema structure can be naturally implied in query expressions. In VOQL*, structural relationship among the objects used in a query expression is represented graphically and thus it has formal semantics that can be inductively defined, as well as it can be used with ease. In this paper, we proposed revised VOQL* and introduced its query processor, InQs(Intelligent Querying System). While retaining the merit of VOQL* that it allows the structural relationship among the objects to be represented visually, the revised VOQL* has another merit that users can formulate a query interactively using various forms supplied by InQs. As a query processor that translates queries in revised VOQL into those in ODMG OQL, InQs provides an environment in which users express queries in revised VOQL* and then the system automatically translates them into those in ODMG OQL. Translation algorithm of InQs is much simpler and intuitive than other algorithms used in QUIVER and other systems, since it reflects the formal semantics of VOQL*, which is defined inductively.

  • PDF

The Korean language version of Stroke Impact Scale 3.0: Cross-cultural adaptation and translation

  • Lee, Hae-jung;Song, Ju-min
    • Journal of the Korean Society of Physical Medicine
    • /
    • v.10 no.3
    • /
    • pp.47-55
    • /
    • 2015
  • PURPOSE: Stoke is one of most common disabling conditions and it is still lacking of measuring patient's functioning level. The aim of the study was to develop Korean language version of stroke impact scale 3.0. METHODS: Korean version of stroke impact scale 3.0 was developed in idiomatic modern Korean with a standard protocol of multiple forward and backward translations and an expert reviews to achieve equivalence with the original English version. Interviews with clinicians who were currently managing patients with stroke were also conducted for language evaluation. A reliability test was performed to make final adaptation using a pre-final version. To assess the reliability of the translated questionnaire, the intraclass correlation coefficient (ICC) was calculated for each domain of the scale. RESULTS: Thirty subjects (16 male, 14 female) aged from 20 to 75 years old participated to review the translated questionnaire. Reliability of each domain of the questionnaire was found to be good in strength (ICC=0.74), ADL (ICC=0.81), mobility (ICC=0.90), hand function (ICC=0.80) and social participation (ICC=0.79), communication (ICC=0.77) with total (ICC=0.76). However, domains of memory and thinking (ICC=0.66), and emotion (ICC=0.27) and showed poor reliability. CONCLUSION: This study indicates that the Korean version of SIS 3.0 was successfully developed. Future study needed for obtaining the validity of the Korean version of SIS 3.0.

Exploiting Implicit Parallelism for Single Loops in Java Programming Language (JAVA 프로그래밍 언어에서 단일루프구조의 무시적 병렬성 검출)

  • Kwon, Oh-Jin
    • Journal of Information Management
    • /
    • v.29 no.3
    • /
    • pp.1-26
    • /
    • 1998
  • The loop is a fundamental for the parallelism exploiting as it has a large portion of execution time for sequential Java program on the parallel machine. This paper proposes the method of exploiting the implicit parallelism through the analysis of data dependence in the existing Java programming language having a single loop structure. The parallel code generation method through the restructuring compiler and the translation method of Java source program into multithread statement, which is supported in the level of the Java programming language, are also proposed here. The performance test of the program translated into the thread statement is conducted using the trip count of loop and the thread count as parameters. The restructuring compiler makes it possible for users to reduce overhead and exploit parallelism efficiently in the Java programming.

  • PDF