• Title/Summary/Keyword: Source language

Search Result 491, Processing Time 0.024 seconds

English-Korean speech translation corpus (EnKoST-C): Construction procedure and evaluation results

  • Jeong-Uk Bang;Joon-Gyu Maeng;Jun Park;Seung Yun;Sang-Hun Kim
    • ETRI Journal
    • /
    • v.45 no.1
    • /
    • pp.18-27
    • /
    • 2023
  • We present an English-Korean speech translation corpus, named EnKoST-C. End-to-end model training for speech translation tasks often suffers from a lack of parallel data, such as speech data in the source language and equivalent text data in the target language. Most available public speech translation corpora were developed for European languages, and there is currently no public corpus for English-Korean end-to-end speech translation. Thus, we created an EnKoST-C centered on TED Talks. In this process, we enhance the sentence alignment approach using the subtitle time information and bilingual sentence embedding information. As a result, we built a 559-h English-Korean speech translation corpus. The proposed sentence alignment approach showed excellent performance of 0.96 f-measure score. We also show the baseline performance of an English-Korean speech translation model trained with EnKoST-C. The EnKoST-C is freely available on a Korean government open data hub site.

Implementation of a function translator converting vulnerable functions for preventing buffer overflow attacks (버퍼 오버플로우 공격 방지를 위한 취약 함수 변환기 구현)

  • Kim, Ik Su;Cho, Yong Yun
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.6 no.1
    • /
    • pp.105-114
    • /
    • 2010
  • C language is frequently used to develop application and system programs. However, programs using C language are vulnerable to buffer overflow attacks. To prevent buffer overflow, programmers have to check boundaries of buffer areas when they develop programs. But vulnerable programs frequently result from improper programming habits and mistakes of programmers. Existing researches for preventing buffer overflow attacks only inform programmers of warnings about vulnerabilities and not remove vulnerabilities in advance so that the programs still include vulnerabilities. In this paper, we propose a function translator which prevents creating programs including buffer overflow vulnerabilities. To prevent creating binary from source including vulnerabilities, the proposed translator searches vulnerable functions which cause buffer overflows, and converts them into secure functions. Accordingly, developing vulnerable programs by programmers which lack in knowledge on security can be prevented.

Version Management Model for Distributed Object Oriented Software Development Environment Based on Web (웹 기반의 분산 객체 지향 소프트웨어 개발 환경을 위한 버전 관리 모텔)

  • 김수용;최동운
    • Journal of the Korea Computer Industry Society
    • /
    • v.2 no.8
    • /
    • pp.1099-1108
    • /
    • 2001
  • In this paper, we proposed an efficient method of management in which various design objects in the environments for distributed software development. Those design objects are created in the environments for distributed software development based on Unified Modeling Language as well as versions of source codes. In this research, a version control technique has been specially focused that is based on our proposed version rules to consistently control versions from web-based distributed software development which enables developers to independently manage distributed objects from the software development platform. Based on the version control technique, we have designed a web-based rule-control version management model that has been proposed here.

  • PDF

Bilingual Voice Conversion Using Frequency Warping on Formant Space (포만트 공간에서의 주파수 변환을 이용한 이중 언어 음성 변환 연구)

  • Chae, Yi-Geun;Yun, Young-Sun;Jung, Jin Man;Eun, Seongbae
    • Phonetics and Speech Sciences
    • /
    • v.6 no.4
    • /
    • pp.133-139
    • /
    • 2014
  • This paper describes several approaches to transform a speaker's individuality to another's individuality using frequency warping between bilingual formant frequencies on different language environments. The proposed methods are simple and intuitive voice conversion algorithms that do not use training data between different languages. The approaches find the warping function from source speaker's frequency to target speaker's frequency on formant space. The formant space comprises four representative monophthongs for each language. The warping functions can be represented by piecewise linear equations, inverse matrix. The used features are pure frequency components including magnitudes, phases, and line spectral frequencies (LSF). The experiments show that the LSF-based voice conversion methods give better performance than other methods.

A Study on the Educational Program Development for Automated Pattern Drafting -Making Blouse in Ninth Grade- (제도법의 자동화를 위한 교육용 프로그램 개발에 관한 연구 (제 1보) -중3 가사 블라우스 만들기-)

  • 김여숙
    • Journal of Korean Home Economics Education Association
    • /
    • v.4 no.1
    • /
    • pp.1-15
    • /
    • 1992
  • The aim of the research is to develop a PC based courseware which is programed to drafts clothing patterns. The pattern making are numerically formulized. The of the program were as follows;1. Menu and instructions are displayed in Korean Language. 2. Easy step-by-step instructions explaining how to draw basic pattern and design pattern. 3. Low cost personal computer and general purpose printer are used. The source program was written in C-language and compiled using Turbop C. The Bezier spline is used to draw curves of pattern and to display Korean characters and pattern on same screen simultaneoulsy, Korean characters are drawn graphically. The low cost IBM Personal Computer or compatibles with Hercules Graphic Card is required to run this grogram.

  • PDF

University Grammar of English in Korea (대학에서의 영문법 교육)

  • 박승윤
    • Korean Journal of English Language and Linguistics
    • /
    • v.2 no.4
    • /
    • pp.537-553
    • /
    • 2002
  • This paper discusses various problems related to the teaching of English grammar at Korean universities. We first discuss whether English grammar should be taught at universities, and, if it is, what kind of English grammar needs to be taught. We propose that the English grammar we teach to Korean undergraduate students be eclectic in the sense that the traditional grammar established by Jespersen and others be the major source of instruction, supplemented, if necessary, by school grammar and also by linguistically oriented grammars such as generative grammar or cognitive grammar. Then we discuss the content of the English grammar that should be included in the curriculum : (i) present perfect vs. past, (ii) will vs. be going to, (iii) must vs. have to, (iv) may vs. can, (v) infinitives vs. gerunds, (vi) conative constructions, and (vii) the passive.

  • PDF

A Study on the Markup Scheme for Building the Corpora of Korean Culinary Manuscripts (한글 필사본 음식조리서 말뭉치 구축을 위한 마크업 방안 연구)

  • An, Ui-Jeong;Park, Jin-Yang;Nam, Gil-Im
    • Language and Information
    • /
    • v.12 no.2
    • /
    • pp.95-114
    • /
    • 2008
  • This study aims at establishing a markup system for 17-19th century culinary manuscripts. To achieve this aim, we, in section 2, look into various theoretical considerations regarding encoding large-scale historical corpora. In section 3, we identify and analyze the characteristics of textual theme and structure of our source text. Section 4 proposes a markup scheme based on the XML standard for bibliographical and structural markups for the corpus as well as the grammatical annotations. We show that it is highly desirable to use XML-based markup system since it is extremely powerful and flexible in its expressiveness and scalable. The markup scheme we suggest is a modified and extended version of the TEI-P5 to accommodate the textual and linguistic characteristics of premodern Korean culinary manuscripts.

  • PDF

Silent Verbs in Northern Mandarin: A Silence Neither Gaps Nor Emptiness Can Fill

  • Kim, Ji-Yung
    • Language and Information
    • /
    • v.11 no.2
    • /
    • pp.87-103
    • /
    • 2007
  • This paper reanalyzes examples with missing verbs. Northern Mandarin rejects argument nominal phrases after a silent verb, as well as silent verbs inside islands. These restrictions suggest a grammatical process which silences verbs. I propose that these restrictions are the result of VP-topicalization followed by ellipsis. This analysis accounts for the island sensitivity of these constructions: since VP-topicalization feeds ellipsis, constructions with elided VPs are not derivable from configurations where movement is impossible. Also, to avoid topicalization along with the VP, the argument must move out of VP; the subsequent topicalization of the VP containing the argument's trace would then give rise to a configuration where that trace c-commands the moved-out DP. Adjuncts do not pose a problem because they are located outside of that smallest VP-shell. The data presented here are accommodated by neither of Tang's (2001) proposals for silent verbs (gapping and empty verbs). Instead, they provide support for a third source for silent verbs, VP-ellipsis via topicalization.

  • PDF

Semantic Prosody and Meaning Equivalence: Is Korean pin konggan Equivalent to ‘Empty Space’ or ‘Blank Space’\ulcorner (의미운률과 의미 등가성: ‘빈 공간’은 ‘empty space’인가 ‘blank space’인가\ulcorner)

  • 조의연
    • Korean Journal of English Language and Linguistics
    • /
    • v.3 no.4
    • /
    • pp.589-609
    • /
    • 2003
  • The purpose of this paper is to show that lexical equivalency in translation can be achieved when it is based on semantic prosodies of lexical items. This paper examines the semantic prosodies of two seemingly synonymous English adjectives ‘empty’ and ‘blank’ on the basis of the corpus given in Cobuild English Collocations on CD-ROM and proposes that they are different in terms of spatial dimensions. Thus when a Korean equivalent pin derived from the verb pita is translated into English, syntagmatic phraseological environments of the Korean adjective must be taken into account to attain the equivalency of the source and target languages. Relevant Korean corpus was taken from the 21st Century Sejong Plan (2002). Out of 12 examples of pin konggan, five appear to be equivalent to ‘blank’ and seven to ‘empty.’ The five to seven ratio in different usage indicates that the equivalency problem concerning the lexical item pin is not a trivial matter in translation.

  • PDF

Skyline Algorithm for Finite Analysis Programs Written in C Language (C언어의 유한요소해석 프로그램을 위한 Skyline Algorithm)

  • 이재영
    • Computational Structural Engineering
    • /
    • v.2 no.2
    • /
    • pp.85-92
    • /
    • 1989
  • A modified skyline algorithm suitable for C language in this paper. The modified algorithm improves the computational efficiency and the structure of the program. Substantial reduction of execution time is achieved by simplifying assemblage and decomposition of the stiffness matrix. A source program is also provided for use in future development of finite element softwares.

  • PDF