• Title/Summary/Keyword: Phonetic codes

Search Result 8, Processing Time 0.02 seconds

Computer Codes for Korean Sounds: K-SAMPA

  • Kim, Jong-mi
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.4E
    • /
    • pp.3-16
    • /
    • 2001
  • An ASCII encoding of Korean has been developed for extended phonetic transcription of the Speech Assessment Methods Phonetic Alphabet (SAMPA). SAMPA is a machine-readable phonetic alphabet used for multilingual computing. It has been developed since 1987 and extended to more than twenty languages. The motivating factor for creating Korean SAMPA (K-SAMPA) is to label Korean speech for a multilingual corpus or to transcribe native language (Ll) interfered pronunciation of a second language learner for bilingual education. Korean SAMPA represents each Korean allophone with a particular SAMPA symbol. Sounds that closely resemble it are represented by the same symbol, regardless of the language they are uttered in. Each of its symbols represents a speech sound that is spectrally and temporally so distinct as to be perceptually different when the components are heard in isolation. Each type of sound has a separate IPA-like designation. Korean SAMPA is superior to other transcription systems with similar objectives. It describes better the cross-linguistic sound quality of Korean than the official Romanization system, proclaimed by the Korean government in July 2000, because it uses an internationally shared phonetic alphabet. It is also phonetically more accurate than the official Romanization in that it dispenses with orthographic adjustments. It is also more convenient for computing than the International Phonetic Alphabet (IPA) because it consists of the symbols on a standard keyboard. This paper demonstrates how the Korean SAMPA can express allophonic details and prosodic features by adopting the transcription conventions of the extended SAMPA (X-SAMPA) and the prosodic SAMPA(SAMPROSA).

  • PDF

Retrieving English Words with a Spoken Work Transliteration (입말 표기를 이용한 영어 단어 검색)

  • Kim Ji-Seoung;Kim Kwang-Hyun;Lee Joon-Ho
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.39 no.3
    • /
    • pp.93-103
    • /
    • 2005
  • Users of searching Internet English dictionary sometimes do not know the correct spelling of the word in mind, but remember only its pronunciation. In order to help these users, we propose a method to retrieve English words effectively with a spoken word transliteration that is a Korean transliteration of English word pronunciation. We develop KONIX codes and transform a spoken word transliteration and English words into them. We then calculate the phonetic similarity between KONIX codes using edit distance and 2-gram methods. Experimental results show that the proposed method is very effective for retrieving English words with a spoken word transliteration.

Eligibility of the affinity between alphabet codes and pronunciation drills

  • Kim, Hyoung-Youb
    • Lingua Humanitatis
    • /
    • v.8
    • /
    • pp.331-367
    • /
    • 2006
  • In this paper I attempted to investigate the matters related with the clarification of the close relationship between writing system and pronunciation. On the way of pursuing the research on the subject I found the fact that the same topic has been the main academic target in Korea. There have been some remarks about English alphabets and pronunciation. Nevertheless, the relation between alphabet codes and pronunciation tokens wasn't considered as the main key to master the English pronunciation correctly and completely. As the main target of this paper I argue that it is necessary to comprehend the connection. Then, we can recognize the significant role of alphabetic structure for understanding the gist of pronunciation exercise. This paper is classified into four parts. Each part consists of the material to affirm the fact that writing system should be the inevitable equivalent of sound system, and vice versa. In the first section I show that the development of the way of pronouncing English words is closely related with the endeavors of the scholars. While performing the survey of the studies about the alphabetic structure of the age many scholars found that the spelling construction was recorded without any common denominator. Thus, they not only sought to stage the bedrock for the standard written form of words but also to associate the alphabet letters with phonetic features. Secondly I mention the negative aspect of the 'only spelling based English pronunciation education' for the educational goal of 'Phonics methodology.' In this part I suggest the essentiality of phonemic properties with the phonetic prospect: phonemic awareness. Thirdly I refer to the standardization of the spelling system of English. As the realm of application of the language is extended toward the various professional areas such as commercial, scientific, and cultural spheres, it is quite natural to assume that the usage of the language will be transformed according to the areas in the world. Fourthly I introduce the first English-Korean grammar book with the section of 'the introduction to English pronunciation.' At the chapter the author explained the sound features of English based on the regulation of 'Scientific Alphabet' of U.S.A. In the transcribing system all the symbols were postulated on the basis of the English alphabet form instead of the separate phonetic signs of IPA.

  • PDF

Secure Blocking + Secure Matching = Secure Record Linkage

  • Karakasidis, Alexandros;Verykios, Vassilios S.
    • Journal of Computing Science and Engineering
    • /
    • v.5 no.3
    • /
    • pp.223-235
    • /
    • 2011
  • Performing approximate data matching has always been an intriguing problem for both industry and academia. This task becomes even more challenging when the requirement of data privacy rises. In this paper, we propose a novel technique to address the problem of efficient privacy-preserving approximate record linkage. The secure framework we propose consists of two basic components. First, we utilize a secure blocking component based on phonetic algorithms statistically enhanced to improve security. Second, we use a secure matching component where actual approximate matching is performed using a novel private approach of the Levenshtein Distance algorithm. Our goal is to combine the speed of private blocking with the increased accuracy of approximate secure matching.

Matlab Implementation of Real-time Speech Analysis Tool (실시간 음성분석도구의 MatLab 구현)

  • Bak Il-suh;Kim Dae-hyun;Jo Cheol-woo
    • MALSORI
    • /
    • no.44
    • /
    • pp.93-104
    • /
    • 2002
  • There are many speech analysis tools available. Among them real-time analysis tool is very useful for interactive experiments. A real-time speech analysis tool was implemented using Matlab. Matlab is a very widely used general purpose signal processing tool. In general, its computational speed is relatively lower than that of the codes from conventional programming languages. Especially, real-time analysis including input of signal and output of the result was not possible in the past. However, due to the improvement of computing power of PCs and inclusion of real-time I/O toolboxes in Matlab, real-time analysis is now possible in some extent by Matlab only. In this experiment, we tried to implement a real-time speech analysis tool using Matlab. Pitch and spectral information is computed in real-time. From the result it is shown that such real-time applications can be implemented easily using Matlab.

  • PDF

A Study on Data Sharing Codes Definition of Chinese in CAI Application Programs (CAI 응용프로그램 작성시 자료공유를 위한 한자 코드 체계 정의에 관한 연구)

  • Kho, Dae-Ghon
    • Journal of The Korean Association of Information Education
    • /
    • v.2 no.2
    • /
    • pp.162-173
    • /
    • 1998
  • Writing a CAI program containing Chinese characters requires a common Chinese character code to share information for educational purposes. A Chinese character code setting needs to allow a mixed use of both vowel and stroke order, to represent Chinese characters in simplified Chinese as well as in Japanese version, and to have a conversion process for data exchange among different sets of Chinese codes. Waste in code area is expected when vowel order is used because heteronyms are recognized as different. However, using stroke order facilitates in data recovery preventing duplicate code generation, though it does not comply with the phonetic rule. We claim that the first and second level Chinese code area needs to be expanded as much as academic and industrial circles have demanded. Also, we assert that Unicode can be a temporary measure for an educational code system due to its interoperability, expandability, and expressivity of character sets.

  • PDF

Reduction and Frequency Analyses of Vowels and Consonants in the Buckeye Speech Corpus

  • Yang, Byung-Gon
    • Phonetics and Speech Sciences
    • /
    • v.4 no.3
    • /
    • pp.75-83
    • /
    • 2012
  • The aims of this study were three. First, to examine the degree of deviation from dictionary prescribed symbols and actual speech made by American English speakers. Second, to measure the frequency of vowel and consonant production of American English speakers. And third, to investigate gender differences in the segmental sounds in a speech corpus. The Buckeye Speech Corpus was recorded by forty American male and female subjects for one hour per subject. The vowels and consonants in both the phonemic and phonetic transcriptions were extracted from the original files of the corpus and their frequencies were obtained using codes of a free software R. Results were as follows: Firstly, the American English speakers produced a reduced number of vowels and consonants in daily conversation. The reduction rate from the dictionary transcriptions to the actual transcriptions was around 38.2%. Secondly, the American English speakers used more front high and back low vowels while three-fourths of the consonants accounted for stops, fricatives, and nasals. This indicates that the segmental inventory has nonlinear frequency distribution in the speech corpus. Thirdly, the two gender groups produced vowels and consonants similarly even though there were a few noticeable differences in their speech. From these results we propose that English teachers consider pronunciation education reflecting the actual speech sounds and that linguists find a way to establish unmarked segmentals from speech corpora.

The design and implementation of automatic translation system for hangul's romanization (국어 로마자 표기 자동 변환 시스템 설계 및 구현)

  • 김홍섭
    • KSCI Review
    • /
    • v.2 no.1
    • /
    • pp.45-54
    • /
    • 1995
  • This study is, by assigning ASCII codes hardly used to Bandaljum(ˇ) and making the fonts of Korean-English character mode, to design the way of converting automatically a word, a sentence or a document of korean into phonetic letters by applying the algorismized phonological principles inputted as a letter string, even though a user do not konw the basic principles of the usage of Korean-to-Romanization notation rule. This is designed so that it may be possible to turn into a mechanical code with reference to the corresponding character in the table of Korean-to-Romanization notation rule that is the currently used standard proposition of the government. Consequently this program makes it user more convenient in the manipulations of special case words, the assistance of colorful-screen or pull-down, pop-up menu and the adoptation of utilizable mouse works for a user convienency. This program could be installed in a single diskette of 5.25"(2HD) and be made in C programming language to mplement various font, expansion or condense of font, alternative printing.ting.

  • PDF