• Title/Summary/Keyword: 방언 분류

Search Result 5, Processing Time 0.017 seconds

Performance Comparison of Korean Dialect Classification Models Based on Acoustic Features

  • Kim, Young Kook;Kim, Myung Ho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.10
    • /
    • pp.37-43
    • /
    • 2021
  • Using the acoustic features of speech, important social and linguistic information about the speaker can be obtained, and one of the key features is the dialect. A speaker's use of a dialect is a major barrier to interaction with a computer. Dialects can be distinguished at various levels such as phonemes, syllables, words, phrases, and sentences, but it is difficult to distinguish dialects by identifying them one by one. Therefore, in this paper, we propose a lightweight Korean dialect classification model using only MFCC among the features of speech data. We study the optimal method to utilize MFCC features through Korean conversational voice data, and compare the classification performance of five Korean dialects in Gyeonggi/Seoul, Gangwon, Chungcheong, Jeolla, and Gyeongsang in eight machine learning and deep learning classification models. The performance of most classification models was improved by normalizing the MFCC, and the accuracy was improved by 1.07% and F1-score by 2.04% compared to the best performance of the classification model before normalizing the MFCC.

Dialect classification based on the speed and the pause of speech utterances (발화 속도와 휴지 구간 길이를 사용한 방언 분류)

  • Jonghwan Na;Bowon Lee
    • Phonetics and Speech Sciences
    • /
    • v.15 no.2
    • /
    • pp.43-51
    • /
    • 2023
  • In this paper, we propose an approach for dialect classification based on the speed and pause of speech utterances as well as the age and gender of the speakers. Dialect classification is one of the important techniques for speech analysis. For example, an accurate dialect classification model can potentially improve the performance of speaker or speech recognition. According to previous studies, research based on deep learning using Mel-Frequency Cepstral Coefficients (MFCC) features has been the dominant approach. We focus on the acoustic differences between regions and conduct dialect classification based on the extracted features derived from the differences. In this paper, we propose an approach of extracting underexplored additional features, namely the speed and the pauses of speech utterances along with the metadata including the age and the gender of the speakers. Experimental results show that our proposed approach results in higher accuracy, especially with the speech rate feature, compared to the method only using the MFCC features. The accuracy improved from 91.02% to 97.02% compared to the previous method that only used MFCC features, by incorporating all the proposed features in this paper.

Dialektengrenzen in Deutschland und das Alter der hochdeutschen Lautverschiebung (독어방언분류와 <고지독어 음운추이>의 발생시기)

  • Song Wan-Yong
    • Koreanishche Zeitschrift fur Deutsche Sprachwissenschaft
    • /
    • v.7
    • /
    • pp.61-83
    • /
    • 2003
  • Die wichtigsten Theorien tiber das Alter der hochdeutschen Lautverschiebung (LV) sind die beiden folgenden: 1. ein sehr hohes Alter(wohl noch var Chr. abgeschlossen), 2. ein relativ junges Alter(im Allgemeinen das 6.-10. Jh. n. Chr.). Die beiden Datierungen weichen um uber ein halbes Jahrtausend voneinander ab. Diese Abweichungen gehen auf unterschiedliche Methoden der Altersbestimmung zuruck: entweder datiert man nur anhand schriftlichen Zeugnisse oder aufgrund von etwas spekulativ ethnologischen Erwagungen. Die letztere lasst zwar ein hohes Alter der LV zu, gibt aber es keine sicheren Belege dafur. Deshalb kann nur sprachliches Material eine sichere Basis fur die Datierung der LV sein. Das gilt immer noch, wenn auch die Schreibung i. d. R. Veranderungen in der gesprochenen Sprache nicht rechtzeitig widerspiegelt. Die altesten Belege fur die wichtigsten Dialekte sind die folgenden: Die alemannische Runeninschrift von Stetten (ca. 680, wahl der fruheste Beleg uberhaupt fur die hd. LV), Handschrift des langobardischen Edictus Rathari $({\cal}um 700)$, das bairische Salzburger Verbrtiderungsbuch(784), die mittelfrankischen Glossen des Echtemacher Evangeliars $({\cal}um 700)$.

  • PDF

A gazetteer of three Japanese plant taxonomists (G. Koidzumi, J. Ohwi, and S. Kitamura) of Kyoto University in Korea during 1930s (1930년대 교토대학의 한반도 채집과 지명 정리: G. Koidzumi, J. Ohwi, S. Kitamura)

  • Chang, Kae-Sun;Park, Soo-Kyung;Kim, Hui;Chang, Chin-Sung
    • Korean Journal of Plant Taxonomy
    • /
    • v.43 no.4
    • /
    • pp.319-331
    • /
    • 2013
  • Records found on labels of specimens deposited at Kyoto University (KYO) and references about three Japanese taxonomists, Koidzumi, Gen'ichi (1883-1953), Ohwi, Jisaburo (1905-1977), Kitamura, Siro (1906-2002) were assembled to produce collector's itineraries from 1930 to 1935 in Korea. The quality of data on labels of the specimens varies, but most are only the collector's name and country of collection, often, the locality data are only textual, and the Chinese and Japanese names, as well as the ethnic dialects common to the region, varies widely. It is estimated that approximately 2,000 specimens collected from Korea by three taxonomists are currently held within the collections of Kyoto University herbarium (KYO). Koidzumi, who was the professor of Kyoto University, traversed different northern parts of the country, such as Island Jeju-do, Mt. Keumkang-san, Hamkyongbuk-do during summer (July to August) in 1932, 1933, and 1935. In 1930 and 1932, Ohwi spend three months in the unexplored mountains in northern parts, such as Hamkyeongnam-do, Hamkyeongbuk-do, and Gangwon-do. On the other hand, for two months in the middle of 1935 visited Jeju-do, Mt. Jirisan and travelled through southern parts. Unlike two previous botanists, major collections in Korea by Kitamura took place twice in one major area in northern part and Jeju-do and Mt. Keumgang-san in 1930, 1932, and 1935.

Species Identification and Monitoring of Labeling Compliance for Commercial Pufferfish Products Sold in Korean On-line Markets (국내 온라인 유통 복어 제품의 종판별 및 표시사항 모니터링 연구)

  • Ji Young Lee;Kun Hee Kim;Tae Sun Kang
    • Journal of Food Hygiene and Safety
    • /
    • v.38 no.6
    • /
    • pp.464-475
    • /
    • 2023
  • In this study, based on an analysis of two DNA barcode markers (cytochrome c oxidase subunit I and cytochrome b genes), we performed species identification and monitored labeling compliance for 50 commercial pufferfish products sold in on-line markets in Korea. Using these barcode sequences as a query for species identification and phylogenetic analysis, we screened the GenBank database. A total of seven pufferfish species (Takifugu chinensis, T. pseudommus, T. xanthopterus, T. alboplumbeus, T. porphyreus, T. vermicularis, and Lagocephalus cheesemanii) were identified and we detected 35 products (70%) that were non-compliant with the corresponding label information. Moreover, the labels on 12 commercial products contained only the general common name (i.e., pufferfish), although not the scientific or Korean names for the 21 edible pufferfish species. Furthermore, the proportion of mislabeled highly processed products (n = 9, 81.8%) was higher than that of simply processed products (n = 26, 66.7%). With respect to the country of origin, the percentage of mislabeled Chinese products (n = 8, 80%) was higher than that of Korean products (n = 26, 66.7%). In addition, the market and dialect names of different pufferfish species were labeled only as Jolbok or Milbok, whereas two non-edible pufferfish species (T. vermicularis and T. pseudommus) were used in six commercial pufferfish products described as JolboK and Gumbok on their labels, which could be attributable to the complex classification system used for pufferfish. These monitoring results highlight the necessity to develop genetic methods that can be used to identify the 21 edible pufferfish species, as well as the need for regulatory monitoring of commercial pufferfish products.