• Title/Summary/Keyword: vowel addition

Search Result 76, Processing Time 0.026 seconds

Vehicle License Plate Text Recognition Algorithm Using Object Detection and Handwritten Hangul Recognition Algorithm (객체 검출과 한글 손글씨 인식 알고리즘을 이용한 차량 번호판 문자 추출 알고리즘)

  • Na, Min Won;Choi, Ha Na;Park, Yun Young
    • Journal of Information Technology Services
    • /
    • v.20 no.6
    • /
    • pp.97-105
    • /
    • 2021
  • Recently, with the development of IT technology, unmanned systems are being introduced in many industrial fields, and one of the most important factors for introducing unmanned systems in the automobile field is vehicle licence plate recognition(VLPR). The existing VLPR algorithms are configured to use image processing for a specific type of license plate to divide individual areas of a character within the plate to recognize each character. However, as the number of Korean vehicle license plates increases, the law is amended, there are old-fashioned license plates, new license plates, and different types of plates are used for each type of vehicle. Therefore, it is necessary to update the VLPR system every time, which incurs costs. In this paper, we use an object detection algorithm to detect character regardless of the format of the vehicle license plate, and apply a handwritten Hangul recognition(HHR) algorithm to enhance the recognition accuracy of a single Hangul character, which is called a Hangul unit. Since Hangul unit is recognized by combining initial consonant, medial vowel and final consonant, so it is possible to use other Hangul units in addition to the 40 Hangul units used for the Korean vehicle license plate.

Possibility of Motor Speech Improvement in People With Spinocerebellar Ataxia via Intensive Speech Treatment (집중치료를 통한 소뇌운동실조증 환자의 말운동개선 가능성)

  • Park, Youngmi
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.11
    • /
    • pp.634-642
    • /
    • 2018
  • People with spinocerebellar ataxia, a hereditary and progressive neurogenic disorder, suffer from ataxic dysarthria due to cerebellar dystrophy. This study was designed to examine if intensive motor speech treatment yields improvement in progressive ataxic dysarthria and if then, to investigate magnitude of therapeutic effect. SPEAK $OUT!^{(R)}$ was provided to a 55-year old female diagnosed with SCA for improving motor speech functions. Magnitude of therapeutic effect was large in changes of MPT and vocal intensity across speech tasks. Small effect size was found in changes of fundamental frequency, however, large therapeutic effect was observed in changes of frequency range. In addition, improvement of vocal quality based on jitter, shimmer, and HNR was observed with large therapeutic effect size and vowel space was expanded, particularly, due to F1. Lastly, VHI scores were decreased. Intensive motor speech treatment, called as SPEAK $OUT!^{(R)}$ was effective enough to observe improvement in vocal intensity, frequency range, and vocal quality, expanding vowel space and lowering VHI scores. Based on the results of this case study, further efficacy evaluation of SPEAK $OUT!^{(R)}$ for improving progressive ataxic dysarthria in people with SCA is required.

The Aquisition and Description of Voiceless Stops of Spanish and English

  • Marie Fellbaum
    • Proceedings of the KSPS conference
    • /
    • 1996.10a
    • /
    • pp.274-274
    • /
    • 1996
  • This presents the preliminary results from work in progress of a paired study of the acquisition of voiceless stops by Spanish speakers learning English, and American English speakers learning Spanish. For this study the hypothesis was that the American speakers would have no difficulty suppressing the aspiration in Spanish unaspirated stops; the Spanish speakers would have difficulty acquiring the aspiration necessary for English voiceless stops, according to Eckman's Markedness Differential Hypothesis. The null hypothesis was proved. All subjects were given the same set of disyllabic real words of English and Spanish in carrier phrases. The tokens analyzed in this report are limited to word-initial voiceless stops, followed by a low back vowel in stressed syllables. Tokens were randomized and then arranged in a list with the words appearing three separate times. Aspiration was measured from the burst to the onset of voicing(VOT). Both the first language (Ll) tokens and second language (L2) tokens were compared for each speaker and between the two groups of language speakers. Results indicate that the Spanish speakers, as a group, were able to reach the accepted target language VOT of English, but English speakers were not able to reach the accepted range for Spanish, in spite of statistically significant changes of p<.OOl by speakers in both groups of learners. A closer analysis of the speech samples revealed wide variability within the speech of native speakers of English. Not only is variability in English due to the wide range of VOT (120 msecs. for English labials, for example) but individual speakers showed different patterns. These results are revealing for the demands requied in experimental designs and the number of speakers and tokens requied for an adequate description of different languages. In addition, a simple report of means will not distinguish the speakers and the respective language learning situation; measurements must also include the RANGE of acceptability of VOT for phonetic segments. This has immediate consequences for the learning and teaching of foreign languages involving aspirated stops. In addition, the labelling of spoken language in speech technology is shown to be inadequate without a fuller mathematical description.

  • PDF

Design of Hangeul Smartphone Keypad (한글 스마트폰 글자판 설계)

  • Lee, Junghwa
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.10
    • /
    • pp.2359-2366
    • /
    • 2015
  • In accordance with development of many smart phone applications, the importance of keypad that can be used in smart phone has been increasing. In this paper, we design the Hangul smart phone keypad to type a Hangeul characters more efficiently by considering the characteristics of the Hangul characters based on the existing research on smart phones keypad. The proposed keypad in this paper, when we placed the letters on the keyboard, minimizes the travel distance by using the frequency of characters and the associated frequency between vowel and consonant. In addition, we define an assessment model for evaluating the performance of the keypad and verify efficiency of the proposed keypad. According to the result of the experiment, the proposed keypad is more efficient than other keypads.

Pronunciation Variation Patterns of Loanwords Produced by Korean and Grapheme-to-Phoneme Conversion Using Syllable-based Segmentation and Phonological Knowledge (한국인 화자의 외래어 발음 변이 양상과 음절 기반 외래어 자소-음소 변환)

  • Ryu, Hyuksu;Na, Minsu;Chung, Minhwa
    • Phonetics and Speech Sciences
    • /
    • v.7 no.3
    • /
    • pp.139-149
    • /
    • 2015
  • This paper aims to analyze pronunciation variations of loanwords produced by Korean and improve the performance of pronunciation modeling of loanwords in Korean by using syllable-based segmentation and phonological knowledge. The loanword text corpus used for our experiment consists of 14.5k words extracted from the frequently used words in set-top box, music, and point-of-interest (POI) domains. At first, pronunciations of loanwords in Korean are obtained by manual transcriptions, which are used as target pronunciations. The target pronunciations are compared with the standard pronunciation using confusion matrices for analysis of pronunciation variation patterns of loanwords. Based on the confusion matrices, three salient pronunciation variations of loanwords are identified such as tensification of fricative [s] and derounding of rounded vowel [ɥi] and [$w{\varepsilon}$]. In addition, a syllable-based segmentation method considering phonological knowledge is proposed for loanword pronunciation modeling. Performance of the baseline and the proposed method is measured using phone error rate (PER)/word error rate (WER) and F-score at various context spans. Experimental results show that the proposed method outperforms the baseline. We also observe that performance degrades when training and test sets come from different domains, which implies that loanword pronunciations are influenced by data domains. It is noteworthy that pronunciation modeling for loanwords is enhanced by reflecting phonological knowledge. The loanword pronunciation modeling in Korean proposed in this paper can be used for automatic speech recognition of application interface such as navigation systems and set-top boxes and for computer-assisted pronunciation training for Korean learners of English.

Acoustic-phonetic characteristics of fricatives distortion in functional articulation disorders (기능적 조음음운장애아동의 치조 마찰음 왜곡의 음향음성학적 특성)

  • Yang, Minkyo;Choi, Yaelin;Kim, Eun Yeon;Yoo, Hyun Ji
    • Phonetics and Speech Sciences
    • /
    • v.10 no.4
    • /
    • pp.127-134
    • /
    • 2018
  • This study aims to explain the difficulties children with articulation and phonological disorders have in producing alveolar fricative sounds. The study will perform a comparative analysis revealing how ordinary children produce alveolar fricative sounds through five different acoustic variables, and consequently identifying objective differences, compared to children with articulation and phonological disorders. Therefore, this study compared and analyzed the differences between 10 children with articulation and phonological disorders and 10 ordinary children according to a phonation type of alveolar fricative sounds (/s/ and /$s^*$), a type of vowel (/i/, /ε/, /u/, /o/, /ɯ/, /ʌ/, /ɑ/), and a structure of syllables (CV, VCV) through acoustic variables including a central moment, skewness, kurtosis, a center of gravity and variance. That is, children with articulation and phonological disorders, when compared to ordinary children, have difficulties with concentrating an agile and momentary friction with strength when articulating alveolar fricative sounds, which uses strong energy and accompany tension. Furthermore, the values of alveolar fricative sounds of children with articulation and phonological disorders appeared to spread evenly over the average range, which means that the range of overall the standard deviation values for children with functional phonological disorders is wider than that of ordinary children. For a future study, if the mispronounced sounds relating to omission, substitution, and addition can be compared and analyzed for various target groups, it could be used effectively to help children with functional phonological disorders.

Laryngeal Cancer Screening using Cepstral Parameters (켑스트럼 파라미터를 이용한 후두암 검진)

  • 이원범;전경명;권순복;전계록;김수미;김형순;양병곤;조철우;왕수건
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.14 no.2
    • /
    • pp.110-116
    • /
    • 2003
  • Background and Objectives : Laryngeal cancer discrimination using voice signals is a non-invasive method that can carry out the examination rapidly and simply without giving discomfort to the patients. n appropriate analysis parameters and classifiers are developed, this method can be used effectively in various applications including telemedicine. This study examines voice analysis parameters used for laryngeal disease discrimination to help discriminate laryngeal diseases by voice signal analysis. The study also estimates the laryngeal cancer discrimination activity of the Gaussian mixture model (GMM) classifier based on the statistical modelling of voice analysis parameters. Materials and Methods : The Multi-dimensional voice program (MDVP) parameters, which have been widely used for the analysis of laryngeal cancer voice, sometimes fail to analyze the voice of a laryngeal cancer patient whose cycle is seriously damaged. Accordingly, it is necessary to develop a new method that enables an analysis of high reliability for the voice signals that cannot be analyzed by the MDVP. To conduct the experiments of laryngeal cancer discrimination, the authors used three types of voices collected at the Department of Otorhinorlaryngology, Pusan National University Hospital. 50 normal males voice data, 50 voices of males with benign laryngeal diseases and 105 voices of males laryngeal cancer. In addition, the experiment also included 11 voices data of males with laryngeal cancer that cannot be analyzed by the MDVP, Only monosyllabic vowel /a/ was used as voice data. Since there were only 11 voices of laryngeal cancer patients that cannot be analyzed by the MDVP, those voices were used only for discrimination. This study examined the linear predictive cepstral coefficients (LPCC) and the met-frequency cepstral coefficients (MFCC) that are the two major cepstrum analysis methods in the area of acoustic recognition. Results : The results showed that this met frequency scaling process was effective in acoustic recognition but not useful for laryngeal cancer discrimination. Accordingly, the linear frequency cepstral coefficients (LFCC) that excluded the met frequency scaling from the MFCC was introduced. The LFCC showed more excellent discrimination activity rather than the MFCC in predictability of laryngeal cancer. Conclusion : In conclusion, the parameters applied in this study could discriminate accurately even the terminal laryngeal cancer whose periodicity is disturbed. Also it is thought that future studies on various classification algorithms and parameters representing pathophysiology of vocal cords will make it possible to discriminate benign laryngeal diseases as well, in addition to laryngeal cancer.

  • PDF

Effects of Voice Therapy Using Gliding and Humming in Dysphonic Patients With Glottal Gap (활창과 허밍을 이용한 음성치료가 성문틈 환자의 음성 개선에 미치는 효과)

  • Jung, Dae-Yong;Shim, Mi-Ran;Hwang, Yeon-Shin;Kim, Geun-Jeon;Sun, Dong-Il
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.32 no.2
    • /
    • pp.81-86
    • /
    • 2021
  • Background and Objectives Therapies have been reported to treat the glottal gap previously. However, these voice therapies showed the limits because many techniques focused only on one among breathing, resonance and phonation. In addition patients often have difficulties visiting hospital frequently. 'Gliding and humming' is vocal training technique that readjusts total vocal patterns such as breathing, resonance and phonation. This technique can be easily applied during short term sessions. The purpose of this study is to evaluate the efficiency of voice therapy with 'gliding and humming' for patients with glottic gap during short-term treatment sessions. Materials and Method Twenty-three patients with glottal gap were selected. Of all patients, 14 patients had sulcus vocalis and 12 patients had muscle tension dysphonia (MTD). Voice therapies were performed 1.9 sessions in average. GRBAS, jitter, shimmer, noise to harmonic ratio, semitone range, closed quotient_vowel and maximum phonation time were compared before and after the therapies. In addition, changes of glottal gap and MTD severity were evaluated. Results Statistically significant improvement was observed. MTD improvement was observed only among the patients with glottal gap improvement. Also sulcus vocalis group showed the statistically significant improvement. Conclusion 'Gliding and humming' was effective to the patients with glottic gap and sulcus vocalis. Also, among patients who have both glottic gap and MTD, the data suggests that voice therapy for glottic gap also makes improvement in MTD.

Comparative Study on Acoustic Characteristics of Vocal Fold Paralysis and Benign Mucosal Disorders of Vocal Fold (성대마비와 양성 성대점막질환의 음향학적 특성비교)

  • Kong, Il-Seung;Cho, Young-Ju;Lee, Myung-Hee;Kim, Jong-Seung;Yang, Yun-Su;Hong, Ki-Hwan
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.18 no.2
    • /
    • pp.122-128
    • /
    • 2007
  • This study aims to analyze the voices of the patients with voice disorders including vocal fold paralysis, vocal fold cyst and vocal nodule/polyp in the aspect of acoustic phonetics. This study intends to collect subsidiary acoustic data in order to make a speech treatment and an standardization of vocal disorders. Subjects and Methods: The subjects of this study were 64 adult patients who underwent indirect laryngoscopy and laryngostroboscopy, and were diagnosed as vocal fold paralysis, vocal fold cyst or vocal nodule/polyp. Experimental group consisted of 20 patients who were diagnosed as vocal fold paralysis, 21 patients who were diagnosed as vocal fold cyst and had the average age of 42.0 $({\pm}10.03)$ ; and 23 patients who were diagnosed as vocal nodule/polyp and had the average age of 40.9 $({\pm}13.75)$. For the methodology of this study, the patients listed above were asked to sit in a comfortable position at intervals of 10cm apart from the patient's mouth and a microphone, and subsequently to phonate a vowel sound /e/ for the maximum phonation time with natural tone and vocal volume then the sound was directly inputted on a computer. During recording, sampling rate was set to 44,100Hz and the 1-second area corresponding to stable zone except the first and the last stage of waveform of the vowel sound /e/ vocalized by the individual patients was analyzed. Results: First, there was no statistically significant difference in jitter and shimmer between vocal fold paralysis and vocal fold cyst, while there was highly statistically significant difference in them between vocal fold paralysis and vocal nodule/polyp. Second, looking into the mean values obtained from NNE, HNR and SNR results associated with noise ratio, the disease showing the most abnormal characteristics was vocal fold paralysis, followed by cyst and nodule/polyp in order. For NNE, there was statistically significant difference between vocal nodule/polyp, and cyst or paralysis. In other words, it was found that the NNE of vocal nodule/polyp was weaker than that of cyst or paralysis. Similarly, HNR and SNR also showed the same characteristics; there was statistically significant difference between vocal fold paralysis and vocal fold cyst or nodule/polyp, and HNR and SNR values of vocal fold paralysis were lower than those of vocal fold cyst or nodule/polyp. Conclusion: For vocal fold paralysis, the abnormal values of acoustic parameters associated with frequency, amplitude and noise ratio were statistically significantly higher than those of vocal fold cyst and nodule/polyp. This finding suggests that the voices of the patients with vocal fold paralysis are the most severely injured due to less stability of vocal fold movement, asymmetry and incomplete glottic closure. In addition, there was no statistically significant difference in the acoustic parameters of tremor among vocal fold paralysis, vocal fold cyst and vocal nodule/polyp. Further studies need to ascertain reasonable acoustic parameters with various vocal disorders as well as to clarify the correlation between acoustics-based objective tools and subjective evaluations.

  • PDF

Characteristics of respiration and phonation depending on smoking or non smoking by practical musicology students and general male students (실용음악전공학생과 일반남학생의 흡연여부에 따른 호흡과 발성 특성 비교)

  • Kim, Eunhye;Choi, Hong-Shik;Lim, Seong-Eun;Choi, Yaelin
    • Phonetics and Speech Sciences
    • /
    • v.6 no.3
    • /
    • pp.49-56
    • /
    • 2014
  • This research compared the features of respiration and phonation between practical musicology students and general male students, according to their smoking status. Participants of this research are 15 practical musicology male students attending ${\bigcirc}{\bigcirc}$ university and 16 general ${\bigcirc}{\bigcirc}{\bigcirc}$ university students. The participants, both non-smokers and smokers with 5-years of smoking history have no history of voice disease in any case and have normal cognitive functions. The results indicated that, first, there is not a notable difference in the respiratory activity status(FVC, FEV1, FEV1/FVC), regardless of major and smoking status. In MPT, even though there is no significant difference in accordance with their majors, considering smoking status, the smoker group was shorter than non-smoker group significant difference statistically (p<.01). Second, the divisions of participants' major did not show significant difference in Fo, jitter, shimmer, and NHR in the vowel prolongation task. However, the smoker group showed a significantly higher degree of jitter and shimmer than the non-smoker group (p<.05) as Fo and NHR shows no difference. In the case of VRP, maximum frequency and frequency range of the practical group are significantly higher than normal group statistically (p<.001). Moreover, although the difference of the minimum frequency shown at the statistic is not significant, practical group showed a higher tendency of frequency than normal group (p=.051). In conclusion, even though there is no difference in respiratory activity between the smoker group and non-smoker group, the MPT of the smoker group is shorter than that of non-smoker group. In addition, the smoker group showed a higher degree of jitter and shimmer than the non-smoker group. MPT is related to the valve action of vocal fold that passes through the glottis. Thus, it is interpreted that the smoker group has a lower quality of voice and valve action of the vocal fold. Also, the practical group has a higher degree of maximum frequency and frequency range than the normal group. This research can function as basic data for vocal characteristics for the majors in relation to the voice-specializing.