• Title/Summary/Keyword: speech features

Search Result 648, Processing Time 0.022 seconds

A Design and Implementation of Online Exhibition Application for Disabled Artists

  • Seung Gyeom Kim;Ha Ram Kang;Tae Hun Kim;Jun Hyeok Lee;Won Joo Lee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.8
    • /
    • pp.77-84
    • /
    • 2024
  • In this paper, we design and implement an online exhibition application based on an Android platform that can showcase the artistic works of disabled artists. This application considers user convenience for disabled artists, particularly providing STT(Speech-to-Text) and TTS(Text-to-Speech) features for visually and hearing impaired individuals. Additionally, for the exhibition of works by disabled artists, the application implements disability certification during registration using disability certificates and registration numbers, ensuring that only authenticated disabled artists can exhibit their works. The database storing personal information of disabled artists and information about art pieces is implemented using MySQL. The server module utilizes RestAPI to transmit data in Json format. To address the large data size of art piece information, it is stored using Firebase Storage, eliminating data size limitations on the server. This application can alleviate issues such as a lack of exhibition space for disabled artists and a lack of communication with the general public.

Continuous Speech Recognition based on Parmetric Trajectory Segmental HMM (모수적 궤적 기반의 분절 HMM을 이용한 연속 음성 인식)

  • 윤영선;오영환
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.3
    • /
    • pp.35-44
    • /
    • 2000
  • In this paper, we propose a new trajectory model for characterizing segmental features and their interaction based upon a general framework of hidden Markov models. Each segment, a sequence of vectors, is represented by a trajectory of observed sequences. This trajectory is obtained by applying a new design matrix which includes transitional information on contiguous frames, and is characterized as a polynomial regression function. To apply the trajectory to the segmental HMM, the frame features are replaced with the trajectory of a given segment. We also propose the likelihood of a given segment and the estimation of trajectory parameters. The obervation probability of a given segment is represented as the relation between the segment likelihood and the estimation error of the trajectories. The estimation error of a trajectory is considered as the weight of the likelihood of a given segment in a state. This weight represents the probability of how well the corresponding trajectory characterize the segment. The proposed model can be regarded as a generalization of a conventional HMM and a parametric trajectory model. The experimental results are reported on the TIMIT corpus and performance is show to improve significantly over that of the conventional HMM.

  • PDF

Classification of nasal places of articulation based on the spectra of adjacent vowels (모음 스펙트럼에 기반한 전후 비자음 조음위치 판별)

  • Jihyeon Yun;Cheoljae Seong
    • Phonetics and Speech Sciences
    • /
    • v.15 no.1
    • /
    • pp.25-34
    • /
    • 2023
  • This study examined the utility of the acoustic features of vowels as cues for the place of articulation of Korean nasal consonants. In the acoustic analysis, spectral and temporal parameters were measured at the 25%, 50%, and 75% time points in the vowels neighboring nasal consonants in samples extracted from a spontaneous Korean speech corpus. Using these measurements, linear discriminant analyses were performed and classification accuracies for the nasal place of articulation were estimated. The analyses were applied separately for vowels following and preceding a nasal consonant to compare the effects of progressive and regressive coarticulation in terms of place of articulation. The classification accuracies ranged between approximately 50% and 60%, implying that acoustic measurements of vowel intervals alone are not sufficient to predict or classify the place of articulation of adjacent nasal consonants. However, given that these results were obtained for measurements at the temporal midpoint of vowels, where they are expected to be the least influenced by coarticulation, the present results also suggest the potential of utilizing acoustic measurements of vowels to improve the recognition accuracy of nasal place. Moreover, the classification accuracy for nasal place was higher for vowels preceding the nasal sounds, suggesting the possibility of higher anticipatory coarticulation reflecting the nasal place.

A comparative study of coarticulation features between children with and without reading disability (읽기장애아동과 일반아동의 동시조음 특성 비교)

  • Sungsook Park;Cheoljae Seong
    • Phonetics and Speech Sciences
    • /
    • v.16 no.2
    • /
    • pp.99-109
    • /
    • 2024
  • Coarticulation is affected by the continuous movement of the articulator within a limited time and space through the neighboring segments and various overlaps. This study investigated the differences in coarticulation characteristics of children with reading disabilities and nondisabled children in CVC and VCV syllables consisted of stops, affricates, and vowels (a, i, u). The subjects were 13 children with reading disabilities and nondisabled children in the 2nd to 6th grades in elementary school. Two second formants were measured. One was measured at the point where the vowel began, and the other was measured at the mid point of the vowel stable section. Regression analysis was performed with F2 onset and F2 of the following vowel to obtain the locus equation (LE). 3-way ANOVA was conducted to the slope of the LE according to the groups (reading disabilities vs. nondisabled), places of articulation, and phonation types. In CVC syllable, dyslexic children showed a flatter slope than nondisabled children. With respect to the places of articulation, velar or bilabial sounds showed steeper LE slope than alveolar or palatal sounds. There were no main effects regarding group and phonation types variable for VCV syllable, and the significant differences in the places of articulation were also differed from the results for the CVC syllables. This study confirmed that dyslexic children showed a different pattern of coarticulation slope depending on the syllable structure. We also found that the higher pause rate of the dyslexic children had a stronger effect on the coarticulation in VCV structures.

Research on Classification of Human Emotions Using EEG Signal (뇌파신호를 이용한 감정분류 연구)

  • Zubair, Muhammad;Kim, Jinsul;Yoon, Changwoo
    • Journal of Digital Contents Society
    • /
    • v.19 no.4
    • /
    • pp.821-827
    • /
    • 2018
  • Affective computing has gained increasing interest in the recent years with the development of potential applications in Human computer interaction (HCI) and healthcare. Although momentous research has been done on human emotion recognition, however, in comparison to speech and facial expression less attention has been paid to physiological signals. In this paper, Electroencephalogram (EEG) signals from different brain regions were investigated using modified wavelet energy features. For minimization of redundancy and maximization of relevancy among features, mRMR algorithm was deployed significantly. EEG recordings of a publically available "DEAP" database have been used to classify four classes of emotions with Multi class Support Vector Machine. The proposed approach shows significant performance compared to existing algorithms.

Sequence-to-sequence based Morphological Analysis and Part-Of-Speech Tagging for Korean Language with Convolutional Features (Sequence-to-sequence 기반 한국어 형태소 분석 및 품사 태깅)

  • Li, Jianri;Lee, EuiHyeon;Lee, Jong-Hyeok
    • Journal of KIISE
    • /
    • v.44 no.1
    • /
    • pp.57-62
    • /
    • 2017
  • Traditional Korean morphological analysis and POS tagging methods usually consist of two steps: 1 Generat hypotheses of all possible combinations of morphemes for given input, 2 Perform POS tagging search optimal result. require additional resource dictionaries and step could error to the step. In this paper, we tried to solve this problem end-to-end fashion using sequence-to-sequence model convolutional features. Experiment results Sejong corpus sour approach achieved 97.15% F1-score on morpheme level, 95.33% and 60.62% precision on word and sentence level, respectively; s96.91% F1-score on morpheme level, 95.40% and 60.62% precision on word and sentence level, respectively.

Prediction of Prosodic Break Using Syntactic Relations and Prosodic Features (구문 관계와 운율 특성을 이용한 한국어 운율구 경계 예측)

  • Jung, Young-Im;Cho, Sun-Ho;Yoon, Ae-Sun;Kwon, Hyuk-Chul
    • Korean Journal of Cognitive Science
    • /
    • v.19 no.1
    • /
    • pp.89-105
    • /
    • 2008
  • In this paper, we suggest a rule-based system for the prediction of natural prosodic phrase breaks from Korean texts. For the implementation of the rule-based system, (1) sentence constituents are sub-categorized according to their syntactic functions, (2) syntactic phrases are recognized using the dependency relations among sub-categorized constituents, (3) rules for predicting prosodic phrase breaks are created. In addition, (4) the length of syntactic phrases and sentences, the position of syntactic phrases in a sentence, sense information of contextual words have been considered as to determine the variable prosodic phrase breaks. Based on these rules and features, we obtained the accuracy over 90% in predicting the position of major break and no break which have high correlation with the syntactic structure of the sentence. As for the overall accuracy in predicting the whole prosodic phrase breaks, the suggested system shows Break_Correct of 87.18% and Juncture Correct of 89.27% which is higher than that of other models.

  • PDF

AN ACOUSTIC ANALYSIS ON THE PRONUNCIATION OF KOREAN VOWELS IN PATIENT WITH CLASS III MALOCCLUSION (III급 부정교합 환자의 한국어 모음 발음에 관한 음향학적 분석)

  • Kim, Young-Ho;Yoo, Hyun-Ji;Kim, Whi-Young;Hong, Jong-Rak
    • Journal of the Korean Association of Oral and Maxillofacial Surgeons
    • /
    • v.35 no.4
    • /
    • pp.221-228
    • /
    • 2009
  • The purpose of the study was to investigate the characteristics of the pronunciation of Korean vowels in patients with class III malocclusion. 11 adult male patients with class III malocclusion(mean ages 22.3 years) and four adult males with normal occlusion(mean ages 26.5 years) were selected for the analysis of eight Korean monophthongs /ㅣ, ㅔ, ㅐ, ㅏ, ㅓ, ㅗ, ㅡ, ㅜ/. The values and relationships of F1, F2 and F3 were derived from the stable section of target vowel in each sentence, and the analysis using formant plots and vowel triangles' distance and area was conducted to find the features of two groups' vowel distributions. Consequently, it was identified that the pronunciation of males patients with class III malocclusion showed high values of F1 in the low vowels, high values of F2 in the back vowels, and remarkably low position of /ㅏ/. The vowel triangle suggested that the triangle areas of male patients with class III malocclusion were shown wider vertically and narrower horizontally than those of males with normal occlusion. These characteristics could reflect the structural features of class III malocclusion such as the prognathic mandible, low tongue position, and advancement of back position of the tongue.

Utilizing Korean Ending Boundary Tones for Accurately Recognizing Emotions in Utterances (발화 내 감정의 정밀한 인식을 위한 한국어 문미억양의 활용)

  • Jang In-Chang;Lee Tae-Seung;Park Mikyoung;Kim Tae-Soo;Jang Dong-Sik
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.6C
    • /
    • pp.505-511
    • /
    • 2005
  • Autonomic machines interacting with human should have capability to perceive the states of emotion and attitude through implicit messages for obtaining voluntary cooperation from their clients. Voice is the easiest and most natural way to exchange human messages. The automatic systems capable to understanding the states of emotion and attitude have utilized features based on pitch and energy of uttered sentences. Performance of the existing emotion recognition systems can be further improved withthe support of linguistic knowledge that specific tonal section in a sentence is related with the states of emotion and attitude. In this paper, we attempt to improve recognition rate of emotion by adopting such linguistic knowledge for Korean ending boundary tones into anautomatic system implemented using pitch-related features and multilayer perceptrons. From the results of an experiment over a Korean emotional speech database, the improvement of $4\%$ is confirmed.

Clinical features and risk factors for missed stroke team activation in cases of acute ischemic stroke in the emergency department

  • Byun, Young-Hoon;Hong, Sung-Youp;Woo, Seon-Hee;Kim, Hyun-Jeong;Jeong, Si-Kyoung
    • Journal of The Korean Society of Emergency Medicine
    • /
    • v.29 no.5
    • /
    • pp.437-448
    • /
    • 2018
  • Objective: Acute ischemic stroke (AIS) requires time-dependent reperfusion therapy, and early recognition of AIS is important to patient outcomes. This study was conducted to identify the clinical features and risk factors of AIS patients that are missed during the early stages of diagnosis. Methods: We retrospectively reviewed AIS patients admitted to a hospital through the emergency department. AIS patients were defined as ischemic stroke patients who visited the emergency department within 6 hours of symptom onset. Patients were classified into two groups: an activation group (A group), in which patients were identified as AIS and the stroke team was activated, and a non-activation group (NA group), for whom the stroke team was not activated. Results: The stroke team was activated for 213 of a total of 262 AIS patients (81.3%), while it was not activated for the remaining 49 (18.7%). The NA group was found to be younger, have lower initial National Institutes of Health Stroke Scale scores, lower incidence of previous hypertension, and a greater incidence of cerebellum and cardio-embolic infarcts than the A group. The chief complaints in the A group were traditional stroke symptoms, side weakness (61.0%), and speech disturbance (17.8%), whereas the NA group had non-traditional symptoms, dizziness (32.7%), and decreased levels of consciousness (22.4%). Independent factors associated with missed stroke team activation were nystagmus, nausea/vomiting, dizziness, gait disturbance, and general weakness. Conclusion: A high index of AIS suspicion is required to identify such patients with these findings. Education on focused neurological examinations and the development of clinical decision tools that could differentiate non-stroke and stroke are needed.