• Title/Summary/Keyword: Speech Recording

Search Result 97, Processing Time 0.048 seconds

Design and Implementation of Home Network Information Appliance Remote Control System Using Voice XML Technology (VoiceXML기술을 이용한 홈네트워크 정보기기 원격 제어 시스템의 설계 및 구현)

  • 이진구;정문상
    • Proceedings of the IEEK Conference
    • /
    • 2002.06a
    • /
    • pp.133-136
    • /
    • 2002
  • VoiceXML is designed for creating audio dialogs that feature synthesized speech, degitized audio, recognition of spoken and DTMF key input, recording of spoken input, telephony, and mixed-initiative conversations. Uses the VoiceXML and there is a Place objective which does information home appliance machinery and tools control. When it uses tile VoiceXML, il will be able to provide a bias characteristic to the user The XML base the gearing with different civil official system is possible. With studying YoiceXML and OSGi, this paper has designed and implemented the control architecture of Information home appliances.

  • PDF

Design of Emulator using DSP Chip (DSP 칩을 이용한 에뮬레이터 설계)

  • Lee, Dae-Young;Lee, Jae-Hak;Kim, Jin-Min;Kim, Hyoun-Ho;Bae, Hyeon-Deok
    • Proceedings of the KIEE Conference
    • /
    • 1993.07a
    • /
    • pp.453-455
    • /
    • 1993
  • In this research, the digital signal processing PC board which employs TI's TMS320C25 is implemented. The board can perform following functions. spectrum analysis of speech and repetitive signal, digital filters emulation by convolution, signal generation of sinusoidal wave, rectangular wave etc.. In this system, communications between PC and DSP board. program down-loading to DSP board and recording and graphic of acquired and processed data in DSP board are executed by PC. Parallel interface and buffer memory are used in communications. Data acquisition and operation are carried out in DSP board. Resultant data are transmitted to PC and output through DAC.

  • PDF

Therapeutic Singing on Speech Production Parameters in Head and Neck Cancer Patients: Case Studies (치료적 노래부르기를 통한 두경부암 환자의 말산출 기능 향상 사례)

  • Kim, Ju Hee;Kim, Soo Ji
    • 재활복지
    • /
    • v.22 no.3
    • /
    • pp.189-208
    • /
    • 2018
  • This case study investigated the changes in speech intelligibility of patients with head and neck cancers who participated in a therapeutic singing-based intervention. Three patients received a total of twelve 30-minute individual sessions. The intervention consisted of three steps: movements for relaxing breathing muscles, vocalization for increasing the range of articulatory movements, and therapeutic singing. In order to examine the changes in speech intelligibility, the voice quality parameters, diadochokinesis (DDK) and the quadrangle vowel space area (VSA) were measured at pre- and posttest. The recording of what each patient read a written paragraph, which were transcribed by blinded assessors, were also analyzed. The results demonstrated that all of the patients showed positive changes in the voice quality, the rate of repetitive syllable production measured by DDK, and the articulatory working space measured by VSA. Along with these measured changes, increases in positive mood and rehabilitation motivation reported by the patients support that the therapeutic singing-based intervention could induce meaningful changes in terms of speech intelligibility from patients with head and neck cancers. Given that this study was conducted with a small sample size, suggestions for further investigation on the effects of the intervention were also presented.

Tube phonation in water for patients with hyperfunctional voice disorders: The effect of tube diameter and water immersion depth on bubble height and maximum phonation time (과기능적 음성장애 환자의 물저항발성: 튜브 직경과 물 깊이가 물거품 높이 및 최대발성지속시간에 미치는 영향)

  • Min Gyeong Kim;Seong Hee Choi;Jong-In Youn
    • Phonetics and Speech Sciences
    • /
    • v.15 no.2
    • /
    • pp.31-40
    • /
    • 2023
  • Tube phonation in water has been widely used for voice training among semi-occluded vocal tract (SOVT) exercises in which the patient bubbles with phonation keeping the tube submerged in water. This study aims to investigate the effect of tube diameter and water depth on bubble height and maximum phonation time (MPT) for patients with hyperfunctional voice disorders. Seventeen patients with hyperfunctional voice disorders were asked to bubble with sustained /u/ at the different inner diameters of tube (5, 7, and 10 mm), water depth (4, 7, and 10 cm). A water resistance phonation biofeedback system using a water height sensor was used for recording bubble height and MPT. The bubble height was significantly changed by the tube diameter while MPT was significantly changed with the tube diameter and water depth. Although the wider tube presented significantly lower bubble height for a given depth, relatively consistent bubble height was maintained. Depending on the water depth, the bubble height did not significantly differ for a given tube diameter. In addtion, MPT significantly decreased with water depth and a wider tube led significantly shorter MPT. A water level-driven water resistance biofeedback system provided useful information on bubble characteristics and vocal fold vibration depending on tube diameter and water depth. It can be useful to monitor the breath support during water resistance phonation for patients with hyperfunctional voice disorders.

An Analysis of Short and Long Syllables of Sino-Korean Words Produced by College Students with Kyungsang Dialect (경상방언 대학생들이 발음한 국어 한자어 장단음 분석)

  • Yang, Byunggon
    • Phonetics and Speech Sciences
    • /
    • v.7 no.4
    • /
    • pp.131-138
    • /
    • 2015
  • The initial syllables of a pair of Sino-Korean words are generally differentiated in their meaning by either short or long durations. They are realized differently by the dialect and generation of speakers. Recent research has reported that the temporal distinction has gradually faded away. The aim of this study is to examine whether college students with Kyungsang dialect made the distinction temporally using a statistical method of Mixed Effects Model. Thirty students participated in the recording of five pairs of Korean words in clear or casual speaking styles. Then, the author measured the durations of the initial syllables of the words and made a descriptive analysis of the data followed by applying Mixed Effects Models to the data by setting gender, length, and style as fixed effects, and subject and syllable as random effects, and tested their effects on the initial syllable durations. Results showed that college students with Kyungsang dialect did not produce the long and short syllables distinctively with any statistically significant difference between them. Secondly, there was a significant difference in the duration of the initial syllables between male and female students. Thirdly, there was also a significant difference in the duration of the initial syllables produced in the clear or casual styles. The author concluded that college students with Kyungsang dialect do not produce long and short Sino-Korean syllables distinctively, and any statistical analysis on the temporal aspect should be carefully made considering both fixed and random effects. Further studies would be desirable to examine production and perception of the initial syllables by speakers with various dialect, generation, and age groups.

The Correlation of Voice Characteristics and Depression Index Analysis in Accordance with Menstrual Cycle (월경주기에 따른 우울지수 정도와 음성특성과의 상관관계 분석)

  • Kim, YuMi;Jang, Seoung-Jin;Kim, Eunyeon;Choi, Yaelin
    • Phonetics and Speech Sciences
    • /
    • v.6 no.3
    • /
    • pp.41-48
    • /
    • 2014
  • This study investigated the differences between emotional parameters BDI, VHI, STAI-X-I and STAI-X-II according to the menstrual cycles of the female and the relation between changes of the depression index and voice characteristics (jitter, shimmer, CPP, HNR, $pF0{\cdot}F1{\cdot}F2{\cdot}F3$, sF0, sF4, sB1, $H1_{c/u}$, $A1_u$, $A3_c$, $H1A3_{c/u}$, $H1A1_u$). Twenty three females ($30{\pm}4.4$ years old) living in Seoul and Gyeonggi Province were participated in this study to answer the questionnaires and record their voice. The participants prolonged /a/ vowel for 5 seconds in a natural condition for their voice recording. Voice data were analyzed using the Matlab and Praat program. A t-test and a correlation analysis were conducted by using SPSS for the statistical analysis. The results are as follows. First, the BDI is significantly higher in group I (lurear phase contrast the menstrual period) and group II (follicular phase against the menstrual period) than group III (luteal phase for follicular phase) (p<.05). Second, shimmer, CPP, pF0 showed a statistically high correlation regarding the BDI in group I (lurear phase contrast the menstrual period). Voice parameters may be useful as supplement in evaluating the emotional change in the phase of menstrual cycle.

Contents Development of IrobiQ on School Violence Prevention Program for Young Children (지능형 로봇 아이로비큐(IrobiQ)를 활용한 학교폭력 예방 프로그램 개발)

  • Hyun, Eunja;Lee, Hawon;Yeon, Hyemin
    • The Journal of the Korea Contents Association
    • /
    • v.13 no.9
    • /
    • pp.455-466
    • /
    • 2013
  • The purpose of this study was to develop a school violence prevention program "Modujikimi" for young children to be embedded in IrobiQ, the teacher assistive robot. The themes of this program consisted of basic character education, bullying prevention education and sexual violence prevention education. The activity types included large group, individual and small group activities, free choice activities, and finally parents' education, which included poems, fairy tales, music, art, sharing stories. Finally, the multi modal functions of the robot were employed: image on the screen, TTS (Text To Speech), touch function, recognition of sound and recording system. The robot content was demonstrated to thirty early childhood educators whose acceptability of the content was measured using questionnaires. And also the content was applied to children in daycare center. As a result, majority of them responded positively in acceptability. The results of this study suggest that the further research is needed to improve two-way interactivity of teacher assistive robot.

An Implementation of Travel Information Service Using VoiceXML and GPS (VoiceXML과 GPS를 이용한 여행정보 서비스의 구현)

  • Oh, Jae-Gyu;Kim, Sun-Hyung
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.8 no.6
    • /
    • pp.1443-1448
    • /
    • 2007
  • In this paper, we implement a distributed computing environment-based travel information service that can use web(internet) and speech interface at the same time and can apply location information, using voice and web browser-based VoiceXML and GPS, to escape the limitations of traditional web(internet)-based travel information services. Because of IVR(Interactive Voice Response) of traditional call center has operated to a pre-installation scenario, it takes much a service time and has the inconveniences that must repeat speech recording according to the revised scenarios in case change response contents. However, suggested VoiceXML and GPS-based travel information service system has advantages that reorganization of system setups is easy, because it consists of the method to update server after make individual conversation scenarios by file format(document), and can provide usefully various travel information in environmental restriction conditions such as the back regions environment, according as our prototype find user's present location using GPS information and then provide various travel information service by this information.

  • PDF

Acoustic analysis of wet voice among patients with swallowing disorders (삼킴장애 환자의 wet voice 관련 음향학적 분석)

  • Kang, Young Ae;Koo, Bon Seok;Kwon, In Sun;Seong, Cheoljae
    • Phonetics and Speech Sciences
    • /
    • v.10 no.4
    • /
    • pp.147-154
    • /
    • 2018
  • Wet voice quality (WVQ) is a characteristic that appears after swallowing. Although the concept is accepted by many clinicians worldwide, it is nevertheless ambiguous. In this study, we investigated WVQ in patients with swallowing disorders using acoustic analysis. A total of 106 patients diagnosed with penetration-aspiration by the videofluoroscopic swallowing study (VFSS) were recruited. A voice recording of vowel /a/ was conducted before and after the VFSS, and an acoustic analysis was then performed using PRAAT. Voice after VFSS was used for a perceptual judgment and divided into two groups: the Wet group (48 patients) and the Non-wet group (58 patients). At the post-VFSS stage, the two groups displayed significant differences in many acoustic parameters including F0_SD, Jitter, RAP, Shimmer, APQ, HNR, NHR, FUF, DVB, and CPP. The parameter affecting judging wetness resulted into Jitter and NHR by the logistic regression test. At the pre-VFSS stage, the two groups differed significantly in many acoustic parameters including Intensity, Jitter, RAP, Shimmer, NHR, FUF, DVB, and CPP. Both pre-and post-VFSS, the mean values of all significant parameters, except Intensity, HNR, and CPP, were higher in the Wet group. According to pre-and post-VFSS, the two groups displayed interactions in many parameters (Intensity, F0_SD, Jitter, RAP, Shimmer, APQ, HNR, NHR, FUF, DVB, and CPP). In particular, Intensity increased in both groups after the VFSS, although the increase in the Non-wet group was greater. Based on these results, it was conjectured that the WVQ after swallowing resulted from the secretion effect of the mucous membrane due to the dry laryngeal characteristic of elderly patients, rather than aspiration resulting in food on the vocal cords.

Characteristics of Vowel Formants, Voice Intensity, and Fundamental Frequency of Female with Amyotrophic Lateral Sclerosis using Spectrograms (스펙트로그램을 이용한 근위축성측삭경화증 여성 화자의 모음 포먼트, 음성강도, 기본주파수의 변화)

  • Byeon, Haewon
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.9
    • /
    • pp.193-198
    • /
    • 2019
  • This study analyzed the changes of vowel formant, voice intensity, and fundamental frequency of vowels for 11 months using acoustochemical spectrogram analysis of women diagnosed with amyotrophic lateral sclerosis (ALS). The test word was a vowel /a, i, u/ and a diphthong /h + ja + da/, /h + wi + da/, and /h +ɰi+ da/. Speech data were collected through the word reading task presented on the monitor using 'Alvin' program, and the recording environment was set to 5,500 Hz for the nyquist frequency and 11,000 Hz for the sampling rate. The records were analyzed by using spectrograms to vowel formants, voice intensity, and fundamental frequency. As a result of analysis, the fundamental frequency and intensity of the ALS process were decreased and the formant slope of the diphthong was decreased rather than the formant change in the vowel. This result suggests that the vowel distortion of ALS due to disease progression is due to the decrease of tongue and jaw co morbidity.