• Title/Summary/Keyword: Voice function

검색결과 434건 처리시간 0.026초

아이트래킹 및 음성인식 기술을 활용한 지체장애인 컴퓨터 접근 시스템 (A Computer Access System for the Physically Disabled Using Eye-Tracking and Speech Recognition)

  • 곽성은;김이삭;심드보라;이승환;황성수
    • 한국HCI학회논문지
    • /
    • 제12권4호
    • /
    • pp.5-15
    • /
    • 2017
  • 컴퓨터 대체접근 기기는 지체장애인의 사회활동 참여에 대한 욕구를 충족시킬 수 있는 방법 중의 하나로 정보통신기술의 발달과 함께 그 필요성이 증대되고 있다. 이러한 기기들은 대부분 발, 머리 등을 이용하여 컴퓨터에 접근할 수 있도록 하는데, 지체장애인의 특성상 발, 머리 등을 이용하여 마우스를 컨트롤하는 것은 쉽지 않으며 속도와 정확도면에서 한계가 있다. 본 논문에서는 기존의 대체접근 기기의 한계를 보완한 '지체장애인 컴퓨터 대체접근시스템'을 제안한다. 제안하는 시스템은 아이트래킹 기술을 이용하여 사용자의 시선만으로 마우스를 이동시킬 수 있고 비교적 누르기 쉬운 외부버튼을 통해 마우스 클릭이 가능하며 음성인식을 통해 문자를 쉽고 빠르게 입력할 수 있다. 또한 마우스 우클릭, 더블클릭, 드래그 기능, 화상 키보드 기능, 인터넷 기능, 스크롤 기능 등 세부적인 기능을 제공하여 컴퓨터가 제공하는 대부분의 작업을 수행할 수 있다.

서울대학교병원에서의 Provox 장기간 사용경험 (Long-Term Experiences of the Provox Voice Prosthesis at Snuh)

  • 정영호;박준범;원태빈;이승신;모지훈;박석원;성명훈;김광현
    • 대한기관식도과학회지
    • /
    • 제5권2호
    • /
    • pp.130-137
    • /
    • 1999
  • Background and Objectives : Provox, a recently developed tracheoesophageal prosthesis, had been widely used for voice rehabilitation after total laryngectomy for its low resistance and easiness of speech ability. But, long-term use of Provox resulted in many complications and resulted in cessation of Provox as a primary method of vocal rehabilitation. The aim of this study is to report Provox-related problems and the long-term results of Provox voice prosthesis. Materials and Methods : Medical records from patients who had undergone total laryngectomy with Provox insertion at seoul National University Hospital between January 1993 and December 1998 were reviewed retrospectively. Results : 36 patients had used 79 Provox voice prostheses during the observed period. The most common complication causing prosthesis change or removal was leakage and/or aspiration, followed by granulation formation, crusting and/or obstruction, and non-function. Median in situ lifetime of Provox was 274 days and 1-year-in situ rate was 31.0% Among 36 patients, 17 patients had undergone tracheoesophageal shunt closure at the last follow-up visit. 10 patients had complications but got along without further treatments, and 1 patient changed to Blom-Singer voice prosthesis. Only 8 patients experienced no complication, and 5 out of whom had several times of Provox change. Conclusion : long-term use of Provox resulted in discontinuation of its use due to complications in many cases. A better voice prosthesis with lower complication rate and longer in situ lifetime is needed.

  • PDF

Bit-dropping에 의한 Overload Control 방식을 채용한 Packet Voice Multiplexer의 성능 분석에 관한 연구 (Performance Analysis of a Packet Voice Multiplexer Using the Overload Control Strategy by Bit Dropping)

  • 우준석;은종관
    • 한국통신학회논문지
    • /
    • 제18권1호
    • /
    • pp.110-122
    • /
    • 1993
  • 음성이 패킷망을 통해 전송될때 각각의 call들에 의해서 발생되는 패킷들은 statistical multiplexer에 의해 다중화 되는데 이때 overload control이 필요하다. Overload control 방식은 음성 traffic을 coding하는 방식과 밀접한 관계가 있으며 그동안 많은 연구가 진행되어 왔다. CCITT에서는 최근에 packetized voice protocol에 대한 권고안 초안인 G.764를 작성하였는데 여기에서 embedded coding을 사용하는 경우에 bit dropping 방식을 사용하면 매우 훌륭하게 overload control을 할 수 있다는 사실을 언급하였다. 이에 따라 본 논문에서는 음성을 embedded ADPCM으로 coding하여 CCITT권고안 G.764에 따라 전송하는 경우에, bit dropping 방식에 따른 overload control 방식을 사용하는 패킷 multiplexer의 성능을 분석하고자 한다. 성능 분석을 위해서는 먼저 multiplexer에 도착하는 중첩된 packet arrival process에서 패킷들의 interarrival time들 간에 존재하는 큰 correlation을 정확히 나타낼 수 있는 수학적인 model이 필요하다. 본 논문에서는 Poisson process나 birth-and-death process에 비해 이들 packet arrival process를 상대적으로 정확히 표현할 수 있는 Makov-modulated Poisson Process(MMPP)를 사용하여 모델링을 하였다.따라서 성능분석은 MMPP/G/1 queueing system에 대한 분석과 비슷하다. 다만 서비스 시간의 분포가 시스템의 상태에 따라 달라지는데 이러한 경우에 대해서는 기존의 논문에서 분석되지 않았다. 성능분석을 통하여 queue에서 서비스를 기다리는 패킷의 수에 대한 분포의 Z-transform을 구하고 이를 이용하여 임의의 시간에서의 queue length와 waiting time의 평균과 표준편차를 구하였다. 이를 통하여 bit dropping 방식에 의한 overload control이 음성의 질을 많이 저하시키지 않으면서도 overload control을 하지 않을 때에 비해 statistical multiplexer에서 훨씬 많은 수의 call을 수용할 수 있도록 하는 효과를 가진다는 사실을 확인 하였다. 또한 패킷이 queue에서 떠난 직후와 임의의 시간에서 구한 queue length와 waiting time의 평균과 표준편차가 매우 비슷하다는 사실을 알 수 있었다. 본 논문에서와 마찬가지로 임의의 시간에서의 분석은 매우 복잡한 경우가 대부분이므로 이러한 사실을 이용하면 매우 간단히 성능분석을 할 수 있을 것이다.

  • PDF

스펙트로그램을 이용한 내전형 연축성 발성 장애와 근긴장성 발성 장애의 감별 (Differentiation of Adductor-Type Spasmodic Dysphonia from Muscle Tension Dysphonia Using Spectrogram)

  • 노승호;김소연;조재경;이상혁;진성민
    • 대한후두음성언어의학회지
    • /
    • 제28권2호
    • /
    • pp.100-105
    • /
    • 2017
  • Background and Objectives : Adductor type spasmodic dysphonia (ADSD) is neurogenic disorder and focal laryngeal dystonia, while muscle tension dysphonia (MTD) is caused by functional voice disorder. Both ADSD and MTD may be associated with excessive supraglottic contraction and compensation, resulting in a strained voice quality with spastic voice breaks. The aim of this study was to determine the utility of spectrogram analysis in the differentiation of ADSD from MTD. Materials and Methods : From 2015 through 2017, 17 patients of ADSD and 20 of MTD, underwent acoustic recording and phonatory function studies, were enrolled. Jitter (frequency perturbation), Shimmer (amplitude perturbation) were obtained using MDVP (Multi-dimensional Voice Program) and GRBAS scale was used for perceptual evaluation. The two speech therapist evaluated a wide band (11,250 Hz) spectrogram by blind test using 4 scales (0-3 point) for four spectral findings, abrupt voice breaks, irregular wide spaced vertical striations, well defined formants and high frequency spectral noise. Results : Jitter, Shimmer and GRBAS were not found different between two groups with no significant correlation (p>0.05). Abrupt voice breaks and irregular wide spaced vertical striations of ADSD were significantly higher than those of MTD with strong correlation (p<0.01). High frequency spectral noise of MTD were higher than those of ADSD with strong correlation (p<0.01). Well defined formants were not found different between two groups. Conclusion : The wide band spectrograms provided visual perceptual information can differentiate ADSD from MTD. Spectrogram analysis is a useful diagnostic tool for differentiating ADSD from MTD where perceptual analysis and clinical evaluation alone are insufficient.

  • PDF

Text-to-speech 시스템에서의 화자 변환 기능 구현 (Implementation of the Voice Conversion in the Text-to-speech System)

  • 황철규;김형순
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 1999년도 학술발표대회 논문집 제18권 2호
    • /
    • pp.33-36
    • /
    • 1999
  • 본 논문에서는 기존의 text-to-speech(TTS) 합성방식이 미리 정해진 화자에 의한 단조로운 합성음을 가지는 문제를 극복하기 위하여, 임의의 화자의 음색을 표현할 수 있는 화자 변환(Voice Conversion) 기능을 구현하였다. 구현된 방식은 화자의 음향공간을 Gaussian Mixture Model(GMM)로 모델링하여 연속 확률 분포에 따른 화자 변환을 가능케 했다. 원시화자(source)와 목적화자(target)간의 특징 벡터의 joint density function을 이용하여 목적화자의 음향공간 특징벡터와 변환된 벡터간의 제곱오류를 최소화하는 변환 함수를 구하였으며, 구해진 변환 함수로 벡터 mapping에 의한 스펙트럼 포락선을 변환했다. 운율 변환은 음성 신호를 정현파 모델에 의해서 모델링하고, 분석된 운율 정보(피치, 지속 시간)는 평균값을 고려해서 변환했다. 성능 평가를 위해서 VQ mapping 방법을 함께 구현하여 각각의 정규화된 켑스트럼 거리를 구해서 성능을 비교 평가하였다. 합성시에는 ABS-OLA 기반의 정현파 모델링 방식을 채택함으로써 자연스러운 합성음을 생성할 수 있었다.

  • PDF

A Study on Annoyance of Interior Noise on Town-Bus

  • Park, Hyung Woo;Kim, Sung Han
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제9권2호
    • /
    • pp.42-47
    • /
    • 2017
  • In these days, the size of urban is growing and the function of city becomes complicated. And also, in city, people lives a lot. The life of urban is getting closer and linked with neighboring people in many parts. Especially, when peoples are exposing during using public transportation, even though does not be known, in they were living. Seoul is the most crowded place in Korea. In Seoul,The village buses have been serviced to the narrow streets. And people who use this bus, wants to seek the comfort of the ride, air quality and noise during in vehicle. In this paper, we determine the degree of annoyance with the noise inside the town bus in dB scale. And, such a situation was confirmed annoyance see their effect. The Interior noise did not see a big difference in the new car and the old car. Annoyance but also according to the skill of the bus driver remains the difference was confirmed.

이동통신 단말기 카메라의 손떨림 보정 장치의 H 제어 (H Control on the Optical Image Stabilizer Mechanism in Mobile Phone Cameras)

  • 이치범
    • 한국생산제조학회지
    • /
    • 제23권3호
    • /
    • pp.266-272
    • /
    • 2014
  • This study proposes a closed-loop shaping control method with $H_{\infty}$ optimization for optical image stabilization (OIS) in mobile phone cameras. The image stabilizer is composed of a horizontal stage constrained by ball bearings and actuated by the magnetic force from voice coil motors. The displacement of the stage is measured by Hall effect sensors. From the OIS frequency response experiment, the transfer function models of the stage and Hall effect sensor were identified. The weight functions were determined considering the tracking performance, noise attenuation, and stability with considerable margins. The $H_{\infty}$ optimal controller was executed using closed-loop shaping and limiting the controller order, which should be less than 6 for real-time implementation. The control algorithm was verified experimentally and proved to operate as designed.

Glottal Area and Voice Onset Time

  • Kim, Dae-Won
    • 대한음성학회지:말소리
    • /
    • 제15_18호
    • /
    • pp.19-34
    • /
    • 1989
  • There is general agreement that voice onset time (VOT) is functionally related with the glottal opening at the moment of the oral release of a stop. However, systematic investigations of tempo 8n4 the place of articulation as affecting the glottal opening and VOT have relatively neglected. Various instrumental techniques were used to verify the claim with BrEng and korean speakers, under controlled experimental conditions, tempo being one of them. It was found that voiceless aspiration (i.e. VOT) is not simply a function of the glottal area at the moment of the oral release of a stop as it is normally defined in the existing literature. Within a given place of articulation and across temper VOT was generally insignificantly related to the glottal area. It is inferred that the glottal adduction onset time for the following vowel is actively control led by the speaker to meet aerodynamic requirements in relation to class (i.e. aspirated and unaspirated) and tempo. Some possible underlying physiological mechanisms for various phonetic aspects of intervocalic stops, associated with the glottal area and VOT, were discussed.

  • PDF

역설적 성대운동을 보이는 3명의 환자에 대한 임상분석 (Clinical Evaluation of 3 patients with Paradoxical Vocal Cord Movement)

  • 최선명;임길채;한광우;남순열
    • 대한기관식도과학회지
    • /
    • 제9권1호
    • /
    • pp.83-86
    • /
    • 2003
  • Background and Objectives : Paradoxical vocal cord movement is a series of paroxysmal adduction of the anterior two-thirds of the vocal cords during respiration or during phonation. The choking, stridor, and wheezing in this condition occur primarily on inhalation, rather than on exhalation. The two pathognomonic diagnostic criterias that need to be assessed during an acute presentation are laryngoscopy with direct visualization of paradoxical adduction of the vocal cords and pulmonary function testing. Materials and Methods : A retrospective review of 3 patients who were referred to otolaryngologist from pulmonology department, and were confirmed by typical laryngoscopic findings with paradoxical adduction of the vocal cords was conducted. Results The patients were misdiagnosed as exercised-induced asthma, and unresponsive to corticosteroid and bronchodilators. Improvement was achieved only by diagnosis with paradoxial vocal cord movement. Biofeed back therapy, voice therapy, treatment for reflux laryngitis improved symptoms. Conclusion The etiology of paradoxical vocal cord movement is unknown. It may be functional or emotional. The functional factors that were proposed are neurologic deficit and gastroesophageal reflux. Management methods of this condition consist of psychological counselling, voice therapy, and antireflux medication.

  • PDF

소음환경이 정상 및 병적음성에 미치는 영향 (The Effect of Noise on the Normal and Pathological Voice)

  • 홍기환;양윤수;김현기
    • 음성과학
    • /
    • 제9권4호
    • /
    • pp.27-38
    • /
    • 2002
  • The purpose of this article is to present the acoustic parameters (VOT, jitter, shimmer, vF0, vAm, NHR, SPI, VTI, DVB, DSH) for consonants (/pipi/, /$p^{h}ip^{h}i$/, /p'ip'i/) and sustained vowels (/a/, /e/, /i/) produced by normal subjects and dysphonia patients at two vocal effort(normal, high) by Lombard effect using 60dB white noise. Lombard effect indicates the vocal effort increase in noisy situation. At normal vocal effort, in general the acoustic parameter values of patients are greater than normal. And in noisy situation, significant decrease of acoustic values is seen in normal compared with in dysphonia patients. The clinical implication of this finding, the vocal quality in dysphonia is not compensated by vocal effort as well as normal subjects because of the inefficiency caused by abnormal vocal fold appearance and function. And with this result, we can counsel that the voice quality can not be improved as well as the patient expect.

  • PDF