• 제목/요약/키워드: Speech Tone

검색결과 200건 처리시간 0.023초

Robust Lip Extraction and Tracking of the Mouth Region

  • Min, Duk-Soo;Kim, Jin-Young;Park, Seung-Ho;Kim, Ki-Jung
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2000년도 ITC-CSCC -2
    • /
    • pp.927-930
    • /
    • 2000
  • Visual features of lip area play an important role in the visual speech information. We are concerned about correct lip area as region of interest (ROI). In this paper, we propose a robust and fast method for locating the mouth corners. Also, we define a region of interest at mouth during speech. A method, which we have used, only uses the horizontal and vertical image operators at mouth area. This searching is performed by fitting the ROI-template to image with illumination control. Most of the lip extraction algorithms are dependent on luminosity of image. We just used the binary image where the variable threshold is applied. The variable threshold varies to illumination condition. In order to control those variations, the gray-tone is converted to binary image by threshold, which is obtained through Multiple Linear Regression Analysis (MLRA) about divided 2D special region. Thus we obtained the region of interest at mouth area, which is the robust extraction about illumination. A region of interest is automatically extracted.

  • PDF

어음청력점사방법과 한국어음청력검사표에 관한 연구 (The Study on Method of Speech Audiometry and the Korean Word Lists Part 1 : Normal Hearing Group)

  • 김기령;김영명;최생이;강대령;심윤주
    • 대한기관식도과학회:학술대회논문집
    • /
    • 대한기관식도과학회 1977년도 제11차 학술대회연제 순서 및 초록
    • /
    • pp.6.1-6
    • /
    • 1977
  • 지금까지의 검사방법에는 순음청력검사, 어음청력검사, 자기청력검사, impedance audiometry등 여러가지 객관적청력검사방법이 있으며, 이들 가운데서 순음청력검사 및 어음청력검사가 기본검사로써 많이 이용되고 있음은 우리들이 주지하고 있는 터이지만 우리나라에 있어서의 어음청력검사에 관한 연구는 과거 20년간의 꾸준한 연구가 있었음에도 불구하고 아직까지 통일원 어음청력검사표가 없으므로써 각각의 어음청취역치와 명료도치 상호간의 정확한 비교가 되지 못하고 있을 뿐 아니라 이들의 검사방법 및 검사재료에 따른 계통적인 연구도 적었기 때문에 어때한 청사 방법과 어느 검사재료가 한국인의 어음청력검사에서 보다 적합한지를 모르고 있는 실정에 있다. 차제에 저자들은 한국에서의 통일된 어음청력검사표의 기준 및 검사법의 통일을 전제로해서 다음과 같이, 한국에서 통용되고 있으며, 대한청각학회에 제출되어 있는 5가지의 어음표를 사용하여 정상인에서 얻은 검사결과를 비교분석하여 다음과 같은 지견을 얻었다. A. 검사재료 및 방법 a. 어음검사표 1. 백준기어음표 (이하 "백 "이라고 약함) 2. 소진명어음표 (이하 "소"라고 약함) 3. 신규식어음표 (이하 "신"이라고 약함) 4. 이종담어음표 (이하 "이" 라고 약함) 5. 함태영어음표 (이하 "함"이라고 약함) b. 검사방법 1. 상행법 2. 하행법 c. 검사대상 1. 여자육성어음 2. 남자녹음어음 3. 여자녹음어음 B. 결 론 1. 상행법이 전검사에 있어서 하행법보다 검사치가 낮아 좋은 결과를 보였다. 2. 어음청취역치및 명료도치는 여자육성이 남자녹음 음에서 보다 낮았으나 여자녹음음에서 보다는 높았다. 3. 어음청취역치는 "함", "이", "소", "신", "백"순으로 수치가 낮았다. 4. 명료도치는 "신", "이", "소", "함", "백"의 순이였다. 5. 이상의 지견을 고찰할 때 우리나라에 있어서도 하루 속히 어음청력검사에 따른 기준치의 설정과 표준 어음청력검사표의 완성등 적절한 조치가 이루어져야할 것으로 촉구되었다.

  • PDF

Electromyographic evidence for a gestural-overlap analysis of vowel devoicing in Korean

  • Jun, Sun-A;Beckman, M.;Niimi, Seiji;Tiede, Mark
    • 음성과학
    • /
    • 제1권
    • /
    • pp.153-200
    • /
    • 1997
  • In languages such as Japanese, it is very common to observe that short peripheral vowel are completely voiceless when surrounded by voiceless consonants. This phenomenon has been known as Montreal French, Shanghai Chinese, Greek, and Korean. Traditionally this phenomenon has been described as a phonological rule that either categorically deletes the vowel or changes the [+voice] feature of the vowel to [-voice]. This analysis was supported by Sawashima (1971) and Hirose (1971)'s observation that there are two distinct EMG patterns for voiced and devoiced vowel in Japanese. Close examination of the phonetic evidence based on acoustic data, however, shows that these phonological characterizations are not tenable (Jun & Beckman 1993, 1994). In this paper, we examined the vowel devoicing phenomenon in Korean using data from ENG fiberscopic and acoustic recorders of 100 sentences produced by one Korean speaker. The results show that there is variability in the 'degree of devoicing' in both acoustic and EMG signals, and in the patterns of glottal closing and opening across different devoiced tokens. There seems to be no categorical difference between devoiced and voiced tokens, for either EMG activity events or glottal patterns. All of these observations support the notion that vowel devoicing in Korean can not be described as the result of the application of a phonological rule. Rather, devoicing seems to be a highly variable 'phonetic' process, a more or less subtle variation in the specification of such phonetic metrics as degree and timing of glottal opening, or of associated subglottal pressure or intra-oral airflow associated with concurrent tone and stricture specifications. Some of token-pair comparisons are amenable to an explanation in terms of gestural overlap and undershoot. However, the effect of gestural timing on vocal fold state seems to be a highly nonlinear function of the interaction among specifications for the relative timing of glottal adduction and abduction gestures, of the amplitudes of the overlapped gestures, of aerodynamic conditions created by concurrent oral tonal gestures, and so on. In summary, to understand devoicing, it will be necessary to examine its effect on phonetic representation of events in many parts of the vocal tracts, and at many stages of the speech chain between the motor intent and the acoustic signal that reaches the hearer's ear.

  • PDF

MLT 여기신호를 이용한 광대역 음성 부호화기 설계 (Design of Wideband Speech Coder Using the MLT Residual Signal)

  • 오연선;신재현;이인성
    • 한국음향학회지
    • /
    • 제24권5호
    • /
    • pp.248-254
    • /
    • 2005
  • 본 논문에서는 대역분할 광대역 음성 부호화기의 구조와 음질 향상을 위한 새로운 고대역 구조를 제안한다. 대역분할 방식에 의해 광대역 음성은 저대역 ($O\~4kHz$) 음성과 고대역 ($4\~8kHz$) 음성으로 나뉘어 지고 각각 G.729E와 MLT(Modulated Lapped Transform) 여기모델을 적용하여 서로 독립된 방식으로 부호화한다. 4kbps의 낮은 전송률로 부호화되는 고대역에서는 MLT 여기모델을 효율적으로 이용하기 위하여 유 무성음을 구별하였고 유성음에 대해서는 저대역 피치주기를 이용한 MLT peak picking 방법을 적용하였다. 즉, MLT 변환된 여기신호는 주기적인 피크를 갖는 주기신호로 나타나며 이때의 피크값을 추출하여 양자화하여 전송한다. 무성음에 대해서는 에너지 값에 따라 비트를 달리 적용하고, 선형예측 스펙트럴 응답이 가중된 MLT 벡터 양자화 방법을 적용하였다. 제안된 15.8kbps 광대역 음성 부호화기의 성능평가는 주관적인 음질평가로 선호도 테스트를 수행하였다.

기저막 특성을 이용한 새로운 음성 특징 추출 및 성능 분석 (Performance of analysis and extraction of speech feature using characteristics of basilar membrane)

  • 이철희;신유식;정성환;김종교
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2000년도 제13회 신호처리 합동 학술대회 논문집
    • /
    • pp.153-156
    • /
    • 2000
  • 본 논문에서는 음성 인식률 향상을 위한 여러 가지방법들 중에서 음성특징 파라미터 추출 방법에 관한 한가지 방법을 제시하였다. 본 논문에서는 청각 특성을 기반으로 한 MFCC(met frequency cepstrum coef-ficients)와 성능 향상을 위한 방법으로 GFCC (gamma-tone filter frequency cepstrum coefficients)를 제시하고 음성 인식을 수행하여 성능을 분석하였다. MFCC에서 일반적으로 사용하는 임계 대역 필터로 삼각 필터(triangular filter) 대신 청각 구조의 기저막(basilar membrane)특성을 묘사한 gammatone 대역 통과 필터를 이용하여 특징 파라미터를 추출하였다. DTW 알고리즘으로 인식률을 분석한 결과 삼각 대역 필터를 이용한 것보다 gammatone 대역 통과 필터를 이용한 추출법이 약 2∼3%의 성능 향상을 보였다.

  • PDF

표준어와 대구말의 리듬에 관한 실험음성학적 연구 -길이를 중심으로- (An Experimental Phonetic Study on the Rhythm of Daegu and Standard Korean --Focusing on Duration--)

  • 조운일
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 1996년도 2월 학술대회지
    • /
    • pp.179-182
    • /
    • 1996
  • This thesis compares the duration aspect of Daegu tongue with that of standard Korean. In the former study on the rhythm of standard Korean, one of the purposes of the study was to compare it with dialects. This thesis is the first attempt In do that. For this purpose, this thesis proceeds as follows. After Introduction, Chapter 2 surveys the former study, Chapter 3 deals with the materials, method and results of the experiment. Chapter 4 analyzes and interprets the results of the experiment. In Conclusion, the most prominent fact is that the results of the experiment fall short of Daegu tongue speakers' expectations. Daegu tongue is generally considered as ' tone language. ' And as Daegu tongue speakers sensitively recognize pitch, they think that they quitckly say the syllables between the pitch stressed syllables, whereas standard Korean speakers say those syllables relatively slowly. But in this experiment, which deals with only duration ignoring pitch, their assumption is proved to be false.

  • PDF

Categorical Perception in intonation

  • Lee, Ho-Young
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2002년도 11월 학술대회지
    • /
    • pp.86-89
    • /
    • 2002
  • According to Pierrehumbert (1980), two level tones - H and L - are enough in representing intonation of intonational languages. But in Korean, high fall and low fall boundary tones, both of which must be represented as HL% in intonational phonology as in Jun (1993, 1999), are distinct not only acoustically but also functionally. The same is true in the case of high level and mid level boundary tones, which must be represented as H% in intonational phonology. In this paper, I conducted two identification tests to provide crucial evidence that H and L are not enough in intonational phonology. The results of the identification tests show that categorical perception occur between high level and low level as well as between high fall and low fall. Based on this fact and the results of the acoustic analyses in Lee (1999, 2000), I strongly propose to adopt one more level tone - M - to represent Korean boundary tones.

  • PDF

CDMA 음성 통신 및 데이터 통신을 이용한 로봇 원격제어기 개발 (A Development of CDMA based Robot Remote Controller)

  • 김우식;윤수정;김응석
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 2005년도 제36회 하계학술대회 논문집 D
    • /
    • pp.2762-2764
    • /
    • 2005
  • In this paper, we study the robot controller design using the voice and data communication via CDMA(Code Division Multiple Access) mobile communication network. We design the robot remote controller using the three methods, telephone call speech recognition, DTMF (Dual Tone Multiple Frequency) realization, SMS(Short Message Service) transmission/reception way via CDMA mobile communication network. We investigate the validity and effectiveness of the proposed remote controller which applied to the mobile robot.

  • PDF

대구말과 표준말 리듬의 실험음성학적 비교연구 --길이(duration)를 중심으로-- (An Experimental Phonetic Study on the Rhythm of Daegu and Standard Korean --Focusing on Duration--)

  • 조운일
    • 대한음성학회지:말소리
    • /
    • 제27_28호
    • /
    • pp.89-109
    • /
    • 1994
  • This thesis compares the duration aspect of the Daegu tongue with that of standard Korean. In the former study on the rhythm of standard Korean, one of the purposes of the study was to compare it with dialects. This thesis is the first attempt to do that. For this purpose, this thesis proceeds as follows. After Introduction, Chapter 2 surveys the former study. Chapter 3 deals with the materials, method and results of the experiment. Chapter 4 analyzes and interprets the results of the experiment, In Conclusion, the most Prominent fact is that the results of the experiment fall short of Daegu tongue speakers' expectations. The Daegu tongue is generally considered as a 'tone language.' And as Daegu tongue speakers sensitively recognize pitch, they think that they quickly say the syllables between the Pitch stressed syllables, whereas standard Korean speakers say those syllables relatively slowly, But in this experiment, which deals with only duration ignoring Pitch, their assumption is proved to be false.

  • PDF

한국인 영어 학습자의 영어 관계절 모호성 해소의 운율적 전략 (Korean English Learners' Prosodic Disambiguation in English Relative Clause Attachment)

  • 전윤실;신지영;김기호
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2006년도 춘계 학술대회 발표논문집
    • /
    • pp.67-70
    • /
    • 2006
  • Prosody can be used to resolve syntactic ambiguity of a sentence. English relative clause construction with complex NP(the N1, N2, and RC sequence) has syntactic ambiguity and the clause can be interpreted as modyfying N1(high attachment) or N2(low attachment), Speakers and listeners can disambiguate those sentences based on the prosody. In this paper, we investigate the Korean English learners production on the prosodic structure of English relative clause construction. The production experiment shows that the beginner learners use the phrasing frequently and the advanced learners depend on both the phrasing and the accent. One of the characteristic of the Korean English learners' intonation is that the Korean accentual phrase tone pattern LHa is transferred to their production.

  • PDF