• Title/Summary/Keyword: Speech signals

Search Result 499, Processing Time 0.02 seconds

Discriminative Feature Vector Selection for Emotion Classification Based on Speech (음성신호기반의 감정분석을 위한 특징벡터 선택)

  • Choi, Ha-Na;Byun, Sung-Woo;Lee, Seok-Pil
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.64 no.9
    • /
    • pp.1363-1368
    • /
    • 2015
  • Recently, computer form were smaller than before because of computing technique's development and many wearable device are formed. So, computer's cognition of human emotion has importantly considered, thus researches on analyzing the state of emotion are increasing. Human voice includes many information of human emotion. This paper proposes a discriminative feature vector selection for emotion classification based on speech. For this, we extract some feature vectors like Pitch, MFCC, LPC, LPCC from voice signals are divided into four emotion parts on happy, normal, sad, angry and compare a separability of the extracted feature vectors using Bhattacharyya distance. So more effective feature vectors are recommended for emotion classification.

Devising an Objective Nasal Vibration Test for Nasal Resonatory Disorders

  • Choi, Hong-Shik;Park, Yong-Jae;Kim, Kwang-Moon
    • Speech Sciences
    • /
    • v.7 no.1
    • /
    • pp.39-52
    • /
    • 2000
  • The present study investigates the clinical applicability of a new device which objectively measures nasal resonating vibration via piezoelectric vibratory sensor from 10 normal volunteers, 10 patients with definite hypernasality and 10 nasal polyposis patients. For the assessment of the hypernasality, the ratio of 'ng' to 'a' as well as that of 'mama' to 'papa' passages were used. For the evaluation of hyponasality, the ratio of nasal vibration post- to pre-induced cul-de-sac resonation was calculated. In the control group, the ratio of ng/a and mama/papa passages was larger than 8, while in the hypernasality group, the ratio was markedly lower. The vibratory signals of 'a' and 'ng' increased markedly in the control group and the hypernasality group after inducing cul-de-sac resonation, while in the hyponasality group, the change was minimal.

  • PDF

Robust Voice Activity Detection Using the Spectral Peaks of Vowel Sounds

  • Yoo, In-Chul;Yook, Dong-Suk
    • ETRI Journal
    • /
    • v.31 no.4
    • /
    • pp.451-453
    • /
    • 2009
  • This letter proposes the use of vowel sound detection for voice activity detection. Vowels have distinctive spectral peaks. These are likely to remain higher than their surroundings even after severe corruption. Therefore, by developing a method of detecting the spectral peaks of vowel sounds in corrupted signals, voice activity can be detected as well even in low signal-to-noise ratio (SNR) conditions. Experimental results indicate that the proposed algorithm performs reliably under various noise and low SNR conditions. This method is suitable for mobile environments where the characteristics of noise may not be known in advance.

Channel Expansion Technology in MPEG Audio (MPEG 오디오의 채널 확장 기술)

  • Pang, Hee-Suk
    • Journal of Broadcast Engineering
    • /
    • v.16 no.5
    • /
    • pp.714-721
    • /
    • 2011
  • MPEG audio uses the masking effect, high frequency component synthesis based on spectral band replication, and channel expansion based on parametric stereo for efficient compression of audio signals. In this paper, we present an overview of the state-of-the-art channel expansion technology in MPEG audio. We also present technical overviews and application examples to broadcasting services for HE-AAC v.2, MPEG Surround, spatial audio object coding (SAOC), and unified speech and audio coding (USAC) which are MPEG audio codecs based on the channel expansion technology.

A Study Using Acoustic Measurement and Perceptual Judgment to identify Prosodic Characteristics of English as Spoken by Koreans (음향 측정과 지각 판단에 의한 한국인 영어의 운율 연구)

  • Koo, Hee-San
    • Speech Sciences
    • /
    • v.2
    • /
    • pp.95-108
    • /
    • 1997
  • The purpose of this experimental study was to investigate prosodic characteristics of English as spoken by Koreans. Test materials were four English words, a sentence, and a paragraph. Six female Korean speakers and five native English speakers participated in acoustic and perceptual experiments. Pitch and duration of word syllables were measured from signals and spectrograms made by the Signalize 3.04 software program for Power Mac 7200. In the perceptual experiment, accent position, intonation patterns, rhythm patterns and phrasing were evaluated by the five native English speakers. Preliminary results from this limited study show that prosodic characteristics of Koreans include (1) pitch on the first part of a word and sentence is lower than that of English speakers, but the pitch on the last part is the opposite; (2) word prosody is quite similar to that of an English speaker, but sentence prosody is quite different; (3) the weakest point of sentence prosody spoken by Koreans is in the rhythmic pattern.

  • PDF

A GPD-BASED DISCRIMINATIVE TRAINING ALGORITHM FOR PREDICTIVE NEURAL NETWORK MODELS

  • Na, Kyung-Min;Rheem, Jae-Yeol;Ann, Sou-Guil
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06a
    • /
    • pp.997-1002
    • /
    • 1994
  • Predictive neural network models are powerful speech recognition models based on a nonlinear pattern prediction. Those models can effectively normalize the temporal and spatial variability of speech signals. But those models suffer from poor discrimination between acoustically similar words. In this paper, we propose a discriminative training algorithm for predictive neural network models based on a generalized probabilistic descent (GPD) algorithm and minimum classification error formulation (MCEF). The Evaluation of our training algorithm on ten Korean digits shows its effectiveness by 40% reduction of recognition error.

  • PDF

Design and Manufacture of a Device for the Recognition of Long Vowels (장모음 인식장치 설계 제작)

  • 구용회
    • Journal of the Korean Institute of Telematics and Electronics T
    • /
    • v.35T no.3
    • /
    • pp.9-14
    • /
    • 1998
  • The speech recognition on long vowels are carried out by electric circuits. A level compressor is able to transform the wave of voice to serial pulses. The obtained pulses have informations to distinguish the vowels. The sampling of the pulses is carried out by the register which picks up a series of serial signals in a pitch of a vowel as an unit. The timing control pulses such as sampling pulses are generated by using peak pulses in the speech wave. The parallel data in the register assign the phonetic symbol by means of the decision making circuit which carries out the IF-THEN rule.

  • PDF

The Effects of the Speaking Rate on the Duration of Syllable before Boundary (발화속도가 경계앞 음절 길이에 미치는 영향)

  • Lee, Soon-Hyang;Koo, Hee-San
    • Speech Sciences
    • /
    • v.1
    • /
    • pp.103-111
    • /
    • 1997
  • The purpose of this study was to investigate the effect of the speaking rate on the duration of syllable before boundary. The materials used were four types of syllable-boundary sequences(Go-'Ga' Boundary-Gu) in a paragraph. The duration of 'Ga' syllables before 4 level of boundary was measured, and all of the measurements were taken from signals and spectrograms made by the $Signalyze^{TM}$ 3.04 for Power Mac 7200. Subjects were six female speakers who read the materials at fast, normal, and slow speed five times. The results show that (1) the slower the speaking rate becomes, the longer the duration of syllable before boundary, (2) the duration rank of syllable before each boundary does not correspond to the level of boundary, eg. at fast speed, = < #, + < $ ; at normal speed, +, #, = < $ ; at slow speed, + < =, #, $, and (3) the syllable before sentence boundary is less influenced than syllable before another boundary.

  • PDF

A Stable Pitch ]Determination via Dyadic Wavelet Transform (DyWT) (Dyadic Wavelet Transform 방식의 Pitch 주기결정)

  • Kim Namhoon;Yoon Gibum;Ko Hanseok
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • autumn
    • /
    • pp.197-200
    • /
    • 2000
  • This paper presents a time-based Pitch Determination Algorithm (PDA) for reliable estimation of pitch Period (PP) in speech signal. In proposed method, we use the Dyadic Wavelet Transform (DyWT), which detects the presence of Glottal Closure Instants (GCI) and uses the information to determine the pitch period. And, the proposed method also uses the periodicity property of DyWT to detect unsteady GCI. To evaluate the performance of the proposed methods, that of other PDAs based on DyWT are compared with what this paper proposed. The effectiveness of the proposed method is tested with real speech signals containing a transition between voiced and the unvoiced interval where the energy of voiced signal is unsteady. The result shows that the proposed method provides a good performance in estimating the both the unsteady GCI positions as well as the steady parts.

  • PDF

A study on the competitive learning algorithm for robust vector qantization to transmit speech signal (벡터 양자화를 위한 학습 알고리즘을 이용한 음성 전송 기술에 관한 연구)

  • Hong, Kang-You;Park, Sang-Hui
    • Proceedings of the KIEE Conference
    • /
    • 1999.07g
    • /
    • pp.3150-3152
    • /
    • 1999
  • The efficient representation and encoding of signals with limited resources, e.g., finite storage capacity and restricted transmission bandwidth, is a fundamental problem in technical information processing systems. Typically under realistic circumstances, the encoding and communication of message has to deal with different sources of noise and disturbances. In this paper, I propose a unifying approach to data compression by robust vector quantization, which explicitly deals with channel noise, and random elimination of prototypes. The resulting algorithm is able to limit the detrimental effect of noise in a very general communication scenario. In this paper, based on the robust vector quantization I have an experiment about speech coding.

  • PDF