• Title/Summary/Keyword: Voiced

Search Result 282, Processing Time 0.023 seconds

A Study on Extraction of Pitch and TSIUVC in Continuous Speech (연속음성신호에서 피치와 TSIUVC 추출에 관한 연구)

  • Lee See-Woo
    • Journal of Internet Computing and Services
    • /
    • v.6 no.4
    • /
    • pp.85-92
    • /
    • 2005
  • In this paper, I propose a new extraction method Pitch Pulse and TSIUVC in continuous speech, The TSIUVC searching and extraction method is based on a zero-crossing rate and individual Pitch Pulse extraction method using FIR-STREAK filter. As a result, the extraction rate of individual pitch pulses was $96{\%}$ for male voice and $85{\%}$ for female voice respectively. The TSIUVC extraction rates are $94.9{\%}$ under $88{\%}$ for male voice and $94.9{\%}$ under $84.8{\%}$ for female voice. This method has the capability of being applied to a new speech coding of Voiced/Silence/TSIUVC, speech analysis and speech synthesis.

  • PDF

A Study on the Segmentation of Speech Signal into Phonemic Units (음성 신호의 음소 단위 구분화에 관한 연구)

  • Lee, Yeui-Cheon;Lee, Gang-Sung;Kim, Soon-Hyon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.10 no.4
    • /
    • pp.5-11
    • /
    • 1991
  • This paper suggests a segmentation method of speech signal into phonemic units. The suggested segmentation system is speaker-independent and performed without anyprior information of speech signal. In segmentation process, we first divide input speech signal into purevoiced region and not pure voiced speech regions. After then we apply the second algorithm which segments each region into the detailed phonemic units by using the voiced detection parameters, i.e., the time variation of 0th LPC cepstrum coefficient parameter and the ZCR parameter. Types of speech, used to prove the availability of segmentation algorithm suggested in this paper, are the vocabulary composed of isolated words and continuous words. According to the experiments, the successful segmentation rate for 507 phonemic units involved in the total vocabulary is 91.7%.

  • PDF

How Different are Vowel Epentheses in Learner Speech and Loanword Phonology?

  • Park, Mi-Sun;Kim, Jong-Mi
    • Speech Sciences
    • /
    • v.15 no.2
    • /
    • pp.33-51
    • /
    • 2008
  • Difference of learner speech and loanword phonology is investigated in terms of Korean learners' speech and their loanword adaptation of English words with a post-vocalic word-final stop. When we compared the speech of 12 Korean learners in mid-intermediate level with that of eight English speakers, the learner speech did not reflect loanword phonology of the vowel insertion after a voiced word-final stop (e.g., rib$[\dotplus]$, bad$[\dotplus]$, gag$[\dotplus]$ vs. tip[=], cat[=], book[=]), but, instead, the target phonology of vowel lengthening before a voiced word-final stop (e.g., rib[r.I:b], CAD$[k{\ae}:d]$, bag$[b{\ae}:g]$ vs. rip[rI.p], cat$[k{\ae}t]$, back$[b{\ae}k])$. A longitudinal study of learner speech before and after instruction showed some development toward the acquisition of target phonology. The results indicate that learner speech departs from loanword phonology, and approaches to target speech in a faster rate than direct ratio. Thus, native phonology predicts loanword phonology, but lends little support to learner speech. Our results also indicate that loanword phonology is constant, while learner speech changes toward the acquisition of target phonology.

  • PDF

Improving LD-CELP using frame classification and modified synthesis filter (프레임 분류와 합성필터의 변형을 이용한 적은 지연을 갖는 음성 부호화기의 성능)

  • 임은희;이주호;김형명
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.21 no.6
    • /
    • pp.1430-1437
    • /
    • 1996
  • A low delay code excited linear predictive speech coder(LD-CELP) at bit rates under 8kbps is considered. We try to improve the perfomance of speech coder with frame type dependent modification of synthesis filter. We first classify frames into 3 groups: voiced, unvoiced and onset. For voicedand unvoiced frame, the spectral envelope of the synthesis filter is adapted to the phonetic characteristics. For transition frame from unvoiced to voiced, the synthesis filter which has been interpolated with the bias filter is used. The proposed vocoder produced more clear sound with similar delay level than other pre-existing LD-CELP vocoders.

  • PDF

A Study on APC-MPC in 8kbps of Convergence System (융복합 시스템의 8kbps에 있어서 APC-MPC에 관한 연구)

  • Lee, See-Woo
    • Journal of Digital Convergence
    • /
    • v.13 no.7
    • /
    • pp.177-182
    • /
    • 2015
  • In a MPC(Multi-Pulse Coding) using excitation source of voiced and unvoiced, it would be a distortion of voice waveform. This is caused by normalization of synthesis speech waveform of voiced in the process of restoration. To solve this problem, this paper present APC-MPC of amplitude-position compensation in a multi-pulses each pitch interval in order to reduce distortion of synthesis waveform. Also, I was implemented that the APC-MPC in coding system. And I evaluate the SNRseg of APC-MPC in 8kbps coding condition of convergence system. As a result, SNRseg of APC-MPC was 13.9dB for female voice and 14.3dB for male voice respectively. And so, I expect to be able to this method for cellular phone and smart phone using excitation source of low bit rate.

Initial-syllable lengthening of an utterance-internal phrase in Korean

  • Yun, Ilsung
    • Phonetics and Speech Sciences
    • /
    • v.6 no.2
    • /
    • pp.141-151
    • /
    • 2014
  • This study reports anti-hierarchical initial-syllable lengthening of an utterance-internal phrase in Korean. That is, the phrase-initial syllable (e.g., /a/ of "apa-do" or /ma/ of "mapa-do") starting with a voiced phoneme (i.e., vowels or voiced consonants) manifests itself as significantly longer when it is preceded by another phrase without a pause than when it leads an utterance or follows a pause utterance-internally. The phenomenon was examined with regard to two other factors: (1) tempo and (2) tenseness of the consonant (/p, $p^{\prime}$, $p^h$/) following the target syllable /a/. First, the effect of tempo on initial lengthening was not significant. Apart from the statistical significance, however, a tendency was observed, i.e., the slower the tempo is, the greater the lengthening. By contrast, the faster the tempo is, the higher the ratio (%) of lengthening. Second, contrary to our expectations, initial-syllable lengthening was even greater before tense stops /$p^{\prime}$, $p^h$/ than before lax stop /p/ regardless of tempo, and it was remarkable when it comes to the ratio (%), which means that initial lengthening is free of the pre-consonantal vowel shortening effect. Final-syllable lengthening is a pre-boundary marker, while the initial-syllable lengthening is regarded as a post-boundary marker of a phrase.

Phonological Contrast between Korean and Thai in Terms of Language Universality (보편성에 따른 한국어와 태국어의 음운대조)

  • Kim, Seon-Jung
    • Cross-Cultural Studies
    • /
    • v.35
    • /
    • pp.293-314
    • /
    • 2014
  • This paper aims to contrast phonology of Korean and Thai in terms of language universality. Considering consonants, both languages having 21 typologically most plausible consonants display high universality in the number of consonants. However, Thai shows higher universality in regards to their substance, i.e. it differs from Korean when it comes to the structure of plosives and fricatives. Both Korean and Thai show similarities regarding the plosives due to the fact that both languages possess three contrastive consonants. However, the Thai plosives consist of plain voiced, plain voiceless and aspirated voiceless sounds that have higher universality than the Korean plosives which are plain voiced, plain voiceless and aspirated voiceless. In case of vowels, both Korean with its 10 vowels and Thai with its 9 vowels show lower universality when it comes to the total number of vowels. However, all of those vowels belong to the list of most plausible vowels which makes their universality higher in substance. In respect of syllable structure, Korean with its CVC type shows a moderately complex structure while Thai with its CCVC type has a complex structure. The coda may consist of only one consonant in each language but onset is composed of one consonant in Korean, and two consonants in Thai. The contrastive study of similarities and differences between Korean and Thai in terms of phonology will help not only understand the two languages but also provide useful information for increasing the efficacy of Korean language education for Thai learners of Korean whose number is rapidly increasing.

Nonlinear Speech Production Modeling using Nonlinear Autoregressive Exogenous based on Support Vector Machine (서포트 벡터 머신 기반 비선형 외인성 자귀회귀를 이용한 비선형 조음 모델링)

  • Jang, Seung-Jin;Kim, Hyo-Min;Park, Young-Choel;Choi, Hong-Shik;Yoon, Young Ro
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2007.11a
    • /
    • pp.113-116
    • /
    • 2007
  • In this paper, our proposed Nonlinear Autoregressive Exogenous (NARX) based on Least Square-Support Vector Regression (LS-SVR) is introduced and tested for producing natural sounds. This nonlinear synthesizer perfectly reproduce voiced sounds, and also conserve the naturalness such as jitter and shimmer, compared to LPC does not keep these naturalness. However, the results of some phonation are quite different from the original sounds. These results are assumed that single-band model can not afford to control and decompose the high frequency components. Therefore multi-band model with wavelet filterbank is adopted for substituting single band model. As a results, multi-band model results in improved stability. Finally, nonlinear speech modeling using NARX based on LS-SVR can successfully reconstruct synthesized sounds nearly similar to original voiced sounds.

A Study on ACFBD-MPC in 8kbps (8kbps에 있어서 ACFBD-MPC에 관한 연구)

  • Lee, See-Woo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.17 no.7
    • /
    • pp.49-53
    • /
    • 2016
  • Recently, the use of signal compression methods to improve the efficiency of wireless networks have increased. In particular, the MPC system was used in the pitch extraction method and the excitation source of voiced and unvoiced to reduce the bit rate. In general, the MPC system using an excitation source of voiced and unvoiced would result in a distortion of the synthesis speech waveform in the case of voiced and unvoiced consonants in a frame. This is caused by normalization of the synthesis speech waveform in the process of restoring the multi-pulses of the representation segment. This paper presents an ACFBD-MPC (Amplitude Compensation Frequency Band Division-Multi Pulse Coding) using amplitude compensation in a multi-pulses each pitch interval and specific frequency to reduce the distortion of the synthesis speech waveform. The experiments were performed with 16 sentences of male and female voices. The voice signal was A/D converted to 10kHz 12bit. In addition, the ACFBD-MPC system was realized and the SNR of the ACFBD-MPC estimated in the coding condition of 8kbps. As a result, the SNR of ACFBD-MPC was 13.6dB for the female voice and 14.2dB for the male voice. The ACFBD-MPC improved the male and female voice by 1 dB and 0.9 dB, respectively, compared to the traditional MPC. This method is expected to be used for cellular telephones and smartphones using the excitation source with a low bit rate.

A Study of Energy Parameter without Windowing Influence in Speech Signal (윈도우의 영향이 제거된 에너지 파라미터에 관한 연구)

  • 조태수;신동성;배명진
    • Proceedings of the IEEK Conference
    • /
    • 2001.06d
    • /
    • pp.277-280
    • /
    • 2001
  • The preprocessing is very important course in speech signal processing. It influence the compression-rate in speech coding and the recognition-rate in speech recognition etc. In this paper, we propose that minimizing window-influence method with pitch period and start points. The proposed method is available for voiced detection and word labeling.

  • PDF