• Title/Summary/Keyword: Speech Code

Search Result 118, Processing Time 0.028 seconds

Complexity-Reduction Algorithm of Speech Coder (QCELP) for CDMA Digital Cellular System (CDMA 디지틀 셀룰라용 음성 부호화기 (QCELP) 의 복잡도 감소 알고리즘)

  • 이인성
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.33B no.3
    • /
    • pp.126-132
    • /
    • 1996
  • In this paper, the complexity reduction method for QCELP speech coder (IS-96) without any perfomrance degradation is proposed for the vecoder of CDMA digital cellular system. The energy terms in pitch parameter search and codebook search routines that require large computations are calculated recursively by utilizing the overlapped structure of code vectors in adaptive codebook and excitation codebook. The additional complexity reduction in the codebook search routine can be achieved by using a simple form in calculation of the energy term when the initial codebook value is zero. In the case of lower transmission rates such as 4,2,1 kbps, the complexity reduction by recursive calulations of energy term is increased.

  • PDF

Enhanced Spectral Envelope Coding Scheme Using Inter-frame Correlation for G.729.1 (G.729.1 코더에서 프레임 간의 상호상관 관계를 이용한 개선된 스펙트럼 포락 코딩 방법)

  • Cho, Keun-Seok;Sung, Jong-Mo;Hahn, Min-Soo;Kim, Young-Il;Jeong, Sang-Bae
    • Phonetics and Speech Sciences
    • /
    • v.1 no.4
    • /
    • pp.97-103
    • /
    • 2009
  • This paper describes a new algorithm for encoding spectral envelope in the time domain alias cancellation (TDAC) part of G.729.1. The spectral envelope and modified discrete cosine transform (MDCT) coefficients of the weighted code-excited linear predictive (CELP) coding error in lower-band and the higher-band input signal are encoded in the TDAC part. In order to reduce allocation bits for spectral envelope coding, a new algorithm using sub-band correlation between adjacent frames is proposed. In addition, to improve the quality of decoded signals, two bit allocation strategies using reduced bits from the proposed algorithm are proposed. The performance of the proposed algorithm is evaluated in terms of objective quality and bit reduction rates. Experimental results show that the proposed algorithm increases the quality of sounds significantly.

  • PDF

Time-Domain Quantization and Interpolation of Pitch Cycle Waveform

  • Kim, Moo-Young
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.1E
    • /
    • pp.11-16
    • /
    • 2008
  • In this paper, a pitch cycle waveform (PCW) is extracted, quantized, and interpolated in a time domain to synthesize high-quality speech at low bit rates. The pre-alignment technique is proposed for the accurate and efficient PCW extraction, which predicts the current PCW position from the previous PCW position assuming that pitch periods evolve slowly. Since the pitch periods are different frame by frame, the original PCW is converted into the fixed-dimension PCW using the dimension-conversion method, and subsequently quantized by code-excited linear predictive (CELP) coding. The excitation signal for the linear predictive coding (LPC) synthesis filter is generated using the time-domain interpolation and interlink of the quantized PCW's. The coder operates at 4.2 kbit/s and 3.2 kbit/s depending on the pitch period. Informal listening test demonstrates the effectiveness of the proposed coding scheme.

Variable Threshold Detection with Weighted BPSK/PCM Speech Signal Transmitted over Gaussian Channels (가우시안 채널에 있어 가중치를 부여한 BPSK/PCM 음성신호의 비트거물 한계치 변화에 의한 신호재생)

  • 안승춘;서정욱;이문호
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.24 no.5
    • /
    • pp.733-739
    • /
    • 1987
  • In this paper, variable threshold detection with weighted pulse code modulation-encoded signals transmitted over Gaussian channels has been investigated. Each bit in the \ulcornerlaw PCM word is weighted according to its significance in the transmitter. It the output falls into the erasure zone, the regenerated sample replaced by interpolation or prediction. To overall system signal to noise ratio for BPSK/PCM speech signals of this technique has been found. When the input signal level was -17 db, the gains in overall signal s/n compared to weighted PCM and variable threshold detection were 5 db and 3 db, respectively. Computer simulation was performed generating signals by computer. The simulation was in resonable agreement with our theoretical prediction.

  • PDF

ON A REDUCTION OF PITCH SEARCHING TIME BY PREPROCESSING IN THE CELP VOCODER

  • Kim, Daesik;Bae, Myungjin;Kim, Jongjae;Byun, Kyungjin;Han, Kichun;Yoo, Hahyoung
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06a
    • /
    • pp.904-911
    • /
    • 1994
  • Code Excited Linear Prediction (CELP) speech coders exhibit good performance at data rates below 4.8 kbps. The major drawback to CELP type coders is their many computation. In this paper, we propose a new pitch search method that preserves the quality of the CELP vocoder with reducing complexity. The basic idea is to apply the preprocessing technique beforehand grasping the autocorrelation property of speech waveform. By using the proposed method, we can get approximately 77% complexity reduction in the pitch search.

  • PDF

A study on the predictability of acoustic power distribution of English speech for English academic achievement in a Science Academy (과학영재학교 재학생 영어발화 주파수 대역별 음향 에너지 분포의 영어 성취도 예측성 연구)

  • Park, Soon;Ahn, Hyunkee
    • Phonetics and Speech Sciences
    • /
    • v.14 no.3
    • /
    • pp.41-49
    • /
    • 2022
  • The average acoustic distribution of American English speakers was statistically compared with the English-speaking patterns of gifted students in a Science Academy in Korea. By analyzing speech recordings, the duration time of which is much longer than in previous studies, this research identified the degree of acoustic proximity between the two parties and the predictability of English academic achievement of gifted high school students. Long-term spectral acoustic power distribution vectors were obtained for 2,048 center frequencies in the range of 20 Hz to 20,000 Hz by applying an long-term average speech spectrum (LTASS) MATLAB code. Three more variables were statistically compared to discover additional indices that can predict future English academic achievement: the receptive vocabulary size test, the cumulative vocabulary scores of English formative assessment, and the English Speaking Proficiency Test scores. Linear regression and correlational analyses between the four variables showed that the receptive vocabulary size test and the low-frequency vocabulary formative assessments which require both lexical and domain-specific science background knowledge are relatively more significant variables than a basic suprasegmental level English fluency in the predictability of gifted students' academic achievement.

The Smart Learning System for English Language Using Hangeul (한글을 이용한 스마트 영어 학습 시스템)

  • Kwon, Seung-tag;Kim, Yong-seok
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.40 no.6
    • /
    • pp.1157-1163
    • /
    • 2015
  • In this paper, we developed a Web App that operates in a mobile device. Also, we designed and developed an electronic dictionary of English words and sentences are expressed by English pronunciation with hangeul. The database using English words, Hangeul code with pictures, vocabulary definitions, speech sound files, and many sentences are created in this system. We developed the English learning system using HTML5 and m-Bizmaker software tools.

New Postprocessing Methods for Rejectin Out-of-Vocabulary Words

  • Song, Myung-Gyu
    • The Journal of the Acoustical Society of Korea
    • /
    • v.16 no.3E
    • /
    • pp.19-23
    • /
    • 1997
  • The goal of postprocessing in automatic speech recognition is to improve recognition performance by utterance verification at the output of recognition stage. It is focused on the effective rejection of out-of vocabulary words based on the confidence score of hypothesized candidate word. We present two methods for computing confidence scores. Both methods are based on the distance between each observation vector and the representative code vector, which is defined by the most likely code vector at each state. While the first method employs simple time normalization, the second one uses a normalization technique based on the concept of on-line garbage mode[1]. According to the speaker independent isolated words recognition experiment with discrete density HMM, the second method outperforms both the first one and conventional likelihood ratio scoring method[2].

  • PDF

Objective Measure for Estimating Subjective Voice Quality in Wireless Communication (CDMA 이동통신 시스템에서의 주관적 음질을 추정하기 위한 객관적 척도)

  • 백금란
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.06e
    • /
    • pp.297-302
    • /
    • 1998
  • 본 논문에서는 CDMA(Code Division Multiple Access) 채널을 통과하면서 여러 가지 형태로 손상된 음성에 대한 주관적 음질 평가를 할 수 있는 객관적 척도에 대한 연구를 수행하였다. 즉, CDMA 채널을 통과한 음성 신호에 대하여 주관적 음질 평가 방법 중 가장 많이 사용되고 있는 MOS(Mean Opinion Score) 테스트를 수행하고, 이 MOS 테스트 결과를 추정할 수 있는 객관척도 알고리즘을 시뮬레이션 하였다. 이러한 연구 결과로 PSQM(Perceptual Speech Quality Measure)을 CDMA 채널 환경에 맞게 수정하여 우수한 성능의 객관적 음질 평가 방법을 얻었다.

  • PDF

A Development of CDMA-based Robot Remote Controller (CDMA 기반 로봇 원격제어기 개발)

  • Kim, Woo-Sik;Kim, Eung-Seok
    • Proceedings of the KIEE Conference
    • /
    • 2006.10c
    • /
    • pp.345-347
    • /
    • 2006
  • In this paper, we study the robot controller design using the voice and data communication via CDMA(Code Division Multiple Access) mobile communication network. We design the robot remote controller using the three methods, telephone call speech recognition, DTMF (Dual Tone Multiple Frequency) realization, SMS(Short Message Service) transmission/reception way via CDMA mobile communication network. We investigate the validity and effectiveness of the proposed remote controller which applied to the mobile robot.

  • PDF