• Title/Summary/Keyword: Speech Code

Search Result 118, Processing Time 0.026 seconds

Real-time Implementation of Speech and Channel Coder on a DSP Chip for Radio Communication System (무선통신 적용을 위한 단일 DSP칩상의 음성/채널 부호화기 실시간 구현)

  • Kim Jae-Won;Sohn Dong-Chul
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.9 no.6
    • /
    • pp.1195-1201
    • /
    • 2005
  • This paper deals with procedures and results for teal time implementation of G.729 speech coder and channel coder including convolution codec, viterbi decoder, and interleaver using a fixed point DSP chip for radio communication systems. We described the method for real-time implementation based on integer simulation results and explained the implemented results by quality performance and required complexity for real-time operation. The required complexity was 24MIPS and 9MIPS in computational load, and 12K words and 4K words in execution code length for speech and channel. The functional evaluation was performed into two steps. The one was bit exact comparison with a fixed point C code, the other was executed by actual speech samples and error test vectors. Unlik other results such as individual implementation, We implemented speech and channel coders on a DSP chip with 160MIPS computation capability and 64 K words memory on chip. This results outweigh the conventional methods in the point of system complexity and implementation cost for radio communication system.

State Encoding of Hidden Markov Linear Prediction Models

  • Krishnamurthy, Vikram;Poor, H.Vincent
    • Journal of Communications and Networks
    • /
    • v.1 no.3
    • /
    • pp.153-157
    • /
    • 1999
  • In this paper, we derive finite-dimensional non-linear fil-ters for optimally reconstructing speech signals in Switched Predic-tion vocoders, Code Excited Linear Prediction(CELP) and Differ-ential Pulse Code Modulation (DPCM). Our filter is an extension of the Hidden Markov filter.

  • PDF

Adaptive Encoding of Fixed Codebook in CELP Coders

  • Kim, Hong-Kook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.16 no.3E
    • /
    • pp.44-49
    • /
    • 1997
  • In this paper, we propose an adaptive encoding method of fixed codebook in CELP coders and implement an adaptive fixed code exited linear prediction(AF-CELP) speech coder. AF-CELP exploits the fact that the fixed codebook contribution to speech signal is also periodic like the adaptive codebook (or pitch filter) contribution. By modeling the fixed code book with the pitch lag and the gain from the adaptive codebook, AF-CELP can be implemented at low bit rates as well as low complexity. Listening tests show that a 6.4 kbit/s AF-CELP has a comparable quality to the 8 kbit/s CS-ACELP in background noise conditions.

  • PDF

A Half Rate Speech Soder using Trellis Excitation (Trellis excitation을 이용한 half rate 음성부호화기)

  • 강상원;이형수;김영수;정진욱
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.33B no.2
    • /
    • pp.88-94
    • /
    • 1996
  • In this paper, we present a half rate speech coder using trellis excitation. The coder combines code-excited linear prediction (CELP) system and trellis quantization method using the codebook expansion, and it produces higher speech quality than the typical CELP coder for the same transmission rate. A subjective comparison with 3~8 bit .$\mu$-law PCM indicates that the half rate coder provides speech quality between 5-bit and 6-bit $\mu$-law PCM .

  • PDF

Synthesis of Expressive Talking Heads from Speech with Recurrent Neural Network (RNN을 이용한 Expressive Talking Head from Speech의 합성)

  • Sakurai, Ryuhei;Shimba, Taiki;Yamazoe, Hirotake;Lee, Joo-Ho
    • The Journal of Korea Robotics Society
    • /
    • v.13 no.1
    • /
    • pp.16-25
    • /
    • 2018
  • The talking head (TH) indicates an utterance face animation generated based on text and voice input. In this paper, we propose the generation method of TH with facial expression and intonation by speech input only. The problem of generating TH from speech can be regarded as a regression problem from the acoustic feature sequence to the facial code sequence which is a low dimensional vector representation that can efficiently encode and decode a face image. This regression was modeled by bidirectional RNN and trained by using SAVEE database of the front utterance face animation database as training data. The proposed method is able to generate TH with facial expression and intonation TH by using acoustic features such as MFCC, dynamic elements of MFCC, energy, and F0. According to the experiments, the configuration of the BLSTM layer of the first and second layers of bidirectional RNN was able to predict the face code best. For the evaluation, a questionnaire survey was conducted for 62 persons who watched TH animations, generated by the proposed method and the previous method. As a result, 77% of the respondents answered that the proposed method generated TH, which matches well with the speech.

Real-time Implementation of CS-ACELP Speech Coder for IMT-2000 Test-bed (IMT-2000 Test-bed 상에서 CS-ACELP 음성부호화기 실시간 구현)

  • 김형중;최송인;김재원;윤병식
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.2 no.3
    • /
    • pp.335-341
    • /
    • 1998
  • In this paper, we present a real time implementation of CS-ACELP(Conjugate Structure Algebraic Code Excited Linear Prediction) speech coder. ITU-T has standardized the CS-ACELP algorithm as G.729. Areal-time implementation of CS-ACELP speech coder algorithm is achieved using 16 bit fixed-point DSP chip. To implement in fixed-point DSP Chip, integer simulation of CS-ACELP algorithm is used. Furthermore. input/output function and communication function included in CS-ACELP speech coder is described. We develope CS-ACELP speech coder in DSP evaluation board and evaluate in IMT-2000 Test-bed.

  • PDF

Implementation of a Single-chip Speech Recognizer Using the TMS320C2000 DSPs (TMS320C2000계열 DSP를 이용한 단일칩 음성인식기 구현)

  • Chung, Ik-Joo
    • Speech Sciences
    • /
    • v.14 no.4
    • /
    • pp.157-167
    • /
    • 2007
  • In this paper, we implemented a single-chip speech recognizer using the TMS320C2000 DSPs. For this implementation, we had developed very small-sized speaker-dependent recognition engine based on dynamic time warping, which is especially suited for embedded systems where the system resources are severely limited. We carried out some optimizations including speed optimization by programming time-critical functions in assembly language, and code size optimization and effective memory allocation. For the TMS320F2801 DSP which has 12Kbyte SRAM and 32Kbyte flash ROM, the recognizer developed can recognize 10 commands. For the TMS320F2808 DSP which has 36Kbyte SRAM and 128Kbyte flash ROM, it has additional capability of outputting the speech sound corresponding to the recognition result. The speech sounds for response, which are captured when the user trains commands, are encoded using ADPCM and saved on flash ROM. The single-chip recognizer needs few parts except for a DSP itself and an OP amp for amplifying microphone output and anti-aliasing. Therefore, this recognizer may play a similar role to dedicated speech recognition chips.

  • PDF

On Speech Digitization and Bandwidth Compression Techniques[II]-Vocoding (음성신호의 디지탈화와 대역폭축소의 방법에 관하여[II]-Vocoding)

  • 은종관
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.15 no.6
    • /
    • pp.1-7
    • /
    • 1978
  • This paper deals with speech digitization and bandwidth compression techniques, particularly two predictive coding methods-namely, adaptive differential pulse code modulation(ADPCM) and adaptive delta modulation(ADM). The principle of a typical adaptive quantizer that is used in ADPCM is explained, and discussed. Also, three companding methods(instantaueous, syllabic, and hybrid companding) that are used in ADM are explained in detail, and their performances are compared. In addition, the performances of ADPCM and ADM as speech coders are compared, and the inerits of each coder are discussed.

  • PDF

On Speech Digitization and Bandwidth Compression Techniques[I]-ADPCM and ADM (음성신호의 디지탈화와 대역폭축소의 방법에 관하여[I]-ADPCM과 ADM)

  • 은종관
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.15 no.3
    • /
    • pp.1-6
    • /
    • 1978
  • This paper deals with speech digitization and bandwidth compression techniques, particularly two predictive coding methods-namely, adaptive diferentia1 pulse code modulation(ADPCM) and adaptive delta modulation (ADM). The principle of a typical adoptive quantizer that is used in ADPCM is explained, and two analysis methods for the adaptive predictor coefficents, block and sequential analyses, are discussed. Also, three companding methods (instantaneous, syllabic, and hybrid companding) that are used in ADM are explained in detail, and their performances are compared. In addition, the performances of ADPCM and ADM as speech coders are compared, and the merits of each coder are discussed.

  • PDF

Real-time implementation of the 2.4kbps EHSX Speech Coder Using a $TMS320C6701^TM$ DSPCore ($TMS320C6701^TM$을 이용한 2.4kbps EHSX 음성 부호화기의 실시간 구현)

  • 양용호;이인성;권오주
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.7C
    • /
    • pp.962-970
    • /
    • 2004
  • This paper presents an efficient implementation of the 2.4 kbps EHSX(Enhanced Harmonic Stochastic Excitation) speech coder on a TMS320C6701$^{TM}$ floating-point digital signal processor. The EHSX speech codec is based on a harmonic and CELP(Code Excited Linear Prediction) modeling of the excitation signal respectively according to the frame characteristic such as a voiced speech and an unvoiced speech. In this paper, we represent the optimization methods to reduce the complexity for real-time implementation. The complexity in the filtering of a CELP algorithm that is the main part for the EHSX algorithm complexity can be reduced by converting program using floating-point variable to program using fixed-point variable. We also present the efficient optimization methods including the code allocation considering a DSP architecture and the low complexity algorithm of harmonic/pitch search in encoder part. Finally, we obtained the subjective quality of MOS 3.28 from speech quality test using the PESQ(perceptual evaluation of speech quality), ITU-T Recommendation P.862 and could get a goal of realtime operation of the EHSX codec.c.