• Title/Summary/Keyword: Speech codec

Search Result 128, Processing Time 0.171 seconds

A Method of Adaptive ISF Split Vector Quantization Using Normalized Codebook (정규화 코드북을 이용한 분할 벡터 구조의 ISF 적응적 양자화 기법)

  • Piao, Zhigang;Lim, Jong-Ha;Hong, Gi-Bong;Lee, In-Sung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.30 no.5
    • /
    • pp.265-272
    • /
    • 2011
  • In most of the ISF (or LSF) based real time speech codec, SVQ (split vector quantization) method is used to decrease the quantizer complexity and memory size of codebook. However, it produces drawback that the level of correlation between code vectors can not be used during vector splits. This paper presents a new method of adaptive ISF vector quantization, which compensates the drawbacks of SVQ structured quantizer for wideband speech codec. In each different frame, the proposed method makes use of the correlation between splitted vectors by adaptively changing codebook distribution according to ordering property of ISF. The algorithm is evaluated in AMR-WB, and shows about 1.5 bit per frame improvement.

Real-time Implementation of Speech and Channel Coder on a DSP Chip for Radio Communication System (무선통신 적용을 위한 단일 DSP칩상의 음성/채널 부호화기 실시간 구현)

  • Kim Jae-Won;Sohn Dong-Chul
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.9 no.6
    • /
    • pp.1195-1201
    • /
    • 2005
  • This paper deals with procedures and results for teal time implementation of G.729 speech coder and channel coder including convolution codec, viterbi decoder, and interleaver using a fixed point DSP chip for radio communication systems. We described the method for real-time implementation based on integer simulation results and explained the implemented results by quality performance and required complexity for real-time operation. The required complexity was 24MIPS and 9MIPS in computational load, and 12K words and 4K words in execution code length for speech and channel. The functional evaluation was performed into two steps. The one was bit exact comparison with a fixed point C code, the other was executed by actual speech samples and error test vectors. Unlik other results such as individual implementation, We implemented speech and channel coders on a DSP chip with 160MIPS computation capability and 64 K words memory on chip. This results outweigh the conventional methods in the point of system complexity and implementation cost for radio communication system.

Real-time Implementation of the AMR Speech Coder Using $OakDSPCore^{\circledR}$ ($OakDSPCore^{\circledR}$를 이용한 적응형 다중 비트 (AMR) 음성 부호화기의 실시간 구현)

  • 이남일;손창용;이동원;강상원
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.6
    • /
    • pp.34-39
    • /
    • 2001
  • An adaptive multi-rate (AMR) speech coder was adopted as a standard of W-CDMA by 3GPP and ETSI. The AMR coder is based on the CELP algorithm operating at rates ranging from 12.2 kbps down to 4.75 kbps, and it is a source controlled codec according to the channel error conditions and the traffic loading. In this paper, we implement the DSP S/W of the AMR coder using OakDSPCore. The implementation is based on the CSD17C00A chip developed by C&S Technology, and it is tested using test vectors, for the AMR speech codec, provided by ETSI for the bit exact implementation. The DSP B/W requires 20.6 MIPS for the encoder and 2.7 MIPS for the decoder. Memories required by the Am coder were 21.97 kwords, 6.64 kwords and 15.1 kwords for code, data sections and data ROM, respectively. Also, actual sound input/output test using microphone and speaker demonstrates its proper real-time operation without distortions or delays.

  • PDF

Fast Implementation Algorithms for EVRC (EVRC의 고속 구현 알고리듬)

  • 정성교;최용수;김남건;윤대희
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.1
    • /
    • pp.43-49
    • /
    • 2001
  • EVRC (Enhanced Variable Rate Codec) has been adopted as a standard coder for the CDMA digital cellular system in North America and Korea, and known to provide good call quality at 8kbps. In this paper, fast implementation algorithms for EVRC encoder are proposed. The proposed algorithms are based on both efficient pitch detection scheme and fast fixed codebook search algorithm. In the codebook search, computational complexity is reduced down to 70% of the original EVRC by limiting the number of pulse position combination and by using a truncated impulse response. The proposed algorithms enable us to implement the EVRC with much smaller computational works. Also, informal subjective tests confirmed that the difference in the speech quality between the original EVRC and the proposed method was indistinguishable.

  • PDF

Implementation of a 4-Channerl ADPCM CODEC Using a DSP (DSP를 사용한 4채널용 ADPCM CODEC의 실시간 구현에 관한 연구)

  • Lee, Ui-Taek;Lee, Gang-Seok;Lee, Sang-Uk
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.22 no.5
    • /
    • pp.29-38
    • /
    • 1985
  • In this paper we have designed and implemented in real time a simple, efficient and flexible AOPCM cosec using a high speed digital processor, NEC 7720. For ADPCM system, we have used an instantaneous adaptive quantizer and a first-order fixed predictor. The software for NEC 7720 has been developed and it was found that the NEC 7720 was capable of performing the entire ADPCAt algorithm for 4 channels in real time as optimizing the program. Computer simulation has born made to investigate a computational accuracr of NEC 7720 and to de-termine necessary parameters for a ADPCM codec. Real telephone speech, RC-shaped Gaussian noise and 1004 Hz tone signal were used for simulation. In simulation, the parameters werc optimized from the computed SNR and the informal listening test. The developed software was tested in real time operation using a hardware emulator for NEC 7720. It took a maximum 23.25$\mu$s to encode one sample and 113.5$\mu$s, including all the necessary 1/0 operations, to encode 4 channels. In the case of decoding process, it took 24.75$\mu$s to decode one sample and 119.5$\mu$s to decode 4 channels.

  • PDF

Signal Enhancement of a Variable Rate Vocoder with a Hybrid domain SNR Estimator

  • Park, Hyung Woo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.2
    • /
    • pp.962-977
    • /
    • 2019
  • The human voice is a convenient method of information transfer between different objects such as between men, men and machine, between machines. The development of information and communication technology, the voice has been able to transfer farther than before. The way to communicate, it is to convert the voice to another form, transmit it, and then reconvert it back to sound. In such a communication process, a vocoder is a method of converting and re-converting a voice and sound. The CELP (Code-Excited Linear Prediction) type vocoder, one of the voice codecs, is adapted as a standard codec since it provides high quality sound even though its transmission speed is relatively low. The EVRC (Enhanced Variable Rate CODEC) and QCELP (Qualcomm Code-Excited Linear Prediction), variable bit rate vocoders, are used for mobile phones in 3G environment. For the real-time implementation of a vocoder, the reduction of sound quality is a typical problem. To improve the sound quality, that is important to know the size and shape of noise. In the existing sound quality improvement method, the voice activated is detected or used, or statistical methods are used by the large mount of data. However, there is a disadvantage in that no noise can be detected, when there is a continuous signal or when a change in noise is large.This paper focused on finding a better way to decrease the reduction of sound quality in lower bit transmission environments. Based on simulation results, this study proposed a preprocessor application that estimates the SNR (Signal to Noise Ratio) using the spectral SNR estimation method. The SNR estimation method adopted the IMBE (Improved Multi-Band Excitation) instead of using the SNR, which is a continuous speech signal. Finally, this application improves the quality of the vocoder by enhancing sound quality adaptively.

Low-Delay LSF FEC Technique Robust in Lossy VoIP Environment (VoIP 손실 환경에 강인한 저지연 LSF FEC 기법)

  • Yang, Hae-Yong;Lee, Kyung-Hoon;Hwang, In-Ho
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.39 no.6
    • /
    • pp.687-695
    • /
    • 2002
  • Media-specific FEC techniques, suggested to confront with VoIP speech packet loss, improve speech quality at the expense of generating additional one-frame delay. In this paper, we suggest new media-specific FEC, i.e, LSF FEC technique which is able to improve speech quality with much shortened additional delay. In the proposed technique, the LSF parameters of the future frame are utilized to recover a lost packet. To evaluate performance of the proposed technique, we use ITU-T G.723.1 and G.729 Codec and apply Gilbert packet loss model and estimate MOS per every packet loss rate using PESQ speech quality estimation algorithm. The proposed technique has effect of shortening delay over from 6.5ms to 27ms compared with existing media-specific FEC techniques. Simulation results for comparison of reconstructed speech quality show this novel technique improves the MOS over 0.1 in practical lossy environment of 5 % packet loss rate.

Research on Open Source Encoding Technology for MPEG Unified Speech and Audio Coding (MPEG 통합 음성/오디오 코덱을 위한 오픈 소스 부호화 기술에 관한 연구)

  • Song, Jeongook;Lee, Joonil;Kang, Hong-Goo
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.1
    • /
    • pp.86-96
    • /
    • 2013
  • Unified Speech and Audio Coding (USAC) is the speech/audio codec with the best quality, approved on Final Draft International Standard (FDIS) at MPEG meeting in 2011. Since MPEG conventionally standardizes only the decoder, it is not easy to study on the encoder technologies. Furthermore, Reference Model(RM) shows extremely poor performance. To solve these problems, the open source project(JAME) proposes the methods to make the improved performance of main encoder technologies in USAC. Especially, this paper introduces the encoder modules: the signal classifier for selective operation between two coders, the psychoacoustic model in frequency domain, and window transition technology. Finally, the results of verification test for FDIS and the performance of Common Encoder are appended.

Improving SVM with Second-Order Conditional MAP for Speech/Music Classification (음성/음악 분류 향상을 위한 2차 조건 사후 최대 확률기법 기반 SVM)

  • Lim, Chung-Soo;Chang, Joon-Hyuk
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.48 no.5
    • /
    • pp.102-108
    • /
    • 2011
  • Support vector machines are well known for their outstanding performance in pattern recognition fields. One example of their applications is music/speech classification for a standardized codec such as 3GPP2 selectable mode vocoder. In this paper, we propose a novel scheme that improves the speech/music classification of support vector machines based on the second-order conditional maximum a priori. While conventional support vector machine optimization techniques apply during training phase, the proposed technique can be adopted in classification phase. In this regard, the proposed approach can be developed and employed in parallel with conventional optimizations, resulting in synergistic boost in classification performance. According to experimental results, the proposed algorithm shows its compatibility and potential for improving the performance of support vector machines.

Fine-tuning SVM for Enhancing Speech/Music Classification (SVM의 미세조정을 통한 음성/음악 분류 성능향상)

  • Lim, Chung-Soo;Song, Ji-Hyun;Chang, Joon-Hyuk
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.48 no.2
    • /
    • pp.141-148
    • /
    • 2011
  • Support vector machines have been extensively studied and utilized in pattern recognition area for years. One of interesting applications of this technique is music/speech classification for a standardized codec such as 3GPP2 selectable mode vocoder. In this paper, we propose a novel approach that improves the speech/music classification of support vector machines. While conventional support vector machine optimization techniques apply during training phase, the proposed technique can be adopted in classification phase. In this regard, the proposed approach can be developed and employed in parallel with conventional optimizations, resulting in synergistic boost in classification performance. We first analyze the impact of kernel width parameter on the classifications made by support vector machines. From this analysis, we observe that we can fine-tune outputs of support vector machines with the kernel width parameter. To make the most of this capability, we identify strong correlation among neighboring input frames, and use this correlation information as a guide to adjusting kernel width parameter. According to the experimental results, the proposed algorithm is found to have potential for improving the performance of support vector machines.