• Title/Summary/Keyword: Speech codec

Search Result 128, Processing Time 0.029 seconds

Analysis and Implementation of Speech/Music Classification for 3GPP2 SMV Codec Based on Support Vector Machine (SMV코덱의 음성/음악 분류 성능 향상을 위한 Support Vector Machine의 적용)

  • Kim, Sang-Kyun;Chang, Joon-Hyuk
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.45 no.6
    • /
    • pp.142-147
    • /
    • 2008
  • In this paper, we propose a novel a roach to improve the performance of speech/music classification for the selectable mode vocoder (SMV) of 3GPP2 using the support vector machine (SVM). The SVM makes it possible to build on an optimal hyperplane that is separated without the error where the distance between the closest vectors and the hyperplane is maximal. We first present an effective analysis of the features and the classification method adopted in the conventional SMV. And then feature vectors which are a lied to the SVM are selected from relevant parameters of the SMV for the efficient speech/music classification. The performance of the proposed algorithm is evaluated under various conditions and yields better results compared with the conventional scheme of the SMV.

Enhancement of Speech/Music Classification for 3GPP2 SMV Codec Employing Discriminative Weight Training (변별적 가중치 학습을 이용한 3GPP2 SVM의 실시간 음성/음악 분류 성능 향상)

  • Kang, Sang-Ick;Chang, Joon-Hyuk;Lee, Seong-Ro
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.6
    • /
    • pp.319-324
    • /
    • 2008
  • In this paper, we propose a novel approach to improve the performance of speech/music classification for the selectable mode vocoder (SMV) of 3GPP2 using the discriminative weight training which is based on the minimum classification error (MCE) algorithm. We first present an effective analysis of the features and the classification method adopted in the conventional SMV. And then proposed the speech/music decision rule is expressed as the geometric mean of optimally weighted features which are selected from the SMV. The performance of the proposed algorithm is evaluated under various conditions and yields better results compared with the conventional scheme of the SMV.

A Transcoding Algorithm between EVRC and G.729A (EVRC와 G.729A 간의 상호부호화)

  • Kwon Goo-Rak;Ko Sung-Jea
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.43 no.3 s.309
    • /
    • pp.54-60
    • /
    • 2006
  • This paper presents an effective algorithm for transcoding between the Enhanced Variable Rate Codec(EVRC) and G.729A. The simplest way to communicate between heterogeneous speech networks is the cascade connection of two different codecs, called tandem coding. However, tandem coding not only produces high computational loads, but also makes long delay, These problems can be solved by using the transcoding algorithm. The proposed algorithm consists of LSP (Line Spectral Pair) conversion, pitch delay conversion and algorithm for reduction of delay. Experimental results show the proposed algorithm produces lower computational complexity, shorter algorithm delay, and similar speech quality when compared with the tandem algorithm.

A Performance Analysis of the Speech Coders for Digital Mobile Radio (디지털 이동통신을 위한 음성 부호기의 성능 분석)

  • 정영모;이상욱
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.27 no.4
    • /
    • pp.491-501
    • /
    • 1990
  • Recently, four speech coding techniques, namely, SBC-APCM(sub-band coding adaptive PCM), RPE-LPC(regualr pulse excitation linear predictive codec), MPE-LTP(multi-pulse excited long-term prediction) and CELP (code-excited linear prediction) are proposed for digital mobile radio applications. However, a performance comparison of these coders in the Rayleigh fading environment has not been made yet. In this paper, the performances of the four spech coders in the random bit error and burst error environment are investigated. For the channel coding of SBC-APCM, RPE-LPC and MPE-LTP, the sensitivity of output bit stream is measured and a bit selective forward error correction is provided acording to the measured bit sensitivity. And for an attempt to improve the performance of CELP, an optimum quantizer is applied for transmitting scalar quantities in CELP. However, an improvement over the conventional approach is found to be negligible. For the channel coding of CELP, Reed-Solomon code, Golay code, convolutional code of rate 1/2 shows the best performance. Finally, from the simulation results, it is concluded that CELP is the best candidate for digital mobile radio and is followed by MPE-LTP, SBC-APCM and RPE-LPC.

  • PDF

Performance Improvement of the QCELP using an Efficient LSF Coding (효율적인 LSF 양자화기를 이용한 QCELP 성능개선)

  • Kim, Hae-Jin;Kang, Sang-Won
    • The Journal of the Acoustical Society of Korea
    • /
    • v.16 no.1
    • /
    • pp.10-15
    • /
    • 1997
  • In this paper, an efficient LSF quantizer, named improved PSVQ(IPSVQ), is proposed to apply in the 8 kbps QCELP speech coder. By using 27 bits IPSVQ instead of 40 bits DPCM quantizer per frame, we can save 13 bits/frame and allocate those bits to the codebook gain and the pitch gain parameters. Hence we improve the overall performance of the QCELP codec. The enhanced QCELP shows the performance improvement of 0.9 dB SNR and 0.4 dB SEGSNR. Informal listening tests also confirm the improvement in the speech quality.

  • PDF

RSA - QoS: A Resource Loss Aware Scheduling Algorithm for Enhancing the Quality of Service in Mobile Networks

  • Ramkumar, Krishnamoorthy;Newton, Pitchai Calduwel
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.12
    • /
    • pp.5917-5935
    • /
    • 2018
  • Adaptive Multi-Rate Codec is one of the codecs which is used for making voice calls. It helps to connect people who are scattered in various geographical areas. It adjusts its bit-rate according to the user's channel conditions. It plays a vital role in providing an improved speech quality of voice connection in Long Term Evolution (LTE). There are some constraints which need to be addressed in providing this service profitably. Quality of Service (QoS) is the dominant mechanism which determines the quality of the speech in communication. On several occasions, number of users are trying to access the same channel simultaneously by standing in a particular region for a longer period of time. It refers to Multi-user channel sharing problem which leads to resource loss very often. The main aim of this paper is to develop a novel RSA - QoS scheduling algorithm for reducing the Resource Loss Ratio. Eventually, it increases the throughput.The simulation result shows that the RSA - QoS increases the number of users for accessing the resources better than the existing algorithms in terms of resource loss and throughput. Ultimately, it enhances the QoS in Mobile Networks.

Speech Packet Transmission Using the AMR-WB Coder with FEC (FEC기능을 추가한 AMR-WB 음성 부호화기를 이용한 음성 패킷 전송)

  • 황정준;이인성
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.40 no.11
    • /
    • pp.63-71
    • /
    • 2003
  • This paper suggests the packet loss recovery method to communicate in real time in the Internet. To reduce the effects of packet loss, Forward Error Correction (FEC) that adds redundant information to voice packets can be used. Adaptive Multi Rate Wideband(AMR-WB) codec which is recently selected by the Third Generation Partnership Project(3GPP) for GSM and the third generation mobile communication WCDMA system and has also been standardized in ITU-T for providing wideband speech services is used. The major cause for speech qualitly degradation in IP-networks is packet loss. So, We recovered single lossy packet by using FEC method and concealed continued errors. The proposed scheme if evaluated in the Gilbert Internet channel model. The high quality of audio maintained up to 30% packet loss.

MPEG-D USAC: Unified Speech and Audio Coding Technology (MPEG-D USAC: 통합 음성 오디오 부호화 기술)

  • Lee, Tae-Jin;Kang, Kyeong-Ok;Kim, Whan-Woo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.7
    • /
    • pp.589-598
    • /
    • 2009
  • As mobile devices become multi-functional, and converge into a single platform, there is a strong need for a codec that is able to provide consistent quality for speech and music content MPEG-D USAC standardization activities started at the 82nd MPEG meeting with a CfP and approved WD3 at the 88th MPEG meeting. MPEG-D USAC is converged technology of AMR-WB+ and HE-AAC V2. Specifically, USAC utilizes three core codecs (AAC ACELP and TCX) for low frequency regions, SBR for high frequency regions and the MPEG Surround tool for stereo information. USAC can provide consistent sound quality for both speech and music content and can be applied to various applications such as multi-media download to mobile device Digital radio Mobile TV and audio books.

A study on a fast algorithm for the LSP coefficient quantization of G. 723.1 speech codec (G.723.1 음성 부호화기의 LSE 계수 양자화를 위한 고속화 알고리즘 연구)

  • Son Chang-yong;Sung Ho-sang;Kang Sang-won;Sung Yu-na
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.153-156
    • /
    • 2000
  • 본 논문에서는 멀티미디어 서비스들 중에서 음성 또는 오디오 신호를 저속으로 압축할 때 사용되는 G.723.1 부호화기의 line spectral frequency(LSF) 계수 양자화 방식을 고속으로 처리하는 알고리즘을 제안하였다. 제안된 고속탐색 방법은 LSF 계수의 순서성질을 이용하여 코드북의 탐색 범위를 줄임으로써 계산량을 크게 감소시킨다. 제안된 고속탐색 방법을 predictive split VQ(PSVQ) 구조를 갖는 G.723.1 에 적용한 결과 spectral distortion(SD) 성능 감쇄 및 추가적인 메모리 증가 없이 최적 코드벡터를 찾기 위한 코드북 탐색 과정에서 코드북의 평균 탐색 범위가 $20.1\%$ 감소했으며, 이는 additions, subtractions, multiplies 및 comparisons 수가 각각 $19.1\%$, $20.1\%$, $19.4\%$$12.2\% 감소하는 결과를 얻었다.

  • PDF

Real-time implementation of the G.728 speech codec using the Vincent6 DSP core (Vincent6 DSP코어를 이용한 G.728 음성 부호화기의 실시간 구현)

  • 성호상
    • Proceedings of the IEEK Conference
    • /
    • 2000.09a
    • /
    • pp.131-135
    • /
    • 2000
  • 본 논문에서는 고성능 고정 소수점 DSP (Digital Signal Processor) 코어인 Vincent6 코어 [1]를 이용하여 ITU-T C.728 음성 부호화기를 실시간으로 구현하였다 G.728 은 16 kb/s전송률의 ITU-T표준 음성 부호화기이며, 입력신호는 8 kHz로 샘플링되며 샘플 당 16 bit 로 양자화된 PCM 신호이다. G.728 은 LD-CELP(Low Delay Code Excited Linear Prediction)라고도 하며, 알고리 듬 delay는 0.625ms 이다. Vincent6 DSP core 는 VLIW (Very-Long Instruction Word) 특성을 가지므로 다중 명령 (multiple instruction)을 수행할 수 있다 이를 위해서 G.728 annex G를 이용하여 고정 소숫점 연산으로 코드를 작성한 후, 이를 vincent6 어셈블리 코드로 구현하였다. 최종적으로 구현된 코드는 ITU-T 의 test vector 에 대 해 bit exact 한 결과를 보이며 34 MCPS (Million Cycles Per Second)의 계산량을 가지며 사용 메모리크기는 데이터 메모리가 약 9KByte, 프로그램 메모리가 약 57 KByte 이다.

  • PDF