• Title/Summary/Keyword: Speech Code

Search Result 118, Processing Time 0.025 seconds

On Predictive Coding of Speech Signals (음성신호의 예측부호화에 관하여)

  • 은종관
    • The Magazine of the IEIE
    • /
    • v.12 no.5
    • /
    • pp.23-35
    • /
    • 1985
  • 본 논문은 디지털 음성통신에서 사용되는 예측부호화(predictive coding) 방식에 관하여 기술하고 있다. 특히 전송속도가 16∼48kbit/s 대역에서 많이 사용하고 있는 adaptive differential pulse code modulation(ADPCM)과 adaptive delta modulation(ADM)에 관하여 중점적으로 토의한다. 또한 variable-rate ADPCM과 ADM에 관해서 기술하고, 이들 시스템의 noisy channel에서의 효과 및 성능개선방법, 그리고 PCM과의 transcoding에서의 문제점 등을 통의한다. ADPCM은 최근 CCITT에서의 표준화 결과로 앞으로 PCM과 함께 많이 쓰여질 전망이며, ADM은 시스템이 간단하고 또한 channel error에 강한 이유로 특수통신에 많이 쓰여질 것이다.

  • PDF

On a Reduction of Codebook Searching Time by using RPE Searching Tchnique in the CELP Vocoder (RPE 검색을 이용한 CELP 보코더의 불규칙 코드북 검색)

  • 김대식
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1995.06a
    • /
    • pp.141-145
    • /
    • 1995
  • Code excited linear prediction speech coders exhibit good performance at data rates as low as 4800 bps. The major drawback to CELP type coders is their large computational requirements. In this paper, we propose a new codebook search method that preserves the quality of the CELP vocoder with reduced complexity. The basic idea is to restrict the searching range of the random codebook by using a searching technique of the regular pulse excitation. Applying the proposed method to the CELP vocoder, we can get approximately 48% complexity reduction in the codebook search.

  • PDF

A Comparative Performance Study of Speech Coders for Three-Way Conferencing in Digital Mobile Communication Networks (이동통신망에서 삼자회의를 위한 음성 부호화기의 성능에 관한 연구)

  • Lee, Mi-Suk;Lee, Yun-Geun;Kim, Gi-Cheol;Lee, Hwang-Su;Jo, Wi-Deok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.1E
    • /
    • pp.30-38
    • /
    • 1995
  • In this paper, we evaluated the performance of vocoders for three-way conferencing using signal summation technique in digital mobile communication network. The signal summation technique yields natural mode of three-way conferencing, in shich the mixed voice signal from two speakers are transmitted to a third person, though there has been no useful speech coding technique for the mixed voice signal yet. We established Qualcomm code term prediction (RPE-LTP) vocoders to provide three-way conferencing using signal summation techinique. In addition, as the conventional speech quality measures are not applicable to the vocoders for mixed voice signals, we proposed two kinds of subjective quality measures. These are the sentence discrimination (SD) test and the modified degraded mean opinion score (MDMOS) test. The experimental results show that the output speech quality of the VSELP vocoder is superior to other two.

  • PDF

Design of Channel Coding Combined with 2.4kbps EHSX Coder (2.4kbps EHSX 음성부호화기와 결합된 채널코딩 방법)

  • Lee, Chang-Hwan;Kim, Young-Joon;Lee, In-Sung
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.9
    • /
    • pp.88-96
    • /
    • 2010
  • We propose the efficient channel coding method combined with a 2.4kbps speech coder. The code rate of a channel coder is given by 1/2 and 1/2 rate convolutional coder is obtained from the punctured convolutional coder with rate of 1/3. The punctured convolutional coder is used for a variable rate allocation. The puncturing method according to the importance of the output data of the source encoder is applied for the convolutional coder. The importance of output data is analyzed by evaluating the bit error sensitivity of speech parameter bits. The performance of proposed coder is analyzed and simulated in Rayleigh fading channel and AWGN channel. The experimental results with 2.4kbps EHSX coder show that the variable rate channel coding method is superior to non-variable channel coding method from the subjective speech quality.

A Study on the Automatic Lexical Acquisition for Multi-lingustic Speech Recognition (다국어 음성 인식을 위한 자동 어휘모델의 생성에 대한 연구)

  • 지원우;윤춘덕;김우성;김석동
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.6
    • /
    • pp.434-442
    • /
    • 2003
  • Software internationalization, the process of making software easier to localize for specific languages, has deep implications when applied to speech technology, where the goal of the task lies in the very essence of the particular language. A greatdeal of work and fine-tuning has gone into language processing software based on ASCII or a single language, say English, thus making a port to different languages difficult. The inherent identity of a language manifests itself in its lexicon, where its character set, phoneme set, pronunciation rules are revealed. We propose a decomposition of the lexicon building process, into four discrete and sequential steps. For preprocessing to build a lexical model, we translate from specific language code to unicode. (step 1) Transliterating code points from Unicode. (step 2) Phonetically standardizing rules. (step 3) Implementing grapheme to phoneme rules. (step 4) Implementing phonological processes.

Performance Evaluation of Reverse Link for Speech and Data Traffic ini CDMA-Based IMT-2000 System (CDMA 방식의 IMT-2000 시스템에서 음성 및 데이터 트래픽에 대한 역방향링크의 성능 평가)

  • Lee, Hyun;Kang, Bob-Joo;You, Young-Gap;Cho, Kyoung-Rok
    • The Journal of Korean Institute of Electromagnetic Engineering and Science
    • /
    • v.11 no.4
    • /
    • pp.657-665
    • /
    • 2000
  • In this study, the bit error rate(BER) performance for the speech and data traffic is evaluated by results of the reverse link simulation of CDMA-based IMT-2000. Simulations in the reverse link are achieved for indoor, pedestrian, and vehicular environments, which are provided by ITU-R . Also, in the these simulations, the fast power control of 1.6kHz rate is applied. The amplitude and phase of the fading signal are estimated by using the 5-tap FIR filter, and the soft-decision Viterbi and Reed-Solomon (RS) decoding are applied. Simulation results provide the optimum ratio of pilot power to traffic power, the BER performance according to the number of fingers, and performance comparison between convolutional code and concatenated code at $10^-6$ BER in 5 MHz system.

  • PDF

A method of wall absorption treatment for enhancing the speech intelligibility at a directional microphone array in a room (실내 공간 내 지향성 마이크 어레이에서의 음성 명료도 개선을 위한 벽면 흡음 처리 방법)

  • Ko, Byeong-Yun;Ih, Jeong-Guon;Cho, Wan-Ho
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.6
    • /
    • pp.649-659
    • /
    • 2021
  • Wall absorption treatment effectively reduces reverberation, but requires a large area for a live room and each wall absorption affects speech intelligibility differently. In this study, we try to find the most effective wall for the absorption treatment using the beamforming array microphone in terms of speech intelligibility. The absorption importance factor is defined by using the collision number of reflected sounds on each wall. It allows estimating how much the speech signal will be enhanced by the absorption treatment. A cuboid room with a size of 107 m3 and a reverberation time of 1.1 s is selected for the simulation. When a Helmholtz-type absorption is treated on the wall with the most significant importance factor, the modified clarity for 500 and 1k Hz is improved by 5.1 dB and 4.8 dB respectively, and the speech transmission index is enhanced by 0.06. The difference in results between the proposed method and commercial simulation code is less than a Just-Noticeable Difference (JND). The absorption treatment on the wall with the most significant importance factor shows improvement greater than the wall with the largest area, and its difference is larger than a JND value.

A Real-time Implementation of G.729.1 Codec on an ARM Processor for the Improvement of VoWiFi Voice Quality (VoWiFi 음질 향상을 위한 G.729.1 광대역 코덱의 ARM 프로세서에의 실시간 구현)

  • Park, Nam-In;Kang, Jin-Ah;Kim, Hong-Kook
    • 한국HCI학회:학술대회논문집
    • /
    • 2008.02a
    • /
    • pp.230-235
    • /
    • 2008
  • This paper addresses issues associated with the real-time implementation of a wideband speech codec such as ITU-T G. 729. 1 on an ARM processor in order to provide an improved voice quality of a VoWiFi service. The real-time implementation features in optimizing the C-source code of G.729. 1 and replacing several parts of the codec algorithm with faster ones. The performance of the implementation is measured by the CPU time spent for G.729.1 on the ARM926EJ processor that is used for a VoWiFi phone. It is shown from the experiments that the G.729.1 codec works in real-time with better voice quality than G 729 codec that is conventionally used for VoIP or VoWiFi phones.

  • PDF

Clustering In Tied Mixture HMM Using Homogeneous Centroid Neural Network (Homogeneous Centroid Neural Network에 의한 Tied Mixture HMM의 군집화)

  • Park Dong-Chul;Kim Woo-Sung
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.31 no.9C
    • /
    • pp.853-858
    • /
    • 2006
  • TMHMM(Tied Mixture Hidden Markov Model) is an important approach to reduce the number of free parameters in speech recognition. However, this model suffers from a degradation in recognition accuracy due to its GPDF (Gaussian Probability Density Function) clustering error. This paper proposes a clustering algorithm, called HCNN(Homogeneous Centroid Neural network), to cluster acoustic feature vectors in TMHMM. Moreover, the HCNN uses the heterogeneous distance measure to allocate more code vectors in the heterogeneous areas where probability densities of different states overlap each other. When applied to Korean digit isolated word recognition, the HCNN reduces the error rate by 9.39% over CNN clustering, and 14.63% over the traditional K-means clustering.

Improving LD-CELP using frame classification and modified synthesis filter (프레임 분류와 합성필터의 변형을 이용한 적은 지연을 갖는 음성 부호화기의 성능)

  • 임은희;이주호;김형명
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.21 no.6
    • /
    • pp.1430-1437
    • /
    • 1996
  • A low delay code excited linear predictive speech coder(LD-CELP) at bit rates under 8kbps is considered. We try to improve the perfomance of speech coder with frame type dependent modification of synthesis filter. We first classify frames into 3 groups: voiced, unvoiced and onset. For voicedand unvoiced frame, the spectral envelope of the synthesis filter is adapted to the phonetic characteristics. For transition frame from unvoiced to voiced, the synthesis filter which has been interpolated with the bias filter is used. The proposed vocoder produced more clear sound with similar delay level than other pre-existing LD-CELP vocoders.

  • PDF