Search | Korea Science

Audio /Speech Codec Using Variable Delay MDCT/IMDCT (가변 지연 MDCT/IMDCT를 이용한 오디오/음성 코덱)

Sangkil Lee;In-Sung Lee
- The Journal of Korea Institute of Information, Electronics, and Communication Technology
- /
- v.16 no.2
- /
- pp.69-76
- /
- 2023
A high-quality audio/voice codec using the MDCT/IMDCT process can perfectly restore the current frame through an overlap-add process with the previous frame. In the overlap-add process, an algorithm delay equal to the frame length occurs. In this paper, we propose a MDCT/IMDCT process that reduces algorithm delay by using a variable phase shift in MDCT/IMDCT process. In this paper, a low-delay audio/speech codec was proposed by applying the low delay MDCT/IMDCT algorithm to the ITU-T standard codec G.729.1 codec. The algorithm delay in the MDCT/IMDCT process can be reduced from 20 ms to 1.25 ms. The performance of the decoded output signal of the audio/speech codec to which low-delay MDCT/IMDCT is applied is evaluated through the PESQ test, which is an objective quality test method. Despite of the reduction in transmission delay, it was confirmed that there is no difference in sound quality from the conventional method.
https://doi.org/10.17661/jkiiect.2023.16.2.69 인용 PDF HTML

Audio Data Transmission Based on The Wavelet Transform for ZigBee Applications (ZigBee 응용을 위한 웨이블릿변환 기반 오디오 데이터 전송)

Chen, Zhenxing;Choi, Eun Chang;Huh, Jae Doo;Kang, Seog Geun
- IEMEK Journal of Embedded Systems and Applications
- /
- v.2 no.1
- /
- pp.31-42
- /
- 2007
A transform coding scheme for the transmission of audio data in ZigBee based wireless personal area networks (WPAN) is presented in this paper. Here, wavelet transform is exploited to encode the features of audio data included mainly in the low frequency region. As a result, it is confirmed that the presented scheme recovers the original audio signals much accurately while it transmits the binary data compressed as 37.5% of the entire data generated without coding scheme. Especially, the mean-squared error between the recovered and original audio data approaches $10^{-4}$ when the signal-to-noise power ratio is sufficiently high. Hence, the presented coding scheme which exploits the wavelet transform is possibly applied for high-quality audio data transmission services in a small-scale sensor network based on ZigBee. Such a result is considered to be applicable as a basic material to update the technical specifications and develop the applications of ZigBee in WPANs.
PDF

Performance Evaluation of Frame Erasure Concealment Algorithms in VoIP Coders (VoIP 코더들의 프레임손실은닉 알고리즘 성능평가)

Han, Seung-Ho;Moon, Kwang;Han, Min-Soo
- Proceedings of the KSPS conference
- /
- 2004.05a
- /
- pp.235-238
- /
- 2004
Frame erasures cause speech quality degradation in wireless communication networks or packet networks. The degradation becomes worse when consecutive frame erasures occur. Speech coders have a frame erasure concealment(FEC) mechanism to compensate for frame erasures. It is meaningful to evaluate the performance of FEC mechanisms for frame erasures that occur in communications networks. In this paper, various frame erasures are designed. And the FEC algorithms of speech coders are evaluated and analyzed with the Perceptual Evaluation of Speech Quality(PESQ). It is found that the performances vary in accordance with frame erasure types, frame erasure rates, and utterance lengths.
PDF

Audio Coder Using an Adaptive Wavelet packet Decomposition and Psychoacoustic (적응 웨이블릿 패킷을 이용한 오디오 부호화기와 심리음향 모델링)

김준성
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1998.06c
- /
- pp.245-248
- /
- 1998
In this paper, a new variable wavelet packet decomposition audio coder, based on the time varying characteristic of the audio signals, is proposed and presents a technique to incorporate psychoacoustic models into an adaptive wave let packet scheme. The proposed filterbank improves the defect of the polyphase filterbank that could not properly represent the critical band and the defect of QMF-tree filter that need high complexity to implement. The filterbank consists of varying number of subband from 4 to 26 bands and use Daubechies 6-order wave let. The codec yields excellent quality at total bit rates of about 128kbps for monophonic CD-quality signals with an sampling frequency of 44.1kHz and reduces complexity by 19% for various bit-rates and sources with encoding and decoding process.
PDF

A Study on the MDCT Design for MPEG-2 Audio (MPEG-2 오디오를 위한 MDCT 설계에 관한 연구)

김정태;구대성;이강현
- Proceedings of the IEEK Conference
- /
- 2000.11c
- /
- pp.97-100
- /
- 2000
The most important technology is the compression methods in the multimedia society. Audio files are rapidly propagated through internet. MP-3(MPEG-1 Layer3) is offered to CD tone quality in 128kbps, but 64kbps below tone-quality is abruptly down. On the other hand, MPEG-II AAC (Advanced Audio Coding) is not compatible with MPEG-I, but AAC has a high compression ratio 1.4 times better than MP-3 and it has max. 7.1 channel and 96KHz sampling rate. In this paper, we designed the optimized MDCT (Modified Discrete Cosine Transform) that could decrease the capacity of enormous computation and could increase the processing speed in the MPEG-2 AAC encoder.
PDF

Determination of the Speaker Position and Evaluation of the Audio System of the Passenger Car (자동차 스피커의 위치선정 및 오디오 성능평가 방법)

이장명;권오상
- Transactions of the Korean Society of Automotive Engineers
- /
- v.4 no.4
- /
- pp.1-8
- /
- 1996
The sound quality of the car audio system is affected by the serveral factors such as the dimensions of the room, the boundary condition of the wall, the location of the speakers, etc. Among these factors, the location of the car speakers has been focused to find the best location of the car speakers assuming that the flat response is better. To verify the suggestion, the subjective test is adopted using 10 people. The developed method is utilizd to evaluate the function of the audio system with fixed speaker position.
PDF

Design and Implementation of the low power and high quality audio encoder/decoder for voice synthesis (음성 합성용 저전력 고음질 부호기/복호기 설계 및 구현)

Park, Nho-Kyung;Park, Sang-Bong;Heo, Jeong-Hwa
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.13 no.6
- /
- pp.55-61
- /
- 2013
In this paper, we describe design and implementation of audio encoder/decoder for voice synthesis. It uses the encoding of difference value of successive samples instead of the original sample value. and has the compression ratio of 4. The function is verified by using FPGA and the performance is measured by the fabricated chip using $0.35{\mu}m$ standard CMOS process. The system clock is 16.384MHz. The measured THD+n is from -40dB to -80dB with frequency variation and the power consumption is about 80mW. It is suited for the mobile application of high audio quality and low power consumption.
https://doi.org/10.7236/JIIBC.2013.13.6.55 인용 PDF KSCI

Sound Quality Enhancement in MPEG Surround by Using ILD Distortion (ILD DISTORTION을 이용한 MPEG SURROUND의 음질 개선)

Chon, Sang-Bae;Choi, In-Yong;Sung, Koeng-Mo
- Proceedings of the IEEK Conference
- /
- 2006.06a
- /
- pp.241-242
- /
- 2006
MPEG Surround is an audio coding technology that represents multi-channel audio signal with downmixed audio signal(s) and very low bitrate side information based on Binaural Cue Coding. The side information consists of Inter-Channel Level Difference, Inter-Channel Correlation, and payloads. These two parameters are correspondent to the well-known spatial parameters in psycho-acoustics, Inter-aural Level Difference (ILD) and Inter-Aural Cross Correlation (IACC). Though ICLD is to provide perceptually equivalent ILD to the listener, however, the ILD of the original multi-channel audio signal and that of the MPEG Surround encoded signal was different. The difference between two ILD values is defined as ILD Distortion (ILDD). This paper provides how ILDD can be applied to enhance sound quality in MPEG Surround and how much ILDD is decreased.
PDF

Dynamic Redundant Audio Transmission for Packet Loss Recovery in VoIP Systems (인터넷 전화에서 손실 패킷 복원을 위한 동적인 부가 정보 전송 기법)

권철홍;김무중
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.4
- /
- pp.349-360
- /
- 2002
In ITU H.323 teleconference system, the RTP/RTCP protocol is offered to transfer real-time multimedia stream. Both sender and receiver hate experience in packet loss and jitter which result from network congestion over Internet. Audio quality over Internet depends on the number of lost packets and on jitter between successive packets. The goal of our study is to improve the speech quality over Internet by checking the packet loss characteristics of the network and adopting the but for control management mechanism at the receiver. We suggest a dynamic redundant audio transmission mechanism which examines the packet loss rate and uses the feedback information through RTCP.
PDF KSCI

Improvement of the TCX Module in AMR-WB+ Codec Using Pyramid VQ (Pyramid VQ를 이용한 AMR-WB+ 코덱 내 TCX 모듈의 성능 개선)

Park, Sang-Kuk;Park, Jung-Eun;Baik, Seung-Kweon;Seo, Jung-Il;Kang, Sang-Won
- The Journal of the Acoustical Society of Korea
- /
- v.26 no.3
- /
- pp.109-114
- /
- 2007
In this paper, we Propose a pyramid VQ to quantize the transform coefficients of TCX module for the audio improvement of AMR-WB+ codec. The Proposed pyramid VQ is compared to the $RE_8$ Lattice VQ used in the AMR-WB+ standard codec. demonstrating improvement 4% and 5.7%. respectively, in Mean Squared Error (MSE) and 3.3% and 4.7%. respectively, in Perceptual Evaluation of Audio Quality (PEAQ) by 8-dimensional and 16-dimensional Pyramid VQ.
https://doi.org/10.7776/ASK.2007.26.3.109 인용 PDF KSCI

Search Result 446, Processing Time 0.029 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)