• Title/Summary/Keyword: Audio Compression

Search Result 135, Processing Time 0.025 seconds

An efficient fixed-point implementation of the IMDCT for audio compression (오디오 압축을 위한 IMDCT의 최적 DSP 근사구현 기법 연구)

  • Jeong, J.H.;Chang, T.G.;Son, Y.K.;Lee, J.W.
    • Proceedings of the KIEE Conference
    • /
    • 2001.07d
    • /
    • pp.2513-2515
    • /
    • 2001
  • 본 논문에서는 유한비트 근사화를 통하여 고정소수점 연산을 이용하여 DCT구현시 발생하는 오차 영향에 대한 해석을 수행하였다. 고정소수점 연산을 위해서는 유한 비트 근사화를 실시하여야 하는데 이 과정에서 수치 표현범위의 제약으로 인한 오차가 발생하게 되고, 특히 순환 연산구조를 가지는 DCT등의 알고리즘 구현시 급격한 성능의 감소를 가져오게 된다. 본 논문에서는 순환 연산식을 유한비트 근사화를 통하여 구현시 발생되는 에러에 대한 분석을 수행하고, 해석식을 도출하였다.

  • PDF

DNN-based Audio Compression Model Optimization Utilizing Entropy Model (엔트로피 모델을 활용한 심층 신경망 기반 오디오 압축 모델 최적화)

  • Lim, Hyungseob;Kang, Hong-Goo;Jang, Inseon
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2022.06a
    • /
    • pp.54-57
    • /
    • 2022
  • 본 논문에서는 심층 신경망 기반 점진적 다계층 오디오 코덱의 비트 전송률 효율 향상을 위한 엔트로피 모델 기반 양자화 방식을 제안한다. 최근 심층 신경망을 이용하여 전통적인 신호 처리 이론 기반의 상용 오디오 코덱들을 대체하기 위한 오디오 압축 및 복원 시스템에 관한 연구가 활발하게 이루어지고 있다. 그러나 아직은 기존 상용 코덱의 성능에 도달하지 못하고 있으며 특히 종단 간 오디오 압축 모델의 경우, 적은 정보량으로 높은 품질을 얻기 위해서는 부호화기의 양자화 구조를 개선하는 것이 필수적이다. 본 연구에서는 기존에 제안된 종단 간 오디오 압축 모델 중 하나인 점진적 다계층 오디오 코덱의 벡터 양자화기를 엔트로피 모델 기반 양자화기로 대체하고 전송률-왜곡 트레이드오프 관계를 활용하여 전송률을 다양한 형태로 조절할 수 있음을 보임으로써 엔트로피 모델 기반 양자화기 도입의 타당성을 검증한다.

  • PDF

Additive Data Insertion into MP3 Bitstream Using linbits Characteristics (Linbits 특성을 이용하여 MP3 비트스트림에 부가적인 정보를 삽입하는 방법에 관한 연구)

  • 김도형;양승진;정재호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.7
    • /
    • pp.612-621
    • /
    • 2003
  • As the use of MP3 audio compression increased, the demand for the insertion of additive data about copyright or information on music contents has been groved and the related research has been progressed actively. When an additive data is inserted into MP3 bitstream, it should not to happen any distortion of music quality or the change of file size, due to the modification of MP3 bitstream structure. In our study, to make these conditions satisfied, we inserted some additive data to bitstream by modifying some bits of linbits among the quantized integer coefficients having big values. At this time, we consider the characteristics of linbits and their distributions. As a result of subjective sound quality test through MOS test, we confirmed that the quality of MOS 4.6 can be achieved at the data insertion rate of 60 bytes/sec. Using the proposed method, it is possible to effectively insert an additive data such as copyright information or information about media itself, so that various applications like audio database management can be realized.

A Research on Quality Improvement of Software-based Video Teleconferencing on the Tactical Communication Networks Less Than 1Mbps (1Mbps 이하 전술통신망에서의 소프트웨어 방식 화상회의 품질향상 연구)

  • Kim, Gwon-Hee
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.37 no.1C
    • /
    • pp.63-75
    • /
    • 2012
  • This paper researched the operation methods of software video teleconferencing on the tactical communication networks under 1Mbps. The tactical communication networks have limited bandwidths, frequent data losses and transmission delays due to the unstable networks. In addition, the bandwidth for video teleconferencing has to be much smaller since the Army Tactical Command Information System(ATCIS) has priority of using the bandwidth. This paper analyzed such restrictions of tactical communication networks, presented some methods to improve the quality of the software video teleconferencing on the tactical communication networks and their actual experiments as well. It is applied in the first place to re-transmit the lost packets and to reduce the image size for the data traffic. Nothing is better for the video teleconferencing than to provide the bandwidth enough for every user. However, on the tactical communication networks with the limited bandwidth, video teleconferencing can be improved by optimizing the compression rate of image data, the number of image frames, the audio codec and the usage of audio compensation data.

Effects of Dynamic Compression to Listening Monitor on Vocal Recording (보컬 녹음에서 모니터에 적용된 컴프레서가 가창에 미치는 영향)

  • Kim, Si-On;Park, Jae-Rock
    • Journal of Korea Entertainment Industry Association
    • /
    • v.13 no.2
    • /
    • pp.93-100
    • /
    • 2019
  • Dynamic Compressors in vocal recordings of modern pop music are essential equipment. Dynamic compressors are applied not only to the mix for listening to music but also to the monitor for the singer to listen to his voice along with the accompaniment while the singer is recording. This study is an experimental study on the effects of a dynamic compressor applied to a monitor environment on the vocal performance of a singer. 10 participating singers participated in the blind test to test how the vocals heard through the monitor would be affected by the 1:1, 2:1 and 4:1 compression ratio. Experimental results show that the higher the compression ratio applied to the monitor, the bigger the song, the brighter the tone, but the pitch becomes finer inaccuracy on the bigger dynamic part of the song. In post-interviews with blinds, it was found that singers generally preferred to hear compressed sound through a compressor on the monitor. Since the music used in the experiment was a ballad with a wide dynamic range, it could not be generalized to all kind of music recordings, but it could provide important implications for the monitoring of recording sites. In addition, We hope that the cognitive science approach to recording technology will be added based on this paper which has been studied through empirical studies on the effect of the monitor environment on the singing voice.

The Vocabulary Recognition Optimize using Acoustic and Lexical Search (음향학적 및 언어적 탐색을 이용한 어휘 인식 최적화)

  • Ahn, Chan-Shik;Oh, Sang-Yeob
    • Journal of Korea Multimedia Society
    • /
    • v.13 no.4
    • /
    • pp.496-503
    • /
    • 2010
  • Speech recognition system is developed of standalone, In case of a mobile terminal using that low recognition rate represent because of limitation of memory size and audio compression. This study suggest vocabulary recognition highest performance improvement system for separate acoustic search and lexical search. Acoustic search is carry out in mobile terminal, lexical search is carry out in server processing system. feature vector of speech signal extract using GMM a phoneme execution, recognition a phoneme list transmission server using Lexical Tree Search algorithm lexical search recognition execution. System performance as a result of represent vocabulary dependence recognition rate of 98.01%, vocabulary independence recognition rate of 97.71%, represent recognition speed of 1.58 second.

Adaptive Buffer Management Method for Quality of Service of Internet Telephony (인터넷폰의 QoS를 위한 적응적인 버퍼관리 방식)

  • 류태욱;이정훈;강성호;엄기환
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.6 no.3
    • /
    • pp.386-392
    • /
    • 2002
  • Internet telephony is an application that transmits voice data for conversation. Therefore it must provide high sound quality. However while audio packets are transferred through the network, they are affected by delay variations and jitters, which could result in poor sound quality of the receiving end does not have an appropriate jitter buffer to overcome network factors. This thesis introduces a buffer management algorithm that could be used to provide better sound quality for Internet phone terminals. This algorithm actively responds to both the compression algorithms that are used by the terminals, as well as to the received data to provide an improvement in sound quality. In order to verify the effectiveness of the proposed algorithm, we experimented in variance network settings. The results show that the proposed algorithm improves on the performance of the conventional buffer management algorithm.

Estimation of Lifetime Data Storage Capacity for Human Senses (인간 감각 정보를 위한 평생 기억용량 평가)

  • You, Young-Gap;Song, Young-Jun;Kim, Dong-Woo
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.1
    • /
    • pp.23-29
    • /
    • 2009
  • This paper presents a capacity estimation of a storage system accumulating all data sensed during the lifetime of an individual human being. The calculation assumes modern data compression and data collection schemes based on wearable or implanted devices under ubiquitous environment. More than 76% of the storage area is found to be used for video data storage of common TV image quality. The remaining storage area is for data from other sensing organs including audio, taste, olfactory and tactual systems in addition to indexing information. Total storage area of around 600 tera bytes is needed to cover 100 years of human life including his fetal period.

Multimedia Data Security of Video Conferencing System (영상회의 시스템에서의 멀티미디어 데이터 보안)

  • 이원호;한군희
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2003.05a
    • /
    • pp.231-236
    • /
    • 2003
  • Video conferencing system it is various at internet and uses the reading is become accomplished. Research of like this portion synchronization of audio, the video compression technique and multimedia data, supports the video conference the research of the Mbone of the If multicast for being active, being become accomplished the multimedia service which is various an video from internet, the line speed of communication becomes high-speed anger and to follow leads is become accomplished. The video conference from opening elder brother dispersion internet network environment the problem against the image which is an image conference data and a voice security is serious and it raises its head. To sleep it presents the security method which from the video conference it follows in quality of multimedia data from the dissertation which it sees and it does.

  • PDF

Adaptive Watermarking Using Wavelet Transform & Spread Spectrum Method (확산스펙트럼 방식과 웨이브렛 변환을 이용한 적응적인 워터마킹)

  • 김현환;김두영
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.4 no.2
    • /
    • pp.389-395
    • /
    • 2000
  • Digital Watermarking is a research area which aims at hiding secret information in digital multimedia content such as images, audio, and video. In this paper, we propose a new watermarking method with visually recognizable symbols into the digital images using wavelet transform, spread spectrum method and multilevel threshold value in considering the wavelet coefficients. The information of watermark can be extracted by subtracting wavelet coefficients with the original image and the watermarked image. The results of this experiment show that the proposed algorithm was superior to other similar watermarking algorithms. We showed Watermarking algorithm in JPEG lossy compression, resizing, LSB(Least Significant Bit) masking, and filtering.

  • PDF