• Title/Summary/Keyword: model quantization

Search Result 227, Processing Time 0.027 seconds

Development of an MPEG-4 AAC encoder of low implementation complexity (낮은 연산 부담을 갖는 MPEG-4 AAC 인코더 개발에 관한 연구)

  • 김병일;김동환;장태규;장흥엽
    • Proceedings of the IEEK Conference
    • /
    • 2003.07e
    • /
    • pp.2467-2470
    • /
    • 2003
  • This paper presents a new structure of MPEG-4 AAC encoder. The proposed encoder directly shapes quantization noise distribution according to the energy distribution curve and thereafter performs adjustment of the offset level of the noise distribution to meet the given bit rate. The direct noise shaping and the bit rate matching scheme of the proposed encoder algorithm significantly alleviate the problem of conventional encoder's processing burden related with the employment of the precise psychoacoustic model and iteration intensive quantizer. The encoder algorithm is implemented on ARM processor with fixed-feint arithmetic operations. The audio quality of the implemented system is observed comparable to those of commercially available encoders, white the complexity of the implementation is drastically reduced in comparison to the conventional encoder systems.

  • PDF

A New Rate Control algorithm for Transcoder Based-on Bit-rate Reduction Characteristics of Requantization (재양자화 특성을 이용한 비트율 변환기의 전송률 제어 기법)

  • 서광덕;이상희;권순각;유국열;김재균
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 1997.11a
    • /
    • pp.181-186
    • /
    • 1997
  • Transcoding is the key technique to further reduce the bit-rate of a previously compressed video. The performance of the transcoding is evaluated by the two factors, the accuracy on the target bit-rate and the complexity of the implementation. In this paper, were propose a new rate control algorithm which has very accurate bit-rate control performance and much smaller computational complexity. For the accuracy problem, we empirically observe the relationship between the quantization step size and generated bits in requantization process and then find that the relationship can be characterized as the new piece-wise linear model. For the complexity problem, we reduce the role of feedback rate control. The simulation results show that the proposed method gives the better performance in the accuracy with the same picture quality than conventional rat control algorithm.

  • PDF

On a robust text-dependent speaker identification over telephone channels (전화음성에 강인한 문장종속 화자인식에 관한 연구)

  • Jung, Eu-Sang;Choi, Hong-Sub
    • Speech Sciences
    • /
    • v.2
    • /
    • pp.57-66
    • /
    • 1997
  • This paper studies the effects of the method, CMS(Cepstral Mean Subtraction), (which compensates for some of the speech distortion. caused by telephone channels), on the performance of the text-dependent speaker identification system. This system is based on the VQ(Vector Quantization) and HMM(Hidden Markov Model) method and chooses the LPC-Cepstrum and Mel-Cepstrum as the feature vectors extracted from the speech data transmitted through telephone channels. Accordingly, we can compare the correct recognition rates of the speaker identification system between the use of LPC-Cepstrum and Mel-Cepstrum. Finally, from the experiment results table, it is found that the Mel-Cepstrum parameter is proven to be superior to the LPC-Cepstrum and that recognition performance improves by about 10% when compensating for telephone channel using the CMS.

  • PDF

Video Rate Control Using Activity Based Rate Prediction

  • Park, Hyung-Shin;Jung, You-Young;Kim, Young-Ro;Ko, Sung-Jea
    • Proceedings of the IEEK Conference
    • /
    • 2000.07a
    • /
    • pp.454-457
    • /
    • 2000
  • In this paper, an efficient rate control algorithm based on rate prediction is proposed for maintaining a smooth buffer variation and a small buffer size. The proposed method adjusts the quantization scaling factor by using the predicted bit-rate to meet the target bit budget exactly. Experimental result show that the proposed prediction-based rate control scheme can regulate the bit-rate across scene changes more effectively and achieve better PSNR performance than existing rate control mechanisms such as the MPEG-2 Test Model 5 (TM5) and the Adaptive Scene Analysis (ASA).

  • PDF

A Study on the Implementation of Low Power DCT Architecture for MPEG-4 AVC (저전력 DCT를 이용한 MPEG-4 AVC 압축에 관한 연구)

  • Kim, Dong-Hoon;Seo, Sang-Jin;Park, Sang-Bong;Jin, Hyun-Joon;Park, Nho-Kyung
    • Proceedings of the KIEE Conference
    • /
    • 2007.10a
    • /
    • pp.371-372
    • /
    • 2007
  • In this paper we present performance and implementation comparisons of high performance two dimensional forward and inverse Discrete Cosine Transform (2D-DCT/IDCT) algorithm and low power algorithm for $8{\times}8$ 20 DCT and quantization based on partial sum and its corresponding hardware architecture for FPGA in MPEG-4. The architecture used in both low power 20 DCT and 2D IDCT is based on the conventional row-column decomposition method. The use of Fast algorithm and distributed arithmetic(DA) technique to implement the DCT/IDCT reduces the hardware complexity. The design was made using Mentor Graphics Tools for design entry and implementation. Mentor Graphics ModelSim SE6.1f was used for Verilog HDL entry, behavioral Simulation and Synthesis. The 2D DCT/IDCT consumes only 50% of the Operating Power.

  • PDF

Visual Modeling and Content-based Processing for Video Data Storage and Delivery

  • Hwang Jae-Jeong;Cho Sang-Gyu
    • Journal of information and communication convergence engineering
    • /
    • v.3 no.1
    • /
    • pp.56-61
    • /
    • 2005
  • In this paper, we present a video rate control scheme for storage and delivery in which the time-varying viewing interests are controlled by human gaze. To track the gaze, the pupil's movement is detected using the three-step process : detecting face region, eye region, and pupil point. To control bit rates, the quantization parameter (QP) is changed by considering the static parameters, the video object priority derived from the pupil tracking, the target PSNR, and the weighted distortion value of the coder. As results, we achieved human interfaced visual model and corresponding region-of-interest rate control system.

Robust Speech Recognition by Utilizing Class Histogram Equalization (클래스 히스토그램 등화 기법에 의한 강인한 음성 인식)

  • Suh, Yung-Joo;Kim, Hor-Rin;Lee, Yun-Keun
    • MALSORI
    • /
    • no.60
    • /
    • pp.145-164
    • /
    • 2006
  • This paper proposes class histogram equalization (CHEQ) to compensate noisy acoustic features for robust speech recognition. CHEQ aims to compensate for the acoustic mismatch between training and test speech recognition environments as well as to reduce the limitations of the conventional histogram equalization (HEQ). In contrast to HEQ, CHEQ adopts multiple class-specific distribution functions for training and test environments and equalizes the features by using their class-specific training and test distributions. According to the class-information extraction methods, CHEQ is further classified into two forms such as hard-CHEQ based on vector quantization and soft-CHEQ using the Gaussian mixture model. Experiments on the Aurora 2 database confirmed the effectiveness of CHEQ by producing a relative word error reduction of 61.17% over the baseline met-cepstral features and that of 19.62% over the conventional HEQ.

  • PDF

Trend of Edge Machine Learning as-a-Service (서비스형 엣지 머신러닝 기술 동향)

  • Na, J.C.;Jeon, S.H.
    • Electronics and Telecommunications Trends
    • /
    • v.37 no.5
    • /
    • pp.44-53
    • /
    • 2022
  • The Internet of Things (IoT) is growing exponentially, with the number of IoT devices multiplying annually. Accordingly, the paradigm is changing from cloud computing to edge computing and even tiny edge computing because of the low latency and cost reduction. Machine learning is also shifting its role from the cloud to edge or tiny edge according to the paradigm shift. However, the fragmented and resource-constrained features of IoT devices have limited the development of artificial intelligence applications. Edge MLaaS (Machine Learning as-a-Service) has been studied to easily and quickly adopt machine learning to products and overcome the device limitations. This paper briefly summarizes what Edge MLaaS is and what element of research it requires.

Sparsity Increases Uncertainty Estimation in Deep Ensemble

  • Dorjsembe, Uyanga;Lee, Ju Hong;Choi, Bumghi;Song, Jae Won
    • Annual Conference of KIPS
    • /
    • 2021.05a
    • /
    • pp.373-376
    • /
    • 2021
  • Deep neural networks have achieved almost human-level results in various tasks and have become popular in the broad artificial intelligence domains. Uncertainty estimation is an on-demand task caused by the black-box point estimation behavior of deep learning. The deep ensemble provides increased accuracy and estimated uncertainty; however, linearly increasing the size makes the deep ensemble unfeasible for memory-intensive tasks. To address this problem, we used model pruning and quantization with a deep ensemble and analyzed the effect in the context of uncertainty metrics. We empirically showed that the ensemble members' disagreement increases with pruning, making models sparser by zeroing irrelevant parameters. Increased disagreement implies increased uncertainty, which helps in making more robust predictions. Accordingly, an energy-efficient compressed deep ensemble is appropriate for memory-intensive and uncertainty-aware tasks.

A Model Compression for Super Resolution Multi Scale Residual Networks based on a Layer-wise Quantization (계층별 양자화 기반 초해상화 다중 스케일 잔차 네트워크 압축)

  • Hwang, Jiwon;Bae, Sung-Ho
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2020.07a
    • /
    • pp.540-543
    • /
    • 2020
  • 기존의 초해상도 딥러닝 기법은 모델의 깊이가 깊어지면서, 좋은 성능을 내지만 점점 더 복잡해지고 있고, 실제로 사용하는데 있어 많은 시간을 요구한다. 이를 해결하기 위해, 우리는 딥러닝 모델의 가중치를 양자화 하여 추론시간을 줄이고자 한다. 초해상도 모델은 feature extraction, non-linear mapping, reconstruction 세 부분으로 나누어져 있으며, 레이어 사이에 많은 skip-connection 이 존재하는 특징이 있다. 따라서 양자화 시 최종 성능 하락에 미치는 영향력이 레이어 별로 다르며, 이를 감안하여 강화학습으로 레이어 별 최적 bit 를 찾아 성능 하락을 최소화한다. 본 논문에서는 Skip-connection 이 많이 존재하는 MSRN 을 사용하였으며, 결과에서 feature extraction, reconstruction 부분과 블록 내 특정 위치의 레이어가 항상 높은 bit 를 가짐을 알 수 있다. 기존에 영상 분류에 한정되어 사용되었던 혼합 bit 양자화를 사용하여 초해상도 딥러닝 기법의 모델 사이즈를 줄인 최초의 논문이며, 제안 방법은 모바일 등 제한된 환경에 적용 가능할 것으로 생각된다.

  • PDF