• Title/Summary/Keyword: Quantization parameter

Search Result 146, Processing Time 0.032 seconds

Compression of DNN Integer Weight using Video Encoder (비디오 인코더를 통한 딥러닝 모델의 정수 가중치 압축)

  • Kim, Seunghwan;Ryu, Eun-Seok
    • Journal of Broadcast Engineering
    • /
    • v.26 no.6
    • /
    • pp.778-789
    • /
    • 2021
  • Recently, various lightweight methods for using Convolutional Neural Network(CNN) models in mobile devices have emerged. Weight quantization, which lowers bit precision of weights, is a lightweight method that enables a model to be used through integer calculation in a mobile environment where GPU acceleration is unable. Weight quantization has already been used in various models as a lightweight method to reduce computational complexity and model size with a small loss of accuracy. Considering the size of memory and computing speed as well as the storage size of the device and the limited network environment, this paper proposes a method of compressing integer weights after quantization using a video codec as a method. To verify the performance of the proposed method, experiments were conducted on VGG16, Resnet50, and Resnet18 models trained with ImageNet and Places365 datasets. As a result, loss of accuracy less than 2% and high compression efficiency were achieved in various models. In addition, as a result of comparison with similar compression methods, it was verified that the compression efficiency was more than doubled.

Analysis of Deep learning Quantization Technology for Micro-sized IoT devices (초소형 IoT 장치에 구현 가능한 딥러닝 양자화 기술 분석)

  • YoungMin KIM;KyungHyun Han;Seong Oun Hwang
    • Journal of Internet of Things and Convergence
    • /
    • v.9 no.1
    • /
    • pp.9-17
    • /
    • 2023
  • Deep learning with large amount of computations is difficult to implement on micro-sized IoT devices or moblie devices. Recently, lightweight deep learning technologies have been introduced to make sure that deep learning can be implemented even on small devices by reducing the amount of computation of the model. Quantization is one of lightweight techniques that can be efficiently used to reduce the memory and size of the model by expressing parameter values with continuous distribution as discrete values of fixed bits. However, the accuracy of the model is reduced due to discrete value representation in quantization. In this paper, we introduce various quantization techniques to correct the accuracy. We selected APoT and EWGS from existing quantization techniques, and comparatively analyzed the results through experimentations The selected techniques were trained and tested with CIFAR-10 or CIFAR-100 datasets in the ResNet model. We found out problems with them through experimental results analysis and presented directions for future research.

A Time-Domain Parameter Extraction Method for Speech Recognition using the Local Peak-to-Peak Interval Information (국소 극대-극소점 간의 간격정보를 이용한 시간영역에서의 음성인식을 위한 파라미터 추출 방법)

  • 임재열;김형일;안수길
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.31B no.2
    • /
    • pp.28-34
    • /
    • 1994
  • In this paper, a new time-domain parameter extraction method for speech recognition is proposed. The suggested emthod is based on the fact that the local peak-to-peak interval, i.e., the interval between maxima and minima of speech waveform is closely related to the frequency component of the speech signal. The parameterization is achieved by a sort of filter bank technique in the time domain. To test the proposed parameter extraction emthod, an isolated word recognizer based on Vector Quantization and Hidden Markov Model was constructed. As a test material, 22 words spoken by ten males were used and the recognition rate of 92.9% was obtained. This result leads to the conclusion that the new parameter extraction method can be used for speech recognition system. Since the proposed method is processed in the time domain, the real-time parameter extraction can be implemented in the class of personal computer equipped onlu with an A/D converter without any DSP board.

  • PDF

A Study on Isolated Word Recognition using Improved Multisection Vector Quantization Recognition System (개선된 MSVQ 인식 시스템을 이용한 단독어 인식에 관한 연구)

  • An, Tae-Ok;Kim, Nam-Joong;Song, Chul;Kim, Soon-Hyeob
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.16 no.2
    • /
    • pp.196-205
    • /
    • 1991
  • This paper is a study on the isolated word recognition of speaker independent which proposes to newly improved MSVQ(multisection vector quantization) recognition system which improve the classical MSVQ recognition system. It is a difference that test pattern has on more section than reference pattern in recognition system 146 DDD area names are selected as recognition vocabulary. 12th LPC cepstral coefficients is used as feature parameter. and when codebook is generated, MINSUM and MINMAX are used in finding the centroid. According to the experiment result. it is proved that this method is better than VQ(vector quantization) recognition methods, DTW(dynamic time warping) pattern matching methods and classical MSVQ methods for recognition rate and recognition time.

  • PDF

Adaptive Watermarking Using Successive Subband Quantization and Perceptual Model Based on Multiwavelet Transform Domain (멀티웨이브릿 변환 영역 기반의 연속 부대역 양자화 및 지각 모델을 이용한 적응 워터마킹)

  • 권기룡;이준재
    • Journal of Korea Multimedia Society
    • /
    • v.6 no.7
    • /
    • pp.1149-1158
    • /
    • 2003
  • Content adaptive watermark embedding algorithm using a stochastic image model in the multiwavelet transform is proposed in this paper. A watermark is embedded into the perceptually significant coefficients (PSCs) of each subband using multiwavelet transform. The PSCs in high frequency subband are selected by SSQ, that is, by setting the thresholds as the one half of the largest coefficient in each subband. The perceptual model is applied with a stochastic approach based on noise visibility function (NVF) that has local image properties for watermark embedding. This model uses stationary Generalized Gaussian model characteristic because watermark has noise properties. The watermark estimation use shape parameter and variance of subband region. it is derive content adaptive criteria according to edge and texture, and flat region. The experiment results of the proposed watermark embedding method based on multiwavelet transform techniques were found to be excellent invisibility and robustness.

  • PDF

Design of the Vector-Scalar Quantizer of LSP Parameters for Wideband Speech Coder (광대역 음성부호화기를 위한 백터-스칼라 LSP 파라미터 양자화기 설계)

  • 신재현;이인성;지덕구;윤병식;최송인
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.40 no.4
    • /
    • pp.286-291
    • /
    • 2003
  • In this Paper, we designed an LSP(Line Spectral Pairs) parameter quantizer with cascaded structure of vector quantizer and scalar quantizer for the wideband speech coder. We have chosen the 16th-order of the LP coefficients. These coefficients are then transformed into the LSP parameters which have the excellent properties for quantization and easy stability checking condition of synthesis filter. In the first stage of quantization, input LSP parameters are split-vector-quantized using two 8-th order codebooks. In the second stage, the components of residual vector are individually quantized by the scalar quantizer utilizing the ordering property of LSP parameters. The designed adaptive VQ-SQ quantizer using 35 bits/frame shows the wideband transparency that the average spectral distortion should be less than 1.6 ㏈ and less than 4% of the frames should have SD above 3 ㏈. The simulation results show that the designed quantizer provides a 2-3 bits/frame saving over the typical vector-scalar quantizer.

An Adaptive Rate Control Using Piecewise Linear Approximation Model (부분 선형 근사 모델을 이용한 적응적 비트율 제어)

  • 조창형;정제창;최병욱
    • Journal of Broadcast Engineering
    • /
    • v.2 no.2
    • /
    • pp.194-205
    • /
    • 1997
  • In video compression standards such as MPEG and H.263. rate control is one of the key components for good coding performance. This paper presents a simple adaptive rate control scheme using a piecewise linear approximation model. While conventional buffer control approach is performed by adjusting the quantization parameter linearly according to the buffer fullness. the proposed approach uses a piecewise linear approximation model derived from logarithmic relation between the quantization parameter and bitrate in data compression. In addition. a forward analyzer performed in the spatial domain is used to improve image quality. Simulation results demonstrate that the proposed method provides better performance than the conventional one and reduces the fluctuation of the PSNR per frame while maintaining the quality of the reconstructed frames at a relatively stable level.

  • PDF

Quantization Parameter Selection Method For H.264-based Multi-view Video Coding (H.264 기반 다시점 비디오 부호화를 위한 양자화 계수 결정 방법)

  • Park, Pil-Kyu;Ho, Yo-Sung
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.32 no.6C
    • /
    • pp.579-584
    • /
    • 2007
  • Recently various prediction structures have been proposed to exploit inter-view correlation among multi-view video sequences. In this paper, we propose a QP(quantization parameter) selection method for the B frame inserted in the first frames of each GOP(group of pictures), where we change QP for the B frame adaptively to achieve uniform picture quality and overall coding gain. Each B frame is coded with reference to two frames in its adjacent views. We calculate QP for the B frame based on the correlation between the two reference frames, calculated using their rate-distortion costs. By applying the proposed method to the MVC reference prediction structure, we have improved the coding gain by 0.09$\sim$0.16 dB.

PSNR-based Initial QP Determination for Low Bit Rate Video Coding

  • Park, Sang-Hyun
    • Journal of information and communication convergence engineering
    • /
    • v.10 no.3
    • /
    • pp.315-320
    • /
    • 2012
  • In H.264/AVC, the first frame of a group of pictures (GOP) is encoded in intra mode which generates a large number of bits. The number of bits for the I-frame affects the qualities of the following frames of a GOP since they are encoded using the bits remaining among the bits allocated to the GOP. In addition, the first frame is used for the inter mode encoding of the following frames. Thus, the initial quantization parameter (QP) affects the following frames as well as the first frame. In this paper, an adaptive peak signal to noise ratio (PSNR)-based initial QP determination algorithm is presented. In the proposed algorithm, a novel linear model is established based on the observation of the relation between the initial QPs and PSNRs of frames. Using the linear model and PSNR results of the encoded GOPs, the proposed algorithm accurately estimates the optimal initial QP which maximizes the PSNR of the current GOP. It is shown by experimental results that the proposed algorithm predicts the optimal initial QP accurately and thus achieves better PSNR performance than that of the existing algorithm.

Speaker Identification Based on Vowel Classification and Vector Quantization (모음 인식과 벡터 양자화를 이용한 화자 인식)

  • Lim, Chang-Heon;Lee, Hwang-Soo;Un, Chong-Kwan
    • The Journal of the Acoustical Society of Korea
    • /
    • v.8 no.4
    • /
    • pp.65-73
    • /
    • 1989
  • In this paper, we propose a text-independent speaker identification algorithm based on VQ(vector quantization) and vowel classification, and its performance is studied and compared with that of a conventional speaker identification algorithm using VQ. The proposed speaker identification algorithm is composed of three processes: vowel segmentation, vowel recognition and average distortion calculation. The vowel segmentation is performed automatlcally using RMS energy, BTR(Back-to-Total cavity volume Ratio)and SFBR(Signed Front-to-Back maximum area Ratio) extracted from input speech signal. If the Input speech signal Is noisy, particularity when the SNR is around 20dB, the proposed speaker identification algorithm performs better than the reference speaker identification algorithm when the correct vowel segmentation is done. The same result is obtained when we use the noisy telephone speech signal as an input, too.

  • PDF