• Title/Summary/Keyword: model quantization

Search Result 227, Processing Time 0.026 seconds

NEST-C: A deep learning compiler framework for heterogeneous computing systems with artificial intelligence accelerators

  • Jeman Park;Misun Yu;Jinse Kwon;Junmo Park;Jemin Lee;Yongin Kwon
    • ETRI Journal
    • /
    • v.46 no.5
    • /
    • pp.851-864
    • /
    • 2024
  • Deep learning (DL) has significantly advanced artificial intelligence (AI); however, frameworks such as PyTorch, ONNX, and TensorFlow are optimized for general-purpose GPUs, leading to inefficiencies on specialized accelerators such as neural processing units (NPUs) and processing-in-memory (PIM) devices. These accelerators are designed to optimize both throughput and energy efficiency but they require more tailored optimizations. To address these limitations, we propose the NEST compiler (NEST-C), a novel DL framework that improves the deployment and performance of models across various AI accelerators. NEST-C leverages profiling-based quantization, dynamic graph partitioning, and multi-level intermediate representation (IR) integration for efficient execution on diverse hardware platforms. Our results show that NEST-C significantly enhances computational efficiency and adaptability across various AI accelerators, achieving higher throughput, lower latency, improved resource utilization, and greater model portability. These benefits contribute to more efficient DL model deployment in modern AI applications.

A Simple Transcoding Method for H.264 Coding System (H.264 부호화시스템에서 간단한 비트열 변환 기법)

  • Yang, Young-Hyun;Kwon, Soon-Kak
    • Journal of Korea Multimedia Society
    • /
    • v.9 no.7
    • /
    • pp.818-826
    • /
    • 2006
  • In this paper, we investigate the relationship of bitrate and quantization parameter needed for the trans-coding method that makes the H.264 bitstream of a particular bitrate to the other titrate. Also we propose the new method in order to transcode the titrate between H.264 video coded bitstreams. The proposed transcoding method updates the model parameters from previous picture or slice by using the approximated relationship of bitrate and quantization step-size and finds the target quantization step-size, and then generates the target titrate by simple coding processing just after requantization. Therefore, the proposed method does not need the complex bitrate control and converts to the target titrate by simple implementation. From simulation, we can see that the proposed method transcodes exactly to an assigned target bitrate for the four test sequences with different their characteristics.

  • PDF

Face Recognition using Vector Quantizer in Eigenspace (아이겐공간에서 벡터 양자기를 이용한 얼굴인식)

  • 임동철;이행세;최태영
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.41 no.5
    • /
    • pp.185-192
    • /
    • 2004
  • This paper presents face recognition using vector quantization in the eigenspace of the faces. The existing eigenface method is not enough for representing the variations of faces. For making up for its defects, the proposed method use a clustering of feature vectors by vector quantization in eigenspace of the faces. In the trainning stage, the face images are transformed the points in the eigenspace by eigeface(eigenvetor) and we represent a set of points for each people as the centroids of vector quantizer. In the recognition stage, the vector quantizer finds the centroid having the minimum quantization error between feature vector of input image and centriods of database. The experiments are performed by 600 faces in Faces94 database. The existing eigenface method has minimum 64 miss-recognition and the proposed method has minimum 20 miss-recognition when we use 4 codevectors. In conclusion, the proposed method is a effective method that improves recognition rate through overcoming the variation of faces.

Fixed-point Implementation for Downlink Traffic Channel of IEEE 802.16e OFDMA TDD System (IEEE 802.16e OFDMA TDD 시스템 하향링크 트래픽 채널의 Fixed-point 구현 방법론)

  • Kim Kyoo-Hyun;Sun Tae-Hyung;Wang Yu-Peng;Chang Kyung-Hi;Park Hyung-Il;Eo Ik-Soo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.31 no.6A
    • /
    • pp.593-602
    • /
    • 2006
  • This paper propose to methodology for deciding suitable bit size that minimizes hardware complexity and performance degradation from floating-point design the fixed-point implementation of downlink traffic channel of IEEE 802.16e OFDMA TDD system. One of the major considering issues for implementing fixed-point design is to select Saturation or Quantization properly with the knowledge of signal distribution by pdf or histogram. Also, through trial and error, we should execute exhaustive computer simulation for various bit sizes, hence obtain appropriate bit size while minimizing performance degradation. We carry out computer simulation to decide the optimized bit size of downlink traffic channel under AWGN and ITU-R M.1225 Veh-A channel model.

A Studyon the Equivalent Model Transformation of the Discrete Linear Systems (이산 선형 시스템의 등가 모델 변환에 관한 연구)

  • 임승우;김정화;정찬수
    • Proceedings of the Korean Institute of Communication Sciences Conference
    • /
    • 1991.10a
    • /
    • pp.215-219
    • /
    • 1991
  • This paper is equivalent model transform which reduces the restriction of digitalization in the discrete linear system. This algorithm is the method that weight is given to contribillity and obserbility gramian, the regular matrix T of coordinate transform is obtained and then the state space coefficents of weighted model can be obtained. This study shows the frequency reponse of low quantization error according to the order of weighting function. The result shows that frequency response of the proposed algorithm is better than that of the balanced realization in the system of smaller bit.

Adaptive Model-Based Quantization Parameter Decision for Video Rate Control (비디오 비트율 제어를 위한 적응적 모델 기반의 양자화 변수 결정 방법)

  • Kim, Seon-Ki;Ho, Yo-Sung
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.32 no.4C
    • /
    • pp.411-417
    • /
    • 2007
  • The rate control is an essential component in video coding to provide better quality under given coding constraints, such as channel capacity, frame rates, etc. In general, source data cannot be described as a single distribution in a video coding, hence it can cause an exhaustive approximation problem. It drops a coding efficiency under weak channel environments, such as mobile communications. In this paper, we design a new quantization parameter decision model that is based on a rate-distortion function of generalized Gaussian distribution. In order to adaptively express various source data distribution, we decide a shape parameter by observing a ratio of samples, which have a small value. For experiment, the proposed algorithm is implemented into H.264/AVC video codec, and its performance is compared with that of MPEG-2 TM5, H.263 TMN8 rate control algorithm. As shown in simulation results, the proposed algorithm provides an improved quality rather than previous algorithms and generates the number of bits closed to the target bits.

HMM-based Speech Recognition using FSVQ and Fuzzy Concept (FSVQ와 퍼지 개념을 이용한 HMM에 기초를 둔 음성 인식)

  • 안태옥
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.40 no.6
    • /
    • pp.90-97
    • /
    • 2003
  • This paper proposes a speech recognition based on HMM(Hidden Markov Model) using FSVQ(First Section Vector Quantization) and fuzzy concept. In the proposed paper, we generate codebook of First Section, and then obtain multi-observation sequences by order of large propabilistic values based on fuzzy rule from the codebook of the first section. Thereafter, this observation sequences of first section from codebooks is trained and in case of recognition, a word that has the most highest probability of first section is selected as a recognized word by same concept. Train station names are selected as the target recognition vocabulary and LPC cepstrum coefficients are used as the feature parameters. Besides the speech recognition experiments of proposed method, we experiment the other methods under same conditions and data. Through the experiment results, it is proved that the proposed method based on HMM using FSVQ and fuzzy concept is superior to tile others in recognition rate.

Physics-based Algorithm Implementation for Characterization of Gate-dielectric Engineered MOSFETs including Quantization Effects

  • Mangla, Tina;Sehgal, Amit;Saxena, Manoj;Haldar, Subhasis;Gupta, Mridula;Gupta, R.S.
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.5 no.3
    • /
    • pp.159-167
    • /
    • 2005
  • Quantization effects (QEs), which manifests when the device dimensions are comparable to the de Brogile wavelength, are becoming common physical phenomena in the present micro-/nanometer technology era. While most novel devices take advantage of QEs to achieve fast switching speed, miniature size and extremely small power consumption, the mainstream CMOS devices (with the exception of EEPROMs) are generally suffering in performance from these effects. In this paper, an analytical model accounting for the QEs and poly-depletion effects (PDEs) at the silicon (Si)/dielectric interface describing the capacitance-voltage (C-V) and current-voltage (I-V) characteristics of MOS devices with thin oxides is developed. It is also applicable to multi-layer gate-stack structures, since a general procedure is used for calculating the quantum inversion charge density. Using this inversion charge density, device characteristics are obtained. Also solutions for C-V can be quickly obtained without computational burden of solving over a physical grid. We conclude with comparison of the results obtained with our model and those obtained by self-consistent solution of the $Schr{\ddot{o}}dinger$ and Poisson equations and simulations reported previously in the literature. A good agreement was observed between them.

Abnormal sonar signal detection using recurrent neural network and vector quantization (순환신경망과 벡터 양자화를 이용한 비정상 소나 신호 탐지)

  • Kibae Lee;Guhn Hyeok Ko;Chong Hyun Lee
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.6
    • /
    • pp.500-510
    • /
    • 2023
  • Passive sonar signals mainly contain both normal and abnormal signals. The abnormal signals mixed with normal signals are primarily detected using an AutoEncoder (AE) that learns only normal signals. However, existing AEs may perform inaccurate detection by reconstructing distorted normal signals from mixed signal. To address these limitations, we propose an abnormal signal detection model based on a Recurrent Neural Network (RNN) and vector quantization. The proposed model generates a codebook representing the learned latent vectors and detects abnormal signals more accurately through the proposed search process of code vectors. In experiments using publicly available underwater acoustic data, the AE and Variational AutoEncoder (VAE) using the proposed method showed at least a 2.4 % improvement in the detection performance and at least a 9.2 % improvement in the extraction performance for abnormal signals than the existing models.

Towards Low Complexity Model for Audio Event Detection

  • Saleem, Muhammad;Shah, Syed Muhammad Shehram;Saba, Erum;Pirzada, Nasrullah;Ahmed, Masood
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.9
    • /
    • pp.175-182
    • /
    • 2022
  • In our daily life, we come across different types of information, for example in the format of multimedia and text. We all need different types of information for our common routines as watching/reading the news, listening to the radio, and watching different types of videos. However, sometimes we could run into problems when a certain type of information is required. For example, someone is listening to the radio and wants to listen to jazz, and unfortunately, all the radio channels play pop music mixed with advertisements. The listener gets stuck with pop music and gives up searching for jazz. So, the above example can be solved with an automatic audio classification system. Deep Learning (DL) models could make human life easy by using audio classifications, but it is expensive and difficult to deploy such models at edge devices like nano BLE sense raspberry pi, because these models require huge computational power like graphics processing unit (G.P.U), to solve the problem, we proposed DL model. In our proposed work, we had gone for a low complexity model for Audio Event Detection (AED), we extracted Mel-spectrograms of dimension 128×431×1 from audio signals and applied normalization. A total of 3 data augmentation methods were applied as follows: frequency masking, time masking, and mixup. In addition, we designed Convolutional Neural Network (CNN) with spatial dropout, batch normalization, and separable 2D inspired by VGGnet [1]. In addition, we reduced the model size by using model quantization of float16 to the trained model. Experiments were conducted on the updated dataset provided by the Detection and Classification of Acoustic Events and Scenes (DCASE) 2020 challenge. We confirm that our model achieved a val_loss of 0.33 and an accuracy of 90.34% within the 132.50KB model size.