• Title/Summary/Keyword: memory compression

Search Result 256, Processing Time 0.031 seconds

Compression of CNN Using Low-Rank Approximation and CP Decomposition Methods (저계수 행렬 근사 및 CP 분해 기법을 이용한 CNN 압축)

  • Moon, HyeonCheol;Moon, Gihwa;Kim, Jae-Gon
    • Journal of Broadcast Engineering
    • /
    • v.26 no.2
    • /
    • pp.125-131
    • /
    • 2021
  • In recent years, Convolutional Neural Networks (CNNs) have achieved outstanding performance in the fields of computer vision such as image classification, object detection, visual quality enhancement, etc. However, as huge amount of computation and memory are required in CNN models, there is a limitation in the application of CNN to low-power environments such as mobile or IoT devices. Therefore, the need for neural network compression to reduce the model size while keeping the task performance as much as possible has been emerging. In this paper, we propose a method to compress CNN models by combining matrix decomposition methods of LR (Low-Rank) approximation and CP (Canonical Polyadic) decomposition. Unlike conventional methods that apply one matrix decomposition method to CNN models, we selectively apply two decomposition methods depending on the layer types of CNN to enhance the compression performance. To evaluate the performance of the proposed method, we use the models for image classification such as VGG-16, RestNet50 and MobileNetV2 models. The experimental results show that the proposed method gives improved classification performance at the same range of 1.5 to 12.1 times compression ratio than the existing method that applies only the LR approximation.

Making Cache-Conscious CCMR-trees for Main Memory Indexing (주기억 데이타베이스 인덱싱을 위한 CCMR-트리)

  • 윤석우;김경창
    • Journal of KIISE:Databases
    • /
    • v.30 no.6
    • /
    • pp.651-665
    • /
    • 2003
  • To reduce cache misses emerges as the most important issue in today's situation of main memory databases, in which CPU speeds have been increasing at 60% per year, and memory speeds at 10% per year. Recent researches have demonstrated that cache-conscious index structure such as the CR-tree outperforms the R-tree variants. Its search performance can be poor than the original R-tree, however, since it uses a lossy compression scheme. In this paper, we propose alternatively a cache-conscious version of the R-tree, which we call MR-tree. The MR-tree propagates node splits upward only if one of the internal nodes on the insertion path has empty room. Thus, the internal nodes of the MR-tree are almost 100% full. In case there is no empty room on the insertion path, a newly-created leaf simply becomes a child of the split leaf. The height of the MR-tree increases according to the sequence of inserting objects. Thus, the HeightBalance algorithm is executed when unbalanced heights of child nodes are detected. Additionally, we also propose the CCMR-tree in order to build a more cache-conscious MR-tree. Our experimental and analytical study shows that the two-dimensional MR-tree performs search up to 2.4times faster than the ordinary R-tree while maintaining slightly better update performance and using similar memory space.

Design of a Holter Monitoring System with Flash Memory Card (플레쉬 메모리 카드를 이용한 홀터 심전계의 설계)

  • 송근국;이경중
    • Journal of Biomedical Engineering Research
    • /
    • v.19 no.3
    • /
    • pp.251-260
    • /
    • 1998
  • The Holter monitoring system is a widely used noninvasive diagnostic tool for ambulatory patient who may be at risk from latent life-threatening cardiac abnormalities. In this paper, we design a high performance intelligent holter monitoring system which is characterized by the small-sized and the low-power consumption. The system hardware consists of one-chip microcontroller(68HC11E9), ECG preprocessing circuit, and flash memory card. ECG preprocessing circuit is made of ECG preamplifier with gain of 250, 500 and 1000, the bandpass filter with bandwidth of 0.05-100Hz, the auto-balancing circuit and the saturation-calibrating circuit to eliminate baseline wandering, ECG signal sampled at 240 samples/sec is converted to the digital signal. We use a linear recursive filter and preprocessing algorithm to detect the ECG parameters which are QRS complex, and Q-R-T points, ST-level, HR, QT interval. The long-term acquired ECG signals and diagnostic parameters are compressed by the MFan(Modified Fan) and the delta modulation method. To easily interface with the PC based analyzer program which is operated in DOS and Windows, the compressed data, that are compatible to FFS(flash file system) format, are stored at the flash memory card with SBF(symmetric block format).

  • PDF

Architecture Design for MPEG-2 AAC Filter bank Decoder using Recursive Structure (Recursive 구조를 이용한 MPEG-2 AAC 복호화기의 필터뱅크 구현)

  • 박세기;강명수;오신범;이채욱
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.6C
    • /
    • pp.865-873
    • /
    • 2004
  • MPEG-2 Advanced Audio Coding(AAC) is widely used in the multi-channel audio compression standards. And it combines hi인-resolution filter bank prediction techniques, and Huffman coding algorithm to achieve the broadcast-quality audio level at very low data rates. The forward and inverse modified discrete transforms which are operated in the encoder and the decoder of the filter bank need many computations. In this paper, we propose suitable recursive structure at IMDCT processing for MPEG-2 AAC real-time decoder. We confirm the memory, the computation speed and complexity of the proposed structure.

Compression-Based Volume Rendering on Distributed Memory Parallel Computers (분산 메모리 구조를 갖는 병렬 컴퓨터 상에서의 압축 기반 볼륨 렌더링)

  • Koo, Gee-Bum;Park, Sang-Hun;Song, Dong-Sub;Ihm, In-Sung
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.6 no.5
    • /
    • pp.457-467
    • /
    • 2000
  • 본 논문에서는 분산 메모리 구조를 갖는 병렬 컴퓨터 상에서 방대한 크기를 갖는 볼륨 데이터의 효과적인 가시화를 위한 병렬 광선 투사법을 제안한다. 데이터의 압축을 기반으로 하는 본 기법은 다른 프로세서의 메모리로부터 데이터를 읽기보다는 자신의 지역 메모리에 존재하는 압축된 데이터를 빠르게 복원함으로써 병렬 렌더링 성능을 향상시키는 것을 목표로 한다. 본 기법은 객체-순서와 영상-순서 탐색 알고리즘 모두의 정점을 이용하여 성능을 향상시켰다. 즉, 블록 단위의 최대-최소 팔진트리의 탐색과 각 픽셀의 불투명도 값을 동적으로 유지하는 실시간 사진트리를 응용함으로써 객체-공간과 영상-공간 각각의 응집성을 이용하였다. 본 논문에서 제안하는 압축 기반 병렬 볼륨 렌더링 방법은 렌더링 수행 중 발생하는 프로세서간의 통신을 최소화하도록 구현되었는데, 이러한 특징은 프로세서 사이의 상당히 높은 데이터 통신 비용을 감수하여야 하는 PC 및 워크스테이션의 클러스터와 같은 더욱 실용적인 분산 환경에서 매우 유용하다. 본 논문에서는 Cray T3E 병렬 컴퓨터 상에서 Visible Man 데이터를 이용하여 실험을 수행하였다.

  • PDF

Implementation of 4-channel Embedded DVR Based on Linux (리눅스 기반 4채널 임베디드 DVR 구현)

  • 이흥규;정갑천;최종현;박성모
    • Proceedings of the IEEK Conference
    • /
    • 2003.07c
    • /
    • pp.2677-2680
    • /
    • 2003
  • This paper describes the implementation of a 4 channel embedded DVR system. It receives analog video from CCD cameras and converts to 640${\times}$480 CCIR-656 digital video by 30 frames/sec. These digital images are compressed to the wevelet transformed image using hardware codec which is capable of 350:1 real-time compression and decompression. The DVR is working on linux and it implemented on an embedded system which is based on StrongARM processor. For the interface between processor system module and image processing module, GPIO and memory control module are used, device drivers are developed. Linux kernel source is customized. This paper provides techniques of embedded system development and embedded linux porting.

  • PDF

Practical (Second) Preimage Attacks on the TCS_SHA-3 Family of Cryptographic Hash Functions

  • Sekar, Gautham;Bhattacharya, Soumyadeep
    • Journal of Information Processing Systems
    • /
    • v.12 no.2
    • /
    • pp.310-321
    • /
    • 2016
  • TCS_SHA-3 is a family of four cryptographic hash functions that are covered by a United States patent (US 2009/0262925). The digest sizes are 224, 256, 384 and 512 bits. The hash functions use bijective functions in place of the standard compression functions. In this paper we describe first and second preimage attacks on the full hash functions. The second preimage attack requires negligible time and the first preimage attack requires $O(2^{36})$ time. In addition to these attacks, we also present a negligible time second preimage attack on a strengthened variant of the TCS_SHA-3. All the attacks have negligible memory requirements. To the best of our knowledge, there is no prior cryptanalysis of any member of the TCS_SHA-3 family in the literature.

Modified harmony search and its application to cost minimization of RC columns

  • Medeiros, Guilherme F.;Kripka, Moacir
    • Advances in Computational Design
    • /
    • v.2 no.1
    • /
    • pp.1-13
    • /
    • 2017
  • This paper presents a variant of the Harmony Search Algorithm (HS) and its application to discrete optimization. The main proposed modifications regarding original HS are related to stopping criterion and reinitialization of population, called Harmony Memory. In order to investigate the efficiency of the algorithm, it was applied for obtaining optimal sections of reinforced concrete columns subjected to uniaxial flexural compression. To minimize the cost of the section, the amount and diameters of the reinforcement bars and the dimensions of the columns cross sections were considered as design variables. The obtained results were compared to those generated by other optimization methods. Since, to the examples, Harmony Search reached the same results achieved by Simulated Annealing, some additional analysis are presented in order to compare these methods regarding success rate and number of iterations to reach the optimum.

A design of Discrete Wavelet Transform Encoder for Image Signal Processing (영상신호 처리를 위한 이산 웨이브렛 변환용 부호화기 설계)

  • 김윤홍;김정화양원일이강현
    • Proceedings of the IEEK Conference
    • /
    • 1998.10a
    • /
    • pp.1101-1104
    • /
    • 1998
  • The modern multimedia applications which are video processor, video conference or video phone and so forth require real time processing. Because of a large amount of image data, those require high compression performance. In this paper, the prosposed image processing encoder was designed by using wavelet transform encoding. The proposed filter block can process imae data on the high speed because of composing individual function blocks by parallel and compute both highpass and lowpass coefficient in the same clock cycle. When image data is decomposed into multiresolution, the proposed scheme needs external memory and controller to save intermediate results and it can operate within 33MHz.

  • PDF

Design of Virtual Memory Compression System on Mobile Device (모바일 환경에서의 가상 메모리 압축 시스템 설계)

  • 정진우;장승주
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.04a
    • /
    • pp.46-48
    • /
    • 2002
  • 일반적으로 메모리 관리에서 가장 큰 문제점은 느린 보조기억 장치의 속도와 빠를 주기억장치의 속도 차이에서 나타나는 성능 저하라고 한 수 있다. 요구 페이징 기법에서 페이지 폴트가 일어나면 페이지 교체 정책에 의해 필요 없는 페이지들을 스왑 디바이스로 이동을 시킨다. 이때 느린 보조기억장치의 접근 속도로 인한 응답시간의 지연은 전체적인 시스템 성능의 저하를 초래한다. 이때 스왑 디바이스로의 접근 횟수와 페이지의 크기를 줄일 수 있다면 페이지 아웃되는 응답시간을 높일 수 있을 것이다. 따라서 본 논문에서는 가상 메모리 압축 시스템을 설계하여 스왑 아웃되는 시간을 줄이며 압축된 페이지들을 사용함으로써 메모리 공간을 절약하여 시스템의 전체적인 성능 향상을 위한 모바일 시스템을 설계한다.

  • PDF