• Title/Summary/Keyword: Interleaved Memory

Search Result 8, Processing Time 0.023 seconds

Reliability Analysis of Interleaved Memory with a Scrubbing Technique (인터리빙 구조를 갖는 메모리의 스크러빙 기법 적용에 따른 신뢰도 해석)

  • Ryu, Sang-Moon
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.20 no.4
    • /
    • pp.443-448
    • /
    • 2014
  • Soft errors in memory devices that caused by radiation are the main threat from a reliability point of view. This threat can be commonly overcome with the combination of SEC (Single-Error Correction) codes and scrubbing technique. The interleaving architecture can give memory devices the ability of tolerating these soft errors, especially against multiple-bit soft errors. And the interleaving distance plays a key role in building the tolerance against multiple-bit soft errors. This paper proposes a reliability model of an interleaved memory device which suffers from multiple-bit soft errors and are protected by a combination of SEC code and scrubbing. The proposed model shows how the interleaving distance works to improve the reliability and can be used to make a decision in determining optimal scrubbing technique to meet the demands in reliability.

A memory management scheme for parallel viterbi algorithm with multiple add-compare-select modules (다중의 Add-compare-select 모듈을 갖는 병렬 비터비 알고리즘의 메모리 관리 방법)

  • 지현순;박동선;송상섭
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.21 no.8
    • /
    • pp.2077-2089
    • /
    • 1996
  • In this paper, a memory organization and its control method are proposed for the implementation of parallel Virterbi decoders. The design is mainly focused on lowering the hardware complexity of a parallel Viterbi decoder which is to reduce the decoding speed. The memories requeired in a Viterbi decoder are the SMM(State Metric Memory) and the TBM(Traceback Memory);the SMM for storing the path metrics of states and the TBM for storing the survial path information. A general parallel Viterbi decoder for high datarate usually consists of multiple ACS (Add-Compare-Select) units and their corresponding memeory modules.for parallel ACS units, SMMs and TBMs are partitioned into smaller independent pairs of memory modules which are separately interleaved to provide the maximum processing speed. In this design SMMs are controlled with addrss generators which can simultaneously compute addresses of the new path metrics. A bit shuffle technique is employed to provide a parallel access to the TBMs to store the survivor path informations from multiple ACS modules.

  • PDF

Cycle-accurate NPU Simulator and Performance Evaluation According to Data Access Strategies (Cycle-accurate NPU 시뮬레이터 및 데이터 접근 방식에 따른 NPU 성능평가)

  • Kwon, Guyun;Park, Sangwoo;Suh, Taeweon
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.17 no.4
    • /
    • pp.217-228
    • /
    • 2022
  • Currently, there are increasing demands for applying deep neural networks (DNNs) in the embedded domain such as classification and object detection. The DNN processing in embedded domain often requires custom hardware such as NPU for acceleration due to the constraints in power, performance, and area. Processing DNN models requires a large amount of data, and its seamless transfer to NPU is crucial for performance. In this paper, we developed a cycle-accurate NPU simulator to evaluate diverse NPU microarchitectures. In addition, we propose a novel technique for reducing the number of memory accesses when processing convolutional layers in convolutional neural networks (CNNs) on the NPU. The main idea is to reuse data with memory interleaving, which recycles the overlapping data between previous and current input windows. Data memory interleaving makes it possible to quickly read consecutive data in unaligned locations. We implemented the proposed technique to the cycle-accurate NPU simulator and measured the performance with LeNet-5, VGGNet-16, and ResNet-50. The experiment shows up to 2.08x speedup in processing one convolutional layer, compared to the baseline.

Efficient Interleaving Schemes of Volume Holographic memory

  • Lee, Byoung-Ho;Han, Seung-Hoon;Kim, Min-Seung;Yang, Byung-Choon
    • Journal of the Optical Society of Korea
    • /
    • v.6 no.4
    • /
    • pp.172-179
    • /
    • 2002
  • Like the conventional digital storage systems, volume holographic memory can be deteriorated by burst errors due to its high-density storage characteristics. These burst errors are used byoptical defects such as scratches, dust particles, etc. and are two-dimensional in a data page. To deal with these errors, we introduce some concepts for describing them and propose efficient two- dimensional interleaving schemes. The schemes are two-dimensional lattices of an error-correction code word and have equilateral triangular and square structures. Using these structures, we can minimize the number of code words that are interleaved and improve the efficiency of the system. For large size burst errors, the efficient interleaving structure is an equilateral triangular lattice. However, for some small size burst errors, it is reduced to a square lattice.

A Design of Programmable Fragment Shader with Reduction of Memory Transfer Time (메모리 전송 효율을 개선한 programmable Fragment 쉐이더 설계)

  • Park, Tae-Ryoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.12
    • /
    • pp.2675-2680
    • /
    • 2010
  • Computation steps for 3D graphic processing consist of two stages - fixed operation stage and programming required stage. Using this characteristic of 3D pipeline, a hybrid structure between graphics hardware designed by fixed structure and programmable hardware based on instructions, can handle graphic processing more efficiently. In this paper, fragment Shader is designed under this hybrid structure. It also supports OpenGL ES 2.0. Interior interface is optimized to reduce the delay of entire pipeline, which may be occurred by data I/O between the fixed hardware and the Shader. Interior register group of the Shader is designed by an interleaved structure to improve the register space and processing speed.

SPA-Resistant Signed Left-to-Right Receding Method (단순전력분석에 안전한 Signed Left-to-Right 리코딩 방법)

  • Han, Dong-Guk;Kim, Tae-Hyun;Kim, Ho-Won;Lim, Jong-In;Kim, Sung-Kyoung
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.17 no.1
    • /
    • pp.127-132
    • /
    • 2007
  • This paper proposed receding methods for a radix-${\gamma}$ representation of the secret scalar which are resistant to SPA. Unlike existing receding method, these receding methods are left-to-right so they can be interleaved with a left-to-right scalar multiplication, removing the need to store both the scalar and its receding. Hence, these left-to-right methods are suitable for implementing on memory limited devices such as smart cards and sensor nodes

A 2.0-GS/s 5-b Current Mode ADC-Based Receiver with Embedded Channel Equalizer (채널 등화기를 내장한 2.0GS/s 5비트 전류 모드 ADC 기반 수신기)

  • Moon, Jong-Ho;Jung, Woo-Chul;Kim, Jin-Tae;Kwon, Kee-Won;Jun, Young-Hyun;Chun, Jung-Hoon
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.49 no.12
    • /
    • pp.184-193
    • /
    • 2012
  • In this paper, a 5-bit 2-GS/s 2-way time interleaved pipeline ADC for high-speed serial link receiver is demonstrated. Implemented as a current-mode amplifier, the stage ADC simultaneously processes the tracking and residue amplification to achieve higher sampling rate. In addition, each stage incorporates a built-in 1-tap FIR equalizer, reducing inter-symbol-interference (ISI)without an extra digital post-processing. The ADC is designed in a 110nm CMOS technology. It comsumes 91mW from a 1.2-V supply. The area excluding the memory block is $0.58{\times}0.42mm^2$. Simulation results show that when equalizer is enabled, the ADC achieves SNDR of 25.2dB and ENOB of 3.9bits at 2.0GS/s sample rate for a Nyquist input signal. When the equalizer is disengaged, SNDR is 26.0dB for 20MHz-1.0GHz input signal, and the ENOB of 4.0bits.

Effective Scheduling Algorithm using Queue Separation and Packet Segmentation for Jumbo Packets (큐 분리 및 패킷 분할을 이용한 효율적인 점보패킷 스케쥴링 방법)

  • 윤빈영;고남석;김환우
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.28 no.9A
    • /
    • pp.663-668
    • /
    • 2003
  • With the advent of high speed networking technology, computers connected to the high-speed networks tend to consume more of their CPU cycles to process data. So one of the solutions to improve the performance of the computers is to reduce the CPU cycles for processing the data. As the consumption of the CPU cycles is increased in proportion to the number of the packets per second to be processed, reducing the number of the packets per second by increasing the length of the packet is one of the solutions. In order to meet this requirement, two types of jumbo packets such as jumbograms and jumbo frames have already been standardized or being discussed. In case that the jumbograms and general packets are interleaved and scheduled together in a router, the jumbogrms may deteriorate the QoS of the general packets due to the transfer delay. They also frequently exhaust the memory with storing the huge length of the packets. This produces the congestion state easily in the router that results in the loss of the packets. In this paper, we analyze the problems in processing the jumbo packets and suggest a noble solution to overcome the problems.