• Title/Summary/Keyword: In-Memory Computing

Search Result 759, Processing Time 0.024 seconds

Memory BIST Circuit Generator System Design Based on Fault Model (고장 모델 기반 메모리 BIST 회로 생성 시스템 설계)

  • Lee Jeong-Min;Shim Eun-Sung;Chang Hoon
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.42 no.2 s.332
    • /
    • pp.49-56
    • /
    • 2005
  • In this paper, we propose a memory BIST Circuit Creation System which creates BIST circuit based on user defined fault model and generates the optimized march test algorithm. Traditional tools have some limit that regenerates BIST circuit after changing the memory type or test algorithm. However, this proposed creation system can automatically generate memory BIST circuit which is suitable in the various memory type and apply algorithm which is required by user. And it gets more efficient through optimizing algorithms for fault models which is selected randomly according to proposed nile. In addition, it support various address width and data and consider interface of IEEE 1149.1 circuit.

Parallel Computing Environment for R with on Supercomputer Systems (빅데이터 분석을 위한 슈퍼컴퓨터 환경에서 R의 병렬처리)

  • Lee, Sang Yeol;Won, Joong Ho
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.39 no.4
    • /
    • pp.19-31
    • /
    • 2014
  • We study parallel processing techniques for the R programming language of high performance computing technology. In this study, we used massively parallel computing system which has 25,408 cpu cores. We conducted a performance evaluation of a distributed memory system using MPI and of a the shared memory system using OpenMP. Our findings are summarized as follows. First, For some particular algorithms, parallel processing is about 150 times faster than serial processing in R. Second, the distributed memory system gets faster as the number of nodes increases while shared memory system is limited in the improvement of performance, due to the limit of the number of cpus in a single system.

Development of the Efficient Compressed Data Management System for Embedded DBMS (모바일 DBMS를 위한 효율적인 압축 데이터 관리 시스템의 개발)

  • Shin, Young-Jae;Hwang, Jin-Ho;Kim, Hak-Soo;Lee, Seung-Mi;Son, Jin-Hyun
    • The KIPS Transactions:PartD
    • /
    • v.15D no.5
    • /
    • pp.589-598
    • /
    • 2008
  • Recently, Mobile Computing Devices are used generally. And Information which is processed by Mobile computing devices is increasing. Because information is digitalizing. So Mobile computing Devices demand an Embedded DBMS for efficient management of information. Moreover Mobile computing Devices demand an efficient storage management in NAND-type flash memory because the NAND-type flash memory is using generally in Mobile computing devices and the NAND-type flash memory is more expensive than the magnetic disks. So that in this paper, we present an efficient Compressed Data Management System for the embedded DBMS that is used in flash memory. This proposed system improve the space utilization and extend a lifetime of a flash memory because it decreases the size of data.

Analysis of the GPGPU Performance for Various Combinations of Workloads Executed Concurrently (동시에 실행되는 워크로드 조합에 따른 GPGPU 성능 분석)

  • Kim, Dongwhan;Eom, Hyeonsang
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.3
    • /
    • pp.165-170
    • /
    • 2017
  • Many studies have utilized GPGPU (General-Purpose Graphic Processing Unit) and its high computing power to compute complex tasks. The characteristics of GPGPU programs necessitate the operations of memory copy between the host and device. A high latency period can affect the performance of the program. Thus, it is required to significantly improve the performance of GPGPU programs by optimizations. By executing multiple GPGPU programs simultaneously, the latency hiding effect of memory copy is achieved by overlapping the memory copy and computing operations in GPGPU. This paper presents the results of analyzing the latency hiding effect for memory copy operations. Furthermore, we propose a performance anticipation model and an algorithm for the limitations of using pinned memory, and show that the use of the proposed algorithm results in a 41% performance increase.

Scratchpad Memory Architectures and Allocation Algorithms for Hard Real-Time Multicore Processors

  • Liu, Yu;Zhang, Wei
    • Journal of Computing Science and Engineering
    • /
    • v.9 no.2
    • /
    • pp.51-72
    • /
    • 2015
  • Time predictability is crucial in hard real-time and safety-critical systems. Cache memories, while useful for improving the average-case memory performance, are not time predictable, especially when they are shared in multicore processors. To achieve time predictability while minimizing the impact on performance, this paper explores several time-predictable scratch-pad memory (SPM) based architectures for multicore processors. To support these architectures, we propose the dynamic memory objects allocation based partition, the static allocation based partition, and the static allocation based priority L2 SPM strategy to retain the characteristic of time predictability while attempting to maximize the performance and energy efficiency. The SPM based multicore architectural design and the related allocation methods thus form a comprehensive solution to hard real-time multicore based computing. Our experimental results indicate the strengths and weaknesses of each proposed architecture and the allocation method, which offers interesting on-chip memory design options to enable multicore platforms for hard real-time systems.

Bandwidth-aware Memory Placement on Hybrid Memories targeting High Performance Computing Systems

  • Lee, Jongmin
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.8
    • /
    • pp.1-8
    • /
    • 2019
  • Modern computers provide tremendous computing capability and a large memory system. Hybrid memories consist of next generation memory devices and are adopted in high performance systems. However, the increased complexity of the microprocessor makes it difficult to operate the system effectively. In this paper, we propose a simple data migration method called Bandwidth-aware Data Migration (BDM) to efficiently use memory systems for high performance processors with hybrid memory. BDM monitors the status of applications running on the system using hardware performance monitoring tools and migrates the appropriate pages of selected applications to High Bandwidth Memory (HBM). BDM selects applications whose bandwidth usages are high and also evenly distributed among the threads. Experimental results show that BDM improves execution time by an average of 20% over baseline execution.

Computationally Efficient Instance Memory Monitoring Scheme for a Security-Enhanced Cloud Platform (클라우드 보안성 강화를 위한 연산 효율적인 인스턴스 메모리 모니터링 기술)

  • Choi, Sang-Hoon;Park, Ki-Woong
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.27 no.4
    • /
    • pp.775-783
    • /
    • 2017
  • As interest in cloud computing grows, the number of users using cloud computing services is increasing. However, cloud computing technology has been steadily challenged by security concerns. Therefore, various security breaches are springing up to enhance the system security for cloud services users. In particular, research on detection of malicious VM (Virtual Machine) is actively underway through the introspecting virtual machines on the cloud platform. However, memory analysis technology is not used as a monitoring tool in the environments where multiple virtual machines are run on a single server platform due to obstructive monitoring overhead. As a remedy to the challenging issue, we proposes a computationally efficient instance memory introspection scheme to minimize the overhead that occurs in memory dump and monitor it through a partial memory monitoring based on the well-defined kernel memory map library.

Design of In-Memory Computing Adder Using Low-Power 8+T SRAM (저 전력 8+T SRAM을 이용한 인 메모리 컴퓨팅 가산기 설계)

  • Chang-Ki Hong;Jeong-Beom Kim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.2
    • /
    • pp.291-298
    • /
    • 2023
  • SRAM-based in-memory computing is one of the technologies to solve the bottleneck of von Neumann architecture. In order to achieve SRAM-based in-memory computing, it is essential to design efficient SRAM bit-cell. In this paper, we propose a low-power differential sensing 8+T SRAM bit-cell which reduces power consumption and improves circuit performance. The proposed 8+T SRAM bit-cell is applied to ripple carry adder which performs SRAM read and bitwise operations simultaneously and executes each logic operation in parallel. Compared to the previous work, the designed 8+T SRAM-based ripple carry adder is reduced power consumption by 11.53%, but increased propagation delay time by 6.36%. Also, this adder is reduced power-delay-product (PDP) by 5.90% and increased energy-delay- product (EDP) by 0.08%. The proposed circuit was designed using TSMC 65nm CMOS process, and its feasibility was verified through SPECTRE simulation.

Analysis on Memory Characteristics of Graphics Processing Units for Designing Memory System of General-Purpose Computing on Graphics Processing Units (범용 그래픽 처리 장치의 메모리 설계를 위한 그래픽 처리 장치의 메모리 특성 분석)

  • Choi, Hongjun;Kim, Cheolhong
    • Smart Media Journal
    • /
    • v.3 no.1
    • /
    • pp.33-38
    • /
    • 2014
  • Even though the performance of microprocessor is improved continuously, the performance improvement of computing system becomes hard to increase, in order to some drawbacks including increased power consumption. To solve the problem, general-purpose computing on graphics processing units(GPGPUs), which execute general-purpose applications by using specialized parallel-processing device representing graphics processing units(GPUs), have been focused. However, the characteristics of applications related with graphics is substantially different from the characteristics of general-purpose applications. Therefore, GPUs cannot exploit the outstanding computational resources sufficiently due to various constraints, when they execute general-purpose applications. When designing GPUs for GPGPU, memory system is important to effectively exploit the GPUs since typically general-purpose applications requires more memory accesses than graphics applications. Especially, external memory access requiring long latency impose a big overhead on the performance of GPUs. Therefore, the GPU performance must be improved if hierarchical memory architecture which can reduce the number of external memory access is applied. For this reason, we will investigate the analysis of GPU performance according to hierarchical cache architectures in executing various benchmarks.

Built-In Redundancy Analysis Algorithm for Embedded Memory Built-In Self Repair with 2-D Redundancy (내장 메모리 자가 복구를 위한 여분의 메모리 분석 알고리즘)

  • Shim, Eun-Sung;Chang, Hoon
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.44 no.2
    • /
    • pp.113-120
    • /
    • 2007
  • With the advance of VLSI technology, the capacity and density of memories is rapidly growing. In this paper we proposed reallocation algorithm. All faulty cell of embedded memory is reallocated into the row and column spare memory. This work implements reallocation algorithm and BISR to verify its design.