• 제목/요약/키워드: Computing-In-Memory

검색결과 764건 처리시간 0.029초

스토리지 클래스 메모리를 활용한 즉각 구동 시스템의 개발 (Development of an Instant On System Using Storage Class Memory)

  • 문영제;도인환;박정수;노삼혁
    • 한국정보과학회논문지:컴퓨팅의 실제 및 레터
    • /
    • 제16권2호
    • /
    • pp.207-211
    • /
    • 2010
  • 스토리지 클래스 메모리 (SCM)는 비휘발성 속성과 바이트 단위의 임의 접근이 가능한 속성을 동시에 보유하고 있는 차세대 메모리 기술로써 그 활용 방안에 있어서 귀추가 주목된다. 기존 시스템에 SCM을 도입하면 시스템의 수행 속도와 안전성을 크게 향상할 수 있을 뿐만 아니라 기존의 시스템에서는 불가능했던 새로운 특징들을 제공할 수 있다. 본 연구는 혁신적인 용도로의 SCM 활용 가능성에 주목하며, 그 일환으로 SCM을 메인 메모리로 활용하여 종료 상태의 시스템에 전원이 인가되는 즉시 종전의 시스템 상태로 되돌아갈 수 있는 SOONN을 제안한다. 본 논문에서는 실제 임베디드 시스템 환경에서 프로토타입 시스템을 개발함으로써 SOONN의 실현 가능성을 제시한다.

Deep Learning Based Security Model for Cloud based Task Scheduling

  • Devi, Karuppiah;Paulraj, D.;Muthusenthil, Balasubramanian
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제14권9호
    • /
    • pp.3663-3679
    • /
    • 2020
  • Scheduling plays a dynamic role in cloud computing in generating as well as in efficient distribution of the resources of each task. The principle goal of scheduling is to limit resource starvation and to guarantee fairness among the parties using the resources. The demand for resources fluctuates dynamically hence the prearranging of resources is a challenging task. Many task-scheduling approaches have been used in the cloud-computing environment. Security in cloud computing environment is one of the core issue in distributed computing. We have designed a deep learning-based security model for scheduling tasks in cloud computing and it has been implemented using CloudSim 3.0 simulator written in Java and verification of the results from different perspectives, such as response time with and without security factors, makespan, cost, CPU utilization, I/O utilization, Memory utilization, and execution time is compared with Round Robin (RR) and Waited Round Robin (WRR) algorithms.

Traffic-based reinforcement learning with neural network algorithm in fog computing environment

  • Jung, Tae-Won;Lee, Jong-Yong;Jung, Kye-Dong
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제12권1호
    • /
    • pp.144-150
    • /
    • 2020
  • Reinforcement learning is a technology that can present successful and creative solutions in many areas. This reinforcement learning technology was used to deploy containers from cloud servers to fog servers to help them learn the maximization of rewards due to reduced traffic. Leveraging reinforcement learning is aimed at predicting traffic in the network and optimizing traffic-based fog computing network environment for cloud, fog and clients. The reinforcement learning system collects network traffic data from the fog server and IoT. Reinforcement learning neural networks, which use collected traffic data as input values, can consist of Long Short-Term Memory (LSTM) neural networks in network environments that support fog computing, to learn time series data and to predict optimized traffic. Description of the input and output values of the traffic-based reinforcement learning LSTM neural network, the composition of the node, the activation function and error function of the hidden layer, the overfitting method, and the optimization algorithm.

인텔 비휘발성 메모리 기술 동향 (Trend of Intel Nonvolatile Memory Technology)

  • 이용섭;우영주;정성인
    • 전자통신동향분석
    • /
    • 제35권3호
    • /
    • pp.55-65
    • /
    • 2020
  • With the development of nonvolatile memory technology, Intel has released the Optane datacenter persistent memory module (DCPMM) that can be deployed in the dual in-line memory module. The results of research and experiments on Optane DCPMMs are significantly different from the anticipated results in previous studies through emulation. The DCPMM can be used in two different modes, namely, memory mode (similar to volatile DRAM: Dynamic Random Access Memory) and app direct mode (similar to file storage). It has buffers in 256-byte granularity; this is four times the CPU (Central Processing Unit) cache line (i.e., 64 bytes). However, these properties are not easy to use correctly, and the incorrect use of these properties may result in performance degradation. Optane has the same characteristics of DRAM and storage devices. To take advantage of the performance characteristics of this device, operating systems and applications require new approaches. However, this change in computing environments will require a significant number of researches in the future.

소형전자계산기에 의한 대전력계통의 고장해석 (Analysis of Faults of Large Power System by Memory-Limited Computer)

  • 박영문
    • 전기의세계
    • /
    • 제21권4호
    • /
    • pp.39-44
    • /
    • 1972
  • This paper describes a new approach for minimizing working memory spaces without loosing too much amount of computing time in the analysis of power system faults. This approach requires the decomposition of alrge power system into several small groups of subsystems, forms individual bus impedance matrics, store them in the auxiliary memory, later assembles them to the original total system by algorithms. And also the approach uses techniques for diagonalizing primitive impedances and expanding the system bus impedance matrices by adding a fault bus. These scheme ensures a remarkable savings of working storage and continous computations of fault currents and voltages with the voried fault locations.

  • PDF

Hardware Platforms for Flash Memory/NVRAM Software Development

  • Nam, Eyee-Hyun;Choi, Ki-Seok;Choi, Jin-Yong;Min, Hang-Jun;Min, Sang-Lyul
    • Journal of Computing Science and Engineering
    • /
    • 제3권3호
    • /
    • pp.181-194
    • /
    • 2009
  • Flash memory is increasingly being used in a wide range of storage applications because of its low power consumption, low access latency, small form factor, and high shock resistance. However, the current platforms for flash memory software development do not meet the ever-increasing requirements of flash memory applications. This paper presents three different hardware platforms for flash memory/NVRAM (non-volatile RAM) software development that overcome the limitations of the current platforms. The three platforms target different types of host system and provide various features that facilitate the development and verification of flash memory/NVRAM software. In this paper, we also demonstrate the usefulness of the three platforms by implementing three different types of storage system (one for each platform) based on them.

Bit Flip Reduction Schemes to Improve PCM Lifetime: A Survey

  • Han, Miseon;Han, Youngsun
    • IEIE Transactions on Smart Processing and Computing
    • /
    • 제5권5호
    • /
    • pp.337-345
    • /
    • 2016
  • Recently, as the number of cores in computer systems has increased, the need for larger memory capacity has also increased. Unfortunately, dynamic random access memory (DRAM), popularly used as main memory for decades, now faces a scalability limitation. Phase change memory (PCM) is considered one of the strong alternatives to DRAM due to its advantages, such as high scalability, non-volatility, low idle power, and so on. However, since PCM suffers from short write endurance, direct use of PCM in main memory incurs a significant problem due to its short lifetime. To solve the lifetime limitation, many studies have focused on reducing the number of bit flips per write request. In this paper, we describe the PCM operating principles in detail and explore various bit flip reduction schemes. Also, we compare their performance in terms of bit reduction rate and lifetime improvement.

A Technique for Improving the Performance of Cache Memories

  • Cho, Doosan
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제13권3호
    • /
    • pp.104-108
    • /
    • 2021
  • In order to improve performance in IoT, edge computing system, a memory is usually configured in a hierarchical structure. Based on the distance from CPU, the access speed slows down in the order of registers, cache memory, main memory, and storage. Similar to the change in performance, energy consumption also increases as the distance from the CPU increases. Therefore, it is important to develop a technique that places frequently used data to the upper memory as much as possible to improve performance and energy consumption. However, the technique should solve the problem of cache performance degradation caused by lack of spatial locality that occurs when the data access stride is large. This study proposes a technique to selectively place data with large data access stride to a software-controlled cache. By using the proposed technique, data spatial locality can be improved by reducing the data access interval, and consequently, the cache performance can be improved.

IoT/에지 컴퓨팅에서 저전력 메모리 아키텍처의 개선 연구 (A Study on Improvement of Low-power Memory Architecture in IoT/edge Computing)

  • 조두산
    • 한국산업융합학회 논문집
    • /
    • 제24권1호
    • /
    • pp.69-77
    • /
    • 2021
  • The widely used low-cost design methodology for IoT devices is very popular. In such a networked device, memory is composed of flash memory, SRAM, DRAM, etc., and because it processes a large amount of data, memory design is an important factor for system performance. Therefore, each device selects optimized design factors such as function, performance and cost according to market demand. The design of a memory architecture available for low-cost IoT devices is very limited with the configuration of SRAM, flash memory, and DRAM. In order to process as much data as possible in the same space, an architecture that supports parallel processing units is usually provided. Such parallel architecture is a design method that provides high performance at low cost. However, it needs precise software techniques for instruction and data mapping on the parallel architecture. This paper proposes an instruction/data mapping method to support optimized parallel processing performance. The proposed method optimizes system performance by actively using hardware and software parallelism.

Algorithmic GPGPU Memory Optimization

  • Jang, Byunghyun;Choi, Minsu;Kim, Kyung Ki
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • 제14권4호
    • /
    • pp.391-406
    • /
    • 2014
  • The performance of General-Purpose computation on Graphics Processing Units (GPGPU) is heavily dependent on the memory access behavior. This sensitivity is due to a combination of the underlying Massively Parallel Processing (MPP) execution model present on GPUs and the lack of architectural support to handle irregular memory access patterns. Application performance can be significantly improved by applying memory-access-pattern-aware optimizations that can exploit knowledge of the characteristics of each access pattern. In this paper, we present an algorithmic methodology to semi-automatically find the best mapping of memory accesses present in serial loop nest to underlying data-parallel architectures based on a comprehensive static memory access pattern analysis. To that end we present a simple, yet powerful, mathematical model that captures all memory access pattern information present in serial data-parallel loop nests. We then show how this model is used in practice to select the most appropriate memory space for data and to search for an appropriate thread mapping and work group size from a large design space. To evaluate the effectiveness of our methodology, we report on execution speedup using selected benchmark kernels that cover a wide range of memory access patterns commonly found in GPGPU workloads. Our experimental results are reported using the industry standard heterogeneous programming language, OpenCL, targeting the NVIDIA GT200 architecture.