• Title/Summary/Keyword: in-memory computing

Search Result 766, Processing Time 0.024 seconds

Development of an Instant On System Using Storage Class Memory (스토리지 클래스 메모리를 활용한 즉각 구동 시스템의 개발)

  • Moon, Young-Je;Doh, In-Hwan;Park, Jung-Soo;Noh, Sam-H.
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.2
    • /
    • pp.207-211
    • /
    • 2010
  • Storage Class Memory (SCM) has both characteristics of non-volatility and random byte addressability. The advent of SCM can bring about novel and innovative features that are not possible in con ventional computing systems. This paper suggests a new system design that turns on/off a system instantly. To do this, we replace the main memory with SCM to retain the volatile system states as the system is turned off. We implement our prototype in an embedded environment and measure its system on/off time.

Deep Learning Based Security Model for Cloud based Task Scheduling

  • Devi, Karuppiah;Paulraj, D.;Muthusenthil, Balasubramanian
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.9
    • /
    • pp.3663-3679
    • /
    • 2020
  • Scheduling plays a dynamic role in cloud computing in generating as well as in efficient distribution of the resources of each task. The principle goal of scheduling is to limit resource starvation and to guarantee fairness among the parties using the resources. The demand for resources fluctuates dynamically hence the prearranging of resources is a challenging task. Many task-scheduling approaches have been used in the cloud-computing environment. Security in cloud computing environment is one of the core issue in distributed computing. We have designed a deep learning-based security model for scheduling tasks in cloud computing and it has been implemented using CloudSim 3.0 simulator written in Java and verification of the results from different perspectives, such as response time with and without security factors, makespan, cost, CPU utilization, I/O utilization, Memory utilization, and execution time is compared with Round Robin (RR) and Waited Round Robin (WRR) algorithms.

Traffic-based reinforcement learning with neural network algorithm in fog computing environment

  • Jung, Tae-Won;Lee, Jong-Yong;Jung, Kye-Dong
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.12 no.1
    • /
    • pp.144-150
    • /
    • 2020
  • Reinforcement learning is a technology that can present successful and creative solutions in many areas. This reinforcement learning technology was used to deploy containers from cloud servers to fog servers to help them learn the maximization of rewards due to reduced traffic. Leveraging reinforcement learning is aimed at predicting traffic in the network and optimizing traffic-based fog computing network environment for cloud, fog and clients. The reinforcement learning system collects network traffic data from the fog server and IoT. Reinforcement learning neural networks, which use collected traffic data as input values, can consist of Long Short-Term Memory (LSTM) neural networks in network environments that support fog computing, to learn time series data and to predict optimized traffic. Description of the input and output values of the traffic-based reinforcement learning LSTM neural network, the composition of the node, the activation function and error function of the hidden layer, the overfitting method, and the optimization algorithm.

Trend of Intel Nonvolatile Memory Technology (인텔 비휘발성 메모리 기술 동향)

  • Lee, Y.S.;Woo, Y.J.;Jung, S.I.
    • Electronics and Telecommunications Trends
    • /
    • v.35 no.3
    • /
    • pp.55-65
    • /
    • 2020
  • With the development of nonvolatile memory technology, Intel has released the Optane datacenter persistent memory module (DCPMM) that can be deployed in the dual in-line memory module. The results of research and experiments on Optane DCPMMs are significantly different from the anticipated results in previous studies through emulation. The DCPMM can be used in two different modes, namely, memory mode (similar to volatile DRAM: Dynamic Random Access Memory) and app direct mode (similar to file storage). It has buffers in 256-byte granularity; this is four times the CPU (Central Processing Unit) cache line (i.e., 64 bytes). However, these properties are not easy to use correctly, and the incorrect use of these properties may result in performance degradation. Optane has the same characteristics of DRAM and storage devices. To take advantage of the performance characteristics of this device, operating systems and applications require new approaches. However, this change in computing environments will require a significant number of researches in the future.

Analysis of Faults of Large Power System by Memory-Limited Computer (소형전자계산기에 의한 대전력계통의 고장해석)

  • Young Moon Park
    • 전기의세계
    • /
    • v.21 no.4
    • /
    • pp.39-44
    • /
    • 1972
  • This paper describes a new approach for minimizing working memory spaces without loosing too much amount of computing time in the analysis of power system faults. This approach requires the decomposition of alrge power system into several small groups of subsystems, forms individual bus impedance matrics, store them in the auxiliary memory, later assembles them to the original total system by algorithms. And also the approach uses techniques for diagonalizing primitive impedances and expanding the system bus impedance matrices by adding a fault bus. These scheme ensures a remarkable savings of working storage and continous computations of fault currents and voltages with the voried fault locations.

  • PDF

Hardware Platforms for Flash Memory/NVRAM Software Development

  • Nam, Eyee-Hyun;Choi, Ki-Seok;Choi, Jin-Yong;Min, Hang-Jun;Min, Sang-Lyul
    • Journal of Computing Science and Engineering
    • /
    • v.3 no.3
    • /
    • pp.181-194
    • /
    • 2009
  • Flash memory is increasingly being used in a wide range of storage applications because of its low power consumption, low access latency, small form factor, and high shock resistance. However, the current platforms for flash memory software development do not meet the ever-increasing requirements of flash memory applications. This paper presents three different hardware platforms for flash memory/NVRAM (non-volatile RAM) software development that overcome the limitations of the current platforms. The three platforms target different types of host system and provide various features that facilitate the development and verification of flash memory/NVRAM software. In this paper, we also demonstrate the usefulness of the three platforms by implementing three different types of storage system (one for each platform) based on them.

Bit Flip Reduction Schemes to Improve PCM Lifetime: A Survey

  • Han, Miseon;Han, Youngsun
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.5 no.5
    • /
    • pp.337-345
    • /
    • 2016
  • Recently, as the number of cores in computer systems has increased, the need for larger memory capacity has also increased. Unfortunately, dynamic random access memory (DRAM), popularly used as main memory for decades, now faces a scalability limitation. Phase change memory (PCM) is considered one of the strong alternatives to DRAM due to its advantages, such as high scalability, non-volatility, low idle power, and so on. However, since PCM suffers from short write endurance, direct use of PCM in main memory incurs a significant problem due to its short lifetime. To solve the lifetime limitation, many studies have focused on reducing the number of bit flips per write request. In this paper, we describe the PCM operating principles in detail and explore various bit flip reduction schemes. Also, we compare their performance in terms of bit reduction rate and lifetime improvement.

A Technique for Improving the Performance of Cache Memories

  • Cho, Doosan
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.13 no.3
    • /
    • pp.104-108
    • /
    • 2021
  • In order to improve performance in IoT, edge computing system, a memory is usually configured in a hierarchical structure. Based on the distance from CPU, the access speed slows down in the order of registers, cache memory, main memory, and storage. Similar to the change in performance, energy consumption also increases as the distance from the CPU increases. Therefore, it is important to develop a technique that places frequently used data to the upper memory as much as possible to improve performance and energy consumption. However, the technique should solve the problem of cache performance degradation caused by lack of spatial locality that occurs when the data access stride is large. This study proposes a technique to selectively place data with large data access stride to a software-controlled cache. By using the proposed technique, data spatial locality can be improved by reducing the data access interval, and consequently, the cache performance can be improved.

A Study on Improvement of Low-power Memory Architecture in IoT/edge Computing (IoT/에지 컴퓨팅에서 저전력 메모리 아키텍처의 개선 연구)

  • Cho, Doosan
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.24 no.1
    • /
    • pp.69-77
    • /
    • 2021
  • The widely used low-cost design methodology for IoT devices is very popular. In such a networked device, memory is composed of flash memory, SRAM, DRAM, etc., and because it processes a large amount of data, memory design is an important factor for system performance. Therefore, each device selects optimized design factors such as function, performance and cost according to market demand. The design of a memory architecture available for low-cost IoT devices is very limited with the configuration of SRAM, flash memory, and DRAM. In order to process as much data as possible in the same space, an architecture that supports parallel processing units is usually provided. Such parallel architecture is a design method that provides high performance at low cost. However, it needs precise software techniques for instruction and data mapping on the parallel architecture. This paper proposes an instruction/data mapping method to support optimized parallel processing performance. The proposed method optimizes system performance by actively using hardware and software parallelism.

Algorithmic GPGPU Memory Optimization

  • Jang, Byunghyun;Choi, Minsu;Kim, Kyung Ki
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.14 no.4
    • /
    • pp.391-406
    • /
    • 2014
  • The performance of General-Purpose computation on Graphics Processing Units (GPGPU) is heavily dependent on the memory access behavior. This sensitivity is due to a combination of the underlying Massively Parallel Processing (MPP) execution model present on GPUs and the lack of architectural support to handle irregular memory access patterns. Application performance can be significantly improved by applying memory-access-pattern-aware optimizations that can exploit knowledge of the characteristics of each access pattern. In this paper, we present an algorithmic methodology to semi-automatically find the best mapping of memory accesses present in serial loop nest to underlying data-parallel architectures based on a comprehensive static memory access pattern analysis. To that end we present a simple, yet powerful, mathematical model that captures all memory access pattern information present in serial data-parallel loop nests. We then show how this model is used in practice to select the most appropriate memory space for data and to search for an appropriate thread mapping and work group size from a large design space. To evaluate the effectiveness of our methodology, we report on execution speedup using selected benchmark kernels that cover a wide range of memory access patterns commonly found in GPGPU workloads. Our experimental results are reported using the industry standard heterogeneous programming language, OpenCL, targeting the NVIDIA GT200 architecture.