• Title/Summary/Keyword: In-Memory Computing

Search Result 759, Processing Time 0.026 seconds

Energy Consumption Evaluation for Two-Level Cache with Non-Volatile Memory Targeting Mobile Processors

  • Matsuno, Shota;Togawa, Masashi;Yanagisawa, Masao;Kimura, Shinji;Sugibayashi, Tadahiko;Togawa, Nozomu
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.2 no.4
    • /
    • pp.226-239
    • /
    • 2013
  • A number of systems have several on-chip memories with cache memory being one of them. Conventional cache memory consists of SRAM but the ratio of static energy to the total energy of the memory architecture becomes larger as the leakage power of traditional SRAM increases. Spin-Torque Transfer RAM (STT-RAM), which is a variety of Non-Volatile Memory (NVM), has many advantages over SRAM, such as high density, low leakage power, and non-volatility, but it consumes too much writing energy. This study evaluated a wide range of energy consumptions of a two-level cache using NVM partially on a mobile processor. Through a number of experimental evaluations, it was confirmed that the use of NVM partially in the two-level cache effectively reduces energy consumption significantly.

  • PDF

Performance Optimization of Numerical Ocean Modeling on Cloud Systems (클라우드 시스템에서 해양수치모델 성능 최적화)

  • JUNG, KWANGWOOG;CHO, YANG-KI;TAK, YONG-JIN
    • The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
    • /
    • v.27 no.3
    • /
    • pp.127-143
    • /
    • 2022
  • Recently, many attempts to run numerical ocean models in cloud computing environments have been tried actively. A cloud computing environment can be an effective means to implement numerical ocean models requiring a large-scale resource or quickly preparing modeling environment for global or large-scale grids. Many commercial and private cloud computing systems provide technologies such as virtualization, high-performance CPUs and instances, ether-net based high-performance-networking, and remote direct memory access for High Performance Computing (HPC). These new features facilitate ocean modeling experimentation on commercial cloud computing systems. Many scientists and engineers expect cloud computing to become mainstream in the near future. Analysis of the performance and features of commercial cloud services for numerical modeling is essential in order to select appropriate systems as this can help to minimize execution time and the amount of resources utilized. The effect of cache memory is large in the processing structure of the ocean numerical model, which processes input/output of data in a multidimensional array structure, and the speed of the network is important due to the communication characteristics through which a large amount of data moves. In this study, the performance of the Regional Ocean Modeling System (ROMS), the High Performance Linpack (HPL) benchmarking software package, and STREAM, the memory benchmark were evaluated and compared on commercial cloud systems to provide information for the transition of other ocean models into cloud computing. Through analysis of actual performance data and configuration settings obtained from virtualization-based commercial clouds, we evaluated the efficiency of the computer resources for the various model grid sizes in the virtualization-based cloud systems. We found that cache hierarchy and capacity are crucial in the performance of ROMS using huge memory. The memory latency time is also important in the performance. Increasing the number of cores to reduce the running time for numerical modeling is more effective with large grid sizes than with small grid sizes. Our analysis results will be helpful as a reference for constructing the best computing system in the cloud to minimize time and cost for numerical ocean modeling.

Evaluation of GPU Computing Capacity for All-in-view GNSS SDR Implementation

  • Yun Sub, Choi;Hung Seok, Seo;Young Baek, Kim
    • Journal of Positioning, Navigation, and Timing
    • /
    • v.12 no.1
    • /
    • pp.75-81
    • /
    • 2023
  • In this study, we design an optimized Graphics Processing Unit (GPU)-based GNSS signal processing technique with the goal of designing and implementing a GNSS Software Defined Receiver (SDR) that can operate in real time all-in-view mode under multi-constellation and multi-frequency signal environment. In the proposed structure the correlators of the existing GNSS SDR are processed by the GPU. We designed a memory structure and processing method that can minimize memory access bottlenecks and optimize the GPU memory resource distribution. The designed GNSS SDR can select and operate only the desired GNSS or desired satellite signals by user input. Also, parameters such as the number of quantization bits, sampling rate, and number of signal tracking arms can be selected. The computing capability of the designed GPU-based GNSS SDR was evaluated and it was confirmed that up to 2400 channels can be processed in real time. As a result, the GPU-based GNSS SDR has sufficient performance to operate in real-time all-in-view mode. In future studies, it will be used for more diverse GNSS signal processing and will be applied to multipath effect analysis using more tracking arms.

Efficient Accessing and Searching in a Sequence of Numbers

  • Seo, Jungjoo;Han, Myoungji;Park, Kunsoo
    • Journal of Computing Science and Engineering
    • /
    • v.9 no.1
    • /
    • pp.1-8
    • /
    • 2015
  • Accessing and searching in a sequence of numbers are fundamental operations in computing that are encountered in a wide range of applications. One of the applications of the problem is cryptanalytic time-memory tradeoff which is aimed at a one-way function. A rainbow table, which is a common method for the time-memory tradeoff, contains elements from an input domain of a hash function that are normally sorted integers. In this paper, we present a practical indexing method for a monotonically increasing static sequence of numbers where the access and search queries can be addressed efficiently in terms of both time and space complexity. For a sequence of n numbers from a universe $U=\{0,{\ldots},m-1\}$, our data structure requires n lg(m/n) + O(n) bits with constant average running time for both access and search queries. We also give an analysis of the time and space complexities of the data structure, supported by experiments with rainbow tables.

An Efficient Recovery Method for Mobile Main Memory Database System (모바일 메인메모리 데이터베이스 시스템을 위한 효율적인 복구 기법)

  • Cho, Sung-Je
    • Journal of Information Technology Services
    • /
    • v.7 no.2
    • /
    • pp.181-195
    • /
    • 2008
  • The rapid growth of mobile communication technology has provided the expansion of mobile internet services, particularly mobile realtime transaction takes much weight among mobile fields. There is an increasing demand for various mobile applications to process transactions in a mobile computing fields. Thus, During transmission in wireless networks a base station failure inevitably causes data loss of the base station buffer. It is required to compensate the loss for communication. The existing methods for a base station failure are not adequate because they all suffer from too much overhead and resolve only the link failure. In this paper, we study an efficient recovry systems for a mobile DBMS. We propose SLL (Segment Log List) that enables the mobile host to compensate data loss efficiently in the case of base station failure. In SLL, a base station deliveries an output information of data cells to a mobile host. when a base station fails, the mobile host can retransmit just next data cells. We also prove the efficiency of new method.

Finite Element Stress Analysis of Coil Springs using a Multi-level Substructuring Method II : Validation and Analysis (다단계 부분구조법을 이용한 코일스프링의 유한요소 응력해석 II : 검증 및 해석)

  • Kim, Jin-Young;Huh, Hoon
    • Transactions of the Korean Society of Automotive Engineers
    • /
    • v.8 no.3
    • /
    • pp.151-162
    • /
    • 2000
  • This study is concerned with computerized multi-level substructuring methods and stress analysis of coil springs. The purpose of substructuring methods is to reduce computing time and capacity of computer memory by multiple level reduction of the degrees of freedom in large size problems that are modeled by three dimensional continuum finite elements. In this paper, the spring super element developed is investigated with tension, torsion, and bending of a cylindrical bar in order to verify its accuracy and efficiency for the multi-level substructuring method. And then the algorithm is applied to finite element analysis of coil springs. The result demonstrates the validity of the multi-level substructuring method and the efficiency in computing time and memory by providing good computational results in coil spring analysis.

  • PDF

Eigen-sensitivity Analysis of Augmented System State Matrix (전력계통의 확대상태행렬 고유치감도 해석)

  • Shim, Kwan-Shik;Nam, Hae-Kon;Kim, Yong-Gu
    • Proceedings of the KIEE Conference
    • /
    • 1996.07b
    • /
    • pp.749-753
    • /
    • 1996
  • This paper presents a new method for first and second order eigen-sensitivity analysis of system matrix in augmented form. Eigen-sensitivity analysis provides invaluable informations in power system planning and operation. However, conventional eigen-sensitivity analysis methods, which need all the eigenvalues and eigenvectors, can not be applicable to large scale power systems due to large computer memory and computing time required. In the proposed method, all sensitivity computations for a mode are carried out using the augmented system matrix and its own eigenvalue and right & left eigenvectors. In other words sensitivity analysis for a mode does not need informations on the other eigenvalues and eigenvectors and sparsity technique can be fully utilized. Thus compuations can be done very efficiently with moderate computer memory and computing time even for large power systems. The proposed algorithm is tested for one machine infinite bus system.

  • PDF

A Dynamic Power Management System for Multiple Client in Cloud Computing Environment (클라우드 환경에서 다중 클라이언트를 위한 동적 전원관리 시스템)

  • Cha, Seung-Min;Lee, Bong-Hwan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.2
    • /
    • pp.213-221
    • /
    • 2012
  • In this paper, a dynamic power management system is proposed to reduce energy consumption for multiple clients in cloud computing environments. The proposed system monitors both keyboard and mouse input from the user, available memory, and CPU usage in the virtual machine. If the system detects no keyboard and mouse input for a certain amount of time and both available memory and CPU usage reach predefined threshold value, the manager in the virtual machine orders the client to shutdown the client machine, which results in significant power save. The developed system is applied to the real university computer lab and the performance of the system is evaluated.

A Novel Memory Hierarchy for Flash Memory Based Storage Systems

  • Yim, Keno-Soo
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.5 no.4
    • /
    • pp.262-269
    • /
    • 2005
  • Semiconductor scientists and engineers ideally desire the faster but the cheaper non-volatile memory devices. In practice, no single device satisfies this desire because a faster device is expensive and a cheaper is slow. Therefore, in this paper, we use heterogeneous non-volatile memories and construct an efficient hierarchy for them. First, a small RAM device (e.g., MRAM, FRAM, and PRAM) is used as a write buffer of flash memory devices. Since the buffer is faster and does not have an erase operation, write can be done quickly in the buffer, making the write latency short. Also, if a write is requested to a data stored in the buffer, the write is directly processed in the buffer, reducing one write operation to flash storages. Second, we use many types of flash memories (e.g., SLC and MLC flash memories) in order to reduce the overall storage cost. Specifically, write requests are classified into two types, hot and cold, where hot data is vulnerable to be modified in the near future. Only hot data is stored in the faster SLC flash, while the cold is kept in slower MLC flash or NOR flash. The evaluation results show that the proposed hierarchy is effective at improving the access time of flash memory storages in a cost-effective manner thanks to the locality in memory accesses.

Efficient Labeling Scheme for Query Processing over XML Fragment Stream in Wireless Computing (무선 환경에서 XML 조각 스트림 질의 처리를 위한 효율적인 레이블링 기법)

  • Ko, Hye-Kyeong
    • The KIPS Transactions:PartD
    • /
    • v.17D no.5
    • /
    • pp.353-358
    • /
    • 2010
  • Unlike the traditional databases, queries on XML streams are restricted to a real time processing and memory usage. In this paper, a robust labeling scheme is proposed, which quickly identifies structural relationship between XML fragments. The proposed labeling scheme provides an effective query processing by removing many redundant operations and minimizing the number of fragments being processed. In experimental results, the proposed labeling scheme efficiently processes query processing and optimizes memory usage.