• Title/Summary/Keyword: in-memory computing

Search Result 766, Processing Time 0.041 seconds

Designing a low-power L1 cache system using aggressive data of frequent reference patterns

  • Jung, Bo-Sung;Lee, Jung-Hoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.7
    • /
    • pp.9-16
    • /
    • 2022
  • Today, with the advent of the 4th industrial revolution, IoT (Internet of Things) systems are advancing rapidly. For this reason, a various application with high-performance and large-capacity are emerging. Therefore, there is a need for low-power and high-performance memory for computing systems with these applications. In this paper, we propose an effective structure for the L1 cache memory, which consumes the most energy in the computing system. The proposed cache system is largely composed of two parts, the L1 main cache and the buffer cache. The main cache is 2 banks, and each bank consists of a 2-way set association. When the L1 cache hits, the data is copied into buffer cache according to the proposed algorithm. According to simulation, the proposed L1 cache system improved the performance of energy delay products by about 65% compared to the existing 4-way set associative cache memory.

Analysis of Large Power System by Small Digital Computer (소형 digital computer를 이용한 대전력계통의 해석)

  • 박영문;정재길
    • 전기의세계
    • /
    • v.23 no.1
    • /
    • pp.61-68
    • /
    • 1974
  • This paper attempts to develop the algorithms and computer program for load flow solution and faults analysis of large power system by small digital computer. The Conventional methods for load flow solution and fault analysis of large power system require too much amount of computer memory space and computing time. Therefore, this paper describes the methad for reducing the computer memory space and computing time as follows. (1) Load Flow Solution; This method is to store each primitive impedance of lines along with a list of bus numbers corresponding to the both terminals of lines, and to store only nonzero element of bus admittance matrix. (2) Faults Analysis: This method is to partition a large power system into several groups of subsystems, form individual bus impedance matrix, store them in the storage, and assemble the only required portion of them to original total system by algorithm.

  • PDF

Unstructured Pressure Based Method for All Speed Flows (전 속도영역 유동을 위한 비정렬격자 압력기반해법)

  • Choi, Hyung-Il;Lee, Do-Hyung;Maeng, Joo-Sung
    • Transactions of the Korean Society of Mechanical Engineers B
    • /
    • v.26 no.11
    • /
    • pp.1521-1530
    • /
    • 2002
  • This article proposes a pressure based method for predicting flows at all speeds. The compressible SIMPLE algorithm is extended to unstructured grid framework. Convection terms are discretized using second-order scheme with deferred correction approach. Diffusion term discretization is based on structured grid analogy that can be easily adopted to hybrid unstructured grid solver. This method also uses node centered scheme with edge based data structure for memory and computing time efficiency of arbitrary grid types. Both incompressible and compressible benchmark problems are solved using the above methodology. The demonstration of this method is extended to slip flow problem that has low Reynolds number but compressibility effect. It is shown that the proposed method can improve efficiency in memory usage and computing time without losing any accuracy.

Evaluation of Network Reliability Using Most Probable States

  • Oh, Dae-Ho;Park, Dong-Ho;Lee, Seung-Min
    • Proceedings of the Korean Reliability Society Conference
    • /
    • 2001.06a
    • /
    • pp.463-469
    • /
    • 2001
  • An algorithm is presenter for generating the most probable states in decreasing order of probability of each unit. The proposed new algorithm in this note is compared with the existing methods regarding memory requirement and execution time. Our method is simpler and, judging from the computing experiment, it requires less memory size than the previously known methods and takes comparable execution time to previous methods for an acceptable level of criterion.

  • PDF

Computational Complexity Comparison of Second-Order Volterrra Filtering Algorithms

  • Im, Sungin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.16 no.2E
    • /
    • pp.38-46
    • /
    • 1997
  • The objective of the paper is to compare the computational complexity of five algorithms for computing time-domain second-order Volterra filter outputs in terms of number of real multiplication and addition operations required for implementation. This study shows that if the filter memory length is greater that or equal to 16, the fast algorithm using the overlap-save method and the frequency-domain symmetry properties of the quadratic coefficients is the most efficient among the algorithms investigated in this paper, When the filter memory length is less than 16, the algorithm using the time-domain symmetry properties is better than any other algorithm.

  • PDF

Meshfree/GFEM in hardware-efficiency prospective

  • Tian, Rong
    • Interaction and multiscale mechanics
    • /
    • v.6 no.2
    • /
    • pp.197-210
    • /
    • 2013
  • A fundamental trend of processor architecture evolving towards exaflops is fast increasing floating point performance (so-called "free" flops) accompanied by much slowly increasing memory and network bandwidth. In order to fully enjoy the "free" flops, a numerical algorithm of PDEs should request more flops per byte or increase arithmetic intensity. A meshfree/GFEM approximation can be the class of the algorithm. It is shown in a GFEM without extra dof that the kind of approximation takes advantages of the high performance of manycore GPUs by a high accuracy of approximation; the "expensive" method is found to be reversely hardware-efficient on the emerging architecture of manycore.

Analysis of Cloud Service Providers

  • Lee, Yo-Seob
    • International Journal of Advanced Culture Technology
    • /
    • v.9 no.3
    • /
    • pp.315-320
    • /
    • 2021
  • Currently, cloud computing is being used as a technology that greatly changes the IT field. For many businesses, many cloud services are available in the form of custom, reliable, and cost-effective web applications. Most cloud service providers provide functions such as IoT, machine learning, AI services, blockchain, AR & VR, mobile services, and containers in addition to basic cloud services that support the scalability of processors, memory, and storage. In this paper, we will look at the most used cloud service providers and compare the services provided by the cloud service providers.

Performance Analysis of Clustering and Non-clustering Methods in Flash Memory Environment (플래시 메모리 환경에서 클러스터링 방법과 비 클러스터링 방법의 성능 분석)

  • Bae, Duck-Ho;Chang, Ji-Woong;Kim, Sang-Wook
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.6
    • /
    • pp.599-603
    • /
    • 2008
  • Flash memory has its unique characteristics: the write operation is much more costly than the read operation and in-place updating is not allowed. In this paper, we analyze how these characteristics of flash memory affect the performance of clustering and non-clustering in record management, and shows that non-clustering is more suitable in flash memory environment, which does not hold in disk environment. Also, we discuss the problems of the existing non-clustering method, and identify considerable designing factors of record management method in flash memory environment.

Improvement Method and Performance Analysis of Shared Memory in Dual Core Embedded Linux system (듀얼코어 임베디드 리눅스 시스템에서 공유 메모리 성능 개선 방안 및 성능 분석)

  • Jung, Ji-Sung;Kim, Chang-Bong
    • Journal of Internet Computing and Services
    • /
    • v.11 no.4
    • /
    • pp.95-106
    • /
    • 2010
  • Recently multiple process communicate together. They share resource and information for cooperation in complicated programming environment. Kernel provides IPC (Inter -Process Communication) for communication with each other process. Shared Memory is a technique that many processes can access to identical memory area in the Linux environment. In this paper, we propose a performance improvement method of shared memory in the dual-core embedded linux system which is consist of different core and different operating system. We construct the MPC2530F (ARM926F+ARM946E) linux system and measure the performance therein. We attempt a performance enhancement in each CPU for each process which uses a shared memory.

JMP+RAND: Mitigating Memory Sharing-Based Side-Channel Attack by Embedding Random Values in Binaries (JMP+RAND: 바이너리 난수 삽입을 통한 메모리 공유 기반 부채널 공격 방어 기법)

  • Kim, Taehun;Shin, Youngjoo
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.9 no.5
    • /
    • pp.101-106
    • /
    • 2020
  • Since computer became available, much effort has been made to achieve information security. Even though memory protection defense mechanisms were studied the most among of them, the problems of existing memory protection defense mechanisms were found due to improved performance of computer and new defense mechanisms were needed due to the advent of the side-channel attacks. In this paper, we propose JMP+RAND that embedding random values of 5 to 8 bytes per page to defend against memory sharing based side-channel attacks and bridging the gap of existing memory protection defense mechanism. Unlike the defense mechanism of the existing side-channel attacks, JMP+RAND uses static binary rewriting and continuous jmp instruction and random values to defend against the side-channel attacks in advance. We numerically calculated the time it takes for a memory sharing-based side-channel attack to binary adopted JMP+RAND technique and verified that the attacks are impossible in a realistic time. Modern architectures have very low overhead for JMP+RAND because of the very fast and accurate branching of jmp instruction using branch prediction. Since random value can be embedded only in specific programs using JMP+RAND, it is expected to be highly efficient when used with memory deduplication technique, especially in a cloud computing environment.