• Title/Summary/Keyword: in-memory computing

Search Result 766, Processing Time 0.03 seconds

Simulation-based Design Verification for High-performance Computing System

  • Jeong Taikyeong T.
    • Journal of Korea Multimedia Society
    • /
    • v.8 no.12
    • /
    • pp.1605-1612
    • /
    • 2005
  • This paper presents the knowledge and experience we obtained by employing multiprocessor systems as a computer simulation design verification to study high-performance computing system. This paper also describes a case study of symmetric multiprocessors (SMP) kernel on a 32 CPUs CC-NUMA architecture using an actual architecture. A small group of CPUs of CC-NUMA, high-performance computer system, is clustered into a processing node or cluster. By simulating the system design verification tools; we discussed SMP OS kernel on a CC-NUMA multiprocessor architecture performance which is $32\%$ of the total execution time and remote memory access latency is occupied $43\%$ of the OS time. In this paper, we demonstrated our simulation results for multiprocessor, high-performance computing system performance, using simulation-based design verification.

  • PDF

Archival Description and Records from Historically Marginalized Cultures: A View from a Postmodern Window

  • Sinn, Dong-Hee
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.44 no.4
    • /
    • pp.115-130
    • /
    • 2010
  • In the archival field, the last decade has witnessed much discussion on archives' broad responsibilities for social memory. Considering that the social role of archives has stemmed from postmodern thinking suggests a paradigm shift from viewing archives as static recorded objects to viewing them as dynamic evidence of human memory. The modern archives and archivists are products of nineteenth-century positivism, limiting their function to archiving written documents within stable organizations. The new thoughts on the social role of archives provide a chance to realize that traditional archival practices have preserved only a sliver of organizational memory, thus ignoring fluid records of human activities and memory. Archival description is the primary method for users to access materials in archives. Thus, it can determine how archival materials will be used (or not used). The traditional archival description works as the representation of archival materials and is directly projected from the hierarchy of organizational documents. This paper argues that archivists will need to redefine archival description to be more sensitive to atypical types of archival materials from various cultural contexts. This paper surveys the postmodern approaches to archival concepts in relation to descriptive practices. It also examines some issues related to representing historically marginalized groups in archival description who were previously neglected in traditional archival practices.

Memory Hierarchy Optimization in Embedded Systems using On-Chip SRAM (On-Chip SRAM을 이용한 임베디드 시스템 메모리 계층 최적화)

  • Kim, Jung-Won;Kim, Seung-Kyun;Lee, Jae-Jin;Jung, Chang-Hee;Woo, Duk-Kyun
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.36 no.2
    • /
    • pp.102-110
    • /
    • 2009
  • The memory wall is the growing disparity of speed between CPU and memory outside the CPU chip. An economical solution is a memory hierarchy organized into several levels, such as processor registers, cache, main memory, disk storage. We introduce a novel memory hierarchy optimization technique in Linux based embedded systems using on-chip SRAM for the first time. The optimization technique allocates On-Chip SRAM to the code/data that selected by programmers by using virtual memory systems. Experiments performed with nine applications indicate that the runtime improvements can be achieved by up to 35%, with an average of 14%, and the energy consumption can be reduced by up to 40%, with an average of 15%.

A Two-level Indexing Method in Flash Memory Environment (플래시 메모리 환경을 위한 이단계 인덱싱 방법)

  • Kim, Jong-Dae;Chang, Ji-Woong;Hwang, Kyu-Jeong;Kim, Sang-Wook
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.7
    • /
    • pp.713-717
    • /
    • 2008
  • Recently, as the capacity of flash memory increases rapidly, efficient indexing methods become crucial for fast searching of a large volume of data stored in flash memory. Flash memory has its unique characteristics: the write operation is much more costly than the read operation and in-place updating is not allowed. In this paper, we propose a novel index structure that significantly reduces the number of write operations and thus supports efficient searches, insertions, and deletions. We verify the superiority of our method by performing extensive experiments.

Implementation of parallel blocked LU decomposition program for utilizing cache memory on GP-GPUs (GP-GPU의 캐시메모리를 활용하기 위한 병렬 블록 LU 분해 프로그램의 구현)

  • Kim, Youngtae;Kim, Doo-Han;Yu, Myoung-Han
    • Journal of Internet Computing and Services
    • /
    • v.14 no.6
    • /
    • pp.41-47
    • /
    • 2013
  • GP-GPUs are general purposed GPUs for numerical computation based on multiple threads which are originally for graphic processing. GP-GPUs provide cache memory in a form of shared memory which user programs can access directly, unlikely typical cache memory. In this research, we implemented the parallel block LU decomposition program to utilize cache memory in GP-GPUs. The parallel blocked LU decomposition program designed with Nvidia CUDA C run 7~8 times faster than nun-blocked LU decomposition program in the same GP-GPU computation environment.

On-Demand Remote Software Code Execution Unit Using On-Chip Flash Memory Cloudification for IoT Environment Acceleration

  • Lee, Dongkyu;Seok, Moon Gi;Park, Daejin
    • Journal of Information Processing Systems
    • /
    • v.17 no.1
    • /
    • pp.191-202
    • /
    • 2021
  • In an Internet of Things (IoT)-configured system, each device executes on-chip software. Recent IoT devices require fast execution time of complex services, such as analyzing a large amount of data, while maintaining low-power computation. As service complexity increases, the service requires high-performance computing and more space for embedded space. However, the low performance of IoT edge devices and their small memory size can hinder the complex and diverse operations of IoT services. In this paper, we propose a remote on-demand software code execution unit using the cloudification of on-chip code memory to accelerate the program execution of an IoT edge device with a low-performance processor. We propose a simulation approach to distribute remote code executed on the server side and on the edge side according to the program's computational and communicational needs. Our on-demand remote code execution unit simulation platform, which includes an instruction set simulator based on 16-bit ARM Thumb instruction set architecture, successfully emulates the architectural behavior of on-chip flash memory, enabling embedded devices to accelerate and execute software using remote execution code in the IoT environment.

Flash Memory Wear-Leveling using Regulation Pools (마모 제어 영역을 활용한 플래시 메모리 마모평준화)

  • Park, Jeong-Su;Min, Sang-Lyul
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.12
    • /
    • pp.1204-1208
    • /
    • 2010
  • In this paper, we propose a flash memory wear-leveling scheme that makes use of meta-data storage region as a regulation pool. By concentrating program and erase operations on the blocks with lower erase counts in the regulation pool, the proposed scheme achieve an even wear-leveling in a simple and efficient way. Experiments with an implementation of the proposed scheme in RS-FTL showed that the erase count deviation is reduced by around 40% through the erase count regulation.

Functional Improvement of the Compressed Data Management System for Mobile DBMS (모바일 DBMS를 위한 압축 데이터 관리 시스템의 기능 고도화)

  • Hwang, Jin-Ho;Lee, Jeong-Wha;Kim, Gun-Woo;Shin, Young-Jae;Son, Jin-Hyun
    • The KIPS Transactions:PartD
    • /
    • v.15D no.6
    • /
    • pp.733-740
    • /
    • 2008
  • Recently, mobile computing devices are used popularly. And quantity of information on mobile computing devices is being increased due to digitalization of information. So it needs an embedded DBMS for effective information management. Furthermore, since flash memory having a restriction on the number of partial write cycles is rapidly deployed on mobile computing devices as data storage and is more expensive than the conventional magnetic hard disk, the compressed data management system(CDMS) has been considered as an effective storage management technique for mobile computing devices in previous research. However, the research of CDMS is at the initial stage and has several problems. Hence, in this paper, we present additional storage management methods to solve the problems and improve the effectiveness of the CDMS for embedded DBMS.

A Regular Expression Matching Algorithm Based on High-Efficient Finite Automaton

  • Wang, Jianhua;Cheng, Lianglun;Liu, Jun
    • Journal of Computing Science and Engineering
    • /
    • v.8 no.2
    • /
    • pp.78-86
    • /
    • 2014
  • Aiming to solve the problems of high memory access and big storage space and long matching time in the regular expression matching of extended finite automaton (XFA), a new regular expression matching algorithm based on high-efficient finite automaton is presented in this paper. The basic idea of the new algorithm is that some extra judging instruments are added at the starting state in order to reduce any unnecessary transition paths as well as to eliminate any unnecessary state transitions. Consequently, the problems of high memory access consumption and big storage space and long matching time during the regular expression matching process of XFA can be efficiently improved. The simulation results convey that our proposed scheme can lower approximately 40% memory access, save about 45% storage space consumption, and reduce about 12% matching time during the same regular expression matching process compared with XFA, but without degrading the matching quality.

Study of Cache Performance on GPGPU

  • Choi, Kyu Hyun;Kim, Seon Wook
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.4 no.2
    • /
    • pp.78-82
    • /
    • 2015
  • General-purpose graphics processing units (GPGPUs) provide tremendous computational and processing power. Despite the latency hiding mechanism, a GPU architecture requires high memory bandwidth and lower latency between computational units and the memory system. For this reason, the current GPU architecture has private L1 caches in each core and a shared L2 cache to increase performance by reducing memory latency. But in some cases, this CPU-like cache design is not suitable for GPGPUs. In this paper, we analyze detailed cache performance related to GPGPU application characteristics, and suggest technical alternatives for the GPGPU architecture as future work.