• Title/Summary/Keyword: in-memory computing

Search Result 766, Processing Time 0.029 seconds

I/O Translation Layer Technology for High-performance and Compatibility Using New Memory (뉴메모리를 이용한 고성능 및 호환성을 위한 I/O 변환 계층 기술)

  • Song, Hyunsub;Moon, Young Je;Noh, Sam H.
    • Journal of KIISE
    • /
    • v.42 no.4
    • /
    • pp.427-433
    • /
    • 2015
  • The rapid advancement of computing technology has triggered the need for fast data I/O processing and high-performance storage technology. Next generation memory technology, which we refer to as new memory, is anticipated to be used for high-performance storage as they have excellent characteristics as a storage device with non-volatility and latency close to DRAM. This research proposes NTL (New memory Translation layer) as a technology to make use of new memory as storage. With the addition of NTL, conventional I/O is served with existing mature disk-based file systems providing compatibility, while new memory I/O is serviced through the NTL to take advantage of the byte-addressability feature of new memory. In this paper, we describe the design of NTL and provide experiment measurement results that show that our design will bring performance benefits.

Distributed In-Memory Caching Method for ML Workload in Kubernetes (쿠버네티스에서 ML 워크로드를 위한 분산 인-메모리 캐싱 방법)

  • Dong-Hyeon Youn;Seokil Song
    • Journal of Platform Technology
    • /
    • v.11 no.4
    • /
    • pp.71-79
    • /
    • 2023
  • In this paper, we analyze the characteristics of machine learning workloads and, based on them, propose a distributed in-memory caching technique to improve the performance of machine learning workloads. The core of machine learning workload is model training, and model training is a computationally intensive task. Performing machine learning workloads in a Kubernetes-based cloud environment in which the computing framework and storage are separated can effectively allocate resources, but delays can occur because IO must be performed through network communication. In this paper, we propose a distributed in-memory caching technique to improve the performance of machine learning workloads performed in such an environment. In particular, we propose a new method of precaching data required for machine learning workloads into the distributed in-memory cache by considering Kubflow pipelines, a Kubernetes-based machine learning pipeline management tool.

  • PDF

Cloudification of On-Chip Flash Memory for Reconfigurable IoTs using Connected-Instruction Execution (연결기반 명령어 실행을 이용한 재구성 가능한 IoT를 위한 온칩 플래쉬 메모리의 클라우드화)

  • Lee, Dongkyu;Cho, Jeonghun;Park, Daejin
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.14 no.2
    • /
    • pp.103-111
    • /
    • 2019
  • The IoT-driven large-scaled systems consist of connected things with on-chip executable embedded software. These light-weighted embedded things have limited hardware space, especially small size of on-chip flash memory. In addition, on-chip embedded software in flash memory is not easy to update in runtime to equip with latest services in IoT-driven applications. It is becoming important to develop light-weighted IoT devices with various software in the limited on-chip flash memory. The remote instruction execution in cloud via IoT connectivity enables to provide high performance software execution with unlimited software instruction in cloud and low-power streaming of instruction execution in IoT edge devices. In this paper, we propose a Cloud-IoT asymmetric structure for providing high performance instruction execution in cloud, still low power code executable thing in light-weighted IoT edge environment using remote instruction execution. We propose a simulated approach to determine efficient partitioning of software runtime in cloud and IoT edge. We evaluated the instruction cloudification using remote instruction by determining the execution time by the proposed structure. The cloud-connected instruction set simulator is newly introduced to emulate the behavior of the processor. Experimental results of the cloud-IoT connected software execution using remote instruction showed the feasibility of cloudification of on-chip code flash memory. The simulation environment for cloud-connected code execution successfully emulates architectural operations of on-chip flash memory in cloud so that the various software services in IoT can be accelerated and performed in low-power by cloudification of remote instruction execution. The execution time of the program is reduced by 50% and the memory space is reduced by 24% when the cloud-connected code execution is used.

The Architecture of the Flash Memory Storage System using Page Delete Information (페이지 삭제정보를 활용하는 플래시 저장장치의 구조)

  • Jung, Ho-Young;Park, Sung-Min;Kang, Soo-Yong;Cha, Jae-Hyuk
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.12
    • /
    • pp.958-962
    • /
    • 2009
  • Flash memory, which replaces hard disk recently, has different physical characteristics with hard disk. For the performance of flash memory based storage system, many researches over OS and file system layers has been doing. In this paper, we propose the architecture of flash memory based storage which uses information of page invalidation when file deletion occurs from upper layer. Also, we evaluate the performance of proposed system. Proposed system effectively increases IO performance by using page invalidation information to block merge and wear leveling algorithms.

An Adaptive Prefetching Technique for Software Distributed Shared Memory Systems (소프트웨어 분산공유메모리시스템을 위한 적응적 선인출 기법)

  • Lee, Sang-Kwon;Yun, Hee-Chul;Lee, Joon-Won;Maeng, Seung-Ryoul
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.28 no.9
    • /
    • pp.461-468
    • /
    • 2001
  • Though shared virtual memory (SVM) system promise low cost solutions for high performance computing they suffer from long memory latencies. These latencies are usually caused by repetitive invalidations on shared data. Since shared data are accessed through synchronization and the patterns by which threads synchronizes are repetitive, a prefetching scheme bases on such repetitiveness would reduce memory latencies. Based on this observation, we propose a prefetching technique which predicts future access behavior by analyzing access history per synchronization variable. Our technique was evaluated on an 8-node SVM system using the SPLASH-2 benchmark. The results show the our technique could achieve 34%~45% reduction in memory access latencies.

  • PDF

Design and Performance Evaluation of a Flash Compression Layer for NAND-type Flash Memory Systems (NAND형 플래시메모리를 위한 플래시 압축 계층의 설계 및 성능평가)

  • Yim Keun Soo;Bahn Hyokyung;Koh Kern
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.32 no.4
    • /
    • pp.177-185
    • /
    • 2005
  • NAND-type flash memory is becoming increasingly popular as a large data storage for mobile computing devices. Since flash memory is an order of magnitude more expensive than magnetic disks, data compression can be effectively used in managing flash memory based storage systems. However, compressed data management in NAND-type flash memory is challenging because it supports only page-based I/Os. For example, when the size of compressed data is smaller than the page size. internal fragmentation occurs and this degrades the effectiveness of compression seriously. In this paper, we present an efficient flash compression layer (FCL) for NAND-type flash memory which stores several small compressed pages into one physical page by using a write buffer Based on prototype implementation and simulation studies, we show that the proposed scheme offers the storage of flash memory more than $140\%$ of its original size and expands the write bandwidth significantly.

Construction of Product Information Hierarchy for Mobile Clients (모바일 클라이언트를 위한 상품정보 객체계층구조 구성)

  • Ha Sangho
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.11 no.2
    • /
    • pp.157-164
    • /
    • 2005
  • With the advances of wireless technologies and mobile computing, m-commerce is being realized on many kinds of mobile devices. Service contents for m-commerce are usually newly written to meet specific characteristics of user's mobile devices, thus requiring formidable efforts. So, it is very important to effectively exploit the Internet product information currently being provided for e-commerce. However, bringing those Internet contents to mobile devices is far from straightforward due to the limitations of mobile devices such as little memory, small displays, low processing speeds, and so forth. In this paper, assuming that the Internet products are represented in XML, we suggest four methods to construct the object hierarchy for effectively viewing the documents on the mobile devices. We then compare and analyze them by experiments in terms of response times and required memory size.

Implementation of Memory Copy Reduction Scheme for Multimedia Service in Embedded Linux Kernel (내장형 리눅스 커널에서 멀티미디어 서비스를 위한 메모리 복사 감소 기법의 구현)

  • Kim, Jeong-Won
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.8
    • /
    • pp.1058-1065
    • /
    • 2004
  • Embedded system is widely used in various applications from simple monitor to a set-top box with CPU, memory and hard disk drives. Specially, embedded OS is ported in moveable or small machinery since it ordinarily transmits multimedia data. In this paper, we propose Null copy scheme on the embedded linux system for multimedia service, which can reduce memory copy overhead from user address space to kernel one, and vice versa. Since embedded system for networked multimedia service has low level computing power as well as memory, the Null copy scheme can provide more improved QoS. Our image transmission experiment results on embedded linux target board(CPU utilization an Deadline miss rates) installed a web camera have shown that the proposed scheme can increase fast response and lower CPU overhead.

  • PDF

Design and Implementation of High Performance Virtual Desktop System Managing Virtual Desktop Image in Main Memory (메인 메모리상에 가상 데스크탑 이미지를 운용하는 고속 가상 데스크탑 시스템 설계 및 구현)

  • Oh, Soo-Cheol;Kim, SeungWoon
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.8
    • /
    • pp.363-368
    • /
    • 2016
  • A storage-based VDI (Virtual Desktop Infrastructure) system has the disadvantage of degraded performance when IOs for the VDI system are concentrated on the storage. The performance of the VDI system decreases rapidly especially, in case of the boot storm wherein all virtual desktops boot simultaneously. In this paper, we propose a main memory-based virtual desktop system managing virtual desktop images on main memory to solve the performance degradation problem including the boot storm. Performance of the main memory-based VDI system is improved by storing the virtual desktop image on the main memory. Also, the virtual desktop images with large size can be stored in the main memory using deduplication technology. Implementation of the proposed VDI system indicated that it has 4 times performance benefit than the storage-based VDI system in case of the boot storm.

An Efficient Cache Management Scheme for Load Balancing in Distributed Environments with Different Memory Sizes (상이한 메모리 크기를 가지는 분산 환경에서 부하 분산을 위한 캐시 관리 기법)

  • Choi, Kitae;Yoon, Sangwon;Park, Jaeyeol;Lim, Jongtae;Lee, Seokhee;Bok, Kyoungsoo;Yoo, Jaesoo
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.8
    • /
    • pp.543-548
    • /
    • 2015
  • Recently, volume of data has been growing dramatically along with the growth of social media and digital devices. However, the existing disk-based distributed file systems have limits to their performance of data processing or data access, due to I/O processing costs and bottlenecks. To solve this problem, the caching technique is being used to manage data in the memory. In this paper, we propose a cache management scheme to handle load balancing in a distributed memory environment. The proposed scheme distributes the data according to the memory size, n distributed environments with different memory sizes. If overloaded nodes occur, it redistributes the the access time of the caching data. In order to show the superiority of the proposed scheme, we compare it with an existing distributed cache management scheme through performance evaluation.