• Title/Summary/Keyword: Write Performance

Search Result 391, Processing Time 0.027 seconds

Partial Garbage Collection Technique for Improving Write Performance of Log-Structured File Systems (부분 가비지 컬렉션을 이용한 로그 구조 파일시스템의 쓰기 성능 개선)

  • Gwak, Hyunho;Shin, Dongkun
    • Journal of KIISE
    • /
    • v.41 no.12
    • /
    • pp.1026-1034
    • /
    • 2014
  • Recently, flash storages devices have become popular. Log-structured file systems (LFS) are suitable for flash storages since these can provide high write performance by only generating sequential writes to the flash device. However, LFS should perform garbage collections (GC) in order to reclaim obsolete space. Recently, a slack space recycling (SSR) technique was proposed to reduce the GC overhead. However, since SSR generates random writes, write performance can be negatively impacted if the random write performance is significantly lower than sequential write performance of the target device. This paper proposes a partial garbage collection technique that copies only a part of valid blocks in a victim segment in order to increase the size of the contiguous invalid space to be used by SSR. The experiments performed in this study show that the write performance in an SD card improves significantly as a result of the partial GC technique.

Storage I/O Subsystem for Guaranteeing Atomic Write in Database Systems (데이터베이스 시스템의 원자성 쓰기 보장을 위한 스토리지 I/O 서브시스템)

  • Han, Kyuhwa;Shin, Dongkun;Kim, Yongserk
    • Journal of KIISE
    • /
    • v.42 no.2
    • /
    • pp.169-176
    • /
    • 2015
  • The atomic write technique is a good solution to solve the problem of the double write buffer. The atomic write technique needs modified I/O subsystems (i.e., file system and I/O schedulers) and a special SSD that guarantees the atomicity of the write request. In this paper, we propose the writing unit aligned block allocation technique (for EXT4 file system) and the merge prevention of requests technique for the CFQ scheduler. We also propose an atomic write-supporting SSD which stores the atomicity information in the spare area of the flash memory page. We evaluate the performance of the proposed atomic write scheme in MariaDB using the tpcc-mysql and SysBench benchmarks. The experimental results show that the proposed atomic write technique shows a performance improvement of 1.4~1.5 times compared to the double write buffer technique.

Co-Writing Multiple Files Based on Directory Locality for High Performance of Small File Writes (디렉토리 지역성을 활용한 작은 파일들의 모아 쓰기 기법)

  • Lee, Kyung-Jae;Ahn, Woo-Hyun;Oh, Jae-Won
    • The KIPS Transactions:PartA
    • /
    • v.15A no.5
    • /
    • pp.275-286
    • /
    • 2008
  • Fast File System(FFS) utilizes large disk bandwidth to improve the write performance of large files. One way to improve the performance is to write multiple blocks of a large file at a single disk I/O through the disk bandwidth. However, rather than disk bandwidth, the performance of small file writes is limited by disk access times significantly impacted by disk movements such as disk seek and rotation because FFS writes each of small files at a single disk write. We propose CW-FFS (Co-Writing Fast File System) to improve the write performance of small files by minimizing the disk movements that are needed to write small files to disks. Its key technique called co-writing scheme is to dynamically collect multiple small files named by a given directory and then write them at a single disk I/O to contiguous disk locations. Co-writing several small files at a single disk I/O reduces multiple disk movements that are needed for small file writes to one single disk movement, thus increasing the overall write performance of write-intensive applications. Furthermore, a file allocation scheme is introduced to prevent co-writing scheme from having a negative impact on disk spatial locality of small files named by a given directory. The measurement of our technique implemented in the OpenBSD 4.0 shows that CW-FFS increases the performance of small file writes over FFS in the range from 5 to 35% in the Postmark benchmark.

Comparison of Traditional Workloads and Deep Learning Workloads in Memory Read and Write Operations

  • Jeongha Lee;Hyokyung Bahn
    • International journal of advanced smart convergence
    • /
    • v.12 no.4
    • /
    • pp.164-170
    • /
    • 2023
  • With the recent advances in AI (artificial intelligence) and HPC (high-performance computing) technologies, deep learning is proliferated in various domains of the 4th industrial revolution. As the workload volume of deep learning increasingly grows, analyzing the memory reference characteristics becomes important. In this article, we analyze the memory reference traces of deep learning workloads in comparison with traditional workloads specially focusing on read and write operations. Based on our analysis, we observe some unique characteristics of deep learning memory references that are quite different from traditional workloads. First, when comparing instruction and data references, instruction reference accounts for a little portion in deep learning workloads. Second, when comparing read and write, write reference accounts for a majority of memory references, which is also different from traditional workloads. Third, although write references are dominant, it exhibits low reference skewness compared to traditional workloads. Specifically, the skew factor of write references is small compared to traditional workloads. We expect that the analysis performed in this article will be helpful in efficiently designing memory management systems for deep learning workloads.

Performance and Energy Optimization for Low-Write Performance Non-volatile Main Memory Systems (낮은 쓰기 성능을 갖는 비휘발성 메인 메모리 시스템을 위한 성능 및 에너지 최적화 기법)

  • Jung, Woo-Soon;Lee, Hyung-Gyu
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.13 no.5
    • /
    • pp.245-252
    • /
    • 2018
  • Non-volatile RAM devices have been increasingly viewed as an alternative of DRAM main memory system. However some technologies including phase-change memory (PCM) are still suffering from relatively poor write performance as well as limited endurance. In this paper, we introduce a proactive last-level cache management to efficiently hide a low write performance of non-volatile main memory systems. The proposed method significantly reduces the cache miss penalty by proactively evicting the part of cachelines when the non-volatile main memory system is in idle state. Our trace-driven simulation demonstrates 24% performance enhancement, compared with a conventional LRU cache management, on the average.

Efficient Small Write Method for DDR-SSD based Software RAID (DDR-SSD를 위한 소프트웨어 RAID의 효과적인 작은 쓰기 처리 기법)

  • Khil, Ki-Jeong;Kwak, Dong-Ho;Kwak, Yun-Sik;Cheong, Seung-Kook;Hwang, Jung-Yeon;Choi, Kil-Seong;Song, Seok-Il
    • Journal of Advanced Navigation Technology
    • /
    • v.14 no.5
    • /
    • pp.752-759
    • /
    • 2010
  • In this paper, we propose differential-logging method to improve the performance of RMW(Read Modify Write) operations of DDR-SSD based software RAID. Small writes that are frequently occurred in enterprise applications are main factor to degrade the performance of RAID5. Once a block is updated in RAID5, the parity block of the block must be updated to maintain consistency of parity. Therefore, to process a small write request, we need to read its parity block stored in disk, read old data, perform XOR operation, and write updated data and parity block. Several methods for hard disk based software RAID are proposed to solve the small write problems in RAID 5. Ln this paper, we propose a differential-logging method which carefully considers the DDR-SSD to solve the small write problem in RAID 5. We show that our proposed method out performs the existing software RAID in LINUX through simulations.

A Locality-Aware Write Filter Cache for Energy Reduction of STTRAM-Based L1 Data Cache

  • Kong, Joonho
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.16 no.1
    • /
    • pp.80-90
    • /
    • 2016
  • Thanks to superior leakage energy efficiency compared to SRAM cells, STTRAM cells are considered as a promising alternative for a memory element in on-chip caches. However, the main disadvantage of STTRAM cells is high write energy and latency. In this paper, we propose a low-cost write filter (WF) cache which resides between the load/store queue and STTRAM-based L1 data cache. To maximize efficiency of the WF cache, the line allocation and access policies are optimized for reducing energy consumption of STTRAM-based L1 data cache. By efficiently filtering the write operations in the STTRAM-based L1 data cache, our proposed WF cache reduces energy consumption of the STTRAM-based L1 data cache by up to 43.0% compared to the case without the WF cache. In addition, thanks to the fast hit latency of the WF cache, it slightly improves performance by 0.2%.

Energy-Performance Efficient 2-Level Data Cache Architecture for Embedded System (내장형 시스템을 위한 에너지-성능 측면에서 효율적인 2-레벨 데이터 캐쉬 구조의 설계)

  • Lee, Jong-Min;Kim, Soon-Tae
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.37 no.5
    • /
    • pp.292-303
    • /
    • 2010
  • On-chip cache memories play an important role in both performance and energy consumption points of view in resource-constrained embedded systems by filtering many off-chip memory accesses. We propose a 2-level data cache architecture with a low energy-delay product tailored for the embedded systems. The L1 data cache is small and direct-mapped, and employs a write-through policy. In contrast, the L2 data cache is set-associative and adopts a write-back policy. Consequently, the L1 data cache is accessed in one cycle and is able to provide high cache bandwidth while the L2 data cache is effective in reducing global miss rate. To reduce the penalty of high miss rate caused by the small L1 cache and power consumption of address generation, we propose an ECP(Early Cache hit Predictor) scheme. The ECP predicts if the L1 cache has the requested data using both fast address generation and L1 cache hit prediction. To reduce high energy cost of accessing the L2 data cache due to heavy write-through traffic from the write buffer laid between the two cache levels, we propose a one-way write scheme. From our simulation-based experiments using a cycle-accurate simulator and embedded benchmarks, the proposed 2-level data cache architecture shows average 3.6% and 50% improvements in overall system performance and the data cache energy consumption.

An Efficient Snapshot Technique for Shared Storage Systems supporting Large Capacity (대용량 공유 스토리지 시스템을 위한 효율적인 스냅샷 기법)

  • 김영호;강동재;박유현;김창수;김명준
    • Journal of KIISE:Databases
    • /
    • v.31 no.2
    • /
    • pp.108-121
    • /
    • 2004
  • In this paper, we propose an enhanced snapshot technique that solves performance degradation when snapshot is initiated for the storage cluster system. However, traditional snapshot technique has some limits adapted to large amount storage shared by multi-hosts in the following aspects. As volume size grows, (1) it deteriorates crucially the performance of write operations due to additional disk access to verify COW is performed. (2) Also it increases excessively the blocking time of write operation performed during the snapshot creation time. (3)Finally, it deteriorates the performance of write operations due to additional disk I/O for mapping block caused by the verification of COW. In this paper, we propose an efficient snapshot technique for large amount storage shared by multi-hosts in SAN Environments. We eliminate the blocking time of write operation caused by freezing while a snapshot creation is performing. Also to improve the performance of write operation when snapshot is taken, we introduce First Allocation Bit(FAB) and Snapshot Status Bit(SSB). It improves performance of write operation by reducing an additional disk access to volume disk for getting snapshot mapping block. We design and implement an efficient snapshot technique, while the snapshot deletion time, improve performance by deallocation of COW data block using SSB of original mapping entry without snapshot mapping entry obtained mapping block read from the shared disk.

Design of an Asynchronous Data Cache with FIFO Buffer for Write Back Mode (Write Back 모드용 FIFO 버퍼 기능을 갖는 비동기식 데이터 캐시)

  • Park, Jong-Min;Kim, Seok-Man;Oh, Myeong-Hoon;Cho, Kyoung-Rok
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.6
    • /
    • pp.72-79
    • /
    • 2010
  • In this paper, we propose the data cache architecture with a write buffer for a 32bit asynchronous embedded processor. The data cache consists of CAM and data memory. It accelerates data up lood cycle between the processor and the main memory that improves processor performance. The proposed data cache has 8 KB cache memory. The cache uses the 4-way set associative mapping with line size of 4 words (16 bytes) and pseudo LRU replacement algorithm for data replacement in the memory. Dirty register and write buffer is used for write policy of the cache. The designed data cache is synthesized to a gate level design using $0.13-{\mu}m$ process. Its average hit rate is 94%. And the system performance has been improved by 46.53%. The proposed data cache with write buffer is very suitable for a 32-bit asynchronous processor.