DOI QR코드

DOI QR Code

RFJ: A Reliable and Fast Journaling Mechanism

RFJ: 신뢰적 고성능 데이터 버퍼 저널링 기법

  • Park, Sejin (Department of Computer Engineering, Keimyung University)
  • 박세진 (계명대학교 컴퓨터공학전공)
  • Received : 2019.04.08
  • Accepted : 2019.07.05
  • Published : 2019.07.31

Abstract

Modern file systems have journaling mechanism to maintain their stored state consistently even under unexpected system crashes or disasters. However, the journaling makes I/O throughput lower. This performance degradation comes from the ordering mechanism between the data buffer and metadata buffer and two-staged buffer writing. Especially, if the data buffer and metadata buffer are journalled at the same time, then it incurs significant performance degradation due to the two-staged writing. That shows the trade-off relation-ship between I/O performance and system reliability. In this paper, we propose RFJ: a reliable and fast jour-naling mechanism to deal with this trade-off relationship. We propose an ordering enforced writeback journaling mode and selective journaling mechanism. The Ordering enforced writeback journaling mode achieves low I/O latency and the selective journaling mechanism achieves high reliability. The experimental result shows that the performance of RFJ is almost 5x faster than the journal mode of Ext3 file system but it still supports the same reliability with the journal mode.

현대 파일 시스템은 예기치 못한 시스템 크래시 또는 재난 상황에서도 데이터의 일관성 유지를 위해 저널링 메커니즘을 유지한다. 그러나 저널링은 I/O 처리율을 떨어뜨리는 문제가 있다. 이 성능 저하 문제는 데이터 버퍼와 메타데이터 버퍼간의 오더링 메커니즘과 2단계 버퍼쓰기에서 기인하는데. 특히, 만약 데이터 버퍼와 메타데이터 버퍼가 동시에 저널링이 되면, 2단계 쓰기 때문에 심각한 성능저하가 발생하며, 이는 I/O 성능과 시스템 신뢰도 간의 Trade-off 관계가 있음을 나타낸다. 본 논문은 RFJ라는 신뢰성 있는 고속 저널링 기법을 제안한다. 이 기법은 Ordering enforced writeback 저널링 모드와 selective journaling 메커니즘을 도입해서 높은 신뢰도와 동시에 고성능 I/O가 가능하게 한다. 본 논문에서 제안한 기법의 실험 결과 기존 Ext3 저널링 모드 대비 약 5배 이상 빠른 I/O 처리량을 지원하면서 동시에 Ext3 저널링과 동일한 수준의 신뢰성을 나타는 것을 확인 할 수 있었다.

Keywords

SHGSCZ_2019_v20n7_45_f0001.png 이미지

Fig. 1. Journaling mechanism. When a new write operation is issued at T1, then the data will be committed to the journal area at T2. After then, the data will be finally checkpointed to the file system at T3.

SHGSCZ_2019_v20n7_45_f0002.png 이미지

Fig. 2. I/O throughput of fileserver workload in the Filebench comparing with various journaling modes under Ext3 file system.

SHGSCZ_2019_v20n7_45_f0003.png 이미지

Fig. 3. Ordered mode in the Ext3 file system.

SHGSCZ_2019_v20n7_45_f0004.png 이미지

Fig. 4. Journal mode in the Ext3 file system.

SHGSCZ_2019_v20n7_45_f0005.png 이미지

Fig. 5. Classification of write operation on real world workload

SHGSCZ_2019_v20n7_45_f0006.png 이미지

Fig. 6. Architecture of RFJ

SHGSCZ_2019_v20n7_45_f0007.png 이미지

Fig. 7. Detailed operation. D means data and M means metadata. When a new write operation is issued, the buffer monitor classifies each buffer (1) and the newly allocated buffers are inserted to the I/O Completion checking list.(2). After then, the journaling daemon wakes up and journals the already existing buffers (3). At this point, it cannot journal the metadata buffer since there are still data buffers that are not written to the disk. Because there is nothing to do, the journaling daemon sleeps. Later, kernel’s buffer flusher flushes the newly allocated buffers (4) then the buffers in the checking list are changed to I/O Completed. When the journaling daemon wakes up, now it can journal the metadata (5) because of the data buffers are resided in the FS area. After that, the checking list is cleared.

SHGSCZ_2019_v20n7_45_f0008.png 이미지

Fig. 8. Throughput comparison to the default journaling modes in the Ext3 File system. Result of File server workload in the Filebench.

References

  1. S. C. Tweedie. EXT3, Journaling File System. ol-strans.sourceforge.net/release/OLS2000-ext3/OLS2000-ext3.html
  2. Mathur, Avantika, et al. "The new ext4 filesystem: current status and future plans." Proceedings of the Linux Symposium. Vol. 2. 2007.
  3. S. Best. JFS Log. How the Journaled File System performs logging. In Proceedings of the 4th Annual Linux Showcase and Conference, pages 163-168, Atlanta, 2000.
  4. Mason, Chris. "Journaling with reisersfs." Linux Journal 2001.82es (2001): 3.
  5. Filebench, http://www.solarisinternals.com/
  6. M. K. McKusick, W. N. Joy, S. J. Leffler, and R. S. Fabry. Fsck - The UNIX File System Check Program. Unix System Manager's Manual 4.3 BSD Virtual VAX-11 Version, April 1986.
  7. Kavalanekar, Swaroop, et al. "Characterization of storage workload traces from production windows servers." Workload Characterization, 2008. IISWC 2008.
  8. Jianxi Chen, Qingsong Wei, Cheng Chen, and Ling kun Wu, FSMAC: A file system metadata accele rator with non-volatile memory, MSST, May 2013.
  9. Bovet, Daniel P. Understanding the Linux kernel. O'reilly, 2007.
  10. Design and Implementation of the Second Extended Filesystem, http://e2fsprogs.sourceforge.net/ext2intro.html
  11. Symantec, Enterprise Vault, http://www.enterprisevault.com
  12. Lee, Eunji, Hyokyung Bahn, and Sam H. Noh. "Unioning of the Buffer Cache and Journaling Layers with Non-volatile Memory." 11th USENIX Conference on File and Storage Technologies. 2013.
  13. Prabhakaran, Vijayan, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. "Analysis and Evolution of Journaling File Systems." USENIX Annual Technical Conference, General Track. 2005.
  14. Choi, Hyun Jin, Seung-Ho Lim, and Kyu Ho Park. "JFTL: A flash translation layer based on a journal remapping for flash memory." ACM Transactions on Storage (TOS) 4.4 (2009).
  15. Chidambaram, Vijay, et al. "Optimistic crash consistency." Proceedings of the TwentyFourth ACM Symposium on Operating Systems Principles. ACM, 2013.