DOI QR코드

DOI QR Code

A Study on Optimizing Disk Utilization of Software-Defined Storage

소프트웨어 정의 스토리지의 디스크 이용을 최적화하는 방법에 관한 연구

  • Received : 2022.12.16
  • Accepted : 2023.02.28
  • Published : 2023.04.30

Abstract

Recently, many companies are using public cloud services or building their own data center because digital transformation is expanding. The software-defined storage is a key solution for storing data on the cloud platform and its use is expanding worldwide. Software-defined storage has the advantage of being able to virtualize and use all storage resources as a single storage device and supporting flexible scale-out. On the other hand, since the size of an object is variable, an imbalance occurs in the use of the disk and may cause a failure. In this study, a method of redistributing objects by optimizing disk weights based on storage state information was proposed to solve the imbalance problem of disk use, and the experimental results were presented. As a result of the experiment, it was confirmed that the maximum utilization rate of the disk decreased by 10% from 89% to 79%. Failures can be prevented, and more data can be stored by optimizing the use of disk.

최근에는 디지털 변환이 확대됨에 따라 많은 기업들이 퍼블릭 클라우드 서비스를 이용하거나 자체 데이터센터를 구축하고 있다. 소프트웨어 정의 스토리지는 클라우드 플랫폼에서 데이터를 저장하기 위한 핵심적인 솔루션으로 전세계적으로 이용이 확대되고 있다. 소프트웨어 정의 스토리지는 전체 스토리지 자원을 하나의 저장장치와 같이 가상화하여 사용할 수 있고 유연한 Scale-out을 지원하는 장점이 있는 반면에, 가변 크기의 오브젝트 방식으로 인한 디스크의 이용에 불균형이 발생하고, 장애를 유발할 수 있다. 본 연구에서는 디스크 이용의 불균형 문제를 해결하기 위하여 스토리지의 상태정보를 바탕으로 디스크의 가중치를 최적화하여 오브젝트를 재분배하는 방법에 대하여 제안하고, 그 실험 결과를 제시하였다. 실험을 수행한 결과, 디스크의 최대 이용률이 89%에서 79%로 10%만큼 감소한 것을 확인하였다. 디스크의 이용률을 최적화함으로써 장애를 예방하고, 더 많은 데이터를 균등하게 저장할 수 있어 효율적인 스토리지 이용이 가능할 것으로 기대된다.

Keywords

References

  1. Coraid, 2013, The Fundamentals of Software-Defined Storage [Internet], http://san.coraid.com/
  2. S. Robinson, 2013, Software-Defined Storage: The Reality Beneath the Hype [Internet], http://www.computerweekly.com/
  3. G. Joshi, E. Soljanin, and G. Wornell, "Efficient redundancy techniques for latency reduction in cloud systems," ACM Transactions on Modeling and Performance Evaluation of Computing Systems, Vol.2, No.2, pp.12, 2017.
  4. H. X. Mao, X. L. Shu, K. Huang, and L. Zhang, "Research of data reliability technology based on erasure code redundancy technology in cloud storage," Advanced Materials Research, Vols.912-914, pp.1345-1348, 2014.
  5. M. Peters and M. Keane, "Key reasons to use softwaredefined storage and how to get started," IBM Whitepaper, IBM.com website, pp.1-8, 2015.
  6. T. Rosado and J. Bernardino, "An overview of openstack architecture," Proceedings of 18th International Database Engineering & Application Symposium, pp.366-367, 2014.
  7. Kubernetes Cluster Architecture [Internet], https://kubernetes.io/docs/concepts/architecture
  8. J. Lee et al., "Development of the KEPCO electric power software common platform technology," www.kepri.re.kr, 2021.
  9. D. Bernstein, "Containers and cloud: From lxc to docker to kubernetes," IEEE Cloud Computing, Vol.1, No.3, pp.81-84, 2014. https://doi.org/10.1109/MCC.2014.51
  10. T. Cerny, M. J. Donahoo, and M. Tmka, "Contextual understanding of microservice architecture: Current and future directions," ACIM SIGAPP Applied Computing Review, Vol.17, No.4, pp.29-45, 2018. https://doi.org/10.1145/3183628.3183631
  11. Troubleshooting OSDS. [Internet], https://docs.ceph.com/en/quincy/rados/troubleshooting/troubleshooting-osd
  12. S. A., Weil, S. A. Brandt, E. L. Miller, D. D. Long, and C. Maltzahn, "Ceph: A scalable, high-performance distributed file system," Proceedings of the 7th Symposium on Operating Systems Design and Implementation, pp.307-320, 2006.
  13. C., Maltzahn, E., Molina-Estolano, A., Khurana, A. J., Nelson, S. A., Brandt, and S., Weil, "Ceph as a scalable alternative to the hadoop distributed file system," The USENIX Magazine, Vol.35, pp.38-49, 2010.
  14. S. A. Weil, A. W. Leung, S. A. Brandt, and C. Maltzahn, "Rados: A scalable, reliable storage service for petabytescale storage clusters," Proceedings of the 2nd International Workshop on Petascale Data Storage: Held in Conjunctin with Supercomputing'07, pp.35-44, 2007.
  15. Ceph Architecture [Internet], https://docs.ceph.com/en/latest/architecture
  16. W. K. Lin, D. M. Chiu, and Y. B. Lee, "Erasuere code replication revisited," Proceedings of Fourth International Conference on Peer-to-Peer Computing, pp.90-97, 2004.
  17. X. Zhang, S. Gaddam, and A. T. Chronopoulos, "Ceph distributed file system benchmarks on an openstack cloud," IEEE International Conference on Cloud Computing in Emerging Markets(CCEM), pp.113-120, 2015.
  18. Chao-Tung Yang, W. H. Lien, Y. C. Shen, and F. Y. Leu, "Implementation of a Software-Defined Storage Service with Heterogeneous Storage Technologies." 2015 IEEE 29th International Conference on Advanced Information Networking and Applications Workshops, pp.102-107, 2015, doi:10.1109/WAINA.2015.50.
  19. T. Zhang, S. Toor, and A. Hellander, "Efficient hierarchical storage management framework empowered by reinforcement learning," arXiv:2201.11668, 2022.
  20. P. -J. Maenhaut, H. Moens, B. Volckaert, V. Ongenae, and F. De Turck, "Design of a hierarchical software-defined storage system for data-intensive multi-tenant cloud applications," 2015 11th International Conference on Network and Service Management (CNSM), pp.22-28, 2015, doi: 10.1109/CNSM.2015.7367334.
  21. J. Mockus, V. Tiesis, and A. Zilinskas, "The application of bayesian methods for seeking the extrenum," Towards Global Optimization, Vol.2, pp.117-129, 1978.
  22. J. M. Bernardo and A. F. Smith, "Bayesian theroy," John Wiley & Sons, 2009.
  23. E. Schulz, M. Speekenbrink, and A. Krause, "A tutorial on Gaussian process regression: Modeling, exploring, and exploiting functions," Journal of Mathmatical Psychology, Vol.85, pp.1-16, 2018. https://doi.org/10.1016/j.jmp.2018.03.001