DOI QR코드

DOI QR Code

Optimization of Graph Processing based on In-Storage Processing

스토리지 내 프로세싱 방식을 사용한 그래프 프로세싱의 최적화 방법

  • 송내영 (서울대학교 컴퓨터공학부) ;
  • 한혁 (동덕여자대학교 컴퓨터학과) ;
  • 염헌영 (서울대학교 컴퓨터공학부)
  • Received : 2017.03.24
  • Accepted : 2017.06.13
  • Published : 2017.08.15

Abstract

In recent years, semiconductor-based storage devices such as flash memory (SSDs) have been developed to high performance. In addition, a trend has been observed of optimally utilizing resources such as the central processing unit (CPU) and memory of the internal controller in the storage device according to the needs of the application. This concept is called In-Storage Processing (ISP). In a storage device equipped with the ISP function, it is possible to process part of the operation executed on the host system, thus reducing the load on the host. Moreover, since the data is processed in the storage device, the data transferred to the host are reduced. In this paper, we propose a method to optimize graph query processing by utilizing these ISP functions, and show that the optimized graph processing method improves the performance of the graph 500 benchmark by up to 20%.

최근 들어 플래시 메모리 Solid State Driver(SSD)와 같은 반도체 기반 저장장치가 고성능으로 발전하면서 저장장치 내부 컨트롤러의 CPU와 메모리 같은 자원을 응용의 요구에 맞추어 최적으로 활용해보고자 하는 움직임이 있었다. 이러한 개념을 스토리지 내 프로세싱 방식(In-Storage Processing, ISP)이라고 한다. ISP의 기능이 탑재된 저장장치에서는 호스트에서 수행하던 연산의 일부를 나누어 처리할 수 있으므로 호스트의 부하가 줄어들고 저장장치 내에서 데이터가 가공되어 처리되기 때문에 호스트까지의 데이터 전달 시간이 줄어든다. 본 논문에서는 이러한 ISP 기능을 활용하여 그래프 질의 처리를 최적화하기 위한 방식을 제안하고, 제안된 최적화 그래프 처리 방식이 graph500 벤치마크의 성능을 최대 20%까지 향상 시켰음을 보여준다.

Keywords

Acknowledgement

Supported by : 한국연구재단

References

  1. Apache Hadoop, [Online]. Available: http://hadoop.apache.org
  2. Lustre, [Online]. Available: http://www.lustre.org
  3. Ceph, [Online]. Available: https://ceph.com
  4. Seok-Joo Lee, Jun-Ki Min, "An Efficient Large Graph Clustering Technique based on Min-Hash," Journal of KIISE, Vol. 43, No. 3, pp. 380-388, Mar. 2016. https://doi.org/10.5626/JOK.2016.43.3.380
  5. Zhu, Xiaowei, Wentao Han, and Wenguang Chen. "GridGraph: Large-Scale Graph Processing on a Single Machine Using 2-Level Hierarchical Partitioning," USENIX Annual Technical Conference, 2015.
  6. Chi, Yuze, et al., "NXgraph: an efficient graph processing system on a single machine," Data Engineering (ICDE), 2016 IEEE 32nd International Conference on. IEEE, 2016.
  7. Kyrola, Aapo, Guy E. Blelloch, and Carlos Guestrin. "GraphChi: Large-Scale Graph Computation on Just a PC," OSDI, Vol. 12, 2012.
  8. Kang, Yangwook, et al., "Enabling cost-effective data processing with smart ssd," Mass Storage Systems and Technologies (MSST), 2013 IEEE 29th Symposium on. IEEE, 2013.
  9. Do, Jaeyoung, et al., "Query processing on smart SSDs: opportunities and challenges," Proc. of the 2013 ACM SIGMOD International Conference on Management of Data. ACM, 2013.
  10. Lee, Young-Sik, et al., "ActiveSort: Efficient external sorting using active SSDs in the MapReduce framework," Future Generation Computer Systems 65 (2016): 76-89. https://doi.org/10.1016/j.future.2016.03.003
  11. Quero, Luis Cavazos, Young-Sik Lee, and Jin-Soo Kim, "Self-sorting SSD: Producing sorted data inside active SSDs," Mass Storage Systems and Technologies (MSST), 2015 31st Symposium on. IEEE, 2015.
  12. Jo, Insoon, et al., "YourSQL: a high-performance database system leveraging in-storage computing," Proc. of the VLDB Endowment 9.12 (2016): 924-935. https://doi.org/10.14778/2994509.2994512
  13. Minseo Kang, Jaesung Kim, Jaegil Lee, "A Comparative Analysis of Recursive Query Algorithm Implementations based on High Performance Distributed In-Memory Big Data Processing Platforms," Journal of KIISE, Vol. 43, No. 6, pp. 621-626, Jun. 2016. https://doi.org/10.5626/JOK.2016.43.6.621
  14. Murphy, Richard C., et al., "Introducing the graph 500," Cray Users Group (CUG), 2010.
  15. Angel, Jordan B., et al., Graph 500 performance on a distributed-memory cluster, Technical Report HPCF-2012-11, 2012.
  16. D'Azevedo, Eduardo F., and Neena Imam, "Graph 500 in OpenSHMEM," Workshop on OpenSHMEM and Related Technologies, Springer International Publishing, 2014.