DOI QR코드

DOI QR Code

A New Cache Replacement Policy for Improving Last Level Cache Performance

라스트 레벨 캐쉬 성능 향상을 위한 캐쉬 교체 기법 연구

  • 두 콩 튜안 (전남대학교 전자컴퓨터공학과) ;
  • 손동오 (전남대학교 전자컴퓨터공학과) ;
  • 김종면 (울산대학교 전기공학부) ;
  • 김철홍 (전남대학교 전자컴퓨터공학부)
  • Received : 2014.05.12
  • Accepted : 2014.09.03
  • Published : 2014.11.15

Abstract

Cache replacement algorithms have been developed in order to reduce miss counts. In modern processors, the performance gap between the processor and main memory has been increasing, creating a more important role for cache replacement policies. The Least Recently Used (LRU) policy is one of the most common policies used in modern processors. However, recent research has shown that the performance gap between the LRU and the theoretical optimal replacement algorithm (OPT) is large. Although LRU replacement has been proven to be adequate over and over again, the OPT/LRU performance gap is continuously widening as the cache associativity becomes large. In this study, we observed that there is a potential chance to improve cache performance based on existing LRU mechanisms. We propose a method that enhances the performance of the LRU replacement algorithm based on the access proportion among the lines in a cache set during a period of two successive replacement actions that make the final replacement action. Our experimental results reveals that the proposed method reduced the average miss rate of the baseline 512KB L2 cache by 15 percent when compared to conventional LRU. In addition, the performance of the processor that applied our proposed cache replacement policy improved by 4.7 percent over LRU, on average.

캐쉬 교체 기법은 캐쉬 미스를 감소시키기 위해서 개발되었다. 마이크로프로세서와 주기억장치의 속도 차이를 해결하기 위해서는 캐쉬 교체 기법의 성능이 중요하다. 일반적인 캐쉬 교체 기법으로는 LRU 기법이 있으며 대부분의 마이크로프로세서에서 캐쉬 교체 기법으로 LRU 기법을 사용한다. 그러나, 최근의 연구에 따르면 LRU 기법과 최적 교체(OPT) 기법 간의 성능 차이는 매우 크다. LRU 기법의 성능은 많은 연구를 통해서 검증되었지만, 캐쉬 사상방식이 높아질수록 LRU 기법과 OPT 기법의 성능 차이는 증가한다. 본 논문에서는 기존의 LRU 기법을 활용하여 캐쉬 성능을 향상시키는 캐쉬 교체 기법을 제안하였다. 제안된 캐쉬 교체 기법은 캐쉬 블록의 접근율에 따라 교체 대상을 선정하여 캐쉬 블록을 교체시킨다. 제안된 캐쉬 교체 기법은 512KB L2 캐쉬에서 기존의 LRU 기법과 비교하여 평균 15%의 미스율을 감소시켰고, 프로세서 성능은 4.7% 향상됨을 알 수 있다.

Keywords

Acknowledgement

Supported by : 한국연구재단

References

  1. Abraham, S. G. et al., "Predictability of load/store instruction latencies," Proc. of the 26th Annual International Symposium on Micro-architecture, pp. 139-152, 1993.
  2. Belady, L. A., "Study of replacement algorithms for a virtual-storage computer," IBM Systems Journal, Vol. 5, No. 2, pp. 78-101, 1966. https://doi.org/10.1147/sj.52.0078
  3. Mattson, R. L. et al., "Evaluation techniques for storage hierarchies," IBM Systems Journal, Vol. 9, No. 2, pp. 78-117, 1970. https://doi.org/10.1147/sj.92.0078
  4. Smith, A. J., "Cache Memories," ACM Computing Surveys, Vol. 14, No. 3, pp. 473-530, 1982. https://doi.org/10.1145/356887.356892
  5. C. T. Do et al., "A Novel Last-Level Cache Replace ment Policy to Improve the Performance of Mobile Systems," Workshop on Mobile and Wireless 2014 Third, 2014.
  6. W. F. Lin and S. Reinhardt, "Predicting Last-Touch References under Optimal Replacement," Technical Report CSE-TR-447-02, University of Michigan, 2002.
  7. A. C. Lai, C. Fide, and B. Falsafi, "Dead-Block Prediction and Dead-Block Correlating Pre-fetchers," Proc. of the 28th International Symposium on Computer Architecture, pp. 144-154, 2001.
  8. J. Jeong and M. Dubois, "Cache Replacement Algorithms with Non-uniform Miss Costs," IEEE Transactions on Computers, Vol. 55, No. 4, pp. 353-365, Apr. 2006. https://doi.org/10.1109/TC.2006.50
  9. R. Sheikh and M. Kharbutli, "Improving Cache Performance by Combining Cost-Sensitivity and Locality Principles in Cache Replacement Algorithms," Proc. of the IEEE International Conference on Computer Design, pp. 76-83, 2011.
  10. C. Chi and H. Dietz, "Improving Cache Performance by Selective Cache Bypass," Proc. of the 22nd Annual Hawaii International Conference on System Sciences, pp. 277-285, 1989.
  11. Y. Wu et al., "Compiler Managed Micro-Cache Bypassing for High Performance EPIC Processors," Proc. of the 35th Annual ACM/IEEE International Symposium on Microarchitecture, pp. 134-145, 2002.
  12. E. Tam et al., "Active Management of Data Caches by Exploiting Reuse Information," IEEE Transactions on Computers, Vol. 48, No. 11, pp. 1244-1259, Nov. 1999. https://doi.org/10.1109/12.811113
  13. M. Kharbutli and Y. Solihin, "Counter-Based Cache Replacement and Bypassing Algorithms," IEEE Transactions on Computers, Vol. 57, No. 4, pp. 433-447, Nov. 2008. https://doi.org/10.1109/TC.2007.70816
  14. D. Burger et al., "The SimpleScalar Tool Set, Version 3.0," SIGARCH Computer Architecture News, Vol. 25, No. 3, pp. 13-25, Jun. 1997.
  15. SPEC. Standard Performance Evaluation Corporation [Online]. Available: http://www.spec.org