DOI QR코드

DOI QR Code

Bounding Worst-Case Data Cache Performance by Using Stack Distance

  • Liu, Yu (Department of Electrical and Computer Engineering, Southern Illinois University Carbondale Carbondale) ;
  • Zhang, Wei (Department of Electrical and Computer Engineering, Southern Illinois University Carbondale Carbondale)
  • Published : 2009.12.31

Abstract

Worst-case execution time (WCET) analysis is critical for hard real-time systems to ensure that different tasks can meet their respective deadlines. While significant progress has been made for WCET analysis of instruction caches, the data cache timing analysis, especially for set-associative data caches, is rather limited. This paper proposes an approach to safely and tightly bounding data cache performance by computing the worst-case stack distance of data cache accesses. Our approach can not only be applied to direct-mapped caches, but also be used for set-associative or even fully-associative caches without increasing the complexity of analysis. Moreover, the proposed approach can statically categorize worst-case data cache misses into cold, conflict, and capacity misses, which can provide useful insights for designers to enhance the worst-case data cache performance. Our evaluation shows that the proposed data cache timing analysis technique can safely and accurately estimate the worst-case data cache performance, and the overestimation as compared to the observed worst-case data cache misses is within 1% on average.

References

  1. ARNOLD, R., F. MULLER, D. WHALLEY, AND M. HARMON. Bounding worst-case instruction cache performance. In Proc. of the Real-Time Systems Symposium, 1994. https://doi.org/10.1109/REAL.1994.342718
  2. BERG, C., J. ENGBLOM, AND R. WILHELM. 2004. Requirements for an design of a processor with predictable timing. In Proc. of the Dagstuhl Perspectives Workshop on Design of Systems with Predictable Behavior.
  3. BEYLS, K. AND E. D'HOLLANDER. 2001. Reuse Distance as a Metric for Cache Behavior. In Proc. of PDCS'01, Aug.
  4. CASCAVAL, C. AND D. A. PADUA. 2003. Estimating cache misses and locality using stack distance. In Proc. of ICS'03, June. https://doi.org/10.1145/782814.782836
  5. CHEN, T., T. MITRA, A. ROYCHOUDHURY, AND V. SUHENDRA. 2005. Exploiting branch constraints without exhaustive path enumeration. In Proc. of the 5th International Workshop on Worst-Case Execution Time Analysis (WCET) July.
  6. FERDINAND, C. AND R. WILHELM. 1998. On predicting data cache behavior for real-time systems. In Proc. of the ACM SIGPLAN Workshop on Languages, Compilers, and Tools for Embedded System. https://doi.org/10.1007/BFb0057777
  7. HEALY, C. A., D. B. WHALLEY, AND M. G. HARMON. 1995. Integrating the timing analysis of pipelining and instruction caching. In Proc. of the Real-Time Systems Symposium. https://doi.org/10.1109/REAL.1995.495218
  8. HEALY, C. AND D. WHALLEY. 2002. Automatic detection and exploitation of branch constraints for timing analysis. IEEE Transactions on Software Engineering, 28(8). https://doi.org/10.1109/TSE.2002.1027799
  9. http://archi.snu.ac.kr/realtime/benchmark/.
  10. KATHAIL, V., M. SCHLANSKER, AND B. R. RAU. 2000. HPL-PD architecture specification: version 1.1. HPL Technical Report.
  11. KIM, Y. H., M. D. HILLS, AND D. A. WOOD. 1991. Implementing stack simulation for highlyassociative memories. In Computer Sciences Technical Report #997, Univ. of Wisconsin, February.
  12. LI, Y. S., S. MALIK, AND A. WOLFE. 1996. Cache modeling for real-time software: beyond direct mapped instruction caches. In Proc. of the IEEE Real-Time Systems Symposium. https://doi.org/10.1109/REAL.1996.563722
  13. LI, Y. S., S. MALIK, AND A. WOLFE. 1995. Efficient microarchitecture modeling and path analysis for real-time software. In Proc. of the 16th IEEE Real-Time Systems Symposium, Dec. https://doi.org/10.1109/REAL.1995.495219
  14. LIM, S., Y. H. BAE, G. T. JANG, B.-D. RHEE, S. R. MIN, C. Y. PARK, AND C. S. KIM. 1994. An accurate worst case timing analysis technique for RISC processors. In Proc. of the 15th IEEE Real-Time Systems Symposium. https://doi.org/10.1109/REAL.1994.342726
  15. MUCHNICK, S. S. 1997. Advanced compiler design and implementation. Morgan Kaufmann Publishers.
  16. PUAUT, I. AND D. DECOTIGNY. 2002. Low-complexity algorithms for static cache locking in multitasking hard real-time systems. In Proc. of 23th Real-Time Systems Symposium (RTSS'02), Dec. https://doi.org/10.1109/REAL.2002.1181567
  17. RAMAPRASAD, H. AND F. MUELLER. 2005. Bounding worst-case data cache behavior by analytically deriving cache reference patterns. In Proc. of the IEEE Real-Time and Embedded Technology and Applications Symposium. https://doi.org/10.1109/RTAS.2005.12
  18. ROCHANGE, C. AND P. SAINRAT. 2002. Difficulties in computing the WCET for processors with speculative execution. In Proc. of WCET.
  19. STASCHULAT, J. AND R. ERNST. 2006. Worst case timing analysis of input dependent data cache behavior. In Proc. of the 18th Euromicro Conference on Real-Time Systems (ECRTS06). https://doi.org/10.1109/ECRTS.2006.33
  20. Trimaran homepage, http://www.trimaran.org.
  21. VERA, X., B. LISPER, AND J. XUE. 2003. Data cache locking for higher program predictability. In Proc. of the 2003 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems. https://doi.org/10.1145/781027.781062
  22. WHITE, R., F. MULLER, C. HEALY, D. WHALLEY, AND M. HARMON. 1997. Timing analysis for data caches and set-associative caches. In Proc. of the IEEE Real-Time Technology and Applications Symposium, June. https://doi.org/10.1109/RTTAS.1997.601358
  23. WILHELM, R., J. ENGBLOM, A. ERMEDAHL, N. HOLSTI, S. THESING, D. WHALLEY, G. BERNAT, C. FERDINAND, R. HECKMAN, T. MITRA, F. MUELLER, I. PUAUT, P. PUSCHNER, J. STASCHULAT, AND P. STENSTROM. 2007. The Worst-case execution time problem - overview of methods and survey of tools. In ACM Transactions on Embedded Computing Systems, January. https://doi.org/10.1145/1347375.1347389
  24. ZIVOJNOVIC, V., J. MARTINEZ, AND C. SCHL. 1994. DSPstone: A DSP-oriented benchmarking methodology. In Proc. of ICSPAT'94, Oct..

Cited by

  1. A Technique for Fast Process Creation Based on Creation Location vol.5, pp.4, 2011, https://doi.org/10.5626/JCSE.2011.5.4.283
  2. Scratchpad Memory Architectures and Allocation Algorithms for Hard Real-Time Multicore Processors vol.9, pp.2, 2015, https://doi.org/10.5626/JCSE.2015.9.2.51