Efficient Indirect Branch Predictor Based on Data Dependence

효율적인 데이터 종속 기반의 간접 분기 예측기

  • Paik Kyoung-Ho (Division of Information Technology Engineering, Soonchunhyang University) ;
  • Kim Eun-Sung (Division of Information Technology Engineering, Soonchunhyang University)
  • 백경호 (순천향대학교 정보기술공학부) ;
  • 김은성 (순천향대학교 정보기술공학부)
  • Published : 2006.07.01

Abstract

The indirect branch instruction is a most substantial obstacle in utilizing ILP of modem high performance processors. The target address of an indirect branch has the polymorphic characteristic varied dynamically, so it is very difficult to predict the accurate target address. Therefore the performance of a processor with speculative methodology is reduced significantly due to the many execution cycle delays in occurring the misprediction. We proposed the very accurate and novel indirect branch prediction scheme so called data-dependence based prediction. The predictor results in the prediction accuracy of 98.92% using 1K entries, and. 99.95% using 8K But, all of the proposed indirect predictor including our predictor has a large hardware overhead for restoring expected target addresses as well as tags for alleviating an aliasing. Hence, we propose the scheme minimizing the hardware overhead without sacrificing the prediction accuracy. Our experiment results show that the hardware is reduced about 60% without the performance loss, and about 80% sacrificing only the performance loss of 0.1% in aspect of the tag overhead. Also, in aspect of the overhead of storing target addresses, it can save the hardware about 35% without the performance loss, and about 45% sacrificing only the performance loss of 1.11%.

간접 분기 명령은 현대적인 고성능 프로세서의 ILP를 제한하는 가장 심각한 장애 요인 중 하나이다. 다른 분기 명령들과는 다르게 간접 분기는 그 타켓 주소가 동적으로 다형태로 변하기 때문에 이를 예측하기 매우 어려우며, 투기적 실행 방식을 사용하는 대부분의 현대적인 고성능 프로세서에서는 예측이 잘못되는 경우에 많은 수행 사이클 지연이 일어나게 되어 프로세서의 성능이 크게 떨어지게 된다. 우리는 예측 정확도가 아주 뛰어난 새로운 개념의 간접 분기 예측 방식 즉, 간접 분기 명령과 이와 데이터 종속 관계를 가진 이 명령어 보다 훨씬 앞서 수행되는 명령어의 레지스터 내용을 결합시켜 간접 분기의 타켓을 예측해내는 방식을 제안하였다. 1K의 예측기를 사용하는 경우에 98.92%의 예측 정확도를 보이고, 8K의 크기를 사용하면 거의 완벽한 99.95%의 정확도를 보인다. 그러나 지금까지 제안된 모든 예측기가 그러하듯이 예상 타켓 주소와 함께 앨리어싱 문제를 완화시키기 위한 태그를 저장하기 위한 하드웨어 오버헤드가 크다는 단점을 안고 있다. 그러므로 본 논문에서는 예측 정확 도의 손실없이도 예측기의 하드웨어 오버헤드를 최소한으로 줄이는 방법을 제안한다. 실험 결과로써 태그 저장에 따른 하드웨어를 성능 손실 없이 약 60%를 줄일 수 있으며, 0.1%의 손실을 감수하면 약 80%까지 줄일 수 있다. 또한 부분 타켓 저장으로 인한 성능 손실 없이 타켓 주소 저장에 따른 하드웨어를 약 35% 절약할 수 있으며, 1.11%의 손실을 감수하면 약 45%까지 절약할 수 있다.

Keywords

References

  1. D. W. Wall, 'Limits of Instruction-Level Parallelism', 4th Int'l Conf. on Architectural Support for Programming Languages and Operating Systems, pp. 176-188, Santa Clara, U.S.A., Apr. 1991 https://doi.org/10.1145/106972.106991
  2. E. Sprangle and D. Carmean, 'Increasing Processor Performance by Implementing Deeper Pipelines', 29th Int'l Symp, on Computer Architecture, pp. 25-34, Anchorage, U.S.A., May 2002 https://doi.org/10.1109/ISCA.2002.1003559
  3. B. A. Fields, R. Bodik, M. D. Hill and C. J. Newburn, 'Using Interaction Costs for Microarchitectural Bottleneck Analysis', 36th Int'l Symp. on Microarchitecture, pp. 228-242, San Diego, U.S.A, Dec. 2003 https://doi.org/10.1109/MICRO.2003.1253198
  4. T. Y. Yeh and Y. N. Patt, 'Alternative Implementation of Two Level Adaptive Training Branch Predictions', 19th Int'l Symp. on Computer Architecture, pp. 124-134, Gold Coast, Australia, May' 1992 https://doi.org/10.1145/285930.286004
  5. R. Nair, 'Dynamic Path-Based Branch Correlation', 28th Int'l Symp. on Micro architecture, pp. 15-23, Ann Arbor, U.S.A., Nov. 1995 https://doi.org/10.1109/MICRO.1995.476809
  6. G. H. Loh and D. S. Henry, 'Predicting Conditional Branches with Fusion-based hybrid Predictors', 11th Conf. on Parallel Architectures and Compilation Techniques, pp. 165-176, Charlottesville, U.S.A., ' Sep, 2002 https://doi.org/10.1109/PACT.2002.1106015
  7. A. Seznec, S. Felix, V. Krishnan and Y. Sazeides, 'Design Tradeoffs for the Alpha EV8 Conditional Branch Predictor', 29th Int'l Symp. on Computer Architecture, pp. 295-300, Anchorage, U.S.A., May 2002 https://doi.org/10.1109/ISCA.2002.1003587
  8. D. A. jimenes, 'Piecewise Linear Branch Prediction', 32nd Int'l Symp, on Computer Architecture, pp. 382-393, Madison, U.S.A., June 2005 https://doi.org/10.1145/1080695.1070002
  9. B. Calder and D. Grunwald, 'Fast & Accurate Instruction Fetch and Branch Prediction', 21th Int'l Symp, on Computer Architecture, pp. 2-11. Chicago, U.S.A., Apr. 1994 https://doi.org/10.1145/191995.192011
  10. K. Driesen and U. Holzle, 'Limits of Indirect Branch Prediction', Technical Report TRCS97-10, Univ. of California Santa Barbara, June 1997
  11. O. J. Santana, A. Falcon, E.' Fernandez, P, Medina, A. Ramirez and M. Volero, 'A Comprehensive Analysis of Indirect Branch Prediction', 4th Int'l Symp. on High Performance Computing, pp. 133-145, Kansay Science City, Japan, May 2002
  12. B. Calder and D. Grunwald, 'Reducing Indirect Function Call Overhead in C++ Programs', 21st Symp, on Principles of Programming Languages, pp. 397-408, Portland, U.S.A., Jan. 1994 https://doi.org/10.1145/174675.177973
  13. K. Driesen and U. Holzle, 'The Direct Cost of Virtual Function Calls in C++', 11th Conf. on Object-Oriented Programming Systems, Languages and Applications, pp. 306-323, San Jose, U.S.A., June 1996
  14. O. Zendra and K. Driesen, 'Stress-testing Control Structures for Dynamic Dispatch in Java', Proc. of the 2nd Java Virtual Machine Research and Technology Symp., pp. 105-118, San Francisco, U.S.A., Aug. 2002
  15. P. Y. Chang, E. Hao and Y. N. Patt, 'Target Prediction for Indirect Jumps', 24th Int'l Symp, on Computer Architecture, pp. 274-283, Denver, U.S.A., June 1997 https://doi.org/10.1145/264107.264209
  16. K. Driesen and U. Holzle, 'Accurate Indirect Branch Prediction', 25th Int'l Symp. on Computer Architecture, pp. 167-178, Barcelona, Spain, July 1998 https://doi.org/10.1109/ISCA.1998.694772
  17. 백경호,김은성,'간접 분기의 다형태 타켓 주소의 정확한 예측',대한전자공학회논문지,제41권 CI편 제6호,pp. 511-521,2004년 11월
  18. K. Driesen and U. Holzle, 'The Cascaded Predictor: Economical and Adaptive Branch Target Prediction', 31th Int'l Symp. on Microarchitecture, pp. 249-258, Dallas, U.S.A., Dec. 1998
  19. J. Kalamatianos and D. R. Kaeli, 'Predicting Indirect Branches via Data Compression', 31th Int'l Symp. on Microarchitecture, pp. 272-281, Dallas, U.S.A., Dec. 1998 https://doi.org/10.1109/MICRO.1998.742789
  20. J. Kalamatianos and D. R. Kaeli, 'Improving the Accuracy of Indirect Branch Prediction via Branch Classification', Technical Report ECECEG-98-008, Northeastern University, Boston, Mar. 1998
  21. K. Driesen and U. Holzle, 'Multi-Stage Cascaded Prediction', 5th Int'l Euro-Par Conf. on Parallel Processing, pp. 1312-1321, Toulouse, France, Aug. 1999
  22. A. Roth, A. Moshovos and G. S. Sohi, 'Improving Virtual Function Call Target Prediction via Dependence-Based Pre- Computation', 13th Int'l Conf. on Super computing, pp. 356-364, Rhodes, Greece, June 1999 https://doi.org/10.1145/305138.305213
  23. M. A. Ertl and D. Gregg, 'Optimizing Indirect Branch Prediction Accuracy in Virtual Machine Interpreters', Conf. on Programming Language Design and Implementation, pp.278-288, San Diego, U.S.A., June 2003 https://doi.org/10.1145/781131.781162
  24. M. A. Ertl and D. Gregg, 'The Structure and Performance of Efficient Interpreters', Journal of Instruction-Level Parallelism, Vol. 5, Nov. 2003
  25. T. Austin, E. Larson and D. Ernst,' Simplescalar: An Infrastructure for Computer System Modeling', IEEE Computer Society, pp. 59-67, Feb. 2002 https://doi.org/10.1109/2.982917