Analyzing Access Histories for Detecting First Races in Shared-memory Programs

공유메모리 프로그램의 최초경합 탐지를 위한 접근역사 분석

  • Published : 2004.02.01

Abstract

Detecting races is important for debugging shared-memory Parallel programs, because races result in unintended nondeterministic executions of the programs. Particularly, the first races to occur in an execution of a program must be detected because they can potentially affect other races that occur later. Previous on-the-fly techniques that detect such first races based on candidate events that are likely to participate in the first races monitor access events in order to collect the candidate events during a program execution, and try to report the races only from determining the concurrency relationships of the candidates. Such races reported in this way. however, are not guaranteed to be first races, because they are not determined by taking into account how they are affected with each other. This paper presents a new post-mortem technique that analyzes, on each nesting level, candidate events collected from an execution of a shared-memory program with nested parallelism in order to report only first races. This technique is efficient, because it guarantees that first races reported by analyzing a nesting level are the races that occur first at the level, and does not require more analyses to the higher nesting levels than the current level. The Proposed technique facilitates more practical and effective debugging than the previous techniques, because it guarantees to detect first races if candidate events are collected from an execution instance of the program with nested parallelism.

공유메모리 병렬프로그램의 디버깅을 위해서 비결정적인 수행결과를 초래하는 경합을 탐지하는 것은 중요하다. 특히, 프로그램 수행에서 가장 먼저 발생하는 최초경합은 이후에 발생하는 경합에 영향을 줄 수 있으므로 반드시 탐지되어야 한다. 이러한 최초경합을 탐지하기 위해 최초경합에 참여할 가능성이 있는 후보사건들을 수행 중에 수집하는 기존의 기법은 접근사건들을 감시하여 후보사건들을 수집하고, 그들간의 병행성 관계만을 검사하여 경합을 보고한다. 그러나 이렇게 보고된 경합은 경차들간의 영향관계가 고려되지 않으므로 최초경합임을 보장하지 못한다 본 논문에서는 내포병렬성을 가진 병렬프로그램의 수행 중에 수집된 후보사건들을 프로그램 수행 후에 각 내포수준에서 분석하여 영향 받지 않은 경합만을 보고하는 기법을 제안한다. 제안된 기법은 임의의 내포수준까지 분석하여 보고된 최초경합이 그 내포수준 까지는 영향 받지 않은 경합임을 보장하므로, 상위 내포수준에 대한 재분석이 필요 없는 효율적인 최초경합 탐지기법이다. 본 기법은 내포병렬성에서 후보사건들만 수집되면 최초경합을 탐지할 수 있으므로 기존의 기법에 비해서 현실적이고 효과적인 디버깅을 가능하게 한다.

Keywords

References

  1. Netzer, R. H. B., and B. P. Miller, 'What Are Race Conditions? Some Issues and Formalizations,' Letters on Programming Language and Systems, 1(1): 74-88, ACM, March. 1992 https://doi.org/10.1145/130616.130623
  2. Choi, J., and S. L. Min, 'Race Frontier: Reproducing Data Races in Parallel Program Debugging,' 3rd Symp. on Principles and Practice of Parallel Programming, pp. 145-154, ACM, April 1991 https://doi.org/10.1145/109625.109641
  3. Jun, Y., and C. E. McDowell, 'On-the-fly Detection of the First Races in Programs with Nested Parallelism,' 2nd Int'l Conf. on Parallel and Distributed Processing Techniques and Applications, CSREA, Sunnyvale, Calif., Aug. 1996
  4. Kim, J., D. Kim., and Y. Jun, 'Scalable Visualization for Debugging Races in OpenMP Programs,' 3rd Int'l Conf. on Communications in Computing, pp. 259-265, Las Vegas, Nevada, June 2002
  5. Park, H., and Y. Jun, 'Detecting the First Races in Parallel Programs with Ordered Synchronization,' 6th Int'l Conf. on Parallel and Distributed Systems, pp. 201-208, IEEE, Tainan, Taiwan, Dec. 1998 https://doi.org/10.1109/ICPADS.1998.741043
  6. Chandra R., L. Dagum, D. Kohr, D. Maydan, J. McDonald, and R. Menon, Parallel Programming in OpenMP, Academic Press, 2001
  7. OpenMP Architecture Review Board, OpenMP Fortran Application Program Interface, Ver. 2.0, Nov. 2000
  8. Sato, M., S. Satoh, M. Kusano, and Y. Tanaka, 'Design of OpenMP Compiler for an SMP Cluster,' 1st European Workshop on OpenMP, Lund, Sweden, Sept. 1999
  9. Dinning, A., and E. Schonberg, 'An Empirical Comparison of Monitoring Algorithms for Access Anomaly Detection,' 2nd Symp. on Principles and Practice of Parallel Programming, pp. 1-10, ACM, 1990
  10. Ronsse, M., and K. De Bosschere, 'RecPlay: A Fully Integrated Practical Record/Replay System,' Tr. on Computer Systems, 17(2): 133-152, ACM, May 1999 https://doi.org/10.1145/312203.312214