• Title/Summary/Keyword: nested parallelism

Search Result 30, Processing Time 0.023 seconds

Extended Three Region Partitioning Method of Loops with Irregular Dependences (비규칙 종속성을 가진 루프의 확장된 세지역 분할 방법)

  • Jeong, Sam-Jin
    • Journal of the Korea Convergence Society
    • /
    • v.6 no.3
    • /
    • pp.51-57
    • /
    • 2015
  • This paper proposes an efficient method such as Extended Three Region Partitioning Method for nested loops with irregular dependences for maximizing parallelism. Our approach is based on the Convex Hull theory, and also based on minimum dependence distance tiling, the unique set oriented partitioning, and three region partitioning methods. In the proposed method, we eliminate anti dependences from the nested loop by variable renaming. After variable renaming, we present algorithm to select one or more appropriate lines among given four lines such as LMLH, RMLH, LMLT and RMLT. If only one line is selected, the method divides the iteration space into two parallel regions by the selected line. Otherwise, we present another algorithm to find a serial region. The selected lines divide the iteration space into two parallel regions as large as possible and one or less serial region as small as possible. Our proposed method gives much better speedup and extracts more parallelism than other existing three region partitioning methods.

Detecting the First Race in OpenMP Program with Nested Parallelism (내포 병렬성을 가지는 OpenMP 프로그램의 최초 경합 탐지)

  • Chon, Byoung-Gyu;Woo, Jong-Jung;Jun, Yong-Kee
    • The KIPS Transactions:PartA
    • /
    • v.8A no.3
    • /
    • pp.253-260
    • /
    • 2001
  • It is important to detect races for debugging shared-memoy parallel programs, because the races cause unintended nondeterministic program execution. Previous on-the-fly techniques to detect races can not guarantee the first race detection in nested parallel programs. Detecting the first race is important for debugging parallel programs, since the removal of the first race may make the next occurred races disappear. In this paper, we presents an on-the-fly detection technique to detect all of the first races through the reexecution of the debugged programs. We assume that the debugged parallel program may have one-way nested parallel programs. The number of reexecution is at the least the nesting depth of the program in the worst case. The space complexity is O(VT) and the time complexity to detect race in each access of access history is O(T), where V is number of shared variables and T is the maximum parallelism of the program. This efficiency of our technique in each execution is the same with the previous on-the-fly detection techniques. Therefore, this technique makes debugging parallel programs more effective and practical.

  • PDF

A Program Restructuring framework for Parallel Processing (병렬처리를 위한 프로그램 재구조화)

  • 송월봉
    • Journal of the Korea Computer Industry Society
    • /
    • v.4 no.4
    • /
    • pp.501-508
    • /
    • 2003
  • In this paper A new theory of linear loop transformation called Elimination of Data Dependency(BDD) is presented. The current framework of linear loop transformation cannot identify a significant fraction of parallelism. For this reason, a method to extract the maximum loop parallelism in perfect nested loops is presented. This technique is applicable to general loop nests where the dependence include both distance and directions.

  • PDF

Analyzing Access Histories for Detecting First Races in Shared-memory Programs (공유메모리 프로그램의 최초경합 탐지를 위한 접근역사 분석)

  • 강문혜;김영주;전용기
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.31 no.1_2
    • /
    • pp.41-50
    • /
    • 2004
  • Detecting races is important for debugging shared-memory Parallel programs, because races result in unintended nondeterministic executions of the programs. Particularly, the first races to occur in an execution of a program must be detected because they can potentially affect other races that occur later. Previous on-the-fly techniques that detect such first races based on candidate events that are likely to participate in the first races monitor access events in order to collect the candidate events during a program execution, and try to report the races only from determining the concurrency relationships of the candidates. Such races reported in this way. however, are not guaranteed to be first races, because they are not determined by taking into account how they are affected with each other. This paper presents a new post-mortem technique that analyzes, on each nesting level, candidate events collected from an execution of a shared-memory program with nested parallelism in order to report only first races. This technique is efficient, because it guarantees that first races reported by analyzing a nesting level are the races that occur first at the level, and does not require more analyses to the higher nesting levels than the current level. The Proposed technique facilitates more practical and effective debugging than the previous techniques, because it guarantees to detect first races if candidate events are collected from an execution instance of the program with nested parallelism.

(A Design and Implementation of Parallelizing Compiler in Loop Structure) (루프구조의 병렬화 컴파일러 설계 및 구현)

  • 송월봉
    • Journal of the Korea Computer Industry Society
    • /
    • v.3 no.8
    • /
    • pp.981-988
    • /
    • 2002
  • In this paper, a simple parallel compiler of a sequential loop is presented. This is a procedure for the automatic conversion of a sequential loop into a nested parallel DOALL loops at compile time. For this. the source program of Parafrase II parallel compiler is analyzed and a new general method the extracting parallelism in order to parallel processing effectively in nested loop is implemented.

  • PDF

Data Dependency Elimination for Parallelism in nested Loops (중첩루프에서 병렬화를 위한 자료 종속성제거)

  • Song, Wol-Bong;Park, Du-Sun
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.6
    • /
    • pp.1494-1506
    • /
    • 1998
  • 본 논문에서는 루프구조의 효율적인 병렬수행을 위한 병렬성 추출에 대하여 불변과 가변 종속거리에 모두적용할 수 있는 통합된 새로운 기법을 제시한다. 이것은 컴파일시간에 순차 루프를 중첩된 DOALL 루프로의 자동 변환에 대한 절차로서, 중첩 루프의 전체적인 병렬화를 하기 위하여 문장들을 반복적으로 수행시키는 것에 의해서 자료 종속을 효과적으로 제거하는 알고리즘이다. 본 논문에 제시된 방법은 성능평가에서도 매우 뛰어난 방법임을 보였다.

  • PDF

Abstract Visualization for Effective Debugging of Parallel Programs Based on Multi-threading (멀티 스레딩 기반 병렬 프로그램의 효과적인 디버깅을 위한 추상적 시각화)

  • Kim, Young-Joo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.3
    • /
    • pp.549-557
    • /
    • 2016
  • It is important for effective visualization to summarize not only a large amount of debugging information but also the mental models of abstract ideas. This paper presents an abstract visualization tool which provides effective visualization of thread structure and race information for OpenMP programs with critical sections and nested parallelism, using a partial order execution graph which captures logical concurrency among threads. This tool is supported by an on-the-fly trace-filtering technique to reduce space complexity of visualization information, and a graph abstraction technique to reduce visual complexity of nested parallelism and critical sections in the filtered trace. The graph abstraction of partial-order relation and race information is effective for understanding program execution and detecting to eliminate races, because the user can examine control flow of program and locations of races in a structural fashion.

A Post-mortem Detection Tool of First Races to Occur in Shared-Memory Programs with Nested Parallelism (내포병렬성을 가진 공유메모리 프로그램에서 최초경합의 수행후 탐지도구)

  • Kang, Mun-Hye;Sim, Gab-Sig
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.4
    • /
    • pp.17-24
    • /
    • 2014
  • Detecting data races is important for debugging shared-memory programs with nested parallelism, because races result in unintended non-deterministic executions of the program. It is especially important to detect the first occurred data races for effective debugging, because the removal of such races may make other affected races disappear or appear. Previous dynamic detection tools for first race detecting can not guarantee that detected races are unaffected races. Also, the tools does not consider the nesting levels or need support of other techniques. This paper suggests a post-mortem tool which collects candidate accesses during program execution and then detects the first races to occur on the program after execution. This technique is efficient, because it guarantees that first races reported by analyzing a nesting level are the races that occur first at the level, and does not require more analyses to the higher nesting levels than the current level.

An Improving Method of Restructuring Parallel Programs for Data Race Detection

  • Ha, Keum-Sook;Lee, Sung woo;Yoo, Kee-Young
    • Proceedings of the IEEK Conference
    • /
    • 2000.07b
    • /
    • pp.715-718
    • /
    • 2000
  • Although shared memory parallel programs are designed to be deterministic both in their final results and intermediate states, the races that occur when different processes access a common memory location in an order not guaranteed by synchronization could result in unintended non-deterministic executions of the program. So, Detecting races, particularly first data races, is important for debugging explicit shared memory parallel programs. It is possible that all data races reported by other on-the-fly algorithms would disappear once the first races were removed. To detect races parallel programs with nested loops and inter-thread coordination, it must guarantee the order of synchronization operations in an execution instance. In this paper, we propose an improved restructuring method that guarantee ordering execution instance and preserve the semantics of original program. This method requires O(np) time and (s + up) space, where n is the number of total operations, s is the number of synchronization operations and p is the number of parallelism in the execution. Also, this method makes on-the-fly detection of parallel program with nested loops and inter-thread coordination more easily in space and time complexity.

  • PDF

A Implementation of Loop Interchange Parallel Compiler (루프인터체인지 병렬컴파일러 구현)

  • Song, Worl-Bong
    • Journal of the Korea Computer Industry Society
    • /
    • v.8 no.3
    • /
    • pp.167-172
    • /
    • 2007
  • Generally, In a application program the core part for parallel processing is a loop. therefore in this paper, loop interchange parallel compiler is proposed. this is a procedure for the automatic conversion of a loop interchange. According to execution to the outside CDOALL statements of cedar fortran, loop interchange is more effectively method the extracting parallelism in order to parallel processing in iterations. This method will be expected to effectively execution result with mixed into linear conversion and go far toward solving the effectively implementation of the non-unimodular nested loop.

  • PDF