• 제목/요약/키워드: parallel programming

Search Result 295, Processing Time 0.018 seconds

Implementation and Translation of Major OpenMP Directives for Chip Multiprocessor without using OS (단일 칩 다중 프로세서상에서 운영체제를 사용하지 않은 OpenMP 구현 및 주요 디렉티브 변환)

  • Jeun, Woo-Chul;Ha, Soon-Hoi
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.34 no.4
    • /
    • pp.145-157
    • /
    • 2007
  • OpenMP is an attractive parallel programming model for a chip multiprocessor because there is no standard parallel programming method for a chip multiprocessor and it is easy to write a parallel program in OpenMP. Then, chip multiprocessor systems can have various architectures according to target application programs. So, we need to implement OpenMP in different way for each system. In this paper, we propose the implementation and the effective translation of major OpenMP directives for a chip multiprocessor without using OS to improve the performance without using special hardware and without extending the OpenMP directives. We present the experimental results on our target platform CT3400.

Implementation and Performance Analysis of High Performance Computing Library for Parallel Processing (병렬처리를 위한 고성능 라이브러리의 구현과 성능 평가)

  • 김영태;이용권
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.31 no.7
    • /
    • pp.379-386
    • /
    • 2004
  • We designed a portable parallel library HPCL(High Performance Computing Library) with following objectives: (1) to provide a close relationship between the parallel code and the original sequential code that will help future versions of the sequential code and (2) to enhance performance of the parallel code. The library is an interface written in C and Fortran programming languages between MPI(Message Passing Interface) and parallel programs in Fortran. Performance results were determined on clusters of PC's and IBM SP4.

The Design and Implementation of the ParaC Language (ParaC 언어의 설계 및 구현)

  • Lee, Kyoung-Seok;Woo, Young-Choon;Kim, Jin-Mee;Chi, Dong-Hae
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.11
    • /
    • pp.2903-2913
    • /
    • 1997
  • This paper describes the design and implementation of the ParaC language that supports parallel programming on the shared memory and distributed memory parallel machine. The ParaC language is designed for the effective use of system resources of scalable parallel systems. The goal is achieved by adding parallel and synchronization constructs for shared address spaces, and remote task constructs for distributed address spaces. This paper also shows the translation method, and we implement the translator and the run-time library for parallel execution of extended constructs.

  • PDF

Advanced controller design for AUV based on adaptive dynamic programming

  • Chen, Tim;Khurram, Safiullahand;Zoungrana, Joelli;Pandey, Lallit;Chen, J.C.Y.
    • Advances in Computational Design
    • /
    • v.5 no.3
    • /
    • pp.233-260
    • /
    • 2020
  • The main purpose to introduce model based controller in proposed control technique is to provide better and fast learning of the floating dynamics by means of fuzzy logic controller and also cancelling effect of nonlinear terms of the system. An iterative adaptive dynamic programming algorithm is proposed to deal with the optimal trajectory-tracking control problems for autonomous underwater vehicle (AUV). The optimal tracking control problem is converted into an optimal regulation problem by system transformation. Then the optimal regulation problem is solved by the policy iteration adaptive dynamic programming algorithm. Finally, simulation example is given to show the performance of the iterative adaptive dynamic programming algorithm.

The Implementation of Fast Object Recognition Using Parallel Processing on CPU and GPU (CPU와 GPU의 병렬 처리를 이용한 고속 물체 인식 알고리즘 구현)

  • Kim, Jun-Chul;Jung, Young-Han;Park, Eun-Soo;Cui, Xue-Nan;Kim, Hak-Il;Huh, Uk-Youl
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.15 no.5
    • /
    • pp.488-495
    • /
    • 2009
  • This paper presents a fast feature extraction method for autonomous mobile robots utilizing parallel processing and based on OpenMP, SSE (Streaming SIMD Extension) and CUDA programming. In the first step on CPU version, the algorithms and codes are optimized and then implemented by parallel processing. The parallel algorithms are debugged to maintain the same level of performance and the process for extracting key points and obtaining dominant orientation with respect to key points is parallelized. After extraction, a parallel descriptor via SSE instructions is constructed. And the GPU version also implemented by parallel processing using CUDA based on the SIFT. The GPU-Parallel descriptor achieves an acceleration up to five times compared with the CPU-Parallel descriptor, but it shows the lower performance than CPU version. CPU version also speed-up the four and half times compared with the original SIFT while maintaining robust performance.

Improving Haskell GC-Tuning Time Using Divide-and-Conquer (분할 정복법을 이용한 Haskell GC 조정 시간 개선)

  • An, Hyungjun;Kim, Hwamok;Liu, Xiao;Kim, Yeoneo;Byun, Sugwoo;Woo, Gyun
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.6 no.9
    • /
    • pp.377-384
    • /
    • 2017
  • The performance improvement of a single core processor has reached its limit since the circuit density cannot be increased any longer due to overheating. Therefore, the multicore and manycore architectures have emerged as viable approaches and parallel programming becomes more important. Haskell, a purely functional language, is getting popular in this situation since it naturally supports parallel programming owing to its beneficial features including the implicit parallelism in evaluating expressions and the monadic tools supporting parallel constructs. However, the performance of Haskell parallel programs is strongly influenced by the performance of the run-time system including the garbage collector. Though a memory profiling tool namely GC-tune has been suggested, we need a more systematic way to use this tool. Since GC-tune finds the optimal memory size by executing the target program with all the different possible GC options, the GC-tuning time takes too long. This paper suggests a basic divide-and-conquer method to reduce the number of GC-tune executions by reducing the search area by one-quarter for every searching step. Applying this method to two parallel programs, a maximally independent set and a K-means programs, the memory tuning time is reduced by 7.78 times with accuracy 98% on average.

Hopfield neuron based nonlinear constrained programming to fuzzy structural engineering optimization

  • Shih, C.J.;Chang, C.C.
    • Structural Engineering and Mechanics
    • /
    • v.7 no.5
    • /
    • pp.485-502
    • /
    • 1999
  • Using the continuous Hopfield network model as the basis to solve the general crisp and fuzzy constrained optimization problem is presented and examined. The model lies in its transformation to a parallel algorithm which distributes the work of numerical optimization to several simultaneously computing processors. The method is applied to different structural engineering design problems that demonstrate this usefulness, satisfaction or potential. The computing algorithm has been given and discussed for a designer who can program it without difficulty.

A Hybrid Method for Improvement of Evolutionary Computation (진화 연산의 성능 개선을 위한 하이브리드 방법)

  • 정진기;오세영
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2002.05a
    • /
    • pp.159-165
    • /
    • 2002
  • 진화연산에는 교배, 돌연변이, 경쟁, 선택이 있다. 이러한 과정 중에서 선택은 새로운 개체를 생산하지는 않지만, 모든 해중에서 최적의 해가 될만한 해는 선택하고, 그러지 않은 해는 버리는 판단의 역할을 한다. 따라서 아무리 좋은 해를 만들었다고 해도, 취사 선택을 잘못하면, 최적의 해를 찾지 못하거나, 또 많은 시간이 소요되게 된다. 따라서 본 논문에서는 stochastic한 성질을 갖고 있는 Tournament selection에 Local selection개념을 도입하여, 지역 해에서 벗어나 전역 해를 찾는데, 개선이 될 수 있도록 하였고 Fast Evolutionary Programming의 mutation과정을 개선하고, Genetic Algorithm의 연산자인 crossover와 mutation을 도입하여 Parallel search로 지역 해에서 벗어나 전역 해를 찾는 하이브리드 알고리즘을 제안하고자 한다.

  • PDF

A Dynamic Work Manager for Heterogeneous Cluster Systems (DWM: 이기종 클러스터 시스템의 동적 자원 관리자)

  • Park, Jong-Hyun;Kim, Jun-Seong
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.46 no.6
    • /
    • pp.56-62
    • /
    • 2009
  • Inexpensive high performance computer systems combined with high speed networks and machine independent communication libraries have made cluster computing a viable option for parallel applications. In a heterogeneous cluster environment, efficient resource management is critically important since the computing power of the individual computer system is a significant performance factor when executing applications in parallel. This paper presents a dynamic task manager, called DWM (dynamic work manager). It makes a heterogeneous cluster system fully utilize the different computing power of its individual computer system. We measure the performance of DWM in a heterogeneous cluster environment with several kernel-level benchmark programs and their programming complexity quantitatively. From the experiments, we found that DWM provides competitive performance with a notable reduction in programming effort.