• Title/Summary/Keyword: openMP

Search Result 178, Processing Time 0.023 seconds

Performance Analysis of Embedded Applications (임베디드 응용 프로그램 성능 분석)

  • 김선욱;오재근;한영선;최홍욱;김철우
    • Proceedings of the IEEK Conference
    • /
    • 2003.07d
    • /
    • pp.1355-1358
    • /
    • 2003
  • This paper presents performance analysis of the embedded application, called EEMBC consisting of 5 categories and total 34 applications. We measured various performance metrics, such as code sizes, TLP(Thread Level Parallelism) using OpenMP API and ILP(Instruction Level Parallelism) on ARM-modeled SimpleScalar in detail. We show that the embedded applications have the similar characteristics as integer applications to deliver low ILP and TLP in our environment. They have many small loops, which result in large instruction overhead in TLP and loop control overhead in ILP.

  • PDF

Parallel Programming Models and Examples (병렬 프로그래밍 모델 및 사례 연구)

  • Chung, Y.H.;Park, J.W.
    • Electronics and Telecommunications Trends
    • /
    • v.13 no.4 s.52
    • /
    • pp.32-42
    • /
    • 1998
  • 본 고는 최근 들어 활발하게 연구가 진행중인 병렬 처리 분야 중에서 여러 가지 병렬 프로그래밍 방법에 대한 정의 및 특징을 살펴보고, 대표적인 사례에 대해 요약해본다. 먼저 데이터 병렬성을 이용한 프로그래밍 방법과 대표적인 프로그래밍 언어 HPF에 대해 살펴본 후, 어드레스 공간이 공유되는 공유 메모리/분산공유 메모리 시스템에서의 프로그래밍 방법과 최근 표준화 작업이 진행중인 OpenMP에 대해서 알아본다. 끝으로 어드레스 공간이 공유되지 않는 분산 메모리 시스템에서의 프로그래밍 방법과 표준 메시지 패싱 인터페이스인 MPI에 대해 서술한다.

Parallelized Matrix Operation for Fast Computations of Antenna Characteristics (안테나 특성 고속 계산을 위한 병렬화 행렬 연산)

  • Cho, Yong-Heui
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2015.05a
    • /
    • pp.61-62
    • /
    • 2015
  • 밀리미터파 대역에서 사용하는 대형 안테나 해석 속도를 개선하기 위한 병렬형 행렬 연산법을 제안한다. 기존의 가우스 소거법을 병렬화하기 위해 행렬 분해와 반복법을 이용한다. 또한, 반복법의 수렴성을 높이기 위해 이전 행렬해를 부분적으로 사용하여 분해 행렬을 구성하는 방식도 제시한다. 본 제안법은 OpenMP, MPI, CUDA 등의 병렬법과 함께 사용할 수 있다.

  • PDF

병렬 영상처리 기반의 고속 머신 비전기술동향

  • Park, Eun-Su;Choe, Hak-Nam;Kim, Jun-Cheol;Jeong, Eum-Han;Kim, Hak-Il
    • ICROS
    • /
    • v.15 no.3
    • /
    • pp.31-39
    • /
    • 2009
  • 본 고에서는 병렬 영상처리를 이용한 고속 머신 비전(Machine Vision) 기술의 동향에 관해 다룬다. 머신 비전에서 사용되는 대표적인 고속 상용 영상처리 라이브러리인 MIL, HALCON, IPP에 대해 소개하고 현재 활발히 연구되고 있는 SSE, OpenMP, CUDA와 같은 병렬 처리 기술에 대하여 알아 본다. 이러한 병렬 처리 기술을 실제 영상처리 알고리즘에 적용하여 그 성능을 고속 상용 영상처리 라이브러리의 성능과 비교하여 소개된 병렬 처리 기술을 실제 PCB 기판 자동검사와 같은 머신 비전에 적용한 연구사례에 대해서 알아본다.

Characteristics of HPC(High-performance Computing)-based Parallel Processing on Electromagnetic Scattering Problems (전자파 산란 문제에서의 고성능 컴퓨팅(HPC) 기반 병렬 처리 특성)

  • Cho, Yong-Heui
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2017.05a
    • /
    • pp.37-38
    • /
    • 2017
  • 금속으로 이루어진 긴 선이나 구에 대한 전자파 산란 특성을 계산할 때, 산란 계산 속도를 개선하기 위해 사용하는 고성능 컴퓨팅(HPC) 기반 병렬 처리 특성을 제시한다. 산란 행렬 생성, 가우스 소거법, 산란파 계산 등으로 이루어진 전자파 산란 문제는 병렬 처리를 통해 계산 속도를 높일 수 있다. 산란 문제의 계산 절차를 분석하여 병렬화에 유리한 계산 작업을 분류한 후 OpenMP 기반 병렬화를 적용한다.

  • PDF

Operating Characteristics on Coupling of Fuel-Cell System with Natural Gas Reformer (휴대전원용 직접알코올 연료전지의 OCV특성 연구)

  • Park, Se-Joon;Choi, Yong-Sung;Lee, Kyung-Sup
    • The Transactions of the Korean Institute of Electrical Engineers P
    • /
    • v.58 no.4
    • /
    • pp.592-596
    • /
    • 2009
  • DAFC(direct alcohol fuel cell) takes the same structure and operational principle with PEMFC(Proton exchange membrane fuel cell). However, DAFC, which uses liquid alcohol instead of hydrogen as fuel, is able to be used as a portable power for small-scaled electronic devices such as MP3, PMP, and mobile phone because alcohol is quite convenient steady-state compound to carry and store it. This paper presents the OCV(open circuit voltage) characteristics of the cases which are alcohol species and different weight rate of ethanol, respectively. The OCV of methanol fuel cell is slightly higher 0.2V than ethanol one, and 8% wt. rate ethanol is rated as the most appropriate fuel for DAFC.

Implementation of Integrated CPU-GPU for Efficient Uniform Memory Access Method and Verification System (CPU-GPU간 긴밀성을 위한 효율적인 공유메모리 접근 방법과 검증 시스템 구현)

  • Park, Hyun-moon;Kwon, Jinsan;Hwang, Tae-ho;Kim, Dong-Sun
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.11 no.2
    • /
    • pp.57-65
    • /
    • 2016
  • In this paper, we propose a system for efficient use of shared memory between CPU and GPU. The system, called Fusion Architecture, assures consistency of the shared memory and minimizes cache misses that frequently occurs on Heterogeneous System Architecture or Unified Virtual Memory based systems. It also maximizes the performance for memory intensive jobs by efficient allocation of GPU cores. To test between architectures on various scenarios, we introduce the Fusion Architecture Analyzer, which compares OpenMP, OpenCL, CUDA, and the proposed architecture in terms of memory overhead and process time. As a result, Proposed fusion architectures show that the Fusion Architecture runs benchmarks 55% faster and reduces memory overheads by 220% in average.

A Study of Performance Advanced Technique of the OFP on Multi-Core (멀티 코어 기반의 OFP 성능 향상 기법 연구)

  • Jang, Hyun-Seok;Won, Hyeon-Kwon;Kim, In-Gyu;Ha, Seok-Wun
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2012.05a
    • /
    • pp.270-273
    • /
    • 2012
  • In this paper, I present the design of Operational Flight Programs(OFPs) on a Multi-Core based Mission Computer(MC) for the optimized performance of the OFPs on Multi-Core based MC. The program assigned as tasks on Multi-Core environment can be scheduled by designing with the use of OpenMp, which is the standard for parallel programming. This paper also describes the differences between Multi-Core Program(MCP) on the technique and Single-Core Program(SCP) in terms of performance aspect. The new proposed design technique is applied to the Integrated Up-Front Control OFP(IUFC OFP) on General Processor Module where Multi-Core based. This paper describes the Multi-Core design technique for the optimized performance of the IUFC OFP, which display and control flight data(Navigation, Communication, Identification Friend or Foe) to pilot.

  • PDF

GPU-ACCELERATED SPECKLE MASKING RECONSTRUCTION ALGORITHM FOR HIGH-RESOLUTION SOLAR IMAGES

  • Zheng, Yanfang;Li, Xuebao;Tian, Huifeng;Zhang, Qiliang;Su, Chong;Shi, Lingyi;Zhou, Ta
    • Journal of The Korean Astronomical Society
    • /
    • v.51 no.3
    • /
    • pp.65-71
    • /
    • 2018
  • The near real-time speckle masking reconstruction technique has been developed to accelerate the processing of solar images to achieve high resolutions for ground-based solar telescopes. However, the reconstruction of solar subimages in such a speckle reconstruction is very time-consuming. We design and implement a new parallel speckle masking reconstruction algorithm based on the Compute Unified Device Architecture (CUDA) on General Purpose Graphics Processing Units (GPGPU). Tests are performed to validate the correctness of our program on NVIDIA GPGPU. Details of several parallel reconstruction steps are presented, and the parallel implementation between various modules shows a significant speed increase compared to the previous serial implementations. In addition, we present a comparison of runtimes across serial programs, the OpenMP-based method, and the new parallel method. The new parallel method shows a clear advantage for large scale data processing, and a speedup of around 9 to 10 is achieved in reconstructing one solar subimage of $256{\times}256pixels$. The speedup performance of the new parallel method exceeds that of OpenMP-based method overall. We conclude that the new parallel method would be of value, and contribute to real-time reconstruction of an entire solar image.