• Title/Summary/Keyword: 병렬 연산 처리

Search Result 554, Processing Time 0.028 seconds

Performance Enhancement of Parallel Prime Sieving Computation with Hybrid Programming and Pipeline Scheduling (하이브리드 프로그래밍과 파이프라인 작업을 통한 병렬 소수 연산 성능 향상)

  • Ryu, Seung-yo;Kim, Dongseung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2015.04a
    • /
    • pp.114-117
    • /
    • 2015
  • 이 논문에서는 소수 추출 방법인 Sieve of Eratosthenes 알고리즘을 병렬화하되 실행시간과 에너지 소모 면에서 개선된 효과를 얻고자 한다. 멀티코어 프로세서의 공유 메모리를 효율적으로 활용하도록 하이브리드 병렬 프로그래밍 모델을 적용하고, 부하 균등화를 정교하게 조절하도록 파이프라인 작업 방식을 도입하였다. 실험결과 이전 방식보다 연산속도가 향상되었고, 에너지 사용량도 감소함을 확인하였다.

Performance Analysis of a Sonar Signal Processing System using TMS320C40 (TMS320C40을 이용한 소나 신호처리시스템의 성능분석)

  • 박광철;문병표;전창호;박성주;이동호
    • Proceedings of the IEEK Conference
    • /
    • 1998.06a
    • /
    • pp.643-646
    • /
    • 1998
  • 소나 시스템과 같이 방대한 양의 연산을 요구하는 고속 신호처리기를 구현하기 위해서는 상용 DSP 칩의 병렬 처리방법은 필요 불가결하다. 본 논문에서는 TI사의 TMS320C40을 이용한 병렬 신호 처리 시스템을 소개한다. TI사의 TMS320C40을 이용한 소나 시스템 신호처리부의 기본 모델을 제시하고, TI에서 제공하는 FFT구현 소스의 분석을 통한 연산의 수학적인 모델을 제시하고 이를 근거로 제안된 모델의 성능을 분석하였다.

  • PDF

A New H.264/AVC CAVLC Parallel Decoding Circuit (새로운 H.264/AVC CAVLC 고속 병렬 복호화 회로)

  • Yeo, Dong-Hoon;Shin, Hyun-Chul
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.45 no.11
    • /
    • pp.35-43
    • /
    • 2008
  • A new effective parallel decoding method has been developed for context-based adaptive variable length codes. In this paper, several new design ideas have been devised for scalable parallel processing, less area, and less power. First, simplified logical operations instead of memory look-ups are used for fast low power operations. Second the codes are grouped based on their lengths for efficient logical operation. Third, up to M bits of input are simultaneously analyzed. For comparison, we have designed the logical operation based parallel decoder for M=8 and a typical conventional method based decoder. High speed parallel decoding is possible with our method. For similar decoding rates (1.57codes/cycle for M=8), our new approach uses 46% less area than the typical conventional method.

A Parallel Spreadsheet-based Monte Carlo Algorithm for Financial Derivatives Pricing (파생 상품의 가치 평가를 위한 몬테카를로 알고리즘에 기반한 병렬 스프레드시트)

  • Lee, Jae-Geun;Kim, Jin-Suk
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.11a
    • /
    • pp.1006-1008
    • /
    • 2005
  • 최근에 계산금융 분야에서 복잡한 수식을 이용한 연산이 증가하고 있다. 그리고 계산금융 분야에서 몬테카를로 시뮬레이션은 대표적인 계산방법 중에 하나이다. 그러나 몬테카를로 시뮬레이션은 많은 반복연산을 수행하므로 연산시간이 오래 걸리는 문제점이 있다. 이러한 문제점을 해결하기 위하여 본 논문에서는 몬테카를로 시뮬레이션과 스프레드시트를 병렬로 처리하였다. 또한 실험을 통하여 병렬 스프레드시트의 계산 노드가 증가함에 따라 파생상품의 계산 시간이 단축되는 것을 보였다.

  • PDF

Analog Parallel Processing Algorithm of CNN-UM for Interframe Change Detection (프레임간의 영상 변화 검출을 위한 CNN-UM의 아날로그 병렬연산처리 알고리즘)

  • 김형석;김선철;손홍락;박영수;한승조
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.40 no.1
    • /
    • pp.1-9
    • /
    • 2003
  • The CNN-UM algorithm which performs the analog parallel subtraction of images has been developed and its application study to the moving target detection has been done. The CNN-UM is the state of the art computation architecture with high computational potential of analog parallel processing. It is one of the strong candidates for the next generation of computing system which fulfills requirement of the real-time image processing. One weakness of the CNN-UM is that its analog parallel processing function is not fully utilized for the inter frame processing. If two subsequent image frames are superimposed with opposite signs on identical capacitors for short time period, the analog subtraction between them is achieved. The Principle of such temporal inter-frame processing algorithm has been described and its mathematical analysis has been done. Practical usefulness of the proposed algorithm has also been verified through the application for moving target detection.

Performance Enhancement of Parallel Prime Sieving with Hybrid Programming and Pipeline Scheduling (혼합형 병렬처리 및 파이프라이닝을 활용한 소수 연산 알고리즘)

  • Ryu, Seung-yo;Kim, Dongseung
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.4 no.10
    • /
    • pp.337-342
    • /
    • 2015
  • We develop a new parallelization method for Sieve of Eratosthenes algorithm, which enhances both computation speed and energy efficiency. A pipeline scheduling is included for better load balancing after proper workload partitioning. They run on multicore CPUs with hybrid parallel programming model which uses both message passing and multithreading computation. Experimental results performed on both small scale clusters and a PC with a mobile processor show significant improvement in execution time and energy consumptions.

A Performance Comparison between Coarray and MPI for Parallel Wave Propagation Modeling and Reverse-time Migration (코어레이와 MPI를 이용한 병렬 파동 전파 모델링과 거꿀 참반사 보정 성능 비교)

  • Ryu, Donghyun;Kim, Ahreum;Ha, Wansoo
    • Geophysics and Geophysical Exploration
    • /
    • v.19 no.3
    • /
    • pp.131-135
    • /
    • 2016
  • Coarray is a parallel processing technique introduced in the Fortran 2008 standard. Coarray can implement parallel processing using simple syntax. In this research, we examined applicability of Coarray to seismic parallel processing by comparing performance of seismic data processing programs using Coarray and MPI. We compared calculation time using seismic wave propagation modeling and one to one communication time using domain decomposition technique. We also compared performance of parallel reverse-time migration programs using Coarray and MPI. Test results show that the computing speed of Coarray method is similar to that of MPI. On the other hand, MPI has superior communication speed to that of Coarray.

Hardware Implementation of Minimized Serial-Divider for Image Frame-Unit Processing in Mobile Phone Camera. (Mobile Phone Camera의 이미지 프레임 단위 처리를 위한 소형화된 Serial-Divider의 하드웨어 구현)

  • Kim, Kyung-Rin;Lee, Sung-Jin;Kim, Hyun-Soo;Kim, Kang-Joo;Kang, Bong-Soon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2007.10a
    • /
    • pp.119-122
    • /
    • 2007
  • In this paper, we propose the method of hardware-design for the division operation of image frame-unit processing in mobile phone camera. Generally, there are two types of the data processing, which are the parallel and serial type. The parallel type makes it possible to process in realtime, but it needs significant hardware size due to many comparators and buffer memories. Compare the serial type with the parallel type, the hardware size of the serial type is smaller than the other because it uses only one comparator, but serial type is not able to process in realtime. To use the hardware resources efficiently, we employ the serial divider since frame-unit operation for image processing does not need realtime process. When compared with both in the same bit size and operating frequency, the hardware size of the serial divider is approximately in the ratio of 13 percentage compared with the parallel divider.

  • PDF

2D Convolution Method Suitable for Hardware Architecture (하드웨어 구조에 적합한 2차원 회선처리 기법)

  • Jung, Yun-Hye;Park, Yong-Jin;Park, Jin-Hong;Byun, Hye-Ran;Han, Tack-Don
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2010.06b
    • /
    • pp.380-383
    • /
    • 2010
  • 다양한 응용프로그램에서 효과적으로 2차원 영상을 처리하기 위해서는 여러 가지 기법들이 이용되는데 그 중 2차원 필터링은 가장 많이 사용되는 방법 중 하나이다. 2차원 필터링에서 회선처리는 수평과 수직 방향의 1차원 선형 필터를 이용하는 방법이다. 2차원 회선처리는 커다란 이미지 위를 커널이 움직이며 연산을 해야 하므로 연산량이 매우 많으며 메모리 접근을 많이 필요로 한다. 하지만 회선처리는 입력화소뿐 아니라 주변 화소 값까지 고려하는 지역적인 동작으로 인해 병렬화된 처리가 가능하다. 이에 본 논문에서는 메모리 접근을 줄이고 연산을 병렬적으로 처리함으로서 회선처리의 수행 시간을 개선하는 하드웨어 기반의 회선처리 방법을 제안한다.

  • PDF

Parallel Architecture Design of H.264/AVC CAVLC for UD Video Realtime Processing (UD(Ultra Definition) 동영상 실시간 처리를 위한 H.264/AVC CAVLC 병렬 아키텍처 설계)

  • Ko, Byung Soo;Kong, Jin-Hyeung
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.5
    • /
    • pp.112-120
    • /
    • 2013
  • In this paper, we propose high-performance H.264/AVC CAVLC encoder for UD video real time processing. Statistical values are obtained in one cycle through the parallel arithmetic and logical operations, using non-zero bit stream which represents zero coefficient or non-zero coefficient. To encode codeword per one cycle, we remove recursive operation in level encoding through parallel comparison for coefficient and escape value. In oder to implement high-speed circuit, proposed CAVLC encoder is designed in two-stage {statical scan, codeword encoding} pipeline. Reducing the encoding table, the arithmetic unit is used to encode non-coefficient and to calculate the codeword. The proposed architecture was simulated in 0.13um standard cell library. The gate count is 33.4Kgates. The architecture can support Ultra Definition Video ($3840{\times}2160$) at 100 frames per second by running at 100MHz.