• 제목/요약/키워드: Parallel Implementation

검색결과 878건 처리시간 0.026초

픽셀-병렬 영상처리에 있어서 포맷 컨버터 설계에 관한 연구 (A Study on the Design of Format Converter for Pixel-Parallel Image Processing)

  • 김현기;김현호;하기종;최영규;류기환;이천희
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2001년도 하계종합학술대회 논문집(2)
    • /
    • pp.269-272
    • /
    • 2001
  • In this paper we proposed the format converter design and implementation for real time image processing. This design method is based on realized the large processor-per-pixel array by integrated circuit technology in which this two types of integrated structure is can be classify associative parallel processor and parallel process with DRAM cell. Layout pitch of one-bit-wide logic is identical memory cell pitch to array high density PEs in integrate structure. This format converter design has control path implementation efficiently, and can be utilized the high technology without complicated controller hardware. Sequence of array instruction are generated by host computer before process start, and instructions are saved on unit controller. Host computer is executed the pixel-parallel operation starting at saved instructions after processing start

  • PDF

다중 컴퓨터 망에서 신경회로망 설계를 위한 고속병렬처리 시스템의 구현 (An Implementation of High-Speed Parallel Processing System for Neural Network Design by Using the Multicomputer Network)

  • 김진호;최흥문
    • 전자공학회논문지B
    • /
    • 제30B권5호
    • /
    • pp.120-128
    • /
    • 1993
  • In this paper, an implementation of high-speed parallel processing system for neural network design on the multicomputer network is presented. Linear speedup expandability is increased by reducing the synchronization penalty and the communication overhead. Also, we presented the parallel processing models and their performance evaluation models for each of the parallization methods of the neural network. The results of the experiments for the character recognition of the neural network bases on the proposed system show that the proposed approach has the higher linear speedup expandability than the other systems. The proposed parallel processing models and the performance evaluation models could be used effectively for the design and the performance estimation of the neural network on the multicomputer network.

  • PDF

A NOVEL PARALLEL METHOD FOR SPECKLE MASKING RECONSTRUCTION USING THE OPENMP

  • LI, XUEBAO;ZHENG, YANFANG
    • 천문학회지
    • /
    • 제49권4호
    • /
    • pp.157-162
    • /
    • 2016
  • High resolution reconstruction technology is developed to help enhance the spatial resolution of observational images for ground-based solar telescopes, such as speckle masking. Near real-time reconstruction performance is achieved on a high performance cluster using the Message Passing Interface (MPI). However, much time is spent in reconstructing solar subimages in such a speckle reconstruction. We design and implement a novel parallel method for speckle masking reconstruction of solar subimage on a shared memory machine using the OpenMP. Real tests are performed to verify the correctness of our codes. We present the details of several parallel reconstruction steps. The parallel implementation between various modules shows a great speed increase as compared to single thread serial implementation, and a speedup of about 2.5 is achieved in one subimage reconstruction. The timing result for reconstructing one subimage with 256×256 pixels shows a clear advantage with greater number of threads. This novel parallel method can be valuable in real-time reconstruction of solar images, especially after porting to a high performance cluster.

오디세우스 대용량 검색 엔진을 위한 병렬 웹 크롤러의 구현 (Implementation of a Parallel Web Crawler for the Odysseus Large-Scale Search Engine)

  • 신은정;김이른;허준석;황규영
    • 한국정보과학회논문지:컴퓨팅의 실제 및 레터
    • /
    • 제14권6호
    • /
    • pp.567-581
    • /
    • 2008
  • 웹의 크기가 폭발적으로 증가함에 따라 인터넷에서 정보를 얻는 수단으로서 검색 엔진의 중요성이 부각되고 있다. 검색 엔진은 사용자에게 최신의 정보를 검색 결과로서 제공하기 위해 웹 페이지를 주기적으로 수집하고 이를 데이타베이스에 저장한다. 웹 크롤러는 이러한 목적으로 웹 페이지를 수집하는 프로그램이다. 대부분의 검색 엔진은 제한된 시간 내에 많은 수의 웹 페이지를 수집하기 위해 다수의 머신을 사용하는 병렬 웹 크롤러를 이용한다. 그러나, 병렬 웹 크롤러의 아키텍처와 세부 구현 방법이 잘 알려져 있지 않기 때문에 실제로 병렬 웹 크롤러를 구현하는 데에 어려움이 많다. 본 논문에서는 병렬 웹 크롤러(parallel web crawler)의 아키텍처와 세부 구현 방법을 제시한다. 병렬 웹 크롤러는 다수의 머신에서 웹 페이지를 병렬적으로 수집하기 위해 조정자(coordinator) 대리자(agent) 구조의 2-티어(tier) 모델을 사용한다. 조정자/대리자 모델은 각 머신에서 웹 페이지를 수집하기 위한 다수의 대리자들과 이 대리자들을 관리하기 위한 하나의 조정자로 구성된다. 병렬 웹 크롤러는 웹 페이지를 수집하기 위한 크롤링(crawling) 모듈, 수집한 웹 페이지를 데이타베이스 로딩 포맷으로 변환하기 위한 컨버팅(converting) 모듈, 수집된 웹 페이지의 중요도를 계산하기 위한 랭킹(ranking) 모듈로 구성된다. 본 논문에서는 병렬 웹 크롤러의 각 모듈들을 설명하고, 세부 구현 방법을 설명한다. 마지막으로, 실험을 통해 병렬 웹 크롤러의 성능을 평가하였다. 실험 결과, 제안된 병렬, 웹 크롤러가 수집해야할 웹 페이지 개수와 머신 개수에 따라 확장 가능함을 보였다.

A PARALLEL IMPLEMENTATION OF A RELAXED HSS PRECONDITIONER FOR SADDLE POINT PROBLEMS FROM THE NAVIER-STOKES EQUATIONS

  • JANG, HO-JONG;YOUN, KIHANG
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • 제22권3호
    • /
    • pp.155-162
    • /
    • 2018
  • We describe a parallel implementation of a relaxed Hermitian and skew-Hermitian splitting preconditioner for the numerical solution of saddle point problems arising from the steady incompressible Navier-Stokes equations. The equations are linearized by the Picard iteration and discretized with the finite element and finite difference schemes on two-dimensional and three-dimensional domains. We report strong scalability results for up to 32 cores.

EFFICIENT PARALLEL GAUSSIAN NORMAL BASES MULTIPLIERS OVER FINITE FIELDS

  • Kim, Young-Tae
    • 호남수학학술지
    • /
    • 제29권3호
    • /
    • pp.415-425
    • /
    • 2007
  • The normal basis has the advantage that the result of squaring an element is simply the right cyclic shift of its coordinates in hardware implementation over finite fields. In particular, the optimal normal basis is the most efficient to hardware implementation over finite fields. In this paper, we propose an efficient parallel architecture which transforms the Gaussian normal basis multiplication in GF($2^m$) into the type-I optimal normal basis multiplication in GF($2^{mk}$), which is based on the palindromic representation of polynomials.

Connection Machine CM-2상에서 신경망군(群)의 병렬 구현 (Parallel Implementation of A Neural Network Ensemble on the Connection Machine CM-2)

  • 김대진
    • 전자공학회논문지C
    • /
    • 제34C권1호
    • /
    • pp.28-41
    • /
    • 1997
  • This paper describes a parallel implementation of a neurla network ensemble developed for object recognition on the connection machine CM-2. The implementation ensures that multiple networks are implemented simultaneously starting from different initial weights and all training samples are applied to each network by one sample per a copy of each network. When compared with a sequential implementation, this accelerates the computation speed by O(N.m.n) where N, m, and n are the network, respectively. The speedup in the computation time and the convergence characteristics of sthe modified backpropagation learning precedure were evaluated by two-dimensional object recognition problem.

  • PDF

용접공정 유한요소 해석의 병렬 처리 적용 (Application for parallel computation for finite element analysis of welding processes)

  • 임세영;김주완;최강혁
    • 대한용접접합학회:학술대회논문집
    • /
    • 대한용접접합학회 2004년도 춘계 학술발표대회 개요집
    • /
    • pp.273-275
    • /
    • 2004
  • A parallel multi-frontal solver is developed for finite element analysis of an arc-welding process, which entails phase evolution, heat transfer, and deformations of structure. We verify the code via comparison to a commercial code,SYSWELD. Attention is focused on the implementation of the parallel solver using MPI library, on the speedup by parallel computation, and on the effectiveness of the solver in welding application

  • PDF

Application of a Parallel Asynchronous Algorithm to Some Grid Problems on Workstation Clusters

  • Park, Pil-Seong
    • Ocean and Polar Research
    • /
    • 제23권2호
    • /
    • pp.173-179
    • /
    • 2001
  • Parallel supercomputing is now a must for oceanographic numerical modelers. Most of today's parallel numerical schemes use synchronous algorithms, where some processors that have finished their tasks earlier than others must wait at synchronization points for correct computation. Hence, the load balancing is a crucial factor, however, it is, in general, difficult to achieve on heterogeneous workstation clusters. We devise an asynchronous algorithm that reduces the idle times of faster processors, and discuss application of the algorithm to some grid problems and implementation on a workstation cluster using Message Passing Interface (MPI).

  • PDF