• 제목/요약/키워드: Parallel Implementation

Search Result 883, Processing Time 0.027 seconds

A Study on Hybrid Image Coder Using a Reconfigurable Multiprocessor System (Study II : Parallel Algorithm Implementation (재구성 가능한 다중 프로세서 시스템을 이용한 혼합 영상 부호화기 구현에 관한 연구(연구 II : 병렬 알고리즘 구현))

  • Choi, Sang-Hoon;Lee, Kwang-Kee;Kim, In;Lee, Yong-Kyun;Park, Kyu-Tae
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.30B no.10
    • /
    • pp.13-26
    • /
    • 1993
  • Motion picture algorithms are realized on the multiprocessor system presented in the Study I. For the most efficient processing of the algorithms, pipelining and geometrical parallel processing methods are employed, and processing time, communication load and efficiency of each algorithm are compared. The performance of the implemented system is compared and analysed with reference to MPEG coding algorithm. Theoretical calculations and experimental results both shows that geometrical partitioning is a more suitable parallel processing algorithm for moving picture coding having the advantage of easy algorithm modification and expansion, and the overall efficiency is higher than pipelining.

  • PDF

Application of the Hamiltonian circuit Latin square to a Parallel Routing Algorithm on Generalized Recursive Circulant Networks

  • Choi, Dongmin;Chung, Ilyong
    • Journal of Korea Multimedia Society
    • /
    • v.18 no.9
    • /
    • pp.1083-1090
    • /
    • 2015
  • A generalized recursive circulant network(GR) is widely used in the design and implementation of local area networks and parallel processing architectures. In this paper, we investigate the routing of a message on this network, that is a key to the performance of this network. We would like to transmit maximum number of packets from a source node to a destination node simultaneously along paths on this network, where the ith packet traverses along the ith path. In order for all packets to arrive at the destination node securely, the ith path must be node-disjoint from all other paths. For construction of these paths, employing the Hamiltonian Circuit Latin Square(HCLS), a special class of (n x n) matrices, we present O(n2) parallel routing algorithm on generalized recursive circulant networks.

A Study of Parallel Implementations of the Chimera Method using Unsteady Euler Equations (비정상 Euler 방정식을 이용한 Chimera 기법의 병렬처리에 관한 연구)

  • Cho K. W.;Kwon J. H.;Lee S.S
    • Journal of computational fluids engineering
    • /
    • v.4 no.3
    • /
    • pp.52-62
    • /
    • 1999
  • The development of a parallelized aerodynamic simulation process involving moving bodies is presented. The implementation of this process is demonstrated using a fully systemized Chimera methodology for steady and unsteady problems. This methodology consists of a Chimera hole-cutting, a new cut-paste algorithm for optimal mesh interface generation and a two-step search method for donor cell identification. It is fully automated and requires minimal user input. All procedures of the Chimera technique are parallelized on the Cray T3E using the MPI library. Two and three-dimensional examples are chosen to demonstrate the effectiveness and parallel performance of this procedure.

  • PDF

Numerical Simulation of Natural Convection in Annuli with Internal Fins

  • Ha, Man-Yeong;Kim, Joo-Goo
    • Journal of Mechanical Science and Technology
    • /
    • v.18 no.4
    • /
    • pp.718-730
    • /
    • 2004
  • The solution for the natural convection in internally finned horizontal annuli is obtained by using a numerical simulation of time-dependent and two-dimensional governing equations. The fins existing in annuli influence the flow pattern, temperature distribution and heat transfer rate. The variations of the On configuration suppress or accelerate the free convective effects compared to those of the smooth tubes. The effects of fin configuration, number of fins and ratio of annulus gap width to the inner cylinder radius on the fluid flow and heat transfer in annuli are demonstrated by the distribution of the velocity vector, isotherms and streamlines. The governing equations are solved efficiently by using a parallel implementation. The technique is adopted for reduction of the computation cost. The parallelization is performed with the domain decomposition technique and message passing between sub-domains on the basis of the MPI library. The results from parallel computation reveal in consistency with those of the sequential program. Moreover, the speed-up ratio shows linearity with the number of processor.

On The Parallel Inplementation of a Static/Explicit FEM Program for Sheet Metal Forming (판금형 해석을 위한 정적/외연적 유한요소 프로그램의 병령화에 관한 연구)

  • ;;G.P.Nikishikov
    • Proceedings of the Korean Society of Precision Engineering Conference
    • /
    • 1995.10a
    • /
    • pp.625-628
    • /
    • 1995
  • A static/implicit finite element code for sheet forming (ITAS3D) is parallelized on IBM SP 6000 multi-processor computer. Computing-load-balanced domain decomposition method and the direct solution method at each subdomain (and interface) equation are developed. The system of equations for each subdomain are constructed by condensation and calculated on each processor. Approximated operation counts are calculated to set up the nonlinear equation system for balancing the compute load on each subdomain. Th esquare cup tests with several numbers of elements are used in demonstrating the performance of this parallel implementation. This procedure are proved to be efficient for moderate number of processors, especially for large number of elements.

  • PDF

Multi-Layer Printed Wiring Board with Built-In Soldering Heater and 3D Implementation of Dynamically Reconfigurable Highly Parallel Processors

  • Fujika, Yoshichika;Lee, Doo-Yong
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2001.10a
    • /
    • pp.104.2-104
    • /
    • 2001
  • In the intelligent integrated systems, the delay time must be reduced using highly parallel processors, as well as high throughput performance. In this paper, we propose a new concept for building 3D highly parallel processors using multi-layer printed wiring boards with built-in soldering heater (BISH-PWB). The proposed BISH is realized with the long and narrow cupper wiring pattern on the internal layer in the terminal pattern area. Based on the linearity of the cupper resistance vs. temperature, we can measure the BISH, temperature and its calorific value from the heater voltage and current measurements. If we provide the BISH temperature control systems for each BISH, selective multi-point soldering can be realized with same ...

  • PDF

Performance Evaluation of Access Channel Slot Acquisition in Cellular DS/CDMA Reverse Link

  • Kang, Bub-Joo;Han, Young-Nam
    • ETRI Journal
    • /
    • v.20 no.1
    • /
    • pp.16-27
    • /
    • 1998
  • In this paper, we consider the acquisition performance of an IS-95 reverse link access channel slot as a function of system design parameters such as postdetection integration length and the number of access channel message block repetitons. The uncertainty region of the reverse link spreading codes compared to that of forward link is very small, since the uncertainty region of the reverse link is determined by a cell radius. Thus, the parallel acquisiton technique in the reverse link is more efficient than a serial acquisition technique in terms of implementation and of acquisition time. The parallel acquisition is achieved by a bank of N parallel I/Q noncoherent correlator are analyzed for band-limited noise and the Rayleigh fast fading channel. The detection probability is derived for multiple correct code-phase offsets and multipath fading. The probability of no message error is derived when rake combining, access channel message block combining, and Viterbi decoding are applied. Numerical results provide the acquisition performance for system design parameters such as postdetection integration length and number of access channel message block repetitions in case of a random access on a mobile station.

  • PDF

Discrete Cosine Transform Algorithms for the VLSI Parallel Implementation (VLSI 병렬 연산을 위한 여현 변환 알고리듬)

  • 조남익;이상욱
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.25 no.7
    • /
    • pp.851-858
    • /
    • 1988
  • In this paper, we propose two different VLSI architectures for the parallel computation of DCT (discrete cosine transform) algorithm. First, it is shown that the DCT algorithm can be implemented on the existing systolic architecture for the DFT(discrete fourier transform) by introducing some modification. Secondly, a new prime factor DCT algorithm based on the prime factor DFT algorithm is proposed. And it is shown that the proposed algorihtm can be implemented in parallel on the systolic architecture for the prime factor DFT. However, proposed algorithm is only applicable to the data length which can be decomposed into relatively prime and odd numbers. It is also found that the proposed systolic architecture requires less multipliers than the structures implementing FDCT(fast DCT) algorithms directly.

  • PDF

Parallel Reduced-Order Square-Root Unscented Kalman Filter for State Estimation of Sensorless Permanent-Magnet Synchronous Motor (센서리스 영구자석 동기전동기의 상태 추정을 위한 병렬 축소 차수 제곱근 무향 칼만 필터)

  • Moon, Cheol;Kwon, Young-Ahn
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.65 no.6
    • /
    • pp.1019-1025
    • /
    • 2016
  • This paper proposes a parallel reduced-order square-root unscented Kalman filter for state estimation of a sensorless permanent-magnet synchronous motor. The appearance of an unscented Kalman filter is caused by the linearization process error between a real system and classical Kalman model. The unscented transformation can make a more accurate Kalman model. However, the complexity is its main drawback. This paper investigates the design and implementation of the proposed filter with Potter and Carlson square-root form. The proposed parallel reduced-order square-root unscented Kalman filter reduces memory and code size, and improves numerical computation. And the performance is not significantly different from the unscented Kalman filter. The experimentation is performed for the verification of the proposed filter.

Parallel Hybrid Particle-Continuum (DSMC-NS) Flow Simulations Using 3-D Unstructured Mesh

  • Wu J.S.;Lian Y.Y.;Cheng G.;Chen Y.S.
    • 한국전산유체공학회:학술대회논문집
    • /
    • 2006.05a
    • /
    • pp.27-34
    • /
    • 2006
  • In this paper, a recently proposed parallel hybrid particle-continuum (DSMC-NS) scheme employing 3D unstructured grid for solving steady-state gas flows involving continuum and rarefied regions is described [1]. Substitution of a density-based NS solver to a pressure-based one that greatly enhances the capability of the proposed hybrid scheme and several practical experiences of implementation learned from the development and verifications are highlighted. At the end, we present some simulation results of a realistic RCS nozzle plume, which is considered very challenging using either a continuum or particle solver alone, to demonstrate the capability of the proposed hybrid DSMC-NS method.

  • PDF