• Title/Summary/Keyword: Parallel computation

Search Result 592, Processing Time 0.026 seconds

A Numerical Study for the Three-Dimensional Fluid Flow Past Tube Banks and Comparison with PIV Experimental Data

  • Ha, Man-Yeong;Kim, Seung-Hyeon;Kim, Kyung-Chun;Son, Young-Chul
    • Journal of Mechanical Science and Technology
    • /
    • v.18 no.12
    • /
    • pp.2236-2249
    • /
    • 2004
  • The analysis for the three-dimensional fluid flow past tube banks arranged in equilateral-triangular form at Re$\_$max/=4,000 is carried out using a large eddy simulation technique. The governing equations for the mass and momentum conservation are discretized using the finite volume method. Parallel computational techniques using MPI (Message Passing Interface) are implemented in the present computer code. The computation time decreases linearly proportional to the number of used CPUs in the present parallel computation. We obtained the time-averaged streamwise and cross-streamwise velocities and turbulent intensities. The present numerical results are compared with the PIV experimental data and agree generally well with the experimental data.

A Performance-Oriented Intra-Prediction Hardware Design for H.264/AVC

  • Jin, Xianzhe;Ryoo, Kwangki
    • Journal of information and communication convergence engineering
    • /
    • v.11 no.1
    • /
    • pp.50-55
    • /
    • 2013
  • In this paper, we propose a parallel intra-operation unit and a memory architecture for improving the performance of intra-prediction, which utilizes spatial correlation in an image to predict the blocks and contains 17 prediction modes in total. The design is targeted for portable devices applying H.264/AVC decoders. For boosting the performance of the proposed design, we adopt a parallel intra-operation unit that can achieve the prediction of 16 neighboring pixels at the same time. In the best case, it can achieve the computation of one luma $16{\times}16$ block within 16 cycles. For one luma $4{\times}4$ block, a mere one cycle is needed to finish the process of computation. Compared with the previous designs, the average cycle reduction rate is 78.01%, and the gate count is slightly reduced. The design is synthesized with the MagnaChip $0.18{mu}m$ library and can run at 125 MHz.

A PARALLEL HYBRID METHOD FOR EQUILIBRIUM PROBLEMS, VARIATIONAL INEQUALITIES AND NONEXPANSIVE MAPPINGS IN HILBERT SPACE

  • Hieu, Dang Van
    • Journal of the Korean Mathematical Society
    • /
    • v.52 no.2
    • /
    • pp.373-388
    • /
    • 2015
  • In this paper, a novel parallel hybrid iterative method is proposed for finding a common element of the set of solutions of a system of equilibrium problems, the set of solutions of variational inequalities for inverse strongly monotone mappings and the set of fixed points of a finite family of nonexpansive mappings in Hilbert space. Strong convergence theorem is proved for the sequence generated by the scheme. Finally, a parallel iterative algorithm for two finite families of variational inequalities and nonexpansive mappings is established.

Parallel Computing For Computational Geometry (컴퓨터 기하학을 위한 병렬계산)

  • O, Seung-Jun
    • Electronics and Telecommunications Trends
    • /
    • v.4 no.1
    • /
    • pp.93-117
    • /
    • 1989
  • Computational Geometry is concerned with the design and analysis of computational algorithms which solve geometry problems. Geometry problems have a large number of applications areas such as pattern recognition, image processing, computer graphics, VLSI design and statistics since they involve inherently geometric problems for which efficient algorithms have to be developed. Several parallel algorithms, based on various parallel computation models, have been proposed for solving geometric problems. We review the current status of the parallel algorithms in computational geometry.

Application of a Parallel Asynchronous Algorithm to Some Grid Problems on Workstation Clusters

  • Park, Pil-Seong
    • Ocean and Polar Research
    • /
    • v.23 no.2
    • /
    • pp.173-179
    • /
    • 2001
  • Parallel supercomputing is now a must for oceanographic numerical modelers. Most of today's parallel numerical schemes use synchronous algorithms, where some processors that have finished their tasks earlier than others must wait at synchronization points for correct computation. Hence, the load balancing is a crucial factor, however, it is, in general, difficult to achieve on heterogeneous workstation clusters. We devise an asynchronous algorithm that reduces the idle times of faster processors, and discuss application of the algorithm to some grid problems and implementation on a workstation cluster using Message Passing Interface (MPI).

  • PDF

Joint Structural Importance of two Components

  • Abouammoh, A.M.;Sarhan, Ammar
    • International Journal of Reliability and Applications
    • /
    • v.3 no.4
    • /
    • pp.173-184
    • /
    • 2002
  • This paper introduces the joint structural importance of two components in a coherent system. Some relationships between joint structural importance and marginal structural importance are presented. It is shown that the sign of Joint structural importance can be determined, in advance, without computation in some special structures. The joint structural importance of two components in some series-parallel and parallel-series systems are established. Some practical examples are presented to elucidate some of the derived results.

  • PDF

A Pipelined Parallel Optimized Design for Convolution-based Non-Cascaded Architecture of JPEG2000 DWT (JPEG2000 이산웨이블릿변환의 컨볼루션기반 non-cascaded 아키텍처를 위한 pipelined parallel 최적화 설계)

  • Lee, Seung-Kwon;Kong, Jin-Hyeung
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.46 no.7
    • /
    • pp.29-38
    • /
    • 2009
  • In this paper, a high performance pipelined computing design of parallel multiplier-temporal buffer-parallel accumulator is present for the convolution-based non-cascaded architecture aiming at the real time Discrete Wavelet Transform(DWT) processing. The convolved multiplication of DWT would be reduced upto 1/4 by utilizing the filter coefficients symmetry and the up/down sampling; and it could be dealt with 3-5 times faster computation by LUT-based DA multiplication of multiple filter coefficients parallelized for product terms with an image data. Further, the reutilization of computed product terms could be achieved by storing in the temporal buffer, which yields the saving of computation as well as dynamic power by 50%. The convolved product terms of image data and filter coefficients are realigned and stored in the temporal buffer for the accumulated addition. Then, the buffer management of parallel aligned storage is carried out for the high speed sequential retrieval of parallel accumulations. The convolved computation is pipelined with parallel multiplier-temporal buffer-parallel accumulation in which the parallelization of temporal buffer and accumulator is optimize, with respect to the performance of parallel DA multiplier, to improve the pipelining performance. The proposed architecture is back-end designed with 0.18um library, which verifies the 30fps throughput of SVGA(800$\times$600) images at 90MHz.

Computational strategies for improving efficiency in rigid-plastic finite element analysis (강소성 유한요소해석의 안정화와 고능률화에 관한 연구)

  • ;;Yoshihiro, Tomita
    • Transactions of the Korean Society of Mechanical Engineers
    • /
    • v.13 no.3
    • /
    • pp.317-322
    • /
    • 1989
  • Effective computational strategies have been proposed in the evaluation of stiffness matrices of rigid-plastic finite element method widely used in simulation of metal forming processes. The stiffness matrices are expressed as the sum of stiffness matrices evaluated by reduced integration and Liu's stabilization matrices which control the occurrence os zero-energy mode due to excessive reduced integration. The proposed method has been applied to the solution of fundamental 3-dimensional problems. The results clarified that the deformed mesh configuration was remarkably stabilized and computation speed attained about 3 times as fast as that of conventional 3-dimensional analyses. Furthermore, computation speed increases by a factor 60 when parallel computation is introduced. This speed has a tendency to increase as the total degree of freedom increases. As a result, this rigid-plastic finite element method enables us to analyze real 3-dimensional forming processes with practically acceptable computation time.

Applying Tabu Search to Minimize Mean Tardiness in the Parallel Machine Scheduling (동일한 병렬기계 일정계획에서 평균지연시간의 최소화를 위한 Tabu Search 방법)

  • 전태웅;강맹규
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.18 no.35
    • /
    • pp.107-114
    • /
    • 1995
  • This paper proposes the Tabu Search algorithm to minimize mean tardiness in the parallel machine scheduling problem. The algorithm reduces the computation time by employing restricted neighborhood and produces an efficient solution in this problem.

  • PDF

Molecular Docking System using Parallel GPU (병렬 GPU를 이용한 분자 도킹 시스템)

  • Park, Sung-Jun
    • The Journal of the Korea Contents Association
    • /
    • v.8 no.12
    • /
    • pp.441-448
    • /
    • 2008
  • The molecular docking system needs a large amount of computation and requires super-computing power. Since the experiment requires a large amount of time, the experiment is conducted in the distributed environment or in the grid environment. Recently, researches on using parallel GPU of far higher performance than that of CPU in scientific computing have been very actively conducted. CUDA is an open technique by which a parallel GPU programming is made possible. This study proposes the molecular docking system using CUDA. It also proposes algorithm that parallels energy-minimizing-computation. To verify such experiments, this study conducted a comparative analysis on the time required for experimenting molecular docking in general CPU and the time and performance of the parallel GPU-based molecular docking which is proposed in this study.