• Title/Summary/Keyword: Parallel Decomposition

Search Result 186, Processing Time 0.032 seconds

An Adaptive Decomposition Technique for Multidisciplinary Design Optimization (다분야통합최적설계를 위한 적응분해기법)

  • Park, Hyeong Uk;Choe, Dong Hun;An, Byeong Ho
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.31 no.5
    • /
    • pp.18-24
    • /
    • 2003
  • The design cycle associated with large engineering systems requires an initial decomposition of the complex system into design processes which are coupled through the transference of output data. Some of these design processes may be grouped into iterative sybcycles. Previous researches predifined the numbers of design processes in groups, but these group sizes should be determined optimally to balance the computing time of each groups. This paper proposes adaptive decomposition method, which determines the group sizes and the order of processes simultaneously to raise design efficiency by expanding the chromosome of the genetic algorithm. Finally, two sample cases are presented to show the effects of optimizing the sequence of processes with the adaptive decomposition method.

Design and Analysis of MPEG-2 MP@HL Decoder in Multi-Processor Environments

  • Yoo, Seung-Hwan;Lee, Hyun-Seung;Lee, Sang-Jo;Park, Rae-Hong;Kim, Do-Hyung
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2009.01a
    • /
    • pp.211-216
    • /
    • 2009
  • As demands for high-definition television (HDTV) increase, the implementation of real-time decoding of high-definition (HD) video becomes an important issue. The data size for HD video is so large that real-time processing of the data is difficult to implement, especially with software. In order to implement a fast moving picture expert group-2 decoder for HDTV, we compose five scenarios that use parallel processing techniques such as data decomposition, task decomposition, and pipelining. Assuming the multi digital signal processor environments, we analyze each scenario in three aspects: decoding speed, L1 memory size, and bandwidth. By comparing the scenarios, we decide the most suitable cases for different situations. We simulate the scenarios in the dual-core and dual-central processing unit environment by using OpenMP and analyze the simulation results.

  • PDF

SOME INVARIANT SUBSPACES FOR BOUNDED LINEAR OPERATORS

  • Yoo, Jong-Kwang
    • Journal of the Chungcheong Mathematical Society
    • /
    • v.24 no.1
    • /
    • pp.19-34
    • /
    • 2011
  • A bounded linear operator T on a complex Banach space X is said to have property (I) provided that T has Bishop's property (${\beta}$) and there exists an integer p > 0 such that for a closed subset F of ${\mathbb{C}}$ ${X_T}(F)={E_T}(F)=\bigcap_{{\lambda}{\in}{\mathbb{C}}{\backslash}F}(T-{\lambda})^PX$ for all closed sets $F{\subseteq}{\mathbb{C}}$, where $X_T$(F) denote the analytic spectral subspace and $E_T$(F) denote the algebraic spectral subspace of T. Easy examples are provided by normal operators and hyponormal operators in Hilbert spaces, and more generally, generalized scalar operators and subscalar operators in Banach spaces. In this paper, we prove that if T has property (I), then the quasi-nilpotent part $H_0$(T) of T is given by $$KerT^P=\{x{\in}X:r_T(x)=0\}={\bigcap_{{\lambda}{\neq}0}(T-{\lambda})^PX$$ for all sufficiently large integers p, where ${r_T(x)}=lim\;sup_{n{\rightarrow}{\infty}}{\parallel}T^nx{\parallel}^{\frac{1}{n}}$. We also prove that if T has property (I) and the spectrum ${\sigma}$(T) is finite, then T is algebraic. Finally, we prove that if $T{\in}L$(X) has property (I) and has decomposition property (${\delta}$) then T has a non-trivial invariant closed linear subspace.

Accuracy Analysis of Parallel Method based on Non-overlapping Domain Decomposition Method (비중첩 영역 분할기법 기반 병렬해석의 정확도 분석)

  • Tak, Moonho;Song, Yooseob;Jeon, Hye-Kwan;Park, Taehyo
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.26 no.4
    • /
    • pp.301-308
    • /
    • 2013
  • In this paper, an accuracy analysis of parallel method based on non-overlapping domain decomposition method is carried out. In this approach, proposed by Tak et al.(2013), the decomposed subdomains do not overlap each other and the connection between adjacent subdomains is determined via simple connective finite element named interfacial element. This approach has two main advantages. The first is that a direct method such as gauss elimination is available even in a singular problem because the singular stiffness matrix from floating domain can be converted to invertible matrix by assembling the interfacial element. The second is that computational time and storage can be reduced in comparison with the traditional finite element tearing and interconnect(FETI) method. The accuracy of analysis using proposed method, on the other hand, is inclined to decrease at cross points on which more than three subdomains are interconnected. Thus, in this paper, an accuracy analysis for a novel non-overlapping domain decomposition method with a variety of subdomain numbers which are interconnected at cross point is carried out. The cause of accuracy degradation is also analyze and establishment of countermeasure is discussed.

PERFORMANCE ANALYSIS OF THE PARALLEL CUPID CODE IN DISTRIBUTED MEMORY SYSTEM BASED ETHERNET AND INFINIBAND NETWORK (이더넷과 인피니밴드 네트워크 기반의 분산 메모리 시스템에서 병렬성능 분석)

  • Jeon, B.J.;Choi, H.G.
    • Journal of computational fluids engineering
    • /
    • v.19 no.2
    • /
    • pp.24-29
    • /
    • 2014
  • In this study, a parallel performance of CUPID-code has been investigated for both Ethernet and Infiniband network system to examine the effect of cache memory and network-speed. Bi-conjugate gradient solver of CUPID-code has been parallelised by using domain decomposition method and message passing interface (MPI). It is shown that the parallel performance of Ethernet-network system is worse than that of Infiniband-network system due to the slow network-speed and a small cache memory. It is also found that the parallel performance of each system deteriorates for a small problem due to the communication overhead, but the performance of Infiniband-network system is better than Ethernet-network system due to a much faster network-speed. For a large problem, the parallel performance depends less on network system.

A Study on Parallel Processing System for Automatic Segmentation of Moving Object in Image Sequences

  • Lee, Hyung;Park, Jong-Won
    • Proceedings of the IEEK Conference
    • /
    • 2000.07a
    • /
    • pp.429-432
    • /
    • 2000
  • The new MPEG-4 video coding standard enables content-based functionalities. In order to support the philosophy of the MPEG-4 visual standard, each frame of video sequences should be represented in terms of video object planes (VOP’s). In other words, video objects to be encoded in still pictures or video sequences should be prepared before the encoding process starts. Therefore, it requires a prior decomposition of sequences into VOP’s so that each VOP represents a moving object. A parallel processing system is required an automatic segmentation to be processed in real-time, because an automatic segmentation is time consuming. This paper addresses the parallel processing: system for an automatic segmentation for separating moving object from the background in image sequences. The proposed parallel processing system comprises of processing elements (PE’s) and a multi-access memory system (MAMS). Multi-access memory system is a memory controller to perform parallel memory access with the variety of types: horizontal, vertical, and block access way. In order to realize these ways, a multi-access memory system consists of a memory module selection module, data routing modules, and an address calculation and routing module. The proposed system is simulated and evaluated by the CADENCE Verilog-XL hardware simulation package.

  • PDF

Numerical Simulation of Natural Convection in Annuli with Internal Fins

  • Ha, Man-Yeong;Kim, Joo-Goo
    • Journal of Mechanical Science and Technology
    • /
    • v.18 no.4
    • /
    • pp.718-730
    • /
    • 2004
  • The solution for the natural convection in internally finned horizontal annuli is obtained by using a numerical simulation of time-dependent and two-dimensional governing equations. The fins existing in annuli influence the flow pattern, temperature distribution and heat transfer rate. The variations of the On configuration suppress or accelerate the free convective effects compared to those of the smooth tubes. The effects of fin configuration, number of fins and ratio of annulus gap width to the inner cylinder radius on the fluid flow and heat transfer in annuli are demonstrated by the distribution of the velocity vector, isotherms and streamlines. The governing equations are solved efficiently by using a parallel implementation. The technique is adopted for reduction of the computation cost. The parallelization is performed with the domain decomposition technique and message passing between sub-domains on the basis of the MPI library. The results from parallel computation reveal in consistency with those of the sequential program. Moreover, the speed-up ratio shows linearity with the number of processor.

On The Parallel Inplementation of a Static/Explicit FEM Program for Sheet Metal Forming (판금형 해석을 위한 정적/외연적 유한요소 프로그램의 병령화에 관한 연구)

  • ;;G.P.Nikishikov
    • Proceedings of the Korean Society of Precision Engineering Conference
    • /
    • 1995.10a
    • /
    • pp.625-628
    • /
    • 1995
  • A static/implicit finite element code for sheet forming (ITAS3D) is parallelized on IBM SP 6000 multi-processor computer. Computing-load-balanced domain decomposition method and the direct solution method at each subdomain (and interface) equation are developed. The system of equations for each subdomain are constructed by condensation and calculated on each processor. Approximated operation counts are calculated to set up the nonlinear equation system for balancing the compute load on each subdomain. Th esquare cup tests with several numbers of elements are used in demonstrating the performance of this parallel implementation. This procedure are proved to be efficient for moderate number of processors, especially for large number of elements.

  • PDF

High Performance Hybrid Direct-Iterative Solution Method for Large Scale Structural Analysis Problems

  • Kim, Min-Ki;Kim, Seung-Jo
    • International Journal of Aeronautical and Space Sciences
    • /
    • v.9 no.2
    • /
    • pp.79-86
    • /
    • 2008
  • High performance direct-iterative hybrid linear solver for large scale finite element problem is developed. Direct solution method is robust but difficult to parallelize, whereas iterative solution method is opposite for direct method. Therefore, combining two solution methods is desired to get both high performance parallel efficiency and numerical robustness for large scale structural analysis problems. Hybrid method mentioned in this paper is based on FETI-DP (Finite Element Tearing and Interconnecting-Dual Primal method) which has good parallel scalability and efficiency. It is suitable for fourth and second order finite element elliptic problems including structural analysis problems. We are using the hybrid concept of theses two solution method categories, combining the multifrontal solver into FETI-DP based iterative solver. Hybrid solver is implemented for our general structural analysis code, IPSAP.

Indentification and Compensation of Robot Kinematic Parameters for Positioning Accuracy Improvement

  • Kim, Doo-Hyeong;Guk, Geum-Hwan
    • 한국기계연구소 소보
    • /
    • s.19
    • /
    • pp.81-92
    • /
    • 1989
  • This paper presents a simple identification method of the actual kinematic parameters for the robot with parallel joints. It is known that Denavit-Hartenberg's coordinate system is not useful for nearly parallel joints. In this paper, the coordinate frames are reassigned to model the kinematic parameter between nearly parallel joints by four parameters. The proposed identification method uses a straight ruler about 1m long. A robot hand is placed by using a teaching pendant at the prescribed points on the ruler, and corresponding error function is defined. The identified kinematic parameters which make the error function zero are obtained by iterative least square error method based on the singular value decomposition. In the compensation of joint angles, only the position is considered because the usual applications of robot do not require a precise orientation control.

  • PDF