• Title/Summary/Keyword: Parallel Decomposition

Search Result 186, Processing Time 0.019 seconds

Acceleration of Anisotropic Elastic Reverse-time Migration with GPUs (GPU를 이용한 이방성 탄성 거꿀 참반사 보정의 계산가속)

  • Choi, Hyungwook;Seol, Soon Jee;Byun, Joongmoo
    • Geophysics and Geophysical Exploration
    • /
    • v.18 no.2
    • /
    • pp.74-84
    • /
    • 2015
  • To yield physically meaningful images through elastic reverse-time migration, the wavefield separation which extracts P- and S-waves from reconstructed vector wavefields by using elastic wave equation is prerequisite. For expanding the application of the elastic reverse-time migration to anisotropic media, not only the anisotropic modelling algorithm but also the anisotropic wavefield separation is essential. The anisotropic wavefield separation which uses pseudo-derivative filters determined according to vertical velocities and anisotropic parameters of elastic media differs from the Helmholtz decomposition which is conventionally used for the isotropic wavefield separation. Since applying these pseudo-derivative filter consumes high computational costs, we have developed the efficient anisotropic wavefield separation algorithm which has capability of parallel computing by using GPUs (Graphic Processing Units). In addition, the highly efficient anisotropic elastic reverse-time migration algorithm using MPI (Message-Passing Interface) and incorporating the developed anisotropic wavefield separation algorithm with GPUs has been developed. To verify the efficiency and the validity of the developed anisotropic elastic reverse-time migration algorithm, a VTI elastic model based on Marmousi-II was built. A synthetic multicomponent seismic data set was created using this VTI elastic model. The computational speed of migration was dramatically enhanced by using GPUs and MPI and the accuracy of image was also improved because of the adoption of the anisotropic wavefield separation.

In-situ Fourier Transform Infrared Spectroscopic Study during Thermolysis of Trimethylaluminum and its Adduct (Trimethylaluminum (TMA), $NH_3$ 및 TMA :$NH_3$Adduct의 열분해 반응에 대한 in-situ FTIR 분광학적 연구)

  • Hyang Sook Kim;Seong Han Kim;Jin Soo Hwang;Joong Gill Choi;Paul Joe Chong
    • Journal of the Korean Chemical Society
    • /
    • v.37 no.12
    • /
    • pp.995-1002
    • /
    • 1993
  • The thermal decomposition of trimethylaluminum (TMA) with ammonia has been investigated by in-situ Fourier transform infrared spectroscopy. The spectroscopic reaction cell, which permits heating interna lly up to 1100$^{\circ}C$, consists of stainless-steel hexagonal-port chamber containing two NaCl windows installed in parallel. In this work, the stoichiometric reaction between TMA and $NH_3$ is found to be completed immediately after mixing. FTIR spectra observed in the range of temperature 25∼1100$^{\circ}C$ show that TMA and TMA : $NH_3$ adduct decompose into methane as a predominant product around 500$^{\circ}C$. The assignments of the IR bands due to the gaseous TMA, $NH_3$ and TMA : $NH_3$ adduct are attempted on the basis of the published data. Furthermore, the decomposition of TMA can be described as a first-order reaction. Kinetic data about the decompositon of TMA and TMA : $NH_3$adduct will also be discussed.

  • PDF

High Throughput Parallel Design of 2-D $8{\times}8$ Integer Transforms for H.264/AVC (H.264/AVC 를 위한 높은 처리량의 2-D $8{\times}8$ integer transforms 병렬 구조 설계)

  • Sharma, Meeturani;Tiwari, Honey;Cho, Yong-Beom
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.49 no.8
    • /
    • pp.27-34
    • /
    • 2012
  • In this paper, the implementation of high throughput two-dimensional (2-D) $8{\times}8$ forward and inverse integer DCT transform for H.264 is presented. The forward and inverse transforms are represented using simple shift and addition operations. Matrix decomposition and matrix operation such as the Kronecker product and direct sum are used to reduce the computation complexity. The proposed design uses integer computations and does not use transpose memory and hence, the resource consumption is also reduced. The maximum operating frequency of the proposed pipelined architecture is 1.184 GHz, which achieves 25.27 Gpixels/sec throughput rate with the hardware cost of 44864 gates. High throughput and low hardware makes the proposed design useful for real time H.264/AVC high definition processing.

A Network-Distributed Design Optimization Approach for Aerodynamic Design of a 3-D Wing (3차원 날개 공력설계를 위한 네트워크 분산 설계최적화)

  • Joh, Chang-Yeol;Lee, Sang-Kyung
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.32 no.10
    • /
    • pp.12-19
    • /
    • 2004
  • An aerodynamic design optimization system for three-dimensional wing was developed as a part of the future MDO framework. The present design optimization system includes four modules such as geometry design, grid generation, flow solver and optimizer. All modules were based on commercial softwares and programmed to have automated execution capability in batch mode utilizing built-in script and journaling. The integration of all modules into the system was accomplished through programming using Visual Basic language. The distributed computational environment based on network communication was established to save computational time especially for time-consuming aerodynamic analyses. The distributed aerodynamic computations were performed in conjunction with the global optimization algorithm of response surface method, instead of using usual parallel computation based on domain decomposition. The application of the design system in the drag minimization problem demonstrated considerably enhanced efficiency of the design process while the final design showed reasonable results of reduced drag.

Effects of Ni Addition on the Microstructures and Magnetic Properties of Fe70-xPd30Nix High-Temperature Ferromagnetic Shape Memory Alloys

  • Lin, Chien-Feng;Yang, Jin-Bin
    • Journal of Magnetics
    • /
    • v.17 no.2
    • /
    • pp.86-95
    • /
    • 2012
  • This study investigated the effects of adding a third alloying element, Ni, to create $Fe_{70-x}Pd_{30}Ni_x$ (x = 2, 4, 6, 8 at.% Ni) ferromagnetic shape memory alloys (FSMAs). The Ni replaced a portion of the Fe. The $Fe_{70-x}Pd_{30}Ni_x$ alloys were homogenized through hot and cold forging to gain a ~38% reduction in thickness, next they were solution-treated (ST) with annealing recrystallization at $1100^{\circ}C$ for 8 h and quenched in ice brine, and then aged at $500^{\circ}C$ for 100 h. Investigation of the microstructures and magnetostriction indicated that the greater Ni amount in the $Fe_{70-x}Pd_{30}Ni_x$ alloys reduced saturation magnetostriction at room temperature (RT). It was also observed that it was more difficult to generate annealed recrystallization. However, with greater Ni addition into the $Fe_{70-x}Pd_{30}Ni_x$ (x = 6, 8 at.% Ni) alloys, the $L1_0+L1_m$ twin phase decomposition into stoichiometric $L1_0+L1_m+{\alpha}_{bct}$ structures was suppressed after the $500^{\circ}C$/100 h aging treatment. The result was that the $Fe_{70-x}Pd_{30}Ni_x$ (x = 6, 8 at.% Ni) alloys maintained a high magnetostriction and magnetostrictive susceptibility (${\Delta}{\lambda}{_\parallel}{^s}/{\Delta}H$) after the alloys were aged at $500^{\circ}C$ for 100 h. This magnetic property of the $Fe_{70-x}Pd_{30}Ni_x$ (x = 6, 8 at.% Ni) alloys make it suitable for application in a high temperature (T > $500^{\circ}C$) and high frequency environments.

Microstructural Characterizations on $(Na_{1/2}Pr_{1/2})TiO_3$ Ceramics ($(Li_{1/2}Pr_{1/2})TiO_3$ 세라믹의 미세구조 평가)

  • Lee, Hwack-Joo;Ryu, Hyun;Park, Hyun-Min;Cho, Yang-Koo;Nahm, Sahn
    • Applied Microscopy
    • /
    • v.32 no.3
    • /
    • pp.257-263
    • /
    • 2002
  • Microstructural investigations of $(Li_{1/2}Pr_{1/2})TiO_3$ (LPT) complex perovskite compounds were carried out using X-ray diffractometry and transmission electron microscopy. LPT has not only the ordering of A-site cation deficiencies but also has the antiphase and inphase tilting of oxygen octahedron and the antiparallel shift of cations. Both the antiphase boundaries and the ferroelastic domains are present in the microstructure. Spinodal decomposition is found in the microstructure. The measured dielectric properties were ${\varepsilon}_r=84.6,\;Q\;{\Large f}_o=776\;GHz,\;{\tau}_{f}=-233.66ppm/^{\circ}C$.

A Study on Interaction Modes among Populations in Cooperative Coevolutionary Algorithm for Supply Chain Network Design (공급사슬 네트워크 설계를 위한 협력적 공진화 알고리즘에서 집단들간 상호작용방식에 관한 연구)

  • Han, Yongho
    • Korean Management Science Review
    • /
    • v.31 no.3
    • /
    • pp.113-130
    • /
    • 2014
  • Cooperative coevolutionary algorithm (CCEA) has proven to be a very powerful means of solving optimization problems through problem decomposition. CCEA implies the use of several populations, each population having the aim of finding a partial solution for a component of the considered problem. Populations evolve separately and they interact only when individuals are evaluated. Interactions are made to obtain complete solutions by combining partial solutions, or collaborators, from each of the populations. In this respect, we can think of various interaction modes. The goal of this research is to develop a CCEA for a supply chain network design (SCND) problem and identify which interaction mode gives the best performance for this problem. We present general design principle of CCEA for the SCND problem, which require several co-evolving populations. We classify these populations into two groups and classify the collaborator selection scheme into two types, the random-based one and the best fitness-based one. By combining both two groups of population and two types of collaborator selection schemes, we consider four possible interaction modes. We also consider two modes of updating populations, the sequential mode and the parallel mode. Therefore, by combining both four possible interaction modes and two modes of updating populations, we investigate seven possible solution algorithms. Experiments for each of these solution algorithms are conducted on a few test problems. The results show that the mode of the best fitness-based collaborator applied to both groups of populations combined with the sequential update mode outperforms the other modes for all the test problems.

Health monitoring of a new hysteretic damper subjected to earthquakes on a shaking table

  • Romo, L.;Benavent-Climent, A.;Morillas, L.;Escolano, D.;Gallego, A.
    • Earthquakes and Structures
    • /
    • v.8 no.3
    • /
    • pp.485-509
    • /
    • 2015
  • This paper presents the experimental results obtained by applying frequency-domain structural health monitoring techniques to assess the damage suffered on a special type of damper called Web Plastifying Damper (WPD). The WPD is a hysteretic type energy dissipator recently developed for the passive control of structures subjected to earthquakes. It consists of several I-section steel segments connected in parallel. The energy is dissipated through plastic deformations of the web of the I-sections, which constitute the dissipative parts of the damper. WPDs were subjected to successive histories of dynamically-imposed cyclic deformations of increasing magnitude with the shaking table of the University of Granada. To assess the damage to the web of the I-section steel segments after each history of loading, a new damage index called Area Index of Damage (AID) was obtained from simple vibration tests. The vibration signals were acquired by means of piezoelectric sensors attached on the I-sections, and non-parametric statistical methods were applied to calculate AID in terms of changes in frequency response functions. The damage index AID was correlated with another energy-based damage index -ID- which past research has proven to accurately characterize the level of mechanical damage. The ID is rooted in the decomposition of the load-displacement curve experienced by the damper into the so-called skeleton and Bauschinger parts. ID predicts the level of damage and the proximity to failure of the damper accurately, but it requires costly instrumentation. The experiments reported in this paper demonstrate a good correlation between AID and ID in a realistic seismic loading scenario consisting of dynamically applied arbitrary cyclic loads. Based on this correlation, it is possible to estimate ID indirectly from the AID, which calls for much simpler and less expensive instrumentation.

Computational Algorithm for Nonlinear Large-scale/Multibody Structural Analysis Based on Co-rotational Formulation with FETI-local Method (Co-rotational 비선형 정식화 및 FETI-local 기법을 결합한 비선형 대용량/다물체 구조 해석 알고리듬 개발)

  • Cho, Haeseong;Joo, HyunShig;Lee, Younghun;Gwak, Min-cheol;Shin, SangJoon;Yoh, Jack J.
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.44 no.9
    • /
    • pp.775-780
    • /
    • 2016
  • In this paper, a computational algorithm of an improved and versatile structural analysis applicable for large-size flexible nonlinear structures is developed. In more detail, nonlinear finite element based on the co-rotational (CR) framework is developed. Then, a finite element tearing and interconnecting method using local Lagrange multipliers (FETI-local) is combined with the nonlinear CR finite element. The resulting computational algorithm is presented and applied for nonlinear static analyses, i.e., cantilevered beam and multibody structure. Finally, the proposed analysis is evaluated with regard to its parallel computation performance, and it is compared with those obtained by serial computation using the sparse matrix linear solver, PARDISO.

Parallel solution of linear systems on the CRAY-2 using multi/micro tasking library (CRAY-2에서 멀티/마이크로 태스킹 라이브러리를 이용한 선형시스템의 병렬해법)

  • Ma, Sang-Back
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.11
    • /
    • pp.2711-2720
    • /
    • 1997
  • Multitasking and microtasking on the CRAY machine provides still another way to improve computational power. Since CRAY-2 has 4 processors we can achieve speedup up to 4 properly designed algorithms. In this paper we present two parallelizations of linear system solution in the CRAY-2 with multitasking and microtasking library. One is the LU decomposition on the dense matrices and the other is the iterative solution of large sparse linear systems with the preconditioner proposed by Radicati di Brozolo. In the first case we realized a speedup of 1.3 with 2 processors for a matrix of dimension 600 with the multitasking and in the second case a speedup of around 3 with 4 processors for a matrix of dimension 600 with the multitasking and in the second case a speedup of around 3 with 4 processors for a matrix of dimension 8192 with the microtasking. In the first case the speedup is limited because of the nonuniform vector lenghts. In the second case the ILU(0) preconditioner with Radicati's technique seem to realize a reasonable high speedup with 4 processors.

  • PDF