• Title/Summary/Keyword: Parallel Processing method

Search Result 732, Processing Time 0.025 seconds

Variable Time Step Simulation and Analysis of Hydraulic Control Systems using Transmission Line Modeling (전달관로 모델링을 이용한 유압제어 시스템의 가변 시간스텝 시뮬레이션 및 해석)

  • Hwang, Un-Gyu;Jo, Seung-Ho
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.26 no.5
    • /
    • pp.843-850
    • /
    • 2002
  • This paper presents a simulation method using the transmission line modeling to reduce simulation runtime of hydraulic control systems. This method is based on separating the system components each other using the transmission line elements prior to simulation, which leads to divide the simulated system into several subsystems suitable for an even more efficient integration. It can also handle nonlinearities and discontinuities without flag signal when restarting integration. By applying variable integration timestep to parallel hydraulic circuits via parallel processing, it is shown that simulation run-time can be reduced significantly compared with that of Runge Kutta method.

The Voxelization of Surface Objects using File handling and Parallel Processing (파일 및 병렬 처리를 이용한 표면 객체의 복셀화 방안)

  • Lee, Su-Yeol;Ahn, Eun-Young
    • Journal of Korea Multimedia Society
    • /
    • v.18 no.2
    • /
    • pp.113-119
    • /
    • 2015
  • This paper suggests an efficient method for making the high resolution volexlized model from a polygonal surface object. A distinctive strength of the method is that a surface model, however complex one, can be transformed and formed an absolute voxelized solid model in a various resolution. It caused by producing a voxel by integrating the informations for the candidated voxels separately detected in each 3D-axial direction. This method reduces memory complexity by storing the information of voxels that is produced during the 2-phase volxelization(surface and inner voxelization) of a surface object in a binary file. For the computational efficiency, a parallel process using multi-threads is applied in the process of the inner voxelization, it also takes advantage of time complexity.

Parallel Integration for Real-Time Simulation (실시간 시뮬레이션을 위한 병렬적분)

  • Lee, W.S.;Samson, J.
    • Transactions of the Korean Society of Automotive Engineers
    • /
    • v.2 no.1
    • /
    • pp.106-115
    • /
    • 1994
  • A parallel integration approach is proposed for real-time simulation of controlled mechanical systems. The proposed approach, which employs the dual-rate integration method in a parallel computing environment, is developed to deal with stiffness and high frequency characteristics of the controlled mechanical systems effectively. Numerical experiments are performed to demonstrate the effectiveness of the approach in shared memory multiprocessors, Alliant FX/8 and Alliant FX/80.

  • PDF

Design of Low-power Serial-to-Parallel and Parallel-to-Serial Converter using Current-cut method (전류 컷 기법을 적용한 저전력형 직병렬/병직렬 변환기 설계)

  • Park, Yong-Woon;Hwang, Sung-Ho;Cha, Jae-Sang;Yang, Chung-Mo;Kim, Sung-Kweon
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.34 no.10A
    • /
    • pp.776-783
    • /
    • 2009
  • Current-cut circuit is an effective method to obtain low power consumption in wireless communication systems as high speed OFDM. For the operation of current-mode FFT LSI with analog signal processing essentially requires current-mode serial-to-parallel/parallel-to-serial converter with multi input and output structure. However, the Hold-mode operation of current-mode serial-to-parallel/parallel-to-serial converter has unnecessary power consumption. We propose a novel current-mode serial-to-parallel/parallel-to-serial converter with current-cut circuit and full chip simulation results agree with experimental data of low power consumption. The proposed current-mode serial-to-parallel/parallel-to-serial converter promise the wide application of the current-mode analog signal processing in the field of low power wireless communication LSI.

From WiFi to WiMAX: Efficient GPU-based Parameterized Transceiver across Different OFDM Protocols

  • Li, Rongchun;Dou, Yong;Zhou, Jie;Li, Baofeng;Xu, Jinbo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.7 no.8
    • /
    • pp.1911-1932
    • /
    • 2013
  • Orthogonal frequency-division multiplexing (OFDM) has become a popular modulation scheme for wireless protocols because of its spectral efficiency and robustness against multipath interference. Although the components of various OFDM protocols are functionally similar, they remain distinct because of the characteristics of the environment. Recently, graphics processing units (GPUs) have been used to accelerate the signal processing of the physical layer (PHY) because of their great computational power, high development efficiency, and flexibility. In this paper, we describe the implementation of parameterized baseband modules using GPUs for two different OFDM protocols, namely, 802.11a and 802.16. First, we introduce various modules in the modulator/demodulator parts of the transmitter and receiver and analyze the computational complexity of each module. We then describe the integration of the GPU-based baseband modules of the two protocols using the parameterized method. GPU-based implementations are addressed to explain how to accelerate the baseband processing to archive real-time throughput. Finally, the performance results of each signal processing module are evaluated and analyzed. The experiments show that the GPU-based 802.11a and 802.16 PHY meet the real-time requirement and demonstrate good bit error ratio (BER) performance. The performance comparison indicates that our GPU-based implemented modules have better flexibility and throughput to the current ones.

A Parallel Processing Method for Partial Nodes in R*-tree Using GPU (GPU를 활용한 R*-tree에서의 부분 노드 병렬 처리 방법)

  • Kim, Seong;Oh, Byoung-Woo
    • Spatial Information Research
    • /
    • v.20 no.6
    • /
    • pp.139-144
    • /
    • 2012
  • The R*-tree manages hierarchical nodes for efficient access of spatial data. We propose a method that maintains partial nodes of R*-tree in the GPU memory to improve efficiency using parallel processing. The proposed method attempts to load as many nodes as possible to the GPU memory. The new nodes are inserted to manage the rest of R*-tree nodes in the main memory. The experimental result shows that the proposed method is more efficient than the main memory based R*-tree.

An Efficient Face Detection Method using Skin Color Information and Parallel Processing in Multi-Core SoC (멀티코어 SoC에서 피부색상 정보와 병렬처리를 이용한 효율적인 얼굴 검출 방법)

  • Kim, Hong-Hee;Lee, Jae-Heung
    • Journal of IKEEE
    • /
    • v.16 no.4
    • /
    • pp.375-381
    • /
    • 2012
  • In this paper, we present an implementation of Viola-Jones algorithm in a multi-core SoC by using skin color information and a parallel processing method. In order to reduce unnecessary operations and improve the detection speed, we adopted a face detection algorithm based on skin color and deleted background image. The algorithm is functionally divided into several parts taking account of the size and the dependency so that the divided functions can be proceeded in parallel. Experiment results in SoC with built-in Cortex-A9 multi core show that it is about 1.8 times faster than the existing algorithm which is not divided.

A Road Region Extraction Using OpenCV CUDA To Advance The Processing Speed (처리 속도 향상을 위해 OpenCV CUDA를 활용한 도로 영역 검출)

  • Lee, Tae-Hee;Hwang, Bo-Hyun;Yun, Jong-Ho;Choi, Myung-Ryul
    • Journal of Digital Convergence
    • /
    • v.12 no.6
    • /
    • pp.231-236
    • /
    • 2014
  • In this paper, we propose a processing speed improvement by adding a parallel processing based on device(graphic card) into a road region extraction by host(PC) based serial processing. The OpenCV CUDA supports the many functions of parallel processing method by interworking a conventional OpenCV with CUDA. Also, when interworking the OpenCV and CUDA, OpenCV functions completed a configuration are optimized the User's device(Graphic Card) specifications. Thus, OpenCV CUDA usage provides an algorithm verification and easiness of simulation result deduction. The proposed method is verified that the proposed method has a about 3.09 times faster processing speed than a conventional method by using OpenCV CUDA and graphic card of NVIDIA GeForce GTX 560 Ti model through experimentation.

The Design and Implementation of the ParaC Language (ParaC 언어의 설계 및 구현)

  • Lee, Kyoung-Seok;Woo, Young-Choon;Kim, Jin-Mee;Chi, Dong-Hae
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.11
    • /
    • pp.2903-2913
    • /
    • 1997
  • This paper describes the design and implementation of the ParaC language that supports parallel programming on the shared memory and distributed memory parallel machine. The ParaC language is designed for the effective use of system resources of scalable parallel systems. The goal is achieved by adding parallel and synchronization constructs for shared address spaces, and remote task constructs for distributed address spaces. This paper also shows the translation method, and we implement the translator and the run-time library for parallel execution of extended constructs.

  • PDF

Comparison of Parallel Preconditioners for Solving Large Sparse Linear Systems on a Massively Parallel Machine (대형이산 행렬 시스템의 초대형병렬컴퓨터에서의 해법을 위한 병렬준비 행렬의 비교)

  • Ma, Sang-Baek
    • The Transactions of the Korea Information Processing Society
    • /
    • v.2 no.4
    • /
    • pp.535-542
    • /
    • 1995
  • In this paper we present two preconditioners for solving large sparse linear systems arising from elliptic partial differential equations on massively parallel machines, such as the CM-5. Most massively parallel machines do heavily rely on the message-passing for the interprocessor communications. but according to the current manufacturing standards the cost of communications is very high compared to that of floating point arithmetic computations. Due to this we need an algorithm which minimizes the amount of interprocessor communication on the massively parallel machines. We will show that Block SOR(Successive Over Relaxation) method coupled with the multi-coloring technique is one of such preconditioner on the massively parallel machines, by conducting experiments in the CM-5. Also, we implemented the ADI(Alternation Direction Implicit) method in the CM-5, which has been conventionally one of the most powerful parallel preconditioner. Our experiment shows that Block SOR method coupled with the multi-coloring technique could yield a speedup with 50% efficiency with the range of number of processors form 16 to 512 for a matrix with dimension 512x512. On the other hand, the ADI method shows a very poor performance.

  • PDF