• Title/Summary/Keyword: Parallel Processing method

Search Result 734, Processing Time 0.03 seconds

Acceleration of Mesh Denoising Using GPU Parallel Processing (GPU의 병렬 처리 기능을 이용한 메쉬 평탄화 가속 방법)

  • Lee, Sang-Gil;Shin, Byeong-Seok
    • Journal of Korea Game Society
    • /
    • v.9 no.2
    • /
    • pp.135-142
    • /
    • 2009
  • Mesh denoising is a method to remove noise applying various filters. However, those methods usually spend much time since filtering is performed on CPU. Because GPU is specialized for floating point operations and faster than CPU, real-time processing for complex operations is possible. Especially mesh denoising is adequate for GPU parallel processing since it repeats the same operations for vertices or triangles. In this paper, we propose mesh denoising algorithm based on bilateral filtering using GPU parallel processing to reduce processing time. It finds neighbor triangles of each vertex for applying bilateral filter, and computes its normal vector. Then it performs bilateral filtering to estimate new vertex position and to update its normal vector.

  • PDF

Development of Parallel Event-Driven Remote IT Convergence (병렬 이벤트 기반 원격 IT 융합 개발)

  • Kim, Jung-Sook;Kim, Sung-Wan;Kim, Hong-Sup
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.12
    • /
    • pp.1-9
    • /
    • 2010
  • This paper describes parallel event-driven remote IT convergence applications which are a combination of traditional industry and IT Technology including advanced communication. In IT convergence system, events can occur currently from many sensors of devices or users. And IT convergence system must have a parallel processing method. In this paper, the parallel processing method was implemented using a thread and we developed a connection method between a device and a mode of communication which is a wireless communication or a power line communication. In addition to that, we developed object modeling, device, user and event modeling, based on XML (eXtensible Markup Language) using object-oriented modeling method. To efficiently show results in real time, systems provide various graphic user interfaces such as a bar graph, a table, and a combination of the two.

Parallel-Addition Convolution Algorithm in Grayscale Image (그레이스케일 영상의 병렬가산 컨볼루션 알고리즘)

  • Choi, Jong-Ho
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.10 no.4
    • /
    • pp.288-294
    • /
    • 2017
  • Recently, deep learning using convolutional neural network (CNN) has been extensively studied in image recognition. Convolution consists of addition and multiplication. Multiplication is computationally expensive in hardware implementation, relative to addition. It is also important factor limiting a chip design in an embedded deep learning system. In this paper, I propose a parallel-addition processing algorithm that converts grayscale images to the superposition of binary images and performs convolution only with addition. It is confirmed that the convolution can be performed by a parallel-addition method capable of reducing the processing time in experiment for verifying the availability of proposed algorithm.

An Efficient Solution Method to MDO Problems in Sequential and Parallel Computing Environments (순차 및 병렬처리 환경에서 효율적인 다분야통합최적설계 문제해결 방법)

  • Lee, Se-Jung
    • Korean Journal of Computational Design and Engineering
    • /
    • v.16 no.3
    • /
    • pp.236-245
    • /
    • 2011
  • Many researchers have recently studied multi-level formulation strategies to solve the MDO problems and they basically distributed the coupling compatibilities across all disciplines, while single-level formulations concentrate all the controls at the system-level. In addition, approximation techniques became remedies for computationally expensive analyses and simulations. This paper studies comparisons of the MDO methods with respect to computing performance considering both conventional sequential and modem distributed/parallel processing environments. The comparisons show Individual Disciplinary Feasible (IDF) formulation is the most efficient for sequential processing and IDF with approximation (IDFa) is the most efficient for parallel processing. Results incorporating to popular design examples show this finding. The author suggests design engineers should firstly choose IDF formulation to solve MDO problems because of its simplicity of implementation and not-bad performance. A single drawback of IDF is requiring more memory for local design variables and coupling variables. Adding cheap memories can save engineers valuable time and effort for complicated multi-level formulations and let them free out of no solution headache of Multi-Disciplinary Analysis (MDA) of the Multi-Disciplinary Feasible (MDF) formulation.

A NOVEL PARALLEL METHOD FOR SPECKLE MASKING RECONSTRUCTION USING THE OPENMP

  • LI, XUEBAO;ZHENG, YANFANG
    • Journal of The Korean Astronomical Society
    • /
    • v.49 no.4
    • /
    • pp.157-162
    • /
    • 2016
  • High resolution reconstruction technology is developed to help enhance the spatial resolution of observational images for ground-based solar telescopes, such as speckle masking. Near real-time reconstruction performance is achieved on a high performance cluster using the Message Passing Interface (MPI). However, much time is spent in reconstructing solar subimages in such a speckle reconstruction. We design and implement a novel parallel method for speckle masking reconstruction of solar subimage on a shared memory machine using the OpenMP. Real tests are performed to verify the correctness of our codes. We present the details of several parallel reconstruction steps. The parallel implementation between various modules shows a great speed increase as compared to single thread serial implementation, and a speedup of about 2.5 is achieved in one subimage reconstruction. The timing result for reconstructing one subimage with 256×256 pixels shows a clear advantage with greater number of threads. This novel parallel method can be valuable in real-time reconstruction of solar images, especially after porting to a high performance cluster.

Development of In-Plane Strength Analysis Software for Composite Laminated Structure with Parallel Processing Technique (병렬처리 기법을 이용한 복합재 적층 구조물의 면내 파손 해석 소프트웨어 개발)

  • Jung, Yeji;Choi, Soo Young;Ahn, Hyon Su;Ha, Seok Wun;Moon, Yong Ho
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.46 no.2
    • /
    • pp.133-140
    • /
    • 2018
  • In this paper, we develop an automated software for in-plane structural analysis of composite laminated structures. The developed software supports various failure criteria and reports the analysis results considering user's convenience. It also provides batch job analysis function based on parallel processing technique. To verify the performance of the software, we compared margin of safety(MS) calculated in the software to those obtained from in-house method and the specimen experiment. As a result of comparisons, there was an error of less than 0.01 in the in-house method and it is within about ${\pm}10%$ with the specimen experiment. In addition, we confirmed the improvement of execution speed of batch job analysis based on parallel processing technique.

An Optimization Method for Hologram Generation on Multiple GPU-based Parallel Processing (다중 GPU기반 홀로그램 생성을 위한 병렬처리 성능 최적화 기법)

  • Kook, Joongjin
    • Smart Media Journal
    • /
    • v.8 no.2
    • /
    • pp.9-15
    • /
    • 2019
  • Since the computational complexity for hologram generation increases exponentially with respect to the size of the point cloud, parallel processing using CUDA and/or OpenCL library based on multiple GPUs has recently become popular. The CUDA kernel for parallelization needs to consist of threads, blocks, and grids properly in accordance with the number of cores and the memory size in the GPU. In addition, in case of multiple GPU environments, the distribution in grid-by-grid, in block-by-block, or in thread-by-thread is needed according to the number of GPUs. In order to evaluate the performance of CGH generation, we compared the computational speed in CPU, in single GPU, and in multi-GPU environments by gradually increasing the number of points in a point cloud from 10 to 1,000,000. We also present a memory structure design and a calculation method required in the CUDA-based parallel processing to accelerate the CGH (Computer Generated Hologram) generation operation in multiple GPU environments.

Parallel Processing of k-Means Clustering Algorithm for Unsupervised Classification of Large Satellite Images: A Hybrid Method Using Multicores and a PC-Cluster (대용량 위성영상의 무감독 분류를 위한 k-Means Clustering 알고리즘의 병렬처리: 다중코어와 PC-Cluster를 이용한 Hybrid 방식)

  • Han, Soohee;Song, Jeong Heon
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.37 no.6
    • /
    • pp.445-452
    • /
    • 2019
  • In this study, parallel processing codes of k-means clustering algorithm were developed and implemented in a PC-cluster for unsupervised classification of large satellite images. We implemented intra-node code using multicores of CPU (Central Processing Unit) based on OpenMP (Open Multi-Processing), inter-nodes code using a PC-cluster based on message passing interface, and hybrid code using both. The PC-cluster consists of one master node and eight slave nodes, and each node is equipped with eight multicores. Two operating systems, Microsoft Windows and Canonical Ubuntu, were installed in the PC-cluster in turn and tested to compare parallel processing performance. Two multispectral satellite images were tested, which are a medium-capacity LANDSAT 8 OLI (Operational Land Imager) image and a high-capacity Sentinel 2A image. To evaluate the performance of parallel processing, speedup and efficiency were measured. Overall, the speedup was over N / 2 and the efficiency was over 0.5. From the comparison of the two operating systems, the Ubuntu system showed two to three times faster performance. To confirm that the results of the sequential and parallel processing coincide with the other, the center value of each band and the number of classified pixels were compared, and result images were examined by pixel to pixel comparison. It was found that care should be taken to avoid false sharing of OpenMP in intra-node implementation. To process large satellite images in a PC-cluster, code and hardware should be designed to reduce performance degradation caused by file I / O. Also, it was found that performance can differ depending on the operating system installed in a PC-cluster.

Interfacing the Visual Projector to PC using the Parallel Port (PC 병렬 포트를 이용한 실물화상기 인터페이스)

  • 이재혁
    • Proceedings of the IEEK Conference
    • /
    • 2000.06c
    • /
    • pp.173-176
    • /
    • 2000
  • In this study, a new multimedia data converter is proposed. Also the PC interfacing met hod using the parallel port of is suggested. The image compression/decompression is based on the JPEG algorithm, which is widely used for an effective compression in the image processing industry. The suggested interfacing method is based on the IEEE1284 and IEEE1284.3 protocol, which is a standard in the PC's parallel port interface.

  • PDF

A Parallel Genetic Algorithms with Diversity Controlled Migration and its Applicability to Multimodal Function Optimization

  • YAMAMOTO, Fujio;ARAKI, Tomoyuki
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 1998.06a
    • /
    • pp.629-633
    • /
    • 1998
  • Proposed here is a parallel genetic algorithm accompanied with intermittent migration among subpopulations. It is intended to maintain diversity in the population for a long period . This method was applied to finding out the global maximum of some multimodal functions for which no other methods seem to be useful . Preferable results and their detailed analysis are also presented.

  • PDF