• Title/Summary/Keyword: GPU algorithm

Search Result 267, Processing Time 0.033 seconds

The Heightfield Based Cartoon Liquid Simulation Using GPU (GPU를 이용한 높이맵 기반 카툰 액체 시뮬레이션)

  • Song, gi-won;Ryoo, seung-tack
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2009.05a
    • /
    • pp.152-156
    • /
    • 2009
  • The goal of this study is to represent heightfield-map based cartoon liquid simulation using GPU. To do this, we examine the optical and flow features of liquid. So far, NPR represent study for gas and liquid advanced in computer graphics. In this study the flow of fluid represent using a trivial algorithm using a non-physical method, and optical of liquid represent ratio of value of reflection and refraction using fresnel equation. Cartoon represent of liquid different from the existing method, that the value of reflection and refraction on the environment map. Edge detection represent jump edge of liquid. As a result, the liquid simulation could represent a cartoon.

  • PDF

A Study on GPGPU Performance Improvement Technique on GCN Architecture Using OpenCL API (GCN 아키텍쳐 상에서의 OpenCL을 이용한 GPGPU 성능향상 기법 연구)

  • Woo, DongHee;Kim, YoonHo
    • The Journal of Society for e-Business Studies
    • /
    • v.23 no.1
    • /
    • pp.37-45
    • /
    • 2018
  • The current system upon which a variety of programs are in operation has continuously expanded its domain from conventional single-core and multi-core system to many-core and heterogeneous system. However, existing researches have focused mostly on parallelizing programs based CUDA framework and rarely on AMD based GCN-GPU optimization. In light of the aforementioned problems, our study focuses on the optimization techniques of the GCN architecture in a GPGPU environment and achieves a performance improvement. Specifically, by using performance techniques we propose, we have reduced more then 30% of the computation time of matrix multiplication and convolution algorithm in GPGPU. Also, we increase the kernel throughput by more then 40%.

Search for broadband extended gravitational-wave emission bursts in LIGO S6 in 350-2000 Hz by GPU acceleration

  • van Putten, Maurice H.P.M.
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.42 no.1
    • /
    • pp.37.3-37.3
    • /
    • 2017
  • We present a novel GPU accelerated search algorithm for broadband extended gravitational-wave emission (BEGE) with better than real-time analyis of H1-L1 LIGO S6 data. It performs matched filtering with over 8 million one-second duration chirps. Parseval's Theorem is used to predict the standard deviation ${\sigma}$ of filter output, taking advantage of near-Gaussian LIGO (H1,L1)-data in the high frequency range of 350-2000 Hz. A multiple of ${\sigma}$ serves as a threshold to filter output back to the central processing unit. This algorithm attains 80% efficiency, normalized to the Fast Fourier Transform (FFT). We apply it to a blind, all-sky search for BEGE in LIGO data, such as may be produced by long gamma-ray bursts and superluminous supernovae. We report on mysterious features, that are excluded by exact simultaneous occurrance. Our results are consistent with no events within a radius of about 20 Mpc.

  • PDF

Twowheeled Motor Vehicle License Plate Recognition Algorithm using CPU based Deep Learning Convolutional Neural Network (CPU 기반의 딥러닝 컨볼루션 신경망을 이용한 이륜 차량 번호판 인식 알고리즘)

  • Kim Jinho
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.19 no.4
    • /
    • pp.127-136
    • /
    • 2023
  • Many research results on the traffic enforcement of illegal driving of twowheeled motor vehicles using license plate recognition are introduced. Deep learning convolutional neural networks can be used for character and word recognition of license plates because of better generalization capability compared to traditional Backpropagation neural networks. In the plates of twowheeled motor vehicles, the interdependent government and city words are included. If we implement the mutually independent word recognizers using error correction rules for two word recognition results, efficient license plate recognition results can be derived. The CPU based convolutional neural network without library under real time processing has an advantage of low cost real application compared to GPU based convolutional neural network with library. In this paper twowheeled motor vehicle license plate recognition algorithm is introduced using CPU based deep-learning convolutional neural network. The experimental results show that the proposed plate recognizer has 96.2% success rate for outdoor twowheeled motor vehicle images in real time.

Development of GPU-Paralleled multi-resolution techniques for Lagrangian-based CFD code in nuclear thermal-hydraulics and safety

  • Do Hyun Kim;Yelyn Ahn;Eung Soo Kim
    • Nuclear Engineering and Technology
    • /
    • v.56 no.7
    • /
    • pp.2498-2515
    • /
    • 2024
  • In this study, we propose a fully parallelized adaptive particle refinement (APR) algorithm for smoothed particle hydrodynamics (SPH) to construct a stable and efficient multi-resolution computing system for nuclear safety analysis. The APR technique, widely employed by SPH research groups to adjust local particle resolutions, currently operates on a serialized algorithm. However, this serialized approach diminishes the computational efficiency of the system, negating the advantages of acceleration achieved through high-performance computing devices. To address this drawback, we propose a fully parallelized APR algorithm designed to enhance both efficiency and computational accuracy, facilitated by a new adaptive smoothing length model. For model validation, we simulated both hydrostatic and hydrodynamic benchmark cases in 2D and 3D environments. The results demonstrate improved computational efficiency compared to the conventional SPH method and APR with a serialized algorithm, and the model's accuracy was confirmed, revealing favorable outcomes near the resolution interface. Through the analysis of jet breakup, we verified the performance and accuracy of the model, emphasizing its applicability in practical nuclear safety analysis.

Parallel Range Query processing on R-tree with Graphics Processing Units (GPU를 이용한 R-tree에서의 범위 질의의 병렬 처리)

  • Yu, Bo-Seon;Kim, Hyun-Duk;Choi, Won-Ik;Kwon, Dong-Seop
    • Journal of Korea Multimedia Society
    • /
    • v.14 no.5
    • /
    • pp.669-680
    • /
    • 2011
  • R-trees are widely used in various areas such as geographical information systems, CAD systems and spatial databases in order to efficiently index multi-dimensional data. As data sets used in these areas grow in size and complexity, however, range query operations on R-tree are needed to be further faster to meet the area-specific constraints. To address this problem, there have been various research efforts to develop strategies for acceleration query processing on R-tree by using the buffer mechanism or parallelizing the query processing on R-tree through multiple disks and processors. As a part of the strategies, approaches which parallelize query processing on R-tree through Graphics Processor Units(GPUs) have been explored. The use of GPUs may guarantee improved performances resulting from faster calculations and reduced disk accesses but may cause additional overhead costs caused by high memory access latencies and low data exchange rate between GPUs and the CPU. In this paper, to address the overhead problems and to adapt GPUs efficiently, we propose a novel approach which uses a GPU as a buffer to parallelize query processing on R-tree. The use of buffer algorithm can give improved performance by reducing the number of disk access and maximizing coalesced memory access resulting in minimizing GPU memory access latencies. Through the extensive performance studies, we observed that the proposed approach achieved up to 5 times higher query performance than the original CPU-based R-trees.

CUDA Acceleration of Super-Resolution Algorithm Using ELBP Classifier for Fisheye Images (광각 영상을 위한 ELBP 분류기를 이용한 초해상도 기법과 CUDA 기반 가속화)

  • Choi, Ji Hoon;Song, Byung Cheol
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.53 no.10
    • /
    • pp.84-91
    • /
    • 2016
  • Most recently, the technology of around view monitoring(AVM) system or the security systems could provide users with images by using a fisheye lens. The filmed images through fisheye lens have an advantage of providing a wider range of scenes. On the other hand, filming through fisheye lens also has disadvantages of distorting images. Especially, it causes the sharpness of images to degrade because the edge of images is out of focus. The influence of a blur still remains at the end of the range when the super-resolution techniques is applied in order to enhance the sharpness. It degrades the clarity of high resolution images and occurs artifacts, which leads to deterioration in the performance of super-resolution algorithm. Therefore, in this paper we propose self-similarity-based pre-processing method to improve the sharpness at the edge. Additionally, we implement the acceleration in the GPU environment of entire algorithm and verify the acceleration.

GPU-based modeling and rendering techniques of 3D clouds using procedural functions (절차적 함수를 이용한 GPU기반 실시간 3D구름 모델링 및 렌더링 기법)

  • Sung, Mankyu
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.4
    • /
    • pp.416-422
    • /
    • 2019
  • This paper proposes a GPU-based modeling and rendering of 3D clouds using procedural functions. The formation of clouds is based on modified noise function made with fbm(Fractional Brownian Motion). Those noise values turn into densities of droplets of liquid water, which is a critical parameter for forming the three different types of clouds. At the rendering stage, the algorithm applies the ray marching technique to decide the colors of cloud using density values obtained from the noise function. In this process, all lighting attenuation and scattering are calculated by physically based manner. Once we have the clouds, they are blended on the sky, which is also rendered physically. We also make the clouds moving in the sky by the wind force. All algorithms are implemented and tested on GPU using GLSL.

Efficient GPU Framework for Adaptive and Continuous Signed Distance Field Construction, and Its Applications

  • Kim, Jong-Hyun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.3
    • /
    • pp.63-69
    • /
    • 2022
  • In this paper, we propose a new GPU-based framework for quickly calculating adaptive and continuous SDF(Signed distance fields), and examine cases related to rendering/collision processing using them. The quadtree constructed from the triangle mesh is transferred to the GPU memory, and the Euclidean distance to the triangle is processed in parallel for each thread by using it to find the shortest continuous distance without discontinuity in the adaptive grid space. In this process, it is shown through experiments that the cut-off view of the adaptive distance field, the distance value inquiry at a specific location, real-time raytracing, and collision handling can be performed quickly and efficiently. Using the proposed method, the adaptive sign distance field can be calculated quickly in about 1 second even on a high polygon mesh, so it is a method that can be fully utilized not only for rigid bodies but also for deformable bodies. It shows the stability of the algorithm through various experimental results whether it can accurately sample and represent distance values in various models.

Dynamic Remeshing for Real-Time Representation of Thin-Shell Tearing Simulations on the GPU

  • Jong-Hyun Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.12
    • /
    • pp.89-96
    • /
    • 2023
  • In this paper, we propose a GPU-based method for real-time processing of dynamic re-meshing required for tearing cloth. Thin shell materials are used in various fields such as physics-based simulation/animation, games, and virtual reality. Tearing the fabric requires dynamically updating the geometry and connectivity, making the process complex and computationally intensive. This process needs to be fast, especially when dealing with interactive content. Most methods perform re-meshing through low-resolution simulations to maintain real-time, or rely on an already segmented pattern, which is not considered dynamic re-meshing, and the quality of the torn pattern is low. In this paper, we propose a new GPU-optimized dynamic re-meshing algorithm that enables real-time processing of high-resolution fabric tears. The method proposed in this paper can be used for virtual surgical simulation and physics-based modeling in games and virtual environments that require real-time, as it allows dynamic re-meshing rather than pre-split meshes.