• Title/Summary/Keyword: GPU Process

Search Result 147, Processing Time 0.03 seconds

A Software Method for Improving the Performance of Real-time Rendering of 3D Games (3D 게임의 실시간 렌더링 속도 향상을 위한 소프트웨어적 기법)

  • Whang, Suk-Min;Sung, Mee-Young;You, Yong-Hee;Kim, Nam-Joong
    • Journal of Korea Game Society
    • /
    • v.6 no.4
    • /
    • pp.55-61
    • /
    • 2006
  • Graphics rendering pipeline (application, geometry, and rasterizer) is the core of real-time graphics which is the most important functionality for computer games. Usually this rendering process is completed by both the CPU and the GPU, and a bottleneck can be located either in the CPU or the GPU. This paper focuses on reducing the bottleneck between the CPU and the GPU. We are proposing a method for improving the performance of parallel processing for real-time graphics rendering by separating the CPU operations (usually performed using a thread) into two parts: pure CPU operations and operations related to the GPU, and let them operate in parallel. This allows for maximizing the parallelism in processing the communication between the CPU and the GPU. Some experiments lead us to confirm that our method proposed in this paper can allow for faster graphics rendering. In addition to our method of using a dedicated thread for GPU related operations, we are also proposing an algorithm for balancing the graphics pipeline using the idle time due to the bottleneck. We have implemented the two methods proposed in this paper in our networked 3D game engine and verified that our methods are effective in real systems.

  • PDF

Optimizing Skyline Query Processing Algorithms on CUDA Framework (CUDA 프레임워크 상에서 스카이라인 질의처리 알고리즘 최적화)

  • Min, Jun;Han, Hwan-Soo;Lee, Sang-Won
    • Journal of KIISE:Databases
    • /
    • v.37 no.5
    • /
    • pp.275-284
    • /
    • 2010
  • GPUs are stream processors based on multi-cores, which can process large data with a high speed and a large memory bandwidth. Furthermore, GPUs are less expensive than multi-core CPUs. Recently, usage of GPUs in general purpose computing has been wide spread. The CUDA architecture from Nvidia is one of efforts to help developers use GPUs in their application domains. In this paper, we propose techniques to parallelize a skyline algorithm which uses a simple nested loop structure. In order to employ the CUDA programming model, we apply our optimization techniques to make our skyline algorithm fit into the performance restrictions of the CUDA architecture. According to our experimental results, we improve the original skyline algorithm by 80% with our optimization techniques.

Fast Hilbert R-tree Bulk-loading Scheme using GPGPU (GPGPU를 이용한 Hilbert R-tree 벌크로딩 고속화 기법)

  • Yang, Sidong;Choi, Wonik
    • Journal of KIISE
    • /
    • v.41 no.10
    • /
    • pp.792-798
    • /
    • 2014
  • In spatial databases, R-tree is one of the most widely used indexing structures and many variants have been proposed for its performance improvement. Among these variants, Hilbert R-tree is a representative method using Hilbert curve to process large amounts of data without high cost split techniques to construct the R-tree. This Hilbert R-tree, however, is hardly applicable to large-scale applications in practice mainly due to high pre-processing costs and slow bulk-load time. To overcome the limitations of Hilbert R-tree, we propose a novel approach for parallelizing Hilbert mapping and thus accelerating bulk-loading of Hilbert R-tree on GPU memory. Hilbert R-tree based on GPU improves bulk-loading performance by applying the inversed-cell method and exploiting parallelism for packing the R-tree structure. Our experimental results show that the proposed scheme is up to 45 times faster compared to the traditional CPU-based bulk-loading schemes.

GPU-accelerated Global Illumination for Point Set Rendering (GPU 가속을 이용한 점집합 렌더링을 위한 전역 조명기법)

  • Min, Heajung;Kim, Young J.
    • Journal of the Korea Computer Graphics Society
    • /
    • v.26 no.1
    • /
    • pp.7-15
    • /
    • 2020
  • In the process of visualizing a point set representing a smooth manifold surface, global illumination techniques can be used to render a realistic scene with various effects of lighting. Thanks to the continuous demand for ray tracing and the development of graphics hardware, dedicated GPUs and programmable pipeline for ray tracing have been introduced in recent years. In this paper, real-time global illumination rendering is studied for a point-set model using ray-tracing GPUs. We apply the moving least-squares (MLS) method to approximate the point set to a smooth implicit surface and render it using global illumination by performing massive ray-intersection tests with the surface and generating shading effects at the intersection point. As a result, a complicated point-set scene consisting of more than 0.5M points can be generated in real-time.

GPU-based modeling and rendering techniques of 3D clouds using procedural functions (절차적 함수를 이용한 GPU기반 실시간 3D구름 모델링 및 렌더링 기법)

  • Sung, Mankyu
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.4
    • /
    • pp.416-422
    • /
    • 2019
  • This paper proposes a GPU-based modeling and rendering of 3D clouds using procedural functions. The formation of clouds is based on modified noise function made with fbm(Fractional Brownian Motion). Those noise values turn into densities of droplets of liquid water, which is a critical parameter for forming the three different types of clouds. At the rendering stage, the algorithm applies the ray marching technique to decide the colors of cloud using density values obtained from the noise function. In this process, all lighting attenuation and scattering are calculated by physically based manner. Once we have the clouds, they are blended on the sky, which is also rendered physically. We also make the clouds moving in the sky by the wind force. All algorithms are implemented and tested on GPU using GLSL.

Efficient GPU Framework for Adaptive and Continuous Signed Distance Field Construction, and Its Applications

  • Kim, Jong-Hyun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.3
    • /
    • pp.63-69
    • /
    • 2022
  • In this paper, we propose a new GPU-based framework for quickly calculating adaptive and continuous SDF(Signed distance fields), and examine cases related to rendering/collision processing using them. The quadtree constructed from the triangle mesh is transferred to the GPU memory, and the Euclidean distance to the triangle is processed in parallel for each thread by using it to find the shortest continuous distance without discontinuity in the adaptive grid space. In this process, it is shown through experiments that the cut-off view of the adaptive distance field, the distance value inquiry at a specific location, real-time raytracing, and collision handling can be performed quickly and efficiently. Using the proposed method, the adaptive sign distance field can be calculated quickly in about 1 second even on a high polygon mesh, so it is a method that can be fully utilized not only for rigid bodies but also for deformable bodies. It shows the stability of the algorithm through various experimental results whether it can accurately sample and represent distance values in various models.

Real-Time Hierarchical Techniques for Rendering of Translucent Materials and Screen-Space Interpolation (반투명 재질의 렌더링과 화면 보간을 위한 실시간 계층화 알고리즘)

  • Ki, Hyun-Woo;Oh, Kyoung-Su
    • Journal of Korea Game Society
    • /
    • v.7 no.1
    • /
    • pp.31-42
    • /
    • 2007
  • In the natural world, most materials such as skin, marble and cloth are translucent. Their appearance is smooth and soft compared with metals or mirrors. In this paper, we propose a new GPU based hierarchical rendering technique for translucent materials, based on the dipole diffusion approximation, at interactive rates. Information of incident light, position, normal, and irradiance, on the surfaces are stored into 2D textures by rendering from a primary light view. Huge numbers of pixel photons are clustered into quad-tree image pyramids. Each pixel, we select clusters (sets of photons), and then we approximate multiple subsurface scattering term with the clusters. We also introduce a novel hierarchical screen-space interpolation technique by exploiting spatial coherence with early-z culling on the GPU. We also build image pyramids of the screen using mipmap and pixel shader. Each pixel of the pyramids is stores position, normal and spatial similarity of children pixels. If a pixel's the similarity is high, we render the pixel and interpolate the pixel to multiple pixels. Result images show that our method can interactively render deformable translucent objects by approximating hundreds of thousand photons with only hundreds clusters without any preprocessing. We use an image-space approach for entire process on the GPU, thus our method is less dependent to scene complexity.

  • PDF

Fast Medical Volume Decompression Using GPGPU (GPGPU를 이용한 고속 의료 볼륨 영상의 압축 복원)

  • Kye, Hee-Won
    • Journal of Korea Multimedia Society
    • /
    • v.15 no.5
    • /
    • pp.624-631
    • /
    • 2012
  • For many medical imaging systems, volume datasets are stored as a compressed form, so that the dataset has to be decompressed before it is visualized. Since the decompression process takes quite a long time, we present an acceleration method for medical volume decompression using GPU. Our method supports that both lossy and lossless compression and progressive refinement is possible to satisfy variable user requirements. Moreover, our decompression method is well parallelized for GPU so that the decompression takes a very short time. Finally, we designed that the decompression and volume rendering work in one framework so that the selective decompression is available. As a result, we gained additional improvement in volume decompression.

HOG based Pedestrian Detection and Behavior Pattern Recognition for Traffic Signal Control (교통신호제어를 위한 HOG 기반 보행자 검출 및 행동패턴 인식)

  • Yang, Sung-Min;Jo, Kang-Hyun
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.19 no.11
    • /
    • pp.1017-1021
    • /
    • 2013
  • The traffic signal has been widely used in the transport system with a fixed time interval currently. This kind of setting time was determined based on experience for vehicles to generate a waiting time while allowing pedestrians crossing the street. However, this strict setting causes inefficient problems in terms of economic and safety crossing. In this research, we propose a monitoring algorithm to detect, track and check pedestrian crossing the crosswalk by the patterns of behavior. This monitoring system ensures the safety for pedestrian and keeps the traffic flow in efficient. In this algorithm, pedestrians are detected by using HOG feature which is robust to illumination changes in outdoor environment. According to a complex computation, the parallel process with the GPU as well as CPU is adopted for real-time processing. Therefore, pedestrians are tracked by the relationship of hue channel in image sequence according to the predefined pedestrian zone. Finally, the system checks the pedestrians' crossing on the crosswalk by its HOG based behavior patterns. In experiments, the parallel processing by both GPU and CPU was performed so that the result reaches 16 FPS (Frame Per Second). The accuracy of detection and tracking was 93.7% and 91.2%, respectively.

GPU-accelerated Reliability Analysis Method using Dynamic Reliability Block Diagram based on DEVS Formalism (DEVS 형식론 기반의 Dynamic Reliability Block Diagram과 GPU 가속 기술을 이용한 신뢰도 분석 방법)

  • Ha, Sol;Ku, Namkug;Roh, Myung-Il
    • Journal of the Korea Society for Simulation
    • /
    • v.22 no.4
    • /
    • pp.109-118
    • /
    • 2013
  • This paper adopts the system configuration to assess the reliability instead of making a fault tree (FT), which is a traditional method to analyze reliability of a certain system; this is the reliability block diagram (RBD) method. The RBD method is a graphical presentation of a system diagram connecting the subsystems of components according to their functions or reliability relationships. The equipment model for the reliability simulation is modeled based on the discrete event system specification (DEVS) formalism. In order to make various alternatives of target system, this paper also adopts the system entity structure (SES), an ontological framework that hierarchically represents the elements of a system and their relationships. To enhance the calculation time of reliability analysis, GPU-based accelerations are adopted to the reliability simulation.