Search | Korea Science

Parallel Algorithm of Conjugate Gradient Solver using OpenGL Compute Shader

Va, Hongly;Lee, Do-keyong;Hong, Min
- Journal of the Korea Society of Computer and Information
- /
- v.26 no.1
- /
- pp.1-9
- /
- 2021
OpenGL compute shader is a shader stage that operate differently from other shader stage and it can be used for the calculating purpose of any data in parallel. This paper proposes a GPU-based parallel algorithm for computing sparse linear systems through conjugate gradient using an iterative method, which perform calculation on OpenGL compute shader. Basically, this sparse linear solver is used to solve large linear systems such as symmetric positive definite matrix. Four well-known matrix formats (Dense, COO, ELL and CSR) have been used for matrix storage. The performance comparison from our experimental tests using eight sparse matrices shows that GPU-based linear solving system much faster than CPU-based linear solving system with the best average computing time 0.64ms in GPU-based and 15.37ms in CPU-based.
https://doi.org/10.9708/jksci.2021.26.01.001 인용 PDF KSCI HTML

Accelerating Depth Image-Based Rendering Using GPU (GPU를 이용한 깊이 영상기반 렌더링의 가속)

Lee, Man-Hee;Park, In-Kyu
- Journal of KIISE:Computer Systems and Theory
- /
- v.33 no.11
- /
- pp.853-858
- /
- 2006
In this paper, we propose a practical method for hardware-accelerated rendering of the depth image-based representation(DIBR) of 3D graphic object using graphic processing unit(GPU). The proposed method overcomes the drawbacks of the conventional rendering, i.e. it is slow since it is hardly assisted by graphics hardware and surface lighting is static. Utilizing the new features of modem GPU and programmable shader support, we develop an efficient hardware-accelerating rendering algorithm of depth image-based 3D object. Surface rendering in response of varying illumination is performed inside the vertex shader while adaptive point splatting is performed inside the fragment shader. Experimental results show that the rendering speed increases considerably compared with the software-based rendering and the conventional OpenGL-based rendering method.
PDF KSCI

A Shader Technique that applies Noise Texture to Vertex Movement and Surface Texture Mapping of Polygon Mesh (폴리곤 메시의 정점 이동과 표면 텍스처 매핑에 노이즈 텍스처를 적용하는 쉐이더 기법)

Hong, Minseok;Park, Jinho
- Journal of Korea Game Society
- /
- v.21 no.2
- /
- pp.79-88
- /
- 2021
Particle and noise are effectively used to implement unspecific VFX like an explosion, magic. Particle can create freely but, The more usage, the higher CPU/GPU usage. This paper using polygon mesh that is hard to change but consumes fixed resources to overcome the demerit of particle and reduce CPU/GPU usage. Also, using shader, apply noise texture that is suitable unspecific pattern to vertex and surface texture mapping of polygon mesh for implement VFX in unity. As a result of experiment, shader applied sphere polygon mesh show 2~4ms CPU, 1~2ms GPU usage in profiler. Also It has been shown that shader can be used to implement unspecific VFX.
https://doi.org/10.7583/JKGS.2021.21.2.79 인용 PDF KSCI

Design of Virtual Machine for Vertex Shader (정점 셰이더의 가상 기계 구현)

Ha, Chang-Soo;Kim, Ju-Hong;Choi, Byeong-Yoon
- Proceedings of the IEEK Conference
- /
- 2005.11a
- /
- pp.1003-1006
- /
- 2005
Vertex shader of GPU in personal computer is advanced in functions as to be half of traditional fixed T&L functions. And, capacity of memory for saving resources to process instructions is unlimited. GPU that can be programmed by programmer is needed for mobile system as well as personal computer. In this paper, we implement software virtual machine for vertex shader using C++ Language. Our goal is designing hardware GPU that can apply to mobile system. The virtual machine consists of nVidia GPU instructions. Input Data to virtual machine is generated by Microsoft fxc compiler. That is to say, Input Data is compiled shader program written in HLSL, Cg, or ASM. The virtual machine will be a reference model for designing hardware GPU and can be used for Testbed to test added or modified instruction.
PDF

A Reconfigurable Lighting Engine for Mobile GPU Shaders

Ahn, Jonghun;Choi, Seongrim;Nam, Byeong-Gyu
- JSTS:Journal of Semiconductor Technology and Science
- /
- v.15 no.1
- /
- pp.145-149
- /
- 2015
A reconfigurable lighting engine for widely used lighting models is proposed for low-power GPU shaders. Conventionally, lighting operations that involve many complex arithmetic operations were calculated by the shader programs on the GPU, which led to a significant energy overhead. In this letter, we propose a lighting engine to improve the energy-efficiency by supporting the widely used advanced lighting models in hardware. It supports the Blinn-Phong, Oren-Nayar, and Cook-Torrance models, by exploiting the logarithmic arithmetic and optimizing the trigonometric function evaluations for the energy-efficiency. Experimental results demonstrate 12.7%, 42.5%, and 35.5% reductions in terms of power-delay product from the shader program implementations for each lighting model. Moreover, our work shows 10.1% higher energy-efficiency for the Blinn-Phong model compared to the prior art.
https://doi.org/10.5573/JSTS.2015.15.1.145 인용 PDF KSCI

A Fully Programmable Shader Processor for Low Power Mobile Devices (저전력 모바일 장치를 위한 완전 프로그램 가능형 쉐이더 프로세서)

Jeong, Hyung-Ki;Lee, Joo-Sock;Park, Tae-Ryong;Lee, Kwang-Yeob
- Journal of IKEEE
- /
- v.13 no.2
- /
- pp.253-259
- /
- 2009
In this paper, we propose a novel architecture of a general graphics shader processor without a dedicated hardware. Recently, mobile devices require the high performance graphics processor as well as the small size, low power. The proposed shader processor is a GP-GPU(General-Purpose computing on Graphics Processing Units) to execute the whole OpenGL ES 2.0 graphics pipeline by using shader instructions. It does not require the separate dedicate H/W such as rasterization on this fully programmable capability. The fully programmable 3D graphics shader processor can reduce much of the graphics hardware. The chip size of the designed shader processor is reduced 60% less than the sizes of previous processors.
PDF

Proposal of 3D Graphic Processor Using Multi-Access Memory System (Multi-Access Memory System을 이용한 3D 그래픽 프로세서 제안)

Lee, S-Ra-El;Kim, Jae-Hee;Ko, Kyung-Sik;Park, Jong-Won
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.19 no.4
- /
- pp.119-128
- /
- 2019
Due to the nature of the 3D graphics processor system, many mathematical calculations are required and parallel processing research using GPU (Graphics Processing Unit) is being performed for high-speed processing. In this paper, we propose a 3D graphics processor using MAMS, a parallel processor that does not use cache memory, to solve the GPU problem of increasing bandwidth caused by cache memory miss and the problem that 3D shader processing speed is not constant. The 3D graphics processor using MAMS proposed in this paper designed Vertex shader, Pixel shader, Tiling and Rasterizing structure using DirectX command analysis, the FPGA(Xilinx Virtex6@100MHz) board for MAMS was constructed and designed using Verilog. We compared the processing time of the developed FPGA (100Mhz) and nVidia GeForce GTX 660 (980Mhz), the processing time using GTX 660 was not constant and suing MAMS was constant.
https://doi.org/10.7236/JIIBC.2019.19.4.119 인용 PDF KSCI HTML

Matrix Addition & Scalar Multiplication on the GPU (GPU 기반 행렬 덧셈 및 스칼라 곱셈 알고리즘)

Park, Sangkun
- Journal of Institute of Convergence Technology
- /
- v.8 no.1
- /
- pp.15-20
- /
- 2018
Recently a GPU has acquired programmability to perform general purpose computation fast by running thousands of threads concurrently. This paper presents a parallel GPU computation algorithm for dense matrix-matrix addition and scalar multiplication using OpenGL compute shader. It can play a very important role as a fundamental building block for many high-performance computing applications. Experimental results on NVIDIA Quad 4000 show that the proposed algorithm runs 21 times faster than CPU algorithm and achieves performance of 16 GFLOPS in single precision for dense matrices with size 4,096. Such performance proves that our algorithm is practical for real applications.
https://doi.org/10.22710/JICT.2018.8.1.015 인용 PDF

Photomosaic using a programmable GPU (프로그래밍 가능한 GPU를 이용한 포토 모자이크)

Kang, Dong-Wann;Yoon, Kyung-Hyun
- Journal of the Korea Computer Graphics Society
- /
- v.14 no.1
- /
- pp.17-25
- /
- 2008
We proposed the method for photomosaic generation using a programmable GPU. We design vertices to generate a photomosaic through a graphics pipeline and suggest a texture representation of an image database whice is used for tile. Both the source image and the tiles are stored to texture, which are matched by a vertex shader and drawn by a fragment shader. This is much faster than several techniques which achieve the best match for each tile.
PDF

Parallel Algorithm for Matrix-Matrix Multiplication on the GPU (GPU 기반 행렬 곱셈 병렬처리 알고리즘)

Park, Sangkun
- Journal of Institute of Convergence Technology
- /
- v.9 no.1
- /
- pp.1-6
- /
- 2019
Matrix multiplication is a fundamental mathematical operation that has numerous applications across most scientific fields. In this paper, we presents a parallel GPU computation algorithm for dense matrix-matrix multiplication using OpenGL compute shader, which can play a very important role as a fundamental building block for many high-performance computing applications. Experimental results on NVIDIA Quad 4000 show that the proposed algorithm runs about 208 times faster than previous CPU algorithm and achieves performance of 75 GFLOPS in single precision for dense matrices with matrix size 4,096. Such performance proves that our algorithm is practical for real applications.
https://doi.org/10.22710/JICT.2019.9.1.001 인용 PDF KSCI

Search Result 44, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)