• Title/Summary/Keyword: GPU implementation

Search Result 147, Processing Time 0.026 seconds

An Implementation of Graphic Offloading Computing using GPU Virtualization based on API Remoting on a Server-based Software Service (서버 기반 SW 서비스에서 API 리모팅 기반의 GPU 가상화를 이용한 그래픽 분할 실행의 구현)

  • Choi, Won-Hyuk;Kim, Won-Young
    • Journal of Internet Computing and Services
    • /
    • v.12 no.6
    • /
    • pp.53-62
    • /
    • 2011
  • In this paper, we introduce a method of graphic offloading computing using a GPU virtualization technology in order to provide high demanding software like 3D software as an on-line software service. When the offloading software is executed on server's software virtualization environment, its graphic works are processed on a client's GPU using GPU virtualization, while on the other its data works are processed on server's CPU. To do that, we propose a method of rendering graphics information on client side GPU using API Remoting method. Also, we show the better performance than server based rendering method when we serve offloading software which include dynamical 3D graphics that display images are frequently changed through on-line. Moreover, we describe a method to virtualize offloading software by a process level and manage client's configuration information in order to decrease server's load when we provide software service to multiple clients.

Improving the Performance of Document Similarity by using GPU Parallelism (GPU 병렬성을 이용한 문서 유사도 계산 성능 개선)

  • Park, Il-Nam;Bae, Byung-Gurl;Im, Eun-Jin;Kang, Seung-Shik
    • The KIPS Transactions:PartB
    • /
    • v.19B no.4
    • /
    • pp.243-248
    • /
    • 2012
  • In the information retrieval systems like vector model implementation and document clustering, document similarity calculation takes a great part on the overall performance of the system. In this paper, GPU parallelism has been explored to enhance the processing speed of document similarity calculation in a CUDA framework. The proposed method increased the similarity calculation speed almost 15 times better compared to the typical CPU-based framework. It is 5.2 and 3.4 times better than the methods by using CUBLAS and Thrust, respectively.

Efficient Implementation of Candidate Region Extractor for Pedestrian Detection System with Stereo Camera based on GP-GPU (스테레오 영상 보행자 인식 시스템의 후보 영역 검출을 위한 GP-GPU 기반의 효율적 구현)

  • Jeong, Geun-Yong;Jeong, Jun-Hee;Lee, Hee-Chul;Jeon, Gwang-Gil;Cho, Joong-Hwee
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.8 no.2
    • /
    • pp.121-128
    • /
    • 2013
  • There have been various research efforts for pedestrian recognition in embedded imaging systems. However, many suffer from their heavy computational complexities. SVM classification method has been widely used for pedestrian recognition. The reduction of candidate region is crucial for low-complexity scheme. In this paper, We propose a real time HOG based pedestrian detection system on GPU which images are captured by a pair of cameras. To speed up humans on road detection, the proposed method reduces a number of detection windows with disparity-search and near-search algorithm and uses the GPU and the NVIDIA CUDA framework. This method can be achieved speedups of 20% or more compared to the recent GPU implementations. The effectiveness of our algorithm is demonstrated in terms of the processing time and the detection performance.

Real-Time Water Surface Simulation on GPU (GPU기반 실시간 물 표면 시뮬레이션)

  • Sung, Mankyu;Kwon, DeokHo;Lee, JaeSung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.12
    • /
    • pp.581-586
    • /
    • 2017
  • This paper proposes a GPU based water surface animation and rendering technique for interactive applications such as games. On the water surface, a lot of physical phenomenon occurs including reflection and refraction depending on the viewing direction. When we represent the water surface, not only showing them in real time, but also make them adjusted automatically. In our implementation, we are able to capture the reflection and refraction through render-to-texture technique and then modify the texture coordinates for applying separate DU/DV map. Also, we make the amount of ratio between reflection and refraction change automatically based on Fresnel formula. All proposed method are implemented using OpenGL 3D graphics API.

Implementation of Stereo Matching Algorithm using GPU (GPU를 이용한 스테레오 정합 알고리즘의 구현)

  • Choi, Hyun-Jun;Seo, Young-Ho;Kim, Dong-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.3
    • /
    • pp.583-588
    • /
    • 2011
  • In this paper, we propose an adaptive variable-sized matching window method using the characteristic points of the image and a method to increase the reliability of the cross-consistency check to raise the correctness of the final disparity image. The proposed adaptive variable-sized window method segments the image with the color information, finds the characteristic points inside the window. Also the proposed algorithm implement using a graphic processing unit(GPU). The GPU, we used in this paper is GeForce GTX296 (NVIDIA) and we can use programming based on CUDA. The calculation speed realizes a speed approximately 128 times faster than that of a CPU.

Evaluation of GPU Computing Capacity for All-in-view GNSS SDR Implementation

  • Yun Sub, Choi;Hung Seok, Seo;Young Baek, Kim
    • Journal of Positioning, Navigation, and Timing
    • /
    • v.12 no.1
    • /
    • pp.75-81
    • /
    • 2023
  • In this study, we design an optimized Graphics Processing Unit (GPU)-based GNSS signal processing technique with the goal of designing and implementing a GNSS Software Defined Receiver (SDR) that can operate in real time all-in-view mode under multi-constellation and multi-frequency signal environment. In the proposed structure the correlators of the existing GNSS SDR are processed by the GPU. We designed a memory structure and processing method that can minimize memory access bottlenecks and optimize the GPU memory resource distribution. The designed GNSS SDR can select and operate only the desired GNSS or desired satellite signals by user input. Also, parameters such as the number of quantization bits, sampling rate, and number of signal tracking arms can be selected. The computing capability of the designed GPU-based GNSS SDR was evaluated and it was confirmed that up to 2400 channels can be processed in real time. As a result, the GPU-based GNSS SDR has sufficient performance to operate in real-time all-in-view mode. In future studies, it will be used for more diverse GNSS signal processing and will be applied to multipath effect analysis using more tracking arms.

Implementation of GPU Based Polymorphic Worm Detection Method and Its Performance Analysis on Different GPU Platforms (GPU를 이용한 Polymorphic worm 탐지 기법 구현 및 GPU 플랫폼에 따른 성능비교)

  • Lee, Sunwon;Song, Chihwan;Lee, Injoon;Joh, Taewon;Kang, Jaewoo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2010.11a
    • /
    • pp.1458-1461
    • /
    • 2010
  • 작년 7월 7일에 있었던 DDoS 공격과 같이 악성 코드로 인한 피해의 규모가 해마다 증가하고 있다. 특히 변형 웜(Polymorphic Worm)은 기존의 방법으로 1차 공격에서의 탐지가 어렵기 때문에 그 위험성이 더 크다. 이에 본 연구에서는 바이오 인포매틱스(Bioinformatics) 분야에서 유전자들의 유사성과 특징을 찾기 위한 방법 중 하나인 Local Alignment를 소개하고 이를 변형 웜 탐지에 적용한다. 또한 수행의 병렬화 및 알고리즘 변형을 통하여 기존 알고리즘의 $O(n^4)$수행시간이라는 단점을 극복한다. 병렬화는 NVIDIA사의 GPU를 이용한 CUDA 프로그래밍과 AMD사의 GPU를 사용한 OpenCL 프로그래밍을 통하여 수행되었다. 이로써 각 GPGPU 플랫폼에서의 Local Alignment를 이용한 변형 웜 탐지 알고리즘의 성능을 비교하였다.

Accelerating Scanline Block Gibbs Sampling Method using GPU (GPU 를 활용한 스캔라인 블록 Gibbs 샘플링 기법의 가속)

  • Zeng, Dongmeng;Kim, Wonsik;Yang, Yong;Park, In Kyu
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2014.06a
    • /
    • pp.77-78
    • /
    • 2014
  • A new MCMC method for optimization is presented in this paper, which is called the scanline block Gibbs sampler. Due to its slow convergence speed, traditional Markov chain Monte Carlo (MCMC) is not widely used. In contrast to the conventional MCMC method, it is more convenient to parallelize the scanline block Gibbs sampler. Since The main part of the scanline block Gibbs sampler is to calculate message between each edge, in order to accelerate the calculation of messages passing in scanline sampler, it is parallelized in GPU. It is proved that the implementation on GPU is faster than on CPU based on the experiments on the OpenGM2 benchmark.

  • PDF

Implementation of Progressive Radiosity on GPU for Image based Relighting (영상기반 재조명을 위한 GPU 기반 래디오시티 구현)

  • Kim, Jun-Hwan;Hwang, Yong-Ho;Hong, Hyun-Ki
    • 한국HCI학회:학술대회논문집
    • /
    • 2007.02a
    • /
    • pp.988-993
    • /
    • 2007
  • 전역조명기법(global illumination)중에서 난반사(diffuse reflection) 객체들 사이의 관계를 효과적으로 표현하는 래디오시티(radiosity)방법은 객체들 사이의 에너지 교환에 에너지 평형 상태를 모델링 한다. 그러나 래디오시티는 많은 계산량으로 인해 실시간 활용에는 적합하지 않았다. 최근 장면생성에 걸리는 소요시간을 크게 단축시킬 수 있는 비용대비 고성능의 그래픽스 하드웨어(GPU)를 이용한 방법들이 제안되고 있다. 객체들 사이에서 교환되는 에너지는 래디언스(radiance)로 표현이 가능하며, 이러한 래디언스는 대상 장면에서 취득한 HDR(High Dynamic Range) 영상으로부터 래디언스 맵을 구성해서 얻을 수 있다. 이를 기반으로 대상장면의 조명환경을 구성하면 대상장면의 복잡도와는 별개로 빠르고 사실적인 합성장면을 생성할 수 있다. 본 논문에서는 G. Coombe 등이 제안한 점진적 세분(progressive refinement) 알고리즘을 수정하여 래디언스 맵을 이용할 수 있도록 하였으며, 각 텍셀(texel)설정 및 보간(interpolation) 적용 등에 따른 실험 결과를 얻고 분석하였다. 구현된 방법은 이후 영상기반 재조명과 그래픽스 하드웨어를 이용한 영상합성 기술로 영화, 애니메이션, 가상현실, 게임 등에 다양하게 활용될 예정이다.

  • PDF

PDF 1.4-1.6 Passward Cracking Optimal Implementation on CUDA GPU (CUDA GPU 상의 PDF 1.4-1.6 해독 최적 구현)

  • Kim, Hyun-Jun;Eum, Si-Uoo;Seo, Hwa-Jeong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.05a
    • /
    • pp.187-190
    • /
    • 2022
  • PDF (Portable Document Format)는 1992년 Adobe 에서 개발한 파일 형식으로 ISO 32000 으로 표준화 되어 전세계적으로 사용되고 있다. PDF와 같이 주로 사용되는 파일은 암호 해독(Password Cracking)의 대상이 될 수 있다. 본 논문에서는 PDF 1.4-1.6 암호 해독을 위해 CUDA GPU 상의 최적 구현하였다. 암호 해독에 사용되는 MD5와 RC4 알고리즘의 최적화와 CUDA GPU의 요소를 사용하였으며 RTX 3060 환경에서 크래킹 도구 해시캣과 비교하여 22.5%의 성능 향상을 달성하였다.