• Title/Summary/Keyword: Graphic Processing Unit (GPU)

Search Result 81, Processing Time 0.023 seconds

Acceleration of Anisotropic Elastic Reverse-time Migration with GPUs (GPU를 이용한 이방성 탄성 거꿀 참반사 보정의 계산가속)

  • Choi, Hyungwook;Seol, Soon Jee;Byun, Joongmoo
    • Geophysics and Geophysical Exploration
    • /
    • v.18 no.2
    • /
    • pp.74-84
    • /
    • 2015
  • To yield physically meaningful images through elastic reverse-time migration, the wavefield separation which extracts P- and S-waves from reconstructed vector wavefields by using elastic wave equation is prerequisite. For expanding the application of the elastic reverse-time migration to anisotropic media, not only the anisotropic modelling algorithm but also the anisotropic wavefield separation is essential. The anisotropic wavefield separation which uses pseudo-derivative filters determined according to vertical velocities and anisotropic parameters of elastic media differs from the Helmholtz decomposition which is conventionally used for the isotropic wavefield separation. Since applying these pseudo-derivative filter consumes high computational costs, we have developed the efficient anisotropic wavefield separation algorithm which has capability of parallel computing by using GPUs (Graphic Processing Units). In addition, the highly efficient anisotropic elastic reverse-time migration algorithm using MPI (Message-Passing Interface) and incorporating the developed anisotropic wavefield separation algorithm with GPUs has been developed. To verify the efficiency and the validity of the developed anisotropic elastic reverse-time migration algorithm, a VTI elastic model based on Marmousi-II was built. A synthetic multicomponent seismic data set was created using this VTI elastic model. The computational speed of migration was dramatically enhanced by using GPUs and MPI and the accuracy of image was also improved because of the adoption of the anisotropic wavefield separation.

Improving the Rendering Speed of 3D Model Animation on Smart Phones

  • Ng, Cong Jie;Hwang, Gi-Hyun;Kang, Dae-Ki
    • Journal of information and communication convergence engineering
    • /
    • v.9 no.3
    • /
    • pp.266-270
    • /
    • 2011
  • The advancement of technology enables smart phones or handheld devices to render complex 3D graphics. However, the processing power and memory of smart phones remain very limited to render high polygon and details 3D models especially on games which requires animation, physic engine, or augmented reality. In this paper, several techniques will be introduced to speed up the computation and reducing the number of vertices of the 3D meshes without losing much detail.

Fast Multi-View Synthesis Using Duplex Foward Mapping and Parallel Processing (순차적 이중 전방 사상의 병렬 처리를 통한 다중 시점 고속 영상 합성)

  • Choi, Ji-Youn;Ryu, Sae-Woon;Shin, Hong-Chang;Park, Jong-Il
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.34 no.11B
    • /
    • pp.1303-1310
    • /
    • 2009
  • Glassless 3D display requires multiple images taken from different viewpoints to show a scene. The simplest way to get multi-view image is using multiple camera that as number of views are requires. To do that, synchronize between cameras or compute and transmit lots of data comes critical problem. Thus, generating such a large number of viewpoint images effectively is emerging as a key technique in 3D video technology. Image-based view synthesis is an algorithm for generating various virtual viewpoint images using a limited number of views and depth maps. In this paper, because the virtual view image can be express as a transformed image from real view with some depth condition, we propose an algorithm to compute multi-view synthesis from two reference view images and their own depth-map by stepwise duplex forward mapping. And also, because the geometrical relationship between real view and virtual view is repetitively, we apply our algorithm into OpenGL Shading Language which is a programmable Graphic Process Unit that allow parallel processing to improve computation time. We demonstrate the effectiveness of our algorithm for fast view synthesis through a variety of experiments with real data.

Digital Hologram Generating of 3D Object with Super-multi-light-source (초다광원 3차원 물체의 디지털 홀로그램 고속 생성)

  • Song, Joongseok;Kim, Changseob;Park, Jong-Il
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2015.07a
    • /
    • pp.135-136
    • /
    • 2015
  • 컴퓨터 생성 홀로그램(CGH: computer generated hologram) 기법은 기존의 광학계 장치와 변수들을 수학적으로 모델링하여 일반 범용 컴퓨터(PC: personal computer)로도 디지털 홀로그램을 생성할 수 있는 기술이다. 이 기술은 디지털 홀로그램의 해상도와 3D 물체의 광원 수에 따라 알고리즘의 연산량이 좌우되기 때문에, 실용적인 사용을 위해서 알고리즘의 연산량을 낮추거나 하드웨어의 연산 속도를 높이는 연구가 필수적이다. 본 논문에서는 초다광원 3D 물체의 디지털 홀로그램을 고속으로 생성할 수 있는 방법을 제안한다. 제안하는 방법은 한 개의 서버 PC와 다수의 클라이언트 PC들로 구성되어 있으며, 이들은 일반적으로 사용되는 범용 GPU (graphic processing unit)가 장착되어 있다. 서버에서 3D 물체의 광원을 스캔하여 데이터화 하고, 클라이언트 PC들의 연산 능력에 따라 광원 데이터를 분할하여 클라이언트들에게 각각 전송한다. 각각의 클라이언트들은 전송받은 데이터를 이용해 다중 GPU 기반의 CGH 연산을 수행하여 간섭 패턴들을 생성하고, 생성된 패턴들은 다시 서버 PC로 재전송된다. 서버 PC로 재전송 된 패턴들이 하나로 누적되면 디지털 홀로그램이 생성된다. 본 실험에서, 기존의 방법으로는 139,655개의 광원에 대해 $1,024{\times}1,024$ 해상도의 홀로그램을 생성하는데 약 2,250 ms가 걸린 반면, 제안하는 방법은 약 478 ms의 속도로 생성할 수 있음을 확인하였다.

  • PDF

Novel Kernel Design for Implementing Volume Rendering in the PyCUDA Framework (PyCUDA 프레임워크에서 볼륨 렌더링을 구현하기 위한 새로운 커널 디자인)

  • Lee, SooHo;Kim, Jong-Hyun
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2022.01a
    • /
    • pp.349-351
    • /
    • 2022
  • 본 논문에서는 계산양이 큰 볼륨 렌더링을 구현할 수 있는 파이썬 기반의 CUDA(Computed Unified Device Architecture) 커널(Kernel) 디자인에 대해서 소개한다. 최근에 파이썬은 인공지능뿐만 아니라 서버, 보안, GUI, 데이터 시각화, 빅 데이터 처리 등 다양한 분야에서 활용이 되고 있기 때문에 인터페이스만을 위한 언어라는 색을 탈피한지 오래이다. 본 논문에서는 대용량 병렬처리 기법인 NVIDIA의 CUDA를 이용하여 파이썬 환경에서 커널을 디자인하고, 계산양이 큰 볼륨 렌더링이 빠르게 계산되는 결과를 보여준다. 결과적으로 C언어 기반의 CUDA뿐만 아니라, 상대적으로 개발이 효율적인 파이썬 환경에서도 GPU(Graphic Processing Unit)기반 애플리케이션 개발이 가능하다는 것을 볼륨 렌더링을 통해 보여준다.

  • PDF

Architecture Exploration of Optimal Many-Core Processors for a Vector-based Rasterization Algorithm (래스터화 알고리즘을 위한 최적의 매니코어 프로세서 구조 탐색)

  • Son, Dong-Koo;Kim, Cheol-Hong;Kim, Jong-Myon
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.9 no.1
    • /
    • pp.17-24
    • /
    • 2014
  • In this paper, we implement and evaluate the performance of a vector-based rasterization algorithm for 3D graphics by using a SIMD (single instruction multiple data) many-core processor architecture. In addition, we evaluate the impact of a data-per-processing elements (DPE) ratio that is defined as the amount of data directly mapped to each processing element (PE) within many-core in terms of performance, energy efficiency, and area efficiency. For the experiment, we utilize seven different PE configurations by varying the DPE ratio (or the number PEs), which are implemented in the same 130 nm CMOS technology with a 500 MHz clock frequency. Experimental results indicate that the optimal PE configuration is achieved as the DPE ratio is in the range from 16,384 to 256 (or the number of PEs is in the range from 16 and 1,024), which meets the requirements of mobile devices in terms of the optimal performance and efficiency.

Real-time Color Recognition Based on Graphic Hardware Acceleration (그래픽 하드웨어 가속을 이용한 실시간 색상 인식)

  • Kim, Ku-Jin;Yoon, Ji-Young;Choi, Yoo-Joo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.1
    • /
    • pp.1-12
    • /
    • 2008
  • In this paper, we present a real-time algorithm for recognizing the vehicle color from the indoor and outdoor vehicle images based on GPU (Graphics Processing Unit) acceleration. In the preprocessing step, we construct feature victors from the sample vehicle images with different colors. Then, we combine the feature vectors for each color and store them as a reference texture that would be used in the GPU. Given an input vehicle image, the CPU constructs its feature Hector, and then the GPU compares it with the sample feature vectors in the reference texture. The similarities between the input feature vector and the sample feature vectors for each color are measured, and then the result is transferred to the CPU to recognize the vehicle color. The output colors are categorized into seven colors that include three achromatic colors: black, silver, and white and four chromatic colors: red, yellow, blue, and green. We construct feature vectors by using the histograms which consist of hue-saturation pairs and hue-intensity pairs. The weight factor is given to the saturation values. Our algorithm shows 94.67% of successful color recognition rate, by using a large number of sample images captured in various environments, by generating feature vectors that distinguish different colors, and by utilizing an appropriate likelihood function. We also accelerate the speed of color recognition by utilizing the parallel computation functionality in the GPU. In the experiments, we constructed a reference texture from 7,168 sample images, where 1,024 images were used for each color. The average time for generating a feature vector is 0.509ms for the $150{\times}113$ resolution image. After the feature vector is constructed, the execution time for GPU-based color recognition is 2.316ms in average, and this is 5.47 times faster than the case when the algorithm is executed in the CPU. Our experiments were limited to the vehicle images only, but our algorithm can be extended to the input images of the general objects.

A Study of Galaxy Cluster Mergers Based on Cosmological Simulations -- On the Evolution of Galaxy Mass Functions

  • Yun, Ki-Yun;Ahn, Sung-Ho;Shin, Ji-Hye;Kim, Ju-Han;Kim, Sung-Soo;Yoon, Suk-Jin
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.36 no.2
    • /
    • pp.67.2-67.2
    • /
    • 2011
  • 은하단에 속한 은하들의 광도함수에 의하면, 어두운 은하들(MB>-18,확인요망)의 수가 이 론적 예측에 비해 현저하게 적게 관측된다. 우리는 이와 같은 "어두운 은하들의 결핍 현상"을 설명하기위해 은하단 간의 충돌/병합과 같은 역학적 기원론을 제시하고자 한다. 본 연구는 은하단 간의 충돌/병합 과정에서 비교적 작은 질량의 은하들이 은하단의 중력적 구속에서 벗어날 가능성이 높다는 점에 착안하였다. 이러한 가능성을 검증하기 위해 (ㄱ) 우주론적 다. 체수치모사의 방법을 활용하고, (ㄴ) 유체수치모사에서 도입하여 발전시킨 "어떤 주어진 입자로부터 N번째 떨어진 입자의 거리 분석(N-th Particle)"이라는 새로운 방법으로 다체입자들의 공간분포 해석을 시도하였다. 이러한 방대한 자료를 효과적으로 분석하기 위해, GPU(Graphic Processing Unit)를 기반으로 설계된 분석 알고리즘을 독자 개발하였다.

  • PDF

Design and Implementation of a Framework for Collaboration Systems in the Shipbuilding and Marine Industry (조선해양 설계분야에서 협업시스템을 위한 프레임워크의 설계 및 구현)

  • Yun, Moon-Kyeong;Kim, Hyun-Ju;Park, Min-Gil;Han, Myeong-Ki;Kim, Wan-Kyoo
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2015.05a
    • /
    • pp.270-273
    • /
    • 2015
  • In shipbuilding and marine industry, engineering and design software solutions have upgraded from the original 2D schematic data based CAD system to a modern 3D drawing-based system. Due to the fact that the massive amount of data usage in real time and data volumes of various engineering models including graphic data have increased, several problems such as lack of server resources and improper handling of 3D drawings have been raised. Besides, increasing the number of session connections per server can cause deterioration of server performance. Recently, increasing the yard's sophisticated design capabilities highlighted the need to develop engineering and design system which would not only overcome the network performance issues, but would provide efficient collaborative design environment. This paper presents an overview of the framework for collaborative engineering design system based on the virtual application (Citrix XenApp 6.5)and acceleration hardware technology of 3D graphics (NVIDIA GRID K2 solution).

  • PDF

A Polarization-based Frequency Scanning Interferometer and the Measurement Processing Acceleration based on Parallel Programing (편광 기반 주파수 스캐닝 간섭 시스템 및 병렬 프로그래밍 기반 측정 고속화)

  • Lee, Seung Hyun;Kim, Min Young
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.8
    • /
    • pp.253-263
    • /
    • 2013
  • Frequency Scanning Interferometry(FSI) system, one of the most promising optical surface measurement techniques, generally results in superior optical performance comparing with other 3-dimensional measuring methods as its hardware structure is fixed in operation and only the light frequency is scanned in a specific spectral band without vertical scanning of the target surface or the objective lens. FSI system collects a set of images of interference fringe by changing the frequency of light source. After that, it transforms intensity data of acquired image into frequency information, and calculates the height profile of target objects with the help of frequency analysis based on Fast Fourier Transform(FFT). However, it still suffers from optical noise on target surfaces and relatively long processing time due to the number of images acquired in frequency scanning phase. 1) a Polarization-based Frequency Scanning Interferometry(PFSI) is proposed for optical noise robustness. It consists of tunable laser for light source, ${\lambda}/4$ plate in front of reference mirror, ${\lambda}/4$ plate in front of target object, polarizing beam splitter, polarizer in front of image sensor, polarizer in front of the fiber coupled light source, ${\lambda}/2$ plate between PBS and polarizer of the light source. Using the proposed system, we can solve the problem of fringe image with low contrast by using polarization technique. Also, we can control light distribution of object beam and reference beam. 2) the signal processing acceleration method is proposed for PFSI, based on parallel processing architecture, which consists of parallel processing hardware and software such as Graphic Processing Unit(GPU) and Compute Unified Device Architecture(CUDA). As a result, the processing time reaches into tact time level of real-time processing. Finally, the proposed system is evaluated in terms of accuracy and processing speed through a series of experiment and the obtained results show the effectiveness of the proposed system and method.