• Title/Summary/Keyword: Graphics acceleration

Search Result 49, Processing Time 0.026 seconds

Pose Calibration of Inertial Measurement Units on Joint-Constrained Rigid Bodies (관절체에 고정된 관성 센서의 위치 및 자세 보정 기법)

  • Kim, Sinyoung;Kim, Hyejin;Lee, Sung-Hee
    • Journal of the Korea Computer Graphics Society
    • /
    • v.19 no.4
    • /
    • pp.13-22
    • /
    • 2013
  • A motion capture system is widely used in movies, computer game, and computer animation industries because it allows for creating realistic human motions efficiently. The inertial motion capture system has several advantages over more popular vision-based systems in terms of the required space and cost. However, it suffers from low accuracy due to the relatively high noise levels of the inertial sensors. In particular, the accelerometer used for measuring gravity direction loses the accuracy when the sensor is moving with non-zero linear acceleration. In this paper, we propose a method to remove the linear acceleration component from the accelerometer data in order to improve the accuracy of measuring gravity direction. In addition, we develop a simple method to calibrate the joint axis of a link to which an inertial sensor belongs as well as the position of a sensor with respect to the link. The calibration enables attaching inertial sensors in an arbitrary position and orientation with respect to a link.

GPGPU Acceleration of SAT Algorithm with Propagation Routine Parallelization (전달 루틴의 병렬화를 통한 SAT 알고리즘의 GPGPU 가속화)

  • Kang, Hyeong-Ju
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.10
    • /
    • pp.1919-1926
    • /
    • 2016
  • Because of the enormous processing ability, General-Purpose Graphics Processing Unit(GPGPU) has been applied to many fields including electronics design automation. The SAT algorithm is one of the core algorithm in many electronics design automation tools. There has been some efforts to apply GPGPU to the SAT algorithm, but it is difficult to parallelize the SAT algorithm because of its characteristics. In this paper, I applied GPGPU to the SAT algorithm by parallelizing the propagation routine that is relatively suitable to parallel processing. On the basis of the similarity of the propagation routine to the sparse matrix multiplication, the data structure for the SAT problem is constituted, and the parallel propagation routine is described. To prevent data loss between paralllel threads, atomic operations are exploited. The experimental results for some benchmark SAT problems show that the proposed algorithm is superior to the previous GPGPU-based SAT solver.

A Modified Method for Registration of 3D Point Clouds with a Low Overlap Ratio (적은 오버랩에서 사용 가능한 3차원 점군 정합 방법)

  • Kim, Jigun;Lee, Junhee;Park, Sangmin;Ko, Kwanghee
    • Journal of the Korea Computer Graphics Society
    • /
    • v.24 no.5
    • /
    • pp.11-19
    • /
    • 2018
  • In this paper, we propose an algorithm for improving the accuracy and rate of convergence when two point clouds with noise and a low overlapping area are registered to each other. We make the most use of the geometric information of the underlying geometry of the point clouds with noise for better accuracy. We select a reasonable region from the noisy point cloud for registration and combine a modified acceleration algorithm to improve its speed. The conventional accuracy improvement method was not possible in a lot of noise, this paper resolves the problem by selecting the reasonable region for the registration. And this paper applies acceleration algorithm for a clone to low overlap point cloud pair. A simple algorithm is added to the conventional method, which leads to 3 or 4 times faster speed. In conclusion, this algorithm was developed to improve both the speed and accuracy of point cloud registration in noisy and low overlap case.

Implementation of OpenVG on Embedded Systems (임베디드 시스템을 위한 OpenVG 구현)

  • Lee, Hwan-Yong;Baek, Nak-Hoon
    • Journal of Korea Multimedia Society
    • /
    • v.12 no.3
    • /
    • pp.335-344
    • /
    • 2009
  • Embedded systems and web browsers have started to provide two-dimensional vector graphics features, to finally support scalability of graphics outputs, while traditional graphics systems have focused on the raster and bitmap operations. Nowadays, SVG and Flash are actively used while OpenVG from Khronos group plays the role of a de facto low-level API standard to support them. In this paper, we represent the design and implementation process and the final results of an OpenVG implementation, AlexVG. From its design stage, our implementation aims at the cooperation with SVG-Tiny, another de facto standard for embedded systems. Currently, our overall system provides not only the OpenVG core features but also variety of OpenVG application programs and SVG-Tiny media file playing capabilities. For the conformance with the standard specifications, our system completely passed the whole OpenVG conformance test suites and the graphics output portions of the SVG-Tiny conformance test suites. From the performance point of view, we focused on the efficiency and effectiveness especially on the mobile phones and embedded devices with limited resources. As the result, it showed impressive benchmarks on the small-scale CPU's such as ARM's, even without neither any other libraries nor acceleration hardware.

  • PDF

Parallel Implementation of the Recursive Least Square for Hyperspectral Image Compression on GPUs

  • Li, Changguo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.7
    • /
    • pp.3543-3557
    • /
    • 2017
  • Compression is a very important technique for remotely sensed hyperspectral images. The lossless compression based on the recursive least square (RLS), which eliminates hyperspectral images' redundancy using both spatial and spectral correlations, is an extremely powerful tool for this purpose, but the relatively high computational complexity limits its application to time-critical scenarios. In order to improve the computational efficiency of the algorithm, we optimize its serial version and develop a new parallel implementation on graphics processing units (GPUs). Namely, an optimized recursive least square based on optimal number of prediction bands is introduced firstly. Then we use this approach as a case study to illustrate the advantages and potential challenges of applying GPU parallel optimization principles to the considered problem. The proposed parallel method properly exploits the low-level architecture of GPUs and has been carried out using the compute unified device architecture (CUDA). The GPU parallel implementation is compared with the serial implementation on CPU. Experimental results indicate remarkable acceleration factors and real-time performance, while retaining exactly the same bit rate with regard to the serial version of the compressor.

CGRA Compilation Boost up for Acceleration of Graphics (영상처리 가속을 위한 CGRA compilation 속도 향상)

  • Kim, Wonsub;Choi, Yoonseo;Kim, Jaehyun
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2014.06a
    • /
    • pp.166-168
    • /
    • 2014
  • Coarse-grained reconfigurable architectures (CGRAs) present a potential of high compute throughput with energy efficiency. A CGRA consists of an array of functional units (FU), which communicate with each other through an interconnect network containing transmission nodes and register files. To achieve high performance from the software solutions mapped onto CGRAs, modulo scheduling of loops is generally employed. One of the key challenges in modulo scheduling for CGRAs is to explicitly handle routings of operands from a source to a destination operations through various routing resources. Existing modulo schedulers for CGRAs are slow because finding a valid routing is generally a searching problem over a large space, even with the guidance of well-defined cost metrics. Applications in traditional embedded multimedia domains are regarded relatively tolerant to a slow compile time in exchange of a high quality solution. However, many rapidly growing domains of applications, such as 3D graphics, require a fast compilation. Entrances of CGRAs to these domains have been blocked mainly due to its long compile time. We attack this problem by utilizing patternized routes, for which resources and time slots for a success can be estimated in advance when a source operation is placed. By conservatively reserving predefined resources at predefined time slots, future routings originated from the source operation are guaranteed. Experiments on a real-world 3D graphics benchmark suite show that our scheduler improves the compile time up to 6000 times while achieving average 70% throughputs of the state-of-art CGRA modulo scheduler, edge-centric modulo scheduler (EMS).

  • PDF

Acceleration for Removing Sea-fog using Graphic Processors and Parallel Processing (그래픽 프로세서를 이용한 병렬연산 기반 해무 제거 고속화)

  • Kim, Young-doo;Kwak, Jae-min;Seo, Young-ho;Choi, Hyun-jun
    • Journal of Advanced Navigation Technology
    • /
    • v.21 no.5
    • /
    • pp.485-490
    • /
    • 2017
  • In this paper, we propose a technique for high speed removal of sea-fog using a graphic processor. This technique uses a host processor(CPU) and several graphics processors(GPU) capable of parallel processing to remove sea-fog from the input image. In the process of removing sea-fog, the dark channel extraction, the maximum brightness channel extraction, and the calculation of the transmission are performed by the host processor, and the process of refining the transmission by applying the bidirectional filter is performed in parallel through the graphic processor. To verify the proposed parallel processing method, three NVIDIA GTX 1070 GPUs were used to construct the verification environment. As a result, it takes about 140ms when implemented with one graphics processor, and 26ms when implemented using OpenMP and multiple GPGPUs. The proposed a parallel processing algorithm based on the graphics processor unit can be used for safe navigation, port control and monitoring system.

Research of accelerating method of video quality measurement program using GPGPU (GPGPU를 이용한 영상 품질 측정 프로그램의 가속화 연구)

  • Lee, Seonguk;Byeon, Gibeom;Kim, Kisu;Hong, Jiman
    • Smart Media Journal
    • /
    • v.5 no.4
    • /
    • pp.69-74
    • /
    • 2016
  • Recently, parallel computing using GPGPU(General-Purpose computing on Graphics Processing Units) according to the development of the graphics processing unit is expanding. This can be achieved through the processing speeds faster than traditional computing environments across many fields, including science, medicine, engineering, and analysis. However, in using the GPU technology to implement the a parallel program there are many constraints. In this paper, we port a CPU-based program(Video Quality Measurement Program) to use technology. The program ported to GPU-based show about 1.83 times the execution speed than CPU-based program. We study on the acceleration of the GPU-based program. Also we discuss the technical constraints and problems that occur when you modify the CPU to the GPU-based programs.

A Study on the Reduction in VR Cybersickness using an Interactive Wind System (Interactive Wind System을 이용한 VR 사이버 멀미 개선 연구)

  • Lim, Dojeon;Lee, Yewon;Cho, Yesol;Ryoo, Taedong;Han, Daseong
    • Journal of the Korea Computer Graphics Society
    • /
    • v.27 no.3
    • /
    • pp.43-53
    • /
    • 2021
  • This paper presents an interactive wind system that generates artificial winds in a virtual reality (VR) environment according to online user inputs from a steering wheel and an acceleration pedal. Our system is composed of a head-mounted display (HMD) and three electric fans to make the user sense touch from the winds blowing from three different directions in a racing car VR application. To evaluate the effectiveness of the winds for reducing VR cybersickness, we employ the simulator sickness questionnaire (SSQ), which is one of the most common measures for cybersickness. We conducted experiments on 13 subjects for the racing car contents first with the winds and then without them or vice versa. Our results showed that the VR contents with the artificial winds clearly reduce cybersickness while providing a positive user experience.

GPU-Based ECC Decode Unit for Efficient Massive Data Reception Acceleration

  • Kwon, Jisu;Seok, Moon Gi;Park, Daejin
    • Journal of Information Processing Systems
    • /
    • v.16 no.6
    • /
    • pp.1359-1371
    • /
    • 2020
  • In transmitting and receiving such a large amount of data, reliable data communication is crucial for normal operation of a device and to prevent abnormal operations caused by errors. Therefore, in this paper, it is assumed that an error correction code (ECC) that can detect and correct errors by itself is used in an environment where massive data is sequentially received. Because an embedded system has limited resources, such as a low-performance processor or a small memory, it requires efficient operation of applications. In this paper, we propose using an accelerated ECC-decoding technique with a graphics processing unit (GPU) built into the embedded system when receiving a large amount of data. In the matrix-vector multiplication that forms the Hamming code used as a function of the ECC operation, the matrix is expressed in compressed sparse row (CSR) format, and a sparse matrix-vector product is used. The multiplication operation is performed in the kernel of the GPU, and we also accelerate the Hamming code computation so that the ECC operation can be performed in parallel. The proposed technique is implemented with CUDA on a GPU-embedded target board, NVIDIA Jetson TX2, and compared with execution time of the CPU.