• 제목/요약/키워드: CPU Processing Time

검색결과 332건 처리시간 0.024초

Real-time Ray-tracing Chip Architecture

  • Yoon, Hyung-Min;Lee, Byoung-Ok;Cheong, Cheol-Ho;Hur, Jin-Suk;Kim, Sang-Gon;Chung, Woo-Nam;Lee, Yong-Ho;Park, Woo-Chan
    • IEIE Transactions on Smart Processing and Computing
    • /
    • 제4권2호
    • /
    • pp.65-70
    • /
    • 2015
  • In this paper, we describe the world's first real-time ray-tracing chip architecture. Ray-tracing technology generates high-quality 3D graphics images better than current rasterization technology by providing four essential light effects: shadow, reflection, refraction and transmission. The real-time ray-tracing chip named RayChip includes a real-time ray-tracing graphics processing unit and an accelerating tree-building unit. An ARM Ltd. central processing unit (CPU) and other peripherals are also included to support all processes of 3D graphics applications. Using the accelerating tree-building unit named RayTree to minimize the CPU load, the chip uses a low-end CPU and decreases both silicon area and power consumption. The evaluation results with RayChip show appropriate performance to support real-time ray tracing in high-definition (HD) resolution, while the rendered images are scaled to full HD resolution. The chip also integrates the Linux operating system and the familiar OpenGL for Embedded Systems application programming interface for easy application development.

Implementation and Performance Evaluation of a Video-Equipped Real-Time Fire Detection Method at Different Resolutions using a GPU (GPU를 이용한 다양한 해상도의 비디오기반 실시간 화재감지 방법 구현 및 성능평가)

  • Shon, Dong-Koo;Kim, Cheol-Hong;Kim, Jong-Myon
    • Journal of the Korea Society of Computer and Information
    • /
    • 제20권1호
    • /
    • pp.1-10
    • /
    • 2015
  • In this paper, we propose an efficient parallel implementation method of a widely used complex four-stage fire detection algorithm using a graphics processing unit (GPU) to improve the performance of the algorithm and analyze the performance of the parallel implementation method. In addition, we use seven different resolution videos (QVGA, VGA, SVGA, XGA, SXGA+, UXGA, QXGA) as inputs of the four-stage fire detection algorithm. Moreover, we compare the performance of the GPU-based approach with that of the CPU implementation for each different resolution video. Experimental results using five different fire videos with seven different resolutions indicate that the execution time of the proposed GPU implementation outperforms that of the CPU implementation in terms of execution time and takes a 25.11ms per frame for the UXGA resolution video, satisfying real-time processing (30 frames per second, 30fps) of the fire detection algorithm.

Real-time FCWS implementation using CPU-FPGA architecture (CPU-FPGA 구조를 이용한 실시간 FCWS 구현)

  • Han, Sungwoo;Jeong, Yongjin
    • Journal of IKEEE
    • /
    • 제21권4호
    • /
    • pp.358-367
    • /
    • 2017
  • Advanced Driver Assistance Systems(ADAS), such as Front Collision Warning System (FCWS) are currently being developed. FCWS require high processing speed because it must operate in real time while driving. In addition, a low-power system is required to operate in an automobile embedded system. In this paper, FCWS is implemented in CPU-FPGA architecture in embedded system to enable real-time processing. The lane detection enabled the use of the Inverse Transform Perspective (IPM) and sliding window methods to operate at fast speed. To detect the vehicle, a Convolutional Neural Network (CNN) with high recognition rate and accelerated by parallel processing in FPGA is used. The proposed architecture was verified using Intel FPGA Cyclone V SoC(System on Chip) with ARM-Core A9 which operates in low power and on-board FPGA. The performance of FCWS in HD resolution is 44FPS, which is real time, and energy efficiency is about 3.33 times higher than that of high performance PC enviroment.

Performance Comparison of Join Operations Parallelization by using GPGPU (GPGPU 기반 조인 연산 병렬화 성능 비교)

  • Lee, Jong-Sub;Lee, Sang-Back;Lee, Kyu-Chul
    • Database Research
    • /
    • 제34권3호
    • /
    • pp.28-44
    • /
    • 2018
  • In a database system, the most expensive operation among relational operations is a join operation. Generally, CPU-based join operations uses parallel processing with either 1 core or 16 cores at most, which does not significantly improve the function. On the other hand, GPGPU(General-Purpose computing on Graphics Processing Units) allows parallel processing through thousands of processing units, greatly reducing the time required to perform join operations. Parallelization of the operation using GPGPU uses NVIDIA's CUDA SDK. In this paper, we implement parallelization of the join operation using GPGPU and compare the performances. The used join operations are Nested Loop Join (NLJ), Sort Merge Join (SMJ) and Hash Join (HJ), and GPGPU equipment uses TITAN Xp, GTX 1080 Ti and GTX 1080. We measure and compare the performance of join operations based on CPU and GPGPU. We compare this performance with the performance of the previous study on the join operation based on GPGPU. The results of experiment show that the performance based on GPGPU is 6~328 times faster than the one based on CPU.

An Estimation Scheme on Processing Time and Processor Utilization for Real-Time System Development (실시간 시스템 개발을 위한 데이터 처리 시간과 프로세서 사용율 추정 기법)

  • Kim, Han-Dong;Choi, Tae-Bong;Ko, Soon-Ju
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 한국정보과학회 2005년도 한국컴퓨터종합학술대회 논문집 Vol.32 No.1 (A)
    • /
    • pp.820-822
    • /
    • 2005
  • The current paper is on a study of the performance estimation fer data processing time and CPU utilization to efficiently develop the real-time system. The analytical modeling and OPNET modeling and benchmarking tests are applied to perform the estimation for data processing time and CPU utilization in real-time system. We demonstrate that the estimation results can be predicted fairly and accurately through the benchmarking test results although there is a small variance between the estimation results and the benchmarking test results.

  • PDF

FPGA-based design and implementation of data acquisition and real-time processing for laser ultrasound propagation

  • Abbas, Syed Haider;Lee, Jung-Ryul;Kim, Zaeill
    • International Journal of Aeronautical and Space Sciences
    • /
    • 제17권4호
    • /
    • pp.467-475
    • /
    • 2016
  • Ultrasonic propagation imaging (UPI) has shown great potential for detection of impairments in complex structures and can be used in wide range of non-destructive evaluation and structural health monitoring applications. The software implementation of such algorithms showed a tendency in time-consumption with increment in scan area because the processor shares its resources with a number of programs running at the same time. This issue was addressed by using field programmable gate arrays (FPGA) that is a dedicated processing solution and used for high speed signal processing algorithms. For this purpose, we need an independent and flexible block of logic which can be used with continuously evolvable hardware based on FPGA. In this paper, we developed an FPGA-based ultrasonic propagation imaging system, where FPGA functions for both data acquisition system and real-time ultrasonic signal processing. The developed UPI system using FPGA board provides better cost-effectiveness and resolution than digitizers, and much faster signal processing time than CPU which was tested using basic ultrasonic propagation algorithms such as ultrasonic wave propagation imaging and multi-directional adjacent wave subtraction. Finally, a comparison of results for processing time between a CPU-based UPI system and the novel FPGA-based system were presented to justify the objective of this research.

Twowheeled Motor Vehicle License Plate Recognition Algorithm using CPU based Deep Learning Convolutional Neural Network (CPU 기반의 딥러닝 컨볼루션 신경망을 이용한 이륜 차량 번호판 인식 알고리즘)

  • Kim Jinho
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • 제19권4호
    • /
    • pp.127-136
    • /
    • 2023
  • Many research results on the traffic enforcement of illegal driving of twowheeled motor vehicles using license plate recognition are introduced. Deep learning convolutional neural networks can be used for character and word recognition of license plates because of better generalization capability compared to traditional Backpropagation neural networks. In the plates of twowheeled motor vehicles, the interdependent government and city words are included. If we implement the mutually independent word recognizers using error correction rules for two word recognition results, efficient license plate recognition results can be derived. The CPU based convolutional neural network without library under real time processing has an advantage of low cost real application compared to GPU based convolutional neural network with library. In this paper twowheeled motor vehicle license plate recognition algorithm is introduced using CPU based deep-learning convolutional neural network. The experimental results show that the proposed plate recognizer has 96.2% success rate for outdoor twowheeled motor vehicle images in real time.

Performance Improvement of a Real-time Traffic Identification System on a Multi-core CPU Environment (멀티 코어 환경에서 실시간 트래픽 분석 시스템 처리속도 향상)

  • Yoon, Sung-Ho;Park, Jun-Sang;Kim, Myung-Sup
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • 제37권5B호
    • /
    • pp.348-356
    • /
    • 2012
  • The application traffic analysis is getting more and more challenging due to the huge amount of traffic from high-speed network link and variety of applications running on wired and wireless Internet devices. Multi-level combination of various analysis methods is desired to achieve high completeness and accuracy of analysis results for a real-time analysis system, while requires much of processing burden on the contrary. This paper proposes a novel architecture for a real-time traffic analysis system which improves the processing performance on multi-core CPU environment. The main contribution of the proposed architecture is an efficient parallel processing mechanism with multiple threads of various analysis methods. The feasibility of the proposed architecture was proved by implementing and deploying it on our campus network.

Implementation and Performance Evaluation of the Faddev-Leverrier Algorithm using GPGPU (GPGPU를 이용한 파데브-레브리어 알고리즘 구현 및 성능 분석)

  • Park, Yong-Hun;Kim, Cheol-Hong;Kim, Jong-Myon
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • 제8권3호
    • /
    • pp.171-178
    • /
    • 2013
  • In this paper, we implement the Faddev-Leverier algorithm using GPGPU (General-Purpose Graphics Processing Unit) to accelerate singular value decomposition. In addition, we compare the performance of the algorithm using CPU and CPU plus GPGPU for eleven ${\times}n$ matrix sizes in order to decompose singular values, where =4, 8, 16, 32, 64, 128, 256, 512, 1,024, 2,048, and 4,096. Experimental results indicate that CPU achieves better performance than CPU plus GPGPU for $n{\leq}64$ because of a large number of read and write operations between CPU and GPGPU. However, CPU plus GPGPU outperforms CPU exponentially in the execution time for $n{\geq}64$.

Real-Time Scheduling Method to assign Virtual CPU in the Multocore Mobile Virtualization System (멀티코아 모바일 가상화 시스템에서 가상 CPU 할당 실시간 스케줄링 방법)

  • Kang, Yongho;Keum, Kimoon;Kim, Seongjong;Jin, Kwangyoun;Kim, Jooman
    • Journal of Digital Convergence
    • /
    • 제12권3호
    • /
    • pp.227-235
    • /
    • 2014
  • Mobile virtualization is an approach to mobile device management in which two virtual platforms are installed on a single wireless device. A smartphone, a single wireless device, might have one virtual environment for business use and one for personal use. Mobile virtualization might also allow one device to run two different operating systems, allowing the same phone to run both RTOS and Android apps. In this paper, we propose the techniques to virtualize the cores of a multicore, allowing the reassign any number of vCPUs that are exposed to a OS to any subset of the pCPUs. And then we also propose the real-time scheduling method to assigning the vCPUs to the pCPU. Suggested technology in this paper solves problem that increases time of real-time process when interrupt are handled, and is able more to fast processing than previous algorithm.