• Title/Summary/Keyword: GPU algorithm

Search Result 266, Processing Time 0.03 seconds

A Study on Biomass Estimation Technique of Invertebrate Grazers Using Multi-object Tracking Model Based on Deep Learning (딥러닝 기반 다중 객체 추적 모델을 활용한 조식성 무척추동물 현존량 추정 기법 연구)

  • Bak, Suho;Kim, Heung-Min;Lee, Heeone;Han, Jeong-Ik;Kim, Tak-Young;Lim, Jae-Young;Jang, Seon Woong
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.3
    • /
    • pp.237-250
    • /
    • 2022
  • In this study, we propose a method to estimate the biomass of invertebrate grazers from the videos with underwater drones by using a multi-object tracking model based on deep learning. In order to detect invertebrate grazers by classes, we used YOLOv5 (You Only Look Once version 5). For biomass estimation we used DeepSORT (Deep Simple Online and real-time tracking). The performance of each model was evaluated on a workstation with a GPU accelerator. YOLOv5 averaged 0.9 or more mean Average Precision (mAP), and we confirmed it shows about 59 fps at 4 k resolution when using YOLOv5s model and DeepSORT algorithm. Applying the proposed method in the field, there was a tendency to be overestimated by about 28%, but it was confirmed that the level of error was low compared to the biomass estimation using object detection model only. A follow-up study is needed to improve the accuracy for the cases where frame images go out of focus continuously or underwater drones turn rapidly. However,should these issues be improved, it can be utilized in the production of decision support data in the field of invertebrate grazers control and monitoring in the future.

Video Analysis System for Action and Emotion Detection by Object with Hierarchical Clustering based Re-ID (계층적 군집화 기반 Re-ID를 활용한 객체별 행동 및 표정 검출용 영상 분석 시스템)

  • Lee, Sang-Hyun;Yang, Seong-Hun;Oh, Seung-Jin;Kang, Jinbeom
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.89-106
    • /
    • 2022
  • Recently, the amount of video data collected from smartphones, CCTVs, black boxes, and high-definition cameras has increased rapidly. According to the increasing video data, the requirements for analysis and utilization are increasing. Due to the lack of skilled manpower to analyze videos in many industries, machine learning and artificial intelligence are actively used to assist manpower. In this situation, the demand for various computer vision technologies such as object detection and tracking, action detection, emotion detection, and Re-ID also increased rapidly. However, the object detection and tracking technology has many difficulties that degrade performance, such as re-appearance after the object's departure from the video recording location, and occlusion. Accordingly, action and emotion detection models based on object detection and tracking models also have difficulties in extracting data for each object. In addition, deep learning architectures consist of various models suffer from performance degradation due to bottlenects and lack of optimization. In this study, we propose an video analysis system consists of YOLOv5 based DeepSORT object tracking model, SlowFast based action recognition model, Torchreid based Re-ID model, and AWS Rekognition which is emotion recognition service. Proposed model uses single-linkage hierarchical clustering based Re-ID and some processing method which maximize hardware throughput. It has higher accuracy than the performance of the re-identification model using simple metrics, near real-time processing performance, and prevents tracking failure due to object departure and re-emergence, occlusion, etc. By continuously linking the action and facial emotion detection results of each object to the same object, it is possible to efficiently analyze videos. The re-identification model extracts a feature vector from the bounding box of object image detected by the object tracking model for each frame, and applies the single-linkage hierarchical clustering from the past frame using the extracted feature vectors to identify the same object that failed to track. Through the above process, it is possible to re-track the same object that has failed to tracking in the case of re-appearance or occlusion after leaving the video location. As a result, action and facial emotion detection results of the newly recognized object due to the tracking fails can be linked to those of the object that appeared in the past. On the other hand, as a way to improve processing performance, we introduce Bounding Box Queue by Object and Feature Queue method that can reduce RAM memory requirements while maximizing GPU memory throughput. Also we introduce the IoF(Intersection over Face) algorithm that allows facial emotion recognized through AWS Rekognition to be linked with object tracking information. The academic significance of this study is that the two-stage re-identification model can have real-time performance even in a high-cost environment that performs action and facial emotion detection according to processing techniques without reducing the accuracy by using simple metrics to achieve real-time performance. The practical implication of this study is that in various industrial fields that require action and facial emotion detection but have many difficulties due to the fails in object tracking can analyze videos effectively through proposed model. Proposed model which has high accuracy of retrace and processing performance can be used in various fields such as intelligent monitoring, observation services and behavioral or psychological analysis services where the integration of tracking information and extracted metadata creates greate industrial and business value. In the future, in order to measure the object tracking performance more precisely, there is a need to conduct an experiment using the MOT Challenge dataset, which is data used by many international conferences. We will investigate the problem that the IoF algorithm cannot solve to develop an additional complementary algorithm. In addition, we plan to conduct additional research to apply this model to various fields' dataset related to intelligent video analysis.

GPU Based Feature Profile Simulation for Deep Contact Hole Etching in Fluorocarbon Plasma

  • Im, Yeon-Ho;Chang, Won-Seok;Choi, Kwang-Sung;Yu, Dong-Hun;Cho, Deog-Gyun;Yook, Yeong-Geun;Chun, Poo-Reum;Lee, Se-A;Kim, Jin-Tae;Kwon, Deuk-Chul;Yoon, Jung-Sik;Kim3, Dae-Woong;You, Shin-Jae
    • Proceedings of the Korean Vacuum Society Conference
    • /
    • 2012.08a
    • /
    • pp.80-81
    • /
    • 2012
  • Recently, one of the critical issues in the etching processes of the nanoscale devices is to achieve ultra-high aspect ratio contact (UHARC) profile without anomalous behaviors such as sidewall bowing, and twisting profile. To achieve this goal, the fluorocarbon plasmas with major advantage of the sidewall passivation have been used commonly with numerous additives to obtain the ideal etch profiles. However, they still suffer from formidable challenges such as tight limits of sidewall bowing and controlling the randomly distorted features in nanoscale etching profile. Furthermore, the absence of the available plasma simulation tools has made it difficult to develop revolutionary technologies to overcome these process limitations, including novel plasma chemistries, and plasma sources. As an effort to address these issues, we performed a fluorocarbon surface kinetic modeling based on the experimental plasma diagnostic data for silicon dioxide etching process under inductively coupled C4F6/Ar/O2 plasmas. For this work, the SiO2 etch rates were investigated with bulk plasma diagnostics tools such as Langmuir probe, cutoff probe and Quadruple Mass Spectrometer (QMS). The surface chemistries of the etched samples were measured by X-ray Photoelectron Spectrometer. To measure plasma parameters, the self-cleaned RF Langmuir probe was used for polymer deposition environment on the probe tip and double-checked by the cutoff probe which was known to be a precise plasma diagnostic tool for the electron density measurement. In addition, neutral and ion fluxes from bulk plasma were monitored with appearance methods using QMS signal. Based on these experimental data, we proposed a phenomenological, and realistic two-layer surface reaction model of SiO2 etch process under the overlying polymer passivation layer, considering material balance of deposition and etching through steady-state fluorocarbon layer. The predicted surface reaction modeling results showed good agreement with the experimental data. With the above studies of plasma surface reaction, we have developed a 3D topography simulator using the multi-layer level set algorithm and new memory saving technique, which is suitable in 3D UHARC etch simulation. Ballistic transports of neutral and ion species inside feature profile was considered by deterministic and Monte Carlo methods, respectively. In case of ultra-high aspect ratio contact hole etching, it is already well-known that the huge computational burden is required for realistic consideration of these ballistic transports. To address this issue, the related computational codes were efficiently parallelized for GPU (Graphic Processing Unit) computing, so that the total computation time could be improved more than few hundred times compared to the serial version. Finally, the 3D topography simulator was integrated with ballistic transport module and etch reaction model. Realistic etch-profile simulations with consideration of the sidewall polymer passivation layer were demonstrated.

  • PDF

An Accelerated Approach to Dose Distribution Calculation in Inverse Treatment Planning for Brachytherapy (근접 치료에서 역방향 치료 계획의 선량분포 계산 가속화 방법)

  • Byungdu Jo
    • Journal of the Korean Society of Radiology
    • /
    • v.17 no.5
    • /
    • pp.633-640
    • /
    • 2023
  • With the recent development of static and dynamic modulated brachytherapy methods in brachytherapy, which use radiation shielding to modulate the dose distribution to deliver the dose, the amount of parameters and data required for dose calculation in inverse treatment planning and treatment plan optimization algorithms suitable for new directional beam intensity modulated brachytherapy is increasing. Although intensity-modulated brachytherapy enables accurate dose delivery of radiation, the increased amount of parameters and data increases the elapsed time required for dose calculation. In this study, a GPU-based CUDA-accelerated dose calculation algorithm was constructed to reduce the increase in dose calculation elapsed time. The acceleration of the calculation process was achieved by parallelizing the calculation of the system matrix of the volume of interest and the dose calculation. The developed algorithms were all performed in the same computing environment with an Intel (3.7 GHz, 6-core) CPU and a single NVIDIA GTX 1080ti graphics card, and the dose calculation time was evaluated by measuring only the dose calculation time, excluding the additional time required for loading data from disk and preprocessing operations. The results showed that the accelerated algorithm reduced the dose calculation time by about 30 times compared to the CPU-only calculation. The accelerated dose calculation algorithm can be expected to speed up treatment planning when new treatment plans need to be created to account for daily variations in applicator movement, such as in adaptive radiotherapy, or when dose calculation needs to account for changing parameters, such as in dynamically modulated brachytherapy.

3D feature profile simulation for nanoscale semiconductor plasma processing

  • Im, Yeon Ho
    • Proceedings of the Korean Vacuum Society Conference
    • /
    • 2015.08a
    • /
    • pp.61.1-61.1
    • /
    • 2015
  • Nanoscale semiconductor plasma processing has become one of the most challenging issues due to the limits of physicochemical fabrication routes with its inherent complexity. The mission of future and emerging plasma processing for development of next generation semiconductor processing is to achieve the ideal nanostructures without abnormal profiles and damages, such as 3D NAND cell array with ultra-high aspect ratio, cylinder capacitors, shallow trench isolation, and 3D logic devices. In spite of significant contributions of research frontiers, these processes are still unveiled due to their inherent complexity of physicochemical behaviors, and gaps in academic research prevent their predictable simulation. To overcome these issues, a Korean plasma consortium began in 2009 with the principal aim to develop a realistic and ultrafast 3D topography simulator of semiconductor plasma processing coupled with zero-D bulk plasma models. In this work, aspects of this computational tool are introduced. The simulator was composed of a multiple 3D level-set based moving algorithm, zero-D bulk plasma module including pulsed plasma processing, a 3D ballistic transport module, and a surface reaction module. The main rate coefficients in bulk and surface reaction models were extracted by molecular simulations or fitting experimental data from several diagnostic tools in an inductively coupled fluorocarbon plasma system. Furthermore, it is well known that realistic ballistic transport is a simulation bottleneck due to the brute-force computation required. In this work, effective parallel computing using graphics processing units was applied to improve the computational performance drastically, so that computer-aided design of these processes is possible due to drastically reduced computational time. Finally, it is demonstrated that 3D feature profile simulations coupled with bulk plasma models can lead to better understanding of abnormal behaviors, such as necking, bowing, etch stops and twisting during high aspect ratio contact hole etch.

  • PDF

The Study on an Automated Generation Method of Road Drawings using Road Survey Vehicle (도로교통안전점검차량을 이용한 도로의 자동도면화 생성 연구)

  • Lee, Jun Seok;Yun, Duk Geun;Park, Jae Hong
    • International Journal of Highway Engineering
    • /
    • v.16 no.5
    • /
    • pp.91-98
    • /
    • 2014
  • PURPOSES : This study is to develop a automate road mapping system using ARASEO(Automated Road Analysis and Safety Evaluation TOol) for road management. METHODS : The road survey van named ARASEO(Automated Road Analysis and Safety Evaluation TOol) was used to generate highway drawings for Korea National Road number 37 automatically. In order to generate the highway drawings for purpose of road management, it is required to acquired the information for highway alignment, road width and road facilities such as safety barrier and road sign. Therefore the survey van acquired and analyzed the road width, median and guardrail data using rear side laser sensor of ARASEO and recognized the traffic control sign and chevron sign using foreside camera images. Also the highway alignment which is the basic information for highway drawing can be analyzed by acquisition the every 1m positional and attitude data using GPU and IMU sensor and developed algorithm. Finally, in this research the CAD based drawing software was developed to draw highway drawing using the analysis result from ARASEO. RESULTS : This study showed the comparison result of the surveyed road width and drawing data. To make the drawing of the road, we made the Autocad ARX program witch run in CAD menu interface. CONCLUSIONS : Using this program we can create the road center line, every 500m horizontal and vertical ground plan drawing automatically.

A Road Region Extraction Using OpenCV CUDA To Advance The Processing Speed (처리 속도 향상을 위해 OpenCV CUDA를 활용한 도로 영역 검출)

  • Lee, Tae-Hee;Hwang, Bo-Hyun;Yun, Jong-Ho;Choi, Myung-Ryul
    • Journal of Digital Convergence
    • /
    • v.12 no.6
    • /
    • pp.231-236
    • /
    • 2014
  • In this paper, we propose a processing speed improvement by adding a parallel processing based on device(graphic card) into a road region extraction by host(PC) based serial processing. The OpenCV CUDA supports the many functions of parallel processing method by interworking a conventional OpenCV with CUDA. Also, when interworking the OpenCV and CUDA, OpenCV functions completed a configuration are optimized the User's device(Graphic Card) specifications. Thus, OpenCV CUDA usage provides an algorithm verification and easiness of simulation result deduction. The proposed method is verified that the proposed method has a about 3.09 times faster processing speed than a conventional method by using OpenCV CUDA and graphic card of NVIDIA GeForce GTX 560 Ti model through experimentation.

Acceleration techniques for GPGPU-based Maximum Intensity Projection (GPGPU 환경에서 최대휘소투영 렌더링의 고속화 방법)

  • Kye, Hee-Won;Kim, Jun-Ho
    • Journal of Korea Multimedia Society
    • /
    • v.14 no.8
    • /
    • pp.981-991
    • /
    • 2011
  • MIP(Maximum Intensity Projection) is a volume rendering technique which is essential for the medical imaging system. MIP rendering based on the ray casting method produces high quality images but takes a long time. Our aim is improvement of the rendering speed using GPGPU(General-purpose computing on Graphic Process Unit) technique. In this paper, we present the ray casting algorithm based on CUDA(an acronym for Compute Unified Device Architecture) which is a programming language for GPGPU and we suggest new acceleration methods for CUDA. In detail, we propose the block based space leaping which skips unnecessary regions of volume data for CUDA, the bisection method which is a fast method to find a block edge, and the initial value estimation method which improves the probability of space leaping. Due to the proposed methods, we noticeably improve the rendering speed without image quality degradation.

Random Partial Haar Wavelet Transformation for Single Instruction Multiple Threads (단일 명령 다중 스레드 병렬 플랫폼을 위한 무작위 부분적 Haar 웨이블릿 변환)

  • Park, Taejung
    • Journal of Digital Contents Society
    • /
    • v.16 no.5
    • /
    • pp.805-813
    • /
    • 2015
  • Many researchers expect the compressive sensing and sparse recovery problem can overcome the limitation of conventional digital techniques. However, these new approaches require to solve the l1 norm optimization problems when it comes to signal reconstruction. In the signal reconstruction process, the transform computation by multiplication of a random matrix and a vector consumes considerable computing power. To address this issue, parallel processing is applied to the optimization problems. In particular, due to huge size of original signal, it is hard to store the random matrix directly in memory, which makes one need to design a procedural approach in handling the random matrix. This paper presents a new parallel algorithm to calculate random partial Haar wavelet transform based on Single Instruction Multiple Threads (SIMT) platform.

Development of small multi-copter system for indoor collision avoidance flight (실내 비행용 소형 충돌회피 멀티콥터 시스템 개발)

  • Moon, Jung-Ho
    • Journal of Aerospace System Engineering
    • /
    • v.15 no.1
    • /
    • pp.102-110
    • /
    • 2021
  • Recently, multi-copters equipped with various collision avoidance sensors have been introduced to improve flight stability. LiDAR is used to recognize a three-dimensional position. Multiple cameras and real-time SLAM technology are also used to calculate the relative position to obstacles. A three-dimensional depth sensor with a small process and camera is also used. In this study, a small collision-avoidance multi-copter system capable of in-door flight was developed as a platform for the development of collision avoidance software technology. The multi-copter system was equipped with LiDAR, 3D depth sensor, and small image processing board. Object recognition and collision avoidance functions based on the YOLO algorithm were verified through flight tests. This paper deals with recent trends in drone collision avoidance technology, system design/manufacturing process, and flight test results.