• Title/Summary/Keyword: Deep inference

Search Result 154, Processing Time 0.025 seconds

Collaborative Inference for Deep Neural Networks in Edge Environments

  • Meizhao Liu;Yingcheng Gu;Sen Dong;Liu Wei;Kai Liu;Yuting Yan;Yu Song;Huanyu Cheng;Lei Tang;Sheng Zhang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.7
    • /
    • pp.1749-1773
    • /
    • 2024
  • Recent advances in deep neural networks (DNNs) have greatly improved the accuracy and universality of various intelligent applications, at the expense of increasing model size and computational demand. Since the resources of end devices are often too limited to deploy a complete DNN model, offloading DNN inference tasks to cloud servers is a common approach to meet this gap. However, due to the limited bandwidth of WAN and the long distance between end devices and cloud servers, this approach may lead to significant data transmission latency. Therefore, device-edge collaborative inference has emerged as a promising paradigm to accelerate the execution of DNN inference tasks where DNN models are partitioned to be sequentially executed in both end devices and edge servers. Nevertheless, collaborative inference in heterogeneous edge environments with multiple edge servers, end devices and DNN tasks has been overlooked in previous research. To fill this gap, we investigate the optimization problem of collaborative inference in a heterogeneous system and propose a scheme CIS, i.e., collaborative inference scheme, which jointly combines DNN partition, task offloading and scheduling to reduce the average weighted inference latency. CIS decomposes the problem into three parts to achieve the optimal average weighted inference latency. In addition, we build a prototype that implements CIS and conducts extensive experiments to demonstrate the scheme's effectiveness and efficiency. Experiments show that CIS reduces 29% to 71% on the average weighted inference latency compared to the other four existing schemes.

Failure Detection Method of Industrial Cartesian Coordinate Robots Based on a CNN Inference Window Using Ambient Sound (음향 데이터를 이용한 CNN 추론 윈도우 기반 산업용 직교 좌표 로봇의 고장 진단 기법)

  • Hyuntae Cho
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.19 no.1
    • /
    • pp.57-64
    • /
    • 2024
  • In the industrial field, robots are used to increase productivity by replacing labors with dangerous, difficult, and hard tasks. However, failures of individual industrial robots in the entire production process may cause product defects or malfunctions, and may cause dangerous disasters in the case of manufacturing parts used in automobiles and aircrafts. Although requirements for early diagnosis of industrial robot failures are steadily increasing, there are many limitations in early detection. This paper introduces methods for diagnosing robot failures using sound-based data and deep learning. This paper also analyzes, compares, and evaluates the performance of failure diagnosis using various deep learning technologies. Furthermore, in order to improve the performance of the fault diagnosis system using deep learning technology, we propose a method to increase the accuracy of fault diagnosis based on an inference window. When adopting the inference window of deep learning, the accuracy of the failure diagnosis was increased up to 94%.

Performance analysis of local exit for distributed deep neural networks over cloud and edge computing

  • Lee, Changsik;Hong, Seungwoo;Hong, Sungback;Kim, Taeyeon
    • ETRI Journal
    • /
    • v.42 no.5
    • /
    • pp.658-668
    • /
    • 2020
  • In edge computing, most procedures, including data collection, data processing, and service provision, are handled at edge nodes and not in the central cloud. This decreases the processing burden on the central cloud, enabling fast responses to end-device service requests in addition to reducing bandwidth consumption. However, edge nodes have restricted computing, storage, and energy resources to support computation-intensive tasks such as processing deep neural network (DNN) inference. In this study, we analyze the effect of models with single and multiple local exits on DNN inference in an edge-computing environment. Our test results show that a single-exit model performs better with respect to the number of local exited samples, inference accuracy, and inference latency than a multi-exit model at all exit points. These results signify that higher accuracy can be achieved with less computation when a single-exit model is adopted. In edge computing infrastructure, it is therefore more efficient to adopt a DNN model with only one or a few exit points to provide a fast and reliable inference service.

Experiment on Intermediate Feature Coding for Object Detection and Segmentation

  • Jeong, Min Hyuk;Jin, Hoe-Yong;Kim, Sang-Kyun;Lee, Heekyung;Choo, Hyon-Gon;Lim, Hanshin;Seo, Jeongil
    • Journal of Broadcast Engineering
    • /
    • v.25 no.7
    • /
    • pp.1081-1094
    • /
    • 2020
  • With the recent development of deep learning, most computer vision-related tasks are being solved with deep learning-based network technologies such as CNN and RNN. Computer vision tasks such as object detection or object segmentation use intermediate features extracted from the same backbone such as Resnet or FPN for training and inference for object detection and segmentation. In this paper, an experiment was conducted to find out the compression efficiency and the effect of encoding on task inference performance when the features extracted in the intermediate stage of CNN are encoded. The feature map that combines the features of 256 channels into one image and the original image were encoded in HEVC to compare and analyze the inference performance for object detection and segmentation. Since the intermediate feature map encodes the five levels of feature maps (P2 to P6), the image size and resolution are increased compared to the original image. However, when the degree of compression is weakened, the use of feature maps yields similar or better inference results to the inference performance of the original image.

YOLOv7 Model Inference Time Complexity Analysis in Different Computing Environments (다양한 컴퓨팅 환경에서 YOLOv7 모델의 추론 시간 복잡도 분석)

  • Park, Chun-Su
    • Journal of the Semiconductor & Display Technology
    • /
    • v.21 no.3
    • /
    • pp.7-11
    • /
    • 2022
  • Object detection technology is one of the main research topics in the field of computer vision and has established itself as an essential base technology for implementing various vision systems. Recent DNN (Deep Neural Networks)-based algorithms achieve much higher recognition accuracy than traditional algorithms. However, it is well-known that the DNN model inference operation requires a relatively high computational power. In this paper, we analyze the inference time complexity of the state-of-the-art object detection architecture Yolov7 in various environments. Specifically, we compare and analyze the time complexity of four types of the Yolov7 model, YOLOv7-tiny, YOLOv7, YOLOv7-X, and YOLOv7-E6 when performing inference operations using CPU and GPU. Furthermore, we analyze the time complexity variation when inferring the same models using the Pytorch framework and the Onnxruntime engine.

Performance Analysis of DNN inference using OpenCV Built in CPU and GPU Functions (OpenCV 내장 CPU 및 GPU 함수를 이용한 DNN 추론 시간 복잡도 분석)

  • Park, Chun-Su
    • Journal of the Semiconductor & Display Technology
    • /
    • v.21 no.1
    • /
    • pp.75-78
    • /
    • 2022
  • Deep Neural Networks (DNN) has become an essential data processing architecture for the implementation of multiple computer vision tasks. Recently, DNN-based algorithms achieve much higher recognition accuracy than traditional algorithms based on shallow learning. However, training and inference DNNs require huge computational capabilities than daily usage purposes of computers. Moreover, with increased size and depth of DNNs, CPUs may be unsatisfactory since they use serial processing by default. GPUs are the solution that come up with greater speed compared to CPUs because of their Parallel Processing/Computation nature. In this paper, we analyze the inference time complexity of DNNs using well-known computer vision library, OpenCV. We measure and analyze inference time complexity for three cases, CPU, GPU-Float32, and GPU-Float16.

Trends in Deep Learning Inference Engines for Embedded Systems (임베디드 시스템용 딥러닝 추론엔진 기술 동향)

  • Yoo, Seung-mok;Lee, Kyung Hee;Park, Jaebok;Yoon, Seok Jin;Cho, Changsik;Jung, Yung Joon;Cho, Il Yeon
    • Electronics and Telecommunications Trends
    • /
    • v.34 no.4
    • /
    • pp.23-31
    • /
    • 2019
  • Deep learning is a hot topic in both academic and industrial fields. Deep learning applications can be categorized into two areas. The first category involves applications such as Google Alpha Go using interfaces with human operators to run complicated inference engines in high-performance servers. The second category includes embedded applications for mobile Internet-of-Things devices, automotive vehicles, etc. Owing to the characteristics of the deployment environment, applications in the second category should be bounded by certain H/W and S/W restrictions depending on their running environment. For example, image recognition in an autonomous vehicle requires low latency, while that on a mobile device requires low power consumption. In this paper, we describe issues faced by embedded applications and review popular inference engines. We also introduce a project that is being development to satisfy the H/W and S/W requirements.

Detection of Dangerous Situations using Deep Learning Model with Relational Inference

  • Jang, Sein;Battulga, Lkhagvadorj;Nasridinov, Aziz
    • Journal of Multimedia Information System
    • /
    • v.7 no.3
    • /
    • pp.205-214
    • /
    • 2020
  • Crime has become one of the major problems in modern society. Even though visual surveillances through closed-circuit television (CCTV) is extensively used for solving crime, the number of crimes has not decreased. This is because there is insufficient workforce for performing 24-hour surveillance. In addition, CCTV surveillance by humans is not efficient for detecting dangerous situations owing to accuracy issues. In this paper, we propose the autonomous detection of dangerous situations in CCTV scenes using a deep learning model with relational inference. The main feature of the proposed method is that it can simultaneously perform object detection and relational inference to determine the danger of the situations captured by CCTV. This enables us to efficiently classify dangerous situations by inferring the relationship between detected objects (i.e., distance and position). Experimental results demonstrate that the proposed method outperforms existing methods in terms of the accuracy of image classification and the false alarm rate even when object detection accuracy is low.

Web Service Platform for Optimal Quantization of CNN Models (CNN 모델의 최적 양자화를 위한 웹 서비스 플랫폼)

  • Roh, Jaewon;Lim, Chaemin;Cho, Sang-Young
    • Journal of the Semiconductor & Display Technology
    • /
    • v.20 no.4
    • /
    • pp.151-156
    • /
    • 2021
  • Low-end IoT devices do not have enough computation and memory resources for DNN learning and inference. Integer quantization of real-type neural network models can reduce model size, hardware computational burden, and power consumption. This paper describes the design and implementation of a web-based quantization platform for CNN deep learning accelerator chips. In the web service platform, we implemented visualization of the model through a convenient UI, analysis of each step of inference, and detailed editing of the model. Additionally, a data augmentation function and a management function of files that store models and inference intermediate results are provided. The implemented functions were verified using three YOLO models.

A Study on Realtime Drone Object Detection Using On-board Deep Learning (온-보드에서의 딥러닝을 활용한 드론의 실시간 객체 인식 연구)

  • Lee, Jang-Woo;Kim, Joo-Young;Kim, Jae-Kyung;Kwon, Cheol-Hee
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.49 no.10
    • /
    • pp.883-892
    • /
    • 2021
  • This paper provides a process for developing deep learning-based aerial object detection models that can run in realtime on onboard. To improve object detection performance, we pre-process and augment the training data in the training stage. In addition, we perform transfer learning and apply a weighted cross-entropy method to reduce the variations of detection performance for each class. To improve the inference speed, we have generated inference acceleration engines with quantization. Then, we analyze the real-time performance and detection performance on custom aerial image dataset to verify generalization.