• Title/Summary/Keyword: computing speed

Search Result 898, Processing Time 0.027 seconds

A new lightweight network based on MobileNetV3

  • Zhao, Liquan;Wang, Leilei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.1
    • /
    • pp.1-15
    • /
    • 2022
  • The MobileNetV3 is specially designed for mobile devices with limited memory and computing power. To reduce the network parameters and improve the network inference speed, a new lightweight network is proposed based on MobileNetV3. Firstly, to reduce the computation of residual blocks, a partial residual structure is designed by dividing the input feature maps into two parts. The designed partial residual structure is used to replace the residual block in MobileNetV3. Secondly, a dual-path feature extraction structure is designed to further reduce the computation of MobileNetV3. Different convolution kernel sizes are used in the two paths to extract feature maps with different sizes. Besides, a transition layer is also designed for fusing features to reduce the influence of the new structure on accuracy. The CIFAR-100 dataset and Image Net dataset are used to test the performance of the proposed partial residual structure. The ResNet based on the proposed partial residual structure has smaller parameters and FLOPs than the original ResNet. The performance of improved MobileNetV3 is tested on CIFAR-10, CIFAR-100 and ImageNet image classification task dataset. Comparing MobileNetV3, GhostNet and MobileNetV2, the improved MobileNetV3 has smaller parameters and FLOPs. Besides, the improved MobileNetV3 is also tested on CPU and Raspberry Pi. It is faster than other networks

Compression of DNN Integer Weight using Video Encoder (비디오 인코더를 통한 딥러닝 모델의 정수 가중치 압축)

  • Kim, Seunghwan;Ryu, Eun-Seok
    • Journal of Broadcast Engineering
    • /
    • v.26 no.6
    • /
    • pp.778-789
    • /
    • 2021
  • Recently, various lightweight methods for using Convolutional Neural Network(CNN) models in mobile devices have emerged. Weight quantization, which lowers bit precision of weights, is a lightweight method that enables a model to be used through integer calculation in a mobile environment where GPU acceleration is unable. Weight quantization has already been used in various models as a lightweight method to reduce computational complexity and model size with a small loss of accuracy. Considering the size of memory and computing speed as well as the storage size of the device and the limited network environment, this paper proposes a method of compressing integer weights after quantization using a video codec as a method. To verify the performance of the proposed method, experiments were conducted on VGG16, Resnet50, and Resnet18 models trained with ImageNet and Places365 datasets. As a result, loss of accuracy less than 2% and high compression efficiency were achieved in various models. In addition, as a result of comparison with similar compression methods, it was verified that the compression efficiency was more than doubled.

Numerical Study on Aerodynamic Performance of Counter-rotating Propeller in Hover Using Actuator Method (Actuator 기법을 이용한 제자리 비행하는 동축 반전 프로펠러 공력 성능에 관한 수치적 연구)

  • Kim, Dahye;Park, Youngmin;Oh, Sejong;Park, Donghun
    • Journal of Aerospace System Engineering
    • /
    • v.15 no.3
    • /
    • pp.30-44
    • /
    • 2021
  • Experimental investigation of counter-rotating propellers is subject to multiple time and cost constraint because of additional design parameters unlike single propeller. Also, a lot of computing time and resources are required for numerical analysis due to consideration of the interference between the upper and lower propellers. In the present study, numerical simulations were conducted to investigate the hover performance of counter-rotating propellers by using actuator method which is considered to be time-efficient. The accuracy of the present numerical methods was validated by comparing the ANSYS Fluent which is commercial CFD code. The axial spacing and rotational speed were selected as the analysis variables, and the aerodynamic performance was obtained under various conditions. Based on the obtained results, the Figure of Merit (FM) of single propeller and counter-rotating propellers and a prediction factor which enables prediction of counter-rotating propeller performance using a single propeller were derived to evaluate availability of the actuator method.

A Design of Network Topology Discovery System based on Traffic In-out Count Analysis (네트워크 트래픽 입출량 분석을 통한 네트워크 토폴로지 탐색 시스템 설계)

  • Park, Ji-Tae;Baek, Ui-Jun;Shin, Mu-Gon;Lee, Min-Seong;Kim, Myung-Sup
    • KNOM Review
    • /
    • v.23 no.1
    • /
    • pp.1-9
    • /
    • 2020
  • With the rapid development of science and technology in recent years, the network environment are growing, and a huge amount of traffic is generated. In particular, the development of 5G networks and edge computing will accelerate this phenomenon. However, according to these trends, network malicious behaviors and traffic overloads are also frequently occurring. To solve these problems, network administrators need to build a network management system to implement a high-speed network and should know exactly about the connection topology of network devices through the network management system. However, the existing network topology discovery method is inefficient because it is passively managed by an administrator and it is a time consuming task. Therefore, we proposes a method of network topology discovery according to the amount of in and out network traffic. The proposed method is applied to a real network to verify the validity of this paper.

Parallel Implementations of Digital Focus Indices Based on Minimax Search Using Multi-Core Processors

  • HyungTae, Kim;Duk-Yeon, Lee;Dongwoon, Choi;Jaehyeon, Kang;Dong-Wook, Lee
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.2
    • /
    • pp.542-558
    • /
    • 2023
  • A digital focus index (DFI) is a value used to determine image focus in scientific apparatus and smart devices. Automatic focus (AF) is an iterative and time-consuming procedure; however, its processing time can be reduced using a general processing unit (GPU) and a multi-core processor (MCP). In this study, parallel architectures of a minimax search algorithm (MSA) are applied to two DFIs: range algorithm (RA) and image contrast (CT). The DFIs are based on a histogram; however, the parallel computation of the histogram is conventionally inefficient because of the bank conflict in shared memory. The parallel architectures of RA and CT are constructed using parallel reduction for MSA, which is performed through parallel relative rating of the image pixel pairs and halved the rating in every step. The array size is then decreased to one, and the minimax is determined at the final reduction. Kernels for the architectures are constructed using open source software to make it relatively platform independent. The kernels are tested in a hexa-core PC and an embedded device using Lenna images of various sizes based on the resolutions of industrial cameras. The performance of the kernels for the DFIs was investigated in terms of processing speed and computational acceleration; the maximum acceleration was 32.6× in the best case and the MCP exhibited a higher performance.

A study on an artificial intelligence model for measuring object speed using road markers that can respond to external forces (외부력에 대응할 수 있는 도로 마커 활용 개체 속도 측정 인공지능 모델 연구)

  • Lim, Dong Hyun;Park, Dae-woo
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.228-231
    • /
    • 2022
  • Most CCTVs operated by public institutions for crime prevention and parking enforcement are located on roads. The angle of these CCTV's view is often changed for various reasons, such as bolt loosening by vibration or shocking by vehicles and workers, etc. In order to effectively provide AI services based on the collected images, the service target area(ROI, Region Of Interest) must be provided without interruption within the image. This is also related to the viewpoint of effective operation of computing power for image analysis. This study explains how to maximize the application of artificial intelligence technology by setting the ROI based on the marker on the road, setting the image analysis to be possible only within the area, and studying the process of finding the ROI.

  • PDF

A High Speed Optimized Implementation of Lightweight Cryptography TinyJAMBU on Internet of Things Processor 8-Bit AVR (사물 인터넷 프로세서 8-bit AVR 상에서의 경량암호 TinyJAMBU 고속 최적 구현)

  • Hyeok-Dong Kwon;Si-Woo Eum;Min-Joo Sim;Yu-Jin Yang;Hwa-Jeong Seo
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.33 no.2
    • /
    • pp.183-191
    • /
    • 2023
  • Cryptographic algorithms require extensive computational resources and rely on complex mathematical principles for security. However, IoT devices have limited resources, leading to insufficient computing power. As a result, lightweight cryptography has emerged, which uses fewer computational resources. NIST organized a competition to standardize lightweight cryptography and TinyJAMBU, one of the algorithms in the competition, is a permutation-based algorithm that repeats many permutation operations. In this paper, we implement TinyJAMBU on an 8-bit AVR processor with a proposedtechnique that includes a reverse shift method and precomputing some operations in a fixed key and nonce environment. Our techniques showed a maximum performance improvement of 7.03 times in permutation operations and 5.87 times in the TinyJAMBU algorithm, improving up to 9.19 times in a fixed key and nonce environment.

Analyses of Security into End-to-End Point Healthcare System based on Internet of Things (사물인터넷 기반의 헬스케어 시스템의 종단간 보안성 분석)

  • Kim, Jung Tae
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.6
    • /
    • pp.871-880
    • /
    • 2017
  • Recently, service based on internet is inter-connected and integrated with a variety of connection. This kind of internet of things consist of heterogenous devices such as sensor node, devices and end-to end equipment which used in conventional protocols and services. The representative system is healthcare system. From healthcare appliance used by IoT, patient and doctor can utilize healthcare information with safety and high speed management. It is very convenient management to operate mobility. But it induced security and vulnerability issues because it has small memory capacity, low power supply and low computing power. This made impossible to implement security algorithm with embedded engine based on hardware. Nowdays, we can't realize conventional standard algorithm due to these kinds of reasons. From the critical issues, it occurred security and vulnerability issues. Therefore, we analysed and compared with conventional method and proposed techniques. Finally, we evaluated security issues and requirement for end-to-end point healthcare system based on internet of things.

Development of the sediment transport model using GPU arithmetic (GPU 연산을 활용한 유사이송 예측모형 개발)

  • Noh, Junsu;Son, Sangyoung
    • Journal of Korea Water Resources Association
    • /
    • v.56 no.7
    • /
    • pp.431-438
    • /
    • 2023
  • Many shorelines are facing the beach erosion. Considering the climate change and the increment of coastal population, the erosion problem could be accelerated. To address this issue, developing a sediment transport model for rapidly predicting terrain change is crucial. In this study, a sediment transport model based on GPU parallel arithmetic was introduced, and it was supposed to simulate the terrain change well with a higher computing speed compared to the CPU based model. We also aim to investigate the model performance and the GPU computational efficiency. We applied several dam break cases to verified model, and we found that the simulated results were close to the observed results. The computational efficiency of GPU was defined by comparing operation time of CPU based model, and it showed that the GPU based model were more efficient than the CPU based model.

Research of Deep Learning-Based Multi Object Classification and Tracking for Intelligent Manager System (지능형 관제시스템을 위한 딥러닝 기반의 다중 객체 분류 및 추적에 관한 연구)

  • June-hwan Lee
    • Smart Media Journal
    • /
    • v.12 no.5
    • /
    • pp.73-80
    • /
    • 2023
  • Recently, intelligent control systems are developing rapidly in various application fields, and methods for utilizing technologies such as deep learning, IoT, and cloud computing for intelligent control systems are being studied. An important technology in an intelligent control system is recognizing and tracking objects in images. However, existing multi-object tracking technology has problems in accuracy and speed. In this paper, a real-time intelligent control system was implemented using YOLO v5 and YOLO v6 based on a one-shot architecture that increases the accuracy of object tracking and enables fast and accurate tracking even when objects overlap each other or when there are many objects belonging to the same class. The experiment was evaluated by comparing YOLO v5 and YOLO v6. As a result of the experiment, the YOLO v6 model shows performance suitable for the intelligent control system.