• Title/Summary/Keyword: Resnet

Search Result 61, Processing Time 0.022 seconds

Analyzing DNN Model Performance Depending on Backbone Network (백본 네트워크에 따른 사람 속성 검출 모델의 성능 변화 분석)

  • Chun-Su Park
    • Journal of the Semiconductor & Display Technology
    • /
    • v.22 no.2
    • /
    • pp.128-132
    • /
    • 2023
  • Recently, with the development of deep learning technology, research on pedestrian attribute recognition technology using deep neural networks has been actively conducted. Existing pedestrian attribute recognition techniques can be obtained in such a way as global-based, regional-area-based, visual attention-based, sequential prediction-based, and newly designed loss function-based, depending on how pedestrian attributes are detected. It is known that the performance of these pedestrian attribute recognition technologies varies greatly depending on the type of backbone network that constitutes the deep neural networks model. Therefore, in this paper, several backbone networks are applied to the baseline pedestrian attribute recognition model and the performance changes of the model are analyzed. In this paper, the analysis is conducted using Resnet34, Resnet50, Resnet101, Swin-tiny, and Swinv2-tiny, which are representative backbone networks used in the fields of image classification, object detection, etc. Furthermore, this paper analyzes the change in time complexity when inferencing each backbone network using a CPU and a GPU.

  • PDF

Empirical Comparison of Deep Learning Networks on Backbone Method of Human Pose Estimation

  • Rim, Beanbonyka;Kim, Junseob;Choi, Yoo-Joo;Hong, Min
    • Journal of Internet Computing and Services
    • /
    • v.21 no.5
    • /
    • pp.21-29
    • /
    • 2020
  • Accurate estimation of human pose relies on backbone method in which its role is to extract feature map. Up to dated, the method of backbone feature extraction is conducted by the plain convolutional neural networks named by CNN and the residual neural networks named by Resnet, both of which have various architectures and performances. The CNN family network such as VGG which is well-known as a multiple stacked hidden layers architecture of deep learning methods, is base and simple while Resnet which is a bottleneck layers architecture yields fewer parameters and outperform. They have achieved inspired results as a backbone network in human pose estimation. However, they were used then followed by different pose estimation networks named by pose parsing module. Therefore, in this paper, we present a comparison between the plain CNN family network (VGG) and bottleneck network (Resnet) as a backbone method in the same pose parsing module. We investigate their performances such as number of parameters, loss score, precision and recall. We experiment them in the bottom-up method of human pose estimation system by adapted the pose parsing module of openpose. Our experimental results show that the backbone method using VGG network outperforms the Resent network with fewer parameter, lower loss score and higher accuracy of precision and recall.

Compression of DNN Integer Weight using Video Encoder (비디오 인코더를 통한 딥러닝 모델의 정수 가중치 압축)

  • Kim, Seunghwan;Ryu, Eun-Seok
    • Journal of Broadcast Engineering
    • /
    • v.26 no.6
    • /
    • pp.778-789
    • /
    • 2021
  • Recently, various lightweight methods for using Convolutional Neural Network(CNN) models in mobile devices have emerged. Weight quantization, which lowers bit precision of weights, is a lightweight method that enables a model to be used through integer calculation in a mobile environment where GPU acceleration is unable. Weight quantization has already been used in various models as a lightweight method to reduce computational complexity and model size with a small loss of accuracy. Considering the size of memory and computing speed as well as the storage size of the device and the limited network environment, this paper proposes a method of compressing integer weights after quantization using a video codec as a method. To verify the performance of the proposed method, experiments were conducted on VGG16, Resnet50, and Resnet18 models trained with ImageNet and Places365 datasets. As a result, loss of accuracy less than 2% and high compression efficiency were achieved in various models. In addition, as a result of comparison with similar compression methods, it was verified that the compression efficiency was more than doubled.

Analyze weeds classification with visual explanation based on Convolutional Neural Networks

  • Vo, Hoang-Trong;Yu, Gwang-Hyun;Nguyen, Huy-Toan;Lee, Ju-Hwan;Dang, Thanh-Vu;Kim, Jin-Young
    • Smart Media Journal
    • /
    • v.8 no.3
    • /
    • pp.31-40
    • /
    • 2019
  • To understand how a Convolutional Neural Network (CNN) model captures the features of a pattern to determine which class it belongs to, in this paper, we use Gradient-weighted Class Activation Mapping (Grad-CAM) to visualize and analyze how well a CNN model behave on the CNU weeds dataset. We apply this technique to Resnet model and figure out which features this model captures to determine a specific class, what makes the model get a correct/wrong classification, and how those wrong label images can cause a negative effect to a CNN model during the training process. In the experiment, Grad-CAM highlights the important regions of weeds, depending on the patterns learned by Resnet, such as the lobe and limb on 미국가막사리, or the entire leaf surface on 단풍잎돼지풀. Besides, Grad-CAM points out a CNN model can localize the object even though it is trained only for the classification problem.

Automatic detection of icing wind turbine using deep learning method

  • Hacıefendioglu, Kemal;Basaga, Hasan Basri;Ayas, Selen;Karimi, Mohammad Tordi
    • Wind and Structures
    • /
    • v.34 no.6
    • /
    • pp.511-523
    • /
    • 2022
  • Detecting the icing on wind turbine blades built-in cold regions with conventional methods is always a very laborious, expensive and very difficult task. Regarding this issue, the use of smart systems has recently come to the agenda. It is quite possible to eliminate this issue by using the deep learning method, which is one of these methods. In this study, an application has been implemented that can detect icing on wind turbine blades images with visualization techniques based on deep learning using images. Pre-trained models of Resnet-50, VGG-16, VGG-19 and Inception-V3, which are well-known deep learning approaches, are used to classify objects automatically. Grad-CAM, Grad-CAM++, and Score-CAM visualization techniques were considered depending on the deep learning methods used to predict the location of icing regions on the wind turbine blades accurately. It was clearly shown that the best visualization technique for localization is Score-CAM. Finally, visualization performance analyses in various cases which are close-up and remote photos of a wind turbine, density of icing and light were carried out using Score-CAM for Resnet-50. As a result, it is understood that these methods can detect icing occurring on the wind turbine with acceptable high accuracy.

Improving Chest X-ray Image Classification via Integration of Self-Supervised Learning and Machine Learning Algorithms

  • Tri-Thuc Vo;Thanh-Nghi Do
    • Journal of information and communication convergence engineering
    • /
    • v.22 no.2
    • /
    • pp.165-171
    • /
    • 2024
  • In this study, we present a novel approach for enhancing chest X-ray image classification (normal, Covid-19, edema, mass nodules, and pneumothorax) by combining contrastive learning and machine learning algorithms. A vast amount of unlabeled data was leveraged to learn representations so that data efficiency is improved as a means of addressing the limited availability of labeled data in X-ray images. Our approach involves training classification algorithms using the extracted features from a linear fine-tuned Momentum Contrast (MoCo) model. The MoCo architecture with a Resnet34, Resnet50, or Resnet101 backbone is trained to learn features from unlabeled data. Instead of only fine-tuning the linear classifier layer on the MoCopretrained model, we propose training nonlinear classifiers as substitutes for softmax in deep networks. The empirical results show that while the linear fine-tuned ImageNet-pretrained models achieved the highest accuracy of only 82.9% and the linear fine-tuned MoCo-pretrained models an increased highest accuracy of 84.8%, our proposed method offered a significant improvement and achieved the highest accuracy of 87.9%.

Image Recognition System for Early Detection of Oral Cancer (구강암 조기발견을 위한 영상인식 시스템)

  • Cahyadi, Edward Dwijayanto;Song, Mi-Hwa
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.05a
    • /
    • pp.309-311
    • /
    • 2022
  • Oral cancer is a type of cancer that has a high possibility to be cured if it is threatened earlier. The convolutional neural network is very popular for being a good algorithm for image recognition. In this research, we try to compare 4 different architectures of the CNN algorithm: Convnet, VGG16, Inception V3, and Resnet. As we compared those 4 architectures we found that VGG16 and Resnet model has better performance with an 85.35% accuracy rate compared to the other 3 architectures. In the future, we are sure that image recognition can be more developed to identify oral cancer earlier.

A Construction of Web Application Platform for Detection and Identification of Various Diseases in Tomato Plants Using a Deep Learning Algorithm (딥러닝 알고리즘을 이용한 토마토에서 발생하는 여러가지 병해충의 탐지와 식별에 대한 웹응용 플렛폼의 구축)

  • Na, Myung Hwan;Cho, Wanhyun;Kim, SangKyoon
    • Journal of Korean Society for Quality Management
    • /
    • v.48 no.4
    • /
    • pp.581-596
    • /
    • 2020
  • Purpose: purpose of this study was to propose the web application platform which can be to detect and discriminate various diseases and pest of tomato plant based on the large amount of disease image data observed in the facility or the open field. Methods: The deep learning algorithms uesed at the web applivation platform are consisted as the combining form of Faster R-CNN with the pre-trained convolution neural network (CNN) models such as SSD_mobilenet v1, Inception v2, Resnet50 and Resnet101 models. To evaluate the superiority of the newly proposed web application platform, we collected 850 images of four diseases such as Bacterial cankers, Late blight, Leaf miners, and Powdery mildew that occur the most frequent in tomato plants. Of these, 750 were used to learn the algorithm, and the remaining 100 images were used to evaluate the algorithm. Results: From the experiments, the deep learning algorithm combining Faster R-CNN with SSD_mobilnet v1, Inception v2, Resnet50, and Restnet101 showed detection accuracy of 31.0%, 87.7%, 84.4%, and 90.8% respectively. Finally, we constructed a web application platform that can detect and discriminate various tomato deseases using best deep learning algorithm. If farmers uploaded image captured by their digital cameras such as smart phone camera or DSLR (Digital Single Lens Reflex) camera, then they can receive an information for detection, identification and disease control about captured tomato disease through the proposed web application platform. Conclusion: Incheon Port needs to act actively paying.

Study on the White Noise effect Against Adversarial Attack for Deep Learning Model for Image Recognition (영상 인식을 위한 딥러닝 모델의 적대적 공격에 대한 백색 잡음 효과에 관한 연구)

  • Lee, Youngseok;Kim, Jongweon
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.15 no.1
    • /
    • pp.27-35
    • /
    • 2022
  • In this paper we propose white noise adding method to prevent missclassification of deep learning system by adversarial attacks. The proposed method is that adding white noise to input image that is benign or adversarial example. The experimental results are showing that the proposed method is robustness to 3 adversarial attacks such as FGSM attack, BIN attack and CW attack. The recognition accuracies of Resnet model with 18, 34, 50 and 101 layers are enhanced when white noise is added to test data set while it does not affect to classification of benign test dataset. The proposed model is applicable to defense to adversarial attacks and replace to time- consuming and high expensive defense method against adversarial attacks such as adversarial training method and deep learning replacing method.

Comparison of Fine-Tuned Convolutional Neural Networks for Clipart Style Classification

  • Lee, Seungbin;Kim, Hyungon;Seok, Hyekyoung;Nang, Jongho
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.9 no.4
    • /
    • pp.1-7
    • /
    • 2017
  • Clipart is artificial visual contents that are created using various tools such as Illustrator to highlight some information. Here, the style of the clipart plays a critical role in determining how it looks. However, previous studies on clipart are focused only on the object recognition [16], segmentation, and retrieval of clipart images using hand-craft image features. Recently, some clipart classification researches based on the style similarity using CNN have been proposed, however, they have used different CNN-models and experimented with different benchmark dataset so that it is very hard to compare their performances. This paper presents an experimental analysis of the clipart classification based on the style similarity with two well-known CNN-models (Inception Resnet V2 [13] and VGG-16 [14] and transfers learning with the same benchmark dataset (Microsoft Style Dataset 3.6K). From this experiment, we find out that the accuracy of Inception Resnet V2 is better than VGG for clipart style classification because of its deep nature and convolution map with various sizes in parallel. We also find out that the end-to-end training can improve the accuracy more than 20% in both CNN models.