• 제목/요약/키워드: Resnet

검색결과 60건 처리시간 0.03초

백본 네트워크에 따른 사람 속성 검출 모델의 성능 변화 분석 (Analyzing DNN Model Performance Depending on Backbone Network )

  • 박천수
    • 반도체디스플레이기술학회지
    • /
    • 제22권2호
    • /
    • pp.128-132
    • /
    • 2023
  • Recently, with the development of deep learning technology, research on pedestrian attribute recognition technology using deep neural networks has been actively conducted. Existing pedestrian attribute recognition techniques can be obtained in such a way as global-based, regional-area-based, visual attention-based, sequential prediction-based, and newly designed loss function-based, depending on how pedestrian attributes are detected. It is known that the performance of these pedestrian attribute recognition technologies varies greatly depending on the type of backbone network that constitutes the deep neural networks model. Therefore, in this paper, several backbone networks are applied to the baseline pedestrian attribute recognition model and the performance changes of the model are analyzed. In this paper, the analysis is conducted using Resnet34, Resnet50, Resnet101, Swin-tiny, and Swinv2-tiny, which are representative backbone networks used in the fields of image classification, object detection, etc. Furthermore, this paper analyzes the change in time complexity when inferencing each backbone network using a CPU and a GPU.

  • PDF

Empirical Comparison of Deep Learning Networks on Backbone Method of Human Pose Estimation

  • Rim, Beanbonyka;Kim, Junseob;Choi, Yoo-Joo;Hong, Min
    • 인터넷정보학회논문지
    • /
    • 제21권5호
    • /
    • pp.21-29
    • /
    • 2020
  • Accurate estimation of human pose relies on backbone method in which its role is to extract feature map. Up to dated, the method of backbone feature extraction is conducted by the plain convolutional neural networks named by CNN and the residual neural networks named by Resnet, both of which have various architectures and performances. The CNN family network such as VGG which is well-known as a multiple stacked hidden layers architecture of deep learning methods, is base and simple while Resnet which is a bottleneck layers architecture yields fewer parameters and outperform. They have achieved inspired results as a backbone network in human pose estimation. However, they were used then followed by different pose estimation networks named by pose parsing module. Therefore, in this paper, we present a comparison between the plain CNN family network (VGG) and bottleneck network (Resnet) as a backbone method in the same pose parsing module. We investigate their performances such as number of parameters, loss score, precision and recall. We experiment them in the bottom-up method of human pose estimation system by adapted the pose parsing module of openpose. Our experimental results show that the backbone method using VGG network outperforms the Resent network with fewer parameter, lower loss score and higher accuracy of precision and recall.

비디오 인코더를 통한 딥러닝 모델의 정수 가중치 압축 (Compression of DNN Integer Weight using Video Encoder)

  • 김승환;류은석
    • 방송공학회논문지
    • /
    • 제26권6호
    • /
    • pp.778-789
    • /
    • 2021
  • 최근 다양한 분야에서 뛰어난 성능을 나타내는 Convolutional Neural Network(CNN)모델을 모바일 기기에서 사용하기 위한 다양한 연구가 진행되고 있다. 기존의 CNN 모델은 모바일 장비에서 사용하기에는 가중치의 크기가 크고 연산복잡도가 높다는 문제점이 있다. 이를 해결하기 위해 가중치의 표현 비트를 낮추는 가중치 양자화를 포함한 여러 경량화 방법들이 등장하였다. 많은 방법들이 다양한 모델에서 적은 정확도 손실과 높은 압축률을 나타냈지만, 대부분의 압축 모델들은 정확도 손실을 복구하기 위한 재학습 과정을 포함시켰다. 재학습 과정은 압축된 모델의 정확도 손실을 최소화하지만 많은 시간과 데이터를 필요로 하는 작업이다. Weight Quantization이후 각 층의 가중치는 정수형 행렬로 나타나는데 이는 이미지의 형태와 유사하다. 본 논문에서는 Weight Quantization이후 각 층의 정수 가중치 행렬을 이미지의 형태로 비디오 코덱을 사용하여 압축하는 방법을 제안한다. 제안하는 방법의 성능을 검증하기 위해 ImageNet과 Places365 데이터 셋으로 학습된 VGG16, Resnet50, Resnet18모델에 실험을 진행하였다. 그 결과 다양한 모델에서 2%이하의 정확도 손실과 높은 압축 효율을 달성했다. 또한, 재학습 과정을 제외한 압축방법인 No Fine-tuning Pruning(NFP)와 ThiNet과의 성능비교 결과 2배 이상의 압축효율이 있음을 검증했다.

Analyze weeds classification with visual explanation based on Convolutional Neural Networks

  • Vo, Hoang-Trong;Yu, Gwang-Hyun;Nguyen, Huy-Toan;Lee, Ju-Hwan;Dang, Thanh-Vu;Kim, Jin-Young
    • 스마트미디어저널
    • /
    • 제8권3호
    • /
    • pp.31-40
    • /
    • 2019
  • To understand how a Convolutional Neural Network (CNN) model captures the features of a pattern to determine which class it belongs to, in this paper, we use Gradient-weighted Class Activation Mapping (Grad-CAM) to visualize and analyze how well a CNN model behave on the CNU weeds dataset. We apply this technique to Resnet model and figure out which features this model captures to determine a specific class, what makes the model get a correct/wrong classification, and how those wrong label images can cause a negative effect to a CNN model during the training process. In the experiment, Grad-CAM highlights the important regions of weeds, depending on the patterns learned by Resnet, such as the lobe and limb on 미국가막사리, or the entire leaf surface on 단풍잎돼지풀. Besides, Grad-CAM points out a CNN model can localize the object even though it is trained only for the classification problem.

Automatic detection of icing wind turbine using deep learning method

  • Hacıefendioglu, Kemal;Basaga, Hasan Basri;Ayas, Selen;Karimi, Mohammad Tordi
    • Wind and Structures
    • /
    • 제34권6호
    • /
    • pp.511-523
    • /
    • 2022
  • Detecting the icing on wind turbine blades built-in cold regions with conventional methods is always a very laborious, expensive and very difficult task. Regarding this issue, the use of smart systems has recently come to the agenda. It is quite possible to eliminate this issue by using the deep learning method, which is one of these methods. In this study, an application has been implemented that can detect icing on wind turbine blades images with visualization techniques based on deep learning using images. Pre-trained models of Resnet-50, VGG-16, VGG-19 and Inception-V3, which are well-known deep learning approaches, are used to classify objects automatically. Grad-CAM, Grad-CAM++, and Score-CAM visualization techniques were considered depending on the deep learning methods used to predict the location of icing regions on the wind turbine blades accurately. It was clearly shown that the best visualization technique for localization is Score-CAM. Finally, visualization performance analyses in various cases which are close-up and remote photos of a wind turbine, density of icing and light were carried out using Score-CAM for Resnet-50. As a result, it is understood that these methods can detect icing occurring on the wind turbine with acceptable high accuracy.

Improving Chest X-ray Image Classification via Integration of Self-Supervised Learning and Machine Learning Algorithms

  • Tri-Thuc Vo;Thanh-Nghi Do
    • Journal of information and communication convergence engineering
    • /
    • 제22권2호
    • /
    • pp.165-171
    • /
    • 2024
  • In this study, we present a novel approach for enhancing chest X-ray image classification (normal, Covid-19, edema, mass nodules, and pneumothorax) by combining contrastive learning and machine learning algorithms. A vast amount of unlabeled data was leveraged to learn representations so that data efficiency is improved as a means of addressing the limited availability of labeled data in X-ray images. Our approach involves training classification algorithms using the extracted features from a linear fine-tuned Momentum Contrast (MoCo) model. The MoCo architecture with a Resnet34, Resnet50, or Resnet101 backbone is trained to learn features from unlabeled data. Instead of only fine-tuning the linear classifier layer on the MoCopretrained model, we propose training nonlinear classifiers as substitutes for softmax in deep networks. The empirical results show that while the linear fine-tuned ImageNet-pretrained models achieved the highest accuracy of only 82.9% and the linear fine-tuned MoCo-pretrained models an increased highest accuracy of 84.8%, our proposed method offered a significant improvement and achieved the highest accuracy of 87.9%.

구강암 조기발견을 위한 영상인식 시스템 (Image Recognition System for Early Detection of Oral Cancer)

  • 에드워드 카야디;송미화
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2022년도 춘계학술발표대회
    • /
    • pp.309-311
    • /
    • 2022
  • Oral cancer is a type of cancer that has a high possibility to be cured if it is threatened earlier. The convolutional neural network is very popular for being a good algorithm for image recognition. In this research, we try to compare 4 different architectures of the CNN algorithm: Convnet, VGG16, Inception V3, and Resnet. As we compared those 4 architectures we found that VGG16 and Resnet model has better performance with an 85.35% accuracy rate compared to the other 3 architectures. In the future, we are sure that image recognition can be more developed to identify oral cancer earlier.

딥러닝 알고리즘을 이용한 토마토에서 발생하는 여러가지 병해충의 탐지와 식별에 대한 웹응용 플렛폼의 구축 (A Construction of Web Application Platform for Detection and Identification of Various Diseases in Tomato Plants Using a Deep Learning Algorithm)

  • 나명환;조완현;김상균
    • 품질경영학회지
    • /
    • 제48권4호
    • /
    • pp.581-596
    • /
    • 2020
  • Purpose: purpose of this study was to propose the web application platform which can be to detect and discriminate various diseases and pest of tomato plant based on the large amount of disease image data observed in the facility or the open field. Methods: The deep learning algorithms uesed at the web applivation platform are consisted as the combining form of Faster R-CNN with the pre-trained convolution neural network (CNN) models such as SSD_mobilenet v1, Inception v2, Resnet50 and Resnet101 models. To evaluate the superiority of the newly proposed web application platform, we collected 850 images of four diseases such as Bacterial cankers, Late blight, Leaf miners, and Powdery mildew that occur the most frequent in tomato plants. Of these, 750 were used to learn the algorithm, and the remaining 100 images were used to evaluate the algorithm. Results: From the experiments, the deep learning algorithm combining Faster R-CNN with SSD_mobilnet v1, Inception v2, Resnet50, and Restnet101 showed detection accuracy of 31.0%, 87.7%, 84.4%, and 90.8% respectively. Finally, we constructed a web application platform that can detect and discriminate various tomato deseases using best deep learning algorithm. If farmers uploaded image captured by their digital cameras such as smart phone camera or DSLR (Digital Single Lens Reflex) camera, then they can receive an information for detection, identification and disease control about captured tomato disease through the proposed web application platform. Conclusion: Incheon Port needs to act actively paying.

영상 인식을 위한 딥러닝 모델의 적대적 공격에 대한 백색 잡음 효과에 관한 연구 (Study on the White Noise effect Against Adversarial Attack for Deep Learning Model for Image Recognition)

  • 이영석;김종원
    • 한국정보전자통신기술학회논문지
    • /
    • 제15권1호
    • /
    • pp.27-35
    • /
    • 2022
  • 본 논문에서는 영상 데이터에 대한 적대적 공격으로부터 생성된 적대적 예제로 인하여 발생할 수 있는 딥러닝 시스템의 오분류를 방어하기 위한 방법으로 분류기의 입력 영상에 백색 잡음을 가산하는 방법을 제안하였다. 제안된 방법은 적대적이든 적대적이지 않던 구분하지 않고 분류기의 입력 영상에 백색 잡음을 더하여 적대적 예제가 분류기에서 올바른 출력을 발생할 수 있도록 유도하는 것이다. 제안한 방법은 FGSM 공격, BIM 공격 및 CW 공격으로 생성된 적대적 예제에 대하여 서로 다른 레이어 수를 갖는 Resnet 모델에 적용하고 결과를 고찰하였다. 백색 잡음의 가산된 데이터의 경우 모든 Resnet 모델에서 인식률이 향상되었음을 관찰할 수 있다. 제안된 방법은 단순히 백색 잡음을 경험적인 방법으로 가산하고 결과를 관찰하였으나 에 대한 엄밀한 분석이 추가되는 경우 기존의 적대적 훈련 방법과 같이 비용과 시간이 많이 소요되는 적대적 공격에 대한 방어 기술을 제공할 수 있을 것으로 사료된다.

Comparison of Fine-Tuned Convolutional Neural Networks for Clipart Style Classification

  • Lee, Seungbin;Kim, Hyungon;Seok, Hyekyoung;Nang, Jongho
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제9권4호
    • /
    • pp.1-7
    • /
    • 2017
  • Clipart is artificial visual contents that are created using various tools such as Illustrator to highlight some information. Here, the style of the clipart plays a critical role in determining how it looks. However, previous studies on clipart are focused only on the object recognition [16], segmentation, and retrieval of clipart images using hand-craft image features. Recently, some clipart classification researches based on the style similarity using CNN have been proposed, however, they have used different CNN-models and experimented with different benchmark dataset so that it is very hard to compare their performances. This paper presents an experimental analysis of the clipart classification based on the style similarity with two well-known CNN-models (Inception Resnet V2 [13] and VGG-16 [14] and transfers learning with the same benchmark dataset (Microsoft Style Dataset 3.6K). From this experiment, we find out that the accuracy of Inception Resnet V2 is better than VGG for clipart style classification because of its deep nature and convolution map with various sizes in parallel. We also find out that the end-to-end training can improve the accuracy more than 20% in both CNN models.