• Title/Summary/Keyword: Vgg16

Search Result 126, Processing Time 0.026 seconds

Apple Detection Algorithm based on an Improved SSD (개선 된 SSD 기반 사과 감지 알고리즘)

  • Ding, Xilong;Li, Qiutan;Wang, Xufei;Chen, Le;Son, Jinku;Song, Jeong-Young
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.21 no.3
    • /
    • pp.81-89
    • /
    • 2021
  • Under natural conditions, Apple detection has the problems of occlusion and small object detection difficulties. This paper proposes an improved model based on SSD. The SSD backbone network VGG16 is replaced with the ResNet50 network model, and the receptive field structure RFB structure is introduced. The RFB model amplifies the feature information of small objects and improves the detection accuracy of small objects. Combined with the attention mechanism (SE) to filter out the information that needs to be retained, the semantic information of the detection objectis enhanced. An improved SSD algorithm is trained on the VOC2007 data set. Compared with SSD, the improved algorithm has increased the accuracy of occlusion and small object detection by 3.4% and 3.9%. The algorithm has improved the false detection rate and missed detection rate. The improved algorithm proposed in this paper has higher efficiency.

Implementation of the Stone Classification with AI Algorithm Based on VGGNet Neural Networks (VGGNet을 활용한 석재분류 인공지능 알고리즘 구현)

  • Choi, Kyung Nam
    • Smart Media Journal
    • /
    • v.10 no.1
    • /
    • pp.32-38
    • /
    • 2021
  • Image classification through deep learning on the image from photographs has been a very active research field for the past several years. In this paper, we propose a method of automatically discriminating stone images from domestic source through deep learning, which is to use Python's hash library to scan 300×300 pixel photo images of granites such as Hwangdeungseok, Goheungseok, and Pocheonseok, performing data preprocessing to create learning images by examining duplicate images for each stone, removing duplicate images with the same hash value as a result of the inspection, and deep learning by stone. In addition, to utilize VGGNet, the size of the images for each stone is resized to 224×224 pixels, learned in VGG16 where the ratio of training and verification data for learning is 80% versus 20%. After training of deep learning, the loss function graph and the accuracy graph were generated, and the prediction results of the deep learning model were output for the three kinds of stone images.

Compression of DNN Integer Weight using Video Encoder (비디오 인코더를 통한 딥러닝 모델의 정수 가중치 압축)

  • Kim, Seunghwan;Ryu, Eun-Seok
    • Journal of Broadcast Engineering
    • /
    • v.26 no.6
    • /
    • pp.778-789
    • /
    • 2021
  • Recently, various lightweight methods for using Convolutional Neural Network(CNN) models in mobile devices have emerged. Weight quantization, which lowers bit precision of weights, is a lightweight method that enables a model to be used through integer calculation in a mobile environment where GPU acceleration is unable. Weight quantization has already been used in various models as a lightweight method to reduce computational complexity and model size with a small loss of accuracy. Considering the size of memory and computing speed as well as the storage size of the device and the limited network environment, this paper proposes a method of compressing integer weights after quantization using a video codec as a method. To verify the performance of the proposed method, experiments were conducted on VGG16, Resnet50, and Resnet18 models trained with ImageNet and Places365 datasets. As a result, loss of accuracy less than 2% and high compression efficiency were achieved in various models. In addition, as a result of comparison with similar compression methods, it was verified that the compression efficiency was more than doubled.

Deep Learning-based Pes Planus Classification Model Using Transfer Learning

  • Kim, Yeonho;Kim, Namgyu
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.4
    • /
    • pp.21-28
    • /
    • 2021
  • This study proposes a deep learning-based flat foot classification methodology using transfer learning. We used a transfer learning with VGG16 pre-trained model and a data augmentation technique to generate a model with high predictive accuracy from a total of 176 image data consisting of 88 flat feet and 88 normal feet. To evaluate the performance of the proposed model, we performed an experiment comparing the prediction accuracy of the basic CNN-based model and the prediction model derived through the proposed methodology. In the case of the basic CNN model, the training accuracy was 77.27%, the validation accuracy was 61.36%, and the test accuracy was 59.09%. Meanwhile, in the case of our proposed model, the training accuracy was 94.32%, the validation accuracy was 86.36%, and the test accuracy was 84.09%, indicating that the accuracy of our model was significantly higher than that of the basic CNN model.

Classification of Whole Body Bone Scan Image with Bone Metastasis using CNN-based Transfer Learning (CNN 기반 전이학습을 이용한 뼈 전이가 존재하는 뼈 스캔 영상 분류)

  • Yim, Ji Yeong;Do, Thanh Cong;Kim, Soo Hyung;Lee, Guee Sang;Lee, Min Hee;Min, Jung Joon;Bom, Hee Seung;Kim, Hyeon Sik;Kang, Sae Ryung;Yang, Hyung Jeong
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.8
    • /
    • pp.1224-1232
    • /
    • 2022
  • Whole body bone scan is the most frequently performed nuclear medicine imaging to evaluate bone metastasis in cancer patients. We evaluated the performance of a VGG16-based transfer learning classifier for bone scan images in which metastatic bone lesion was present. A total of 1,000 bone scans in 1,000 cancer patients (500 patients with bone metastasis, 500 patients without bone metastasis) were evaluated. Bone scans were labeled with abnormal/normal for bone metastasis using medical reports and image review. Subsequently, gradient-weighted class activation maps (Grad-CAMs) were generated for explainable AI. The proposed model showed AUROC 0.96 and F1-Score 0.90, indicating that it outperforms to VGG16, ResNet50, Xception, DenseNet121 and InceptionV3. Grad-CAM visualized that the proposed model focuses on hot uptakes, which are indicating active bone lesions, for classification of whole body bone scan images with bone metastases.

Classification of Apple Tree Leaves Diseases using Deep Learning Methods

  • Alsayed, Ashwaq;Alsabei, Amani;Arif, Muhammad
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.7
    • /
    • pp.324-330
    • /
    • 2021
  • Agriculture is one of the essential needs of human life on planet Earth. It is the source of food and earnings for many individuals around the world. The economy of many countries is associated with the agriculture sector. Lots of diseases exist that attack various fruits and crops. Apple Tree Leaves also suffer different types of pathological conditions that affect their production. These pathological conditions include apple scab, cedar apple rust, or multiple diseases, etc. In this paper, an automatic detection framework based on deep learning is investigated for apple leaves disease classification. Different pre-trained models, VGG16, ResNetV2, InceptionV3, and MobileNetV2, are considered for transfer learning. A combination of parameters like learning rate, batch size, and optimizer is analyzed, and the best combination of ResNetV2 with Adam optimizer provided the best classification accuracy of 94%.

Road Damage Detection and Classification based on Multi-level Feature Pyramids

  • Yin, Junru;Qu, Jiantao;Huang, Wei;Chen, Qiqiang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.2
    • /
    • pp.786-799
    • /
    • 2021
  • Road damage detection is important for road maintenance. With the development of deep learning, more and more road damage detection methods have been proposed, such as Fast R-CNN, Faster R-CNN, Mask R-CNN and RetinaNet. However, because shallow and deep layers cannot be extracted at the same time, the existing methods do not perform well in detecting objects with fewer samples. In addition, these methods cannot obtain a highly accurate detecting bounding box. This paper presents a Multi-level Feature Pyramids method based on M2det. Because the feature layer has multi-scale and multi-level architecture, the feature layer containing more information and obvious features can be extracted. Moreover, an attention mechanism is used to improve the accuracy of local boundary boxes in the dataset. Experimental results show that the proposed method is better than the current state-of-the-art methods.

An Improved Deep Learning Method for Animal Images (동물 이미지를 위한 향상된 딥러닝 학습)

  • Wang, Guangxing;Shin, Seong-Yoon;Shin, Kwang-Weong;Lee, Hyun-Chang
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2019.01a
    • /
    • pp.123-124
    • /
    • 2019
  • This paper proposes an improved deep learning method based on small data sets for animal image classification. Firstly, we use a CNN to build a training model for small data sets, and use data augmentation to expand the data samples of the training set. Secondly, using the pre-trained network on large-scale datasets, such as VGG16, the bottleneck features in the small dataset are extracted and to be stored in two NumPy files as new training datasets and test datasets. Finally, training a fully connected network with the new datasets. In this paper, we use Kaggle famous Dogs vs Cats dataset as the experimental dataset, which is a two-category classification dataset.

  • PDF

SIFT Image Feature Detect based on Deep learning (딥 러닝 기반의 SIFT 이미지 특징 검출)

  • Lee, Jae-Eun;Moon, Won-Jun;Seo, Young-Ho;Kim, Dong-Wook
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2018.11a
    • /
    • pp.122-123
    • /
    • 2018
  • 본 논문에서는 옥타브(sacle vector, octave)를 0, 시그마(sigma)는 1.6, 간격(intervals)은 3으로 설정하여 검출한 RobHess SIFT 특징들로 데이터 셋을 만들어 딥 러닝 모델인 VGG-16을 기반으로 SIFT 이미지 특징을 검출하는 방법을 제안한다. DIV2K 데이터 셋을 $33{\times}33$ 크기로 잘라서 데이터 셋을 구성하였고, 흑백 영상으로 판별하는 SIFT와는 달리 RGB 영상을 사용 하였다. 영상을 좌 우 반전, 밝기, 회전, 크기를 조절하여 원본 영상을 변형시켜 네트워크 학습 및 평가를 진행하였다. 네트워크는 영상의 가운데에 위치한 픽셀이 특징점인지 아닌지를 판별한다. 검증 데이터의 결과 98.207%의 정확도를 얻었다.

  • PDF

Automatic Detection of Work Distraction with Deep Learning Technique for Remote Management of Telecommuting

  • Lee, Wan Yeon
    • International journal of advanced smart convergence
    • /
    • v.10 no.1
    • /
    • pp.82-88
    • /
    • 2021
  • In this paper, we propose an automatic detection scheme of work distraction for remote management of telecommuting. The proposed scheme periodically captures two consequent computer screens and generates the difference image of these two captured images. The scheme applies the difference image to our deep learning model and makes a decision of abnormal patterns in the difference image. Our deep learning model is designed with the transfer learning technique of VGG16 deep learning. When the scheme detects an abnormal pattern in the difference image, it hides all texts in the difference images to protect disclosure of privacy-related information. Evaluation shows that the proposed scheme provides about 96% detection accuracy.