• Title/Summary/Keyword: deep transfer learning

Search Result 252, Processing Time 0.023 seconds

Image-Based Automatic Detection of Construction Helmets Using R-FCN and Transfer Learning (R-FCN과 Transfer Learning 기법을 이용한 영상기반 건설 안전모 자동 탐지)

  • Park, Sangyoon;Yoon, Sanghyun;Heo, Joon
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.39 no.3
    • /
    • pp.399-407
    • /
    • 2019
  • In Korea, the construction industry has been known to have the highest risk of safety accidents compared to other industries. Therefore, in order to improve safety in the construction industry, several researches have been carried out from the past. This study aims at improving safety of labors in construction site by constructing an effective automatic safety helmet detection system using object detection algorithm based on image data of construction field. Deep learning was conducted using Region-based Fully Convolutional Network (R-FCN) which is one of the object detection algorithms based on Convolutional Neural Network (CNN) with Transfer Learning technique. Learning was conducted with 1089 images including human and safety helmet collected from ImageNet and the mean Average Precision (mAP) of the human and the safety helmet was measured as 0.86 and 0.83, respectively.

Robot Vision to Audio Description Based on Deep Learning for Effective Human-Robot Interaction (효과적인 인간-로봇 상호작용을 위한 딥러닝 기반 로봇 비전 자연어 설명문 생성 및 발화 기술)

  • Park, Dongkeon;Kang, Kyeong-Min;Bae, Jin-Woo;Han, Ji-Hyeong
    • The Journal of Korea Robotics Society
    • /
    • v.14 no.1
    • /
    • pp.22-30
    • /
    • 2019
  • For effective human-robot interaction, robots need to understand the current situation context well, but also the robots need to transfer its understanding to the human participant in efficient way. The most convenient way to deliver robot's understanding to the human participant is that the robot expresses its understanding using voice and natural language. Recently, the artificial intelligence for video understanding and natural language process has been developed very rapidly especially based on deep learning. Thus, this paper proposes robot vision to audio description method using deep learning. The applied deep learning model is a pipeline of two deep learning models for generating natural language sentence from robot vision and generating voice from the generated natural language sentence. Also, we conduct the real robot experiment to show the effectiveness of our method in human-robot interaction.

Transfer Learning Backbone Network Model Analysis for Human Activity Classification Using Imagery (영상기반 인체행위분류를 위한 전이학습 중추네트워크모델 분석)

  • Kim, Jong-Hwan;Ryu, Junyeul
    • Journal of the Korea Society for Simulation
    • /
    • v.31 no.1
    • /
    • pp.11-18
    • /
    • 2022
  • Recently, research to classify human activity using imagery has been actively conducted for the purpose of crime prevention and facility safety in public places and facilities. In order to improve the performance of human activity classification, most studies have applied deep learning based-transfer learning. However, despite the increase in the number of backbone network models that are the basis of deep learning as well as the diversification of architectures, research on finding a backbone network model suitable for the purpose of operation is insufficient due to the atmosphere of using a certain model. Thus, this study applies the transfer learning into recently developed deep learning backborn network models to build an intelligent system that classifies human activity using imagery. For this, 12 types of active and high-contact human activities based on sports, not basic human behaviors, were determined and 7,200 images were collected. After 20 epochs of transfer learning were equally applied to five backbone network models, we quantitatively analyzed them to find the best backbone network model for human activity classification in terms of learning process and resultant performance. As a result, XceptionNet model demonstrated 0.99 and 0.91 in training and validation accuracy, 0.96 and 0.91 in Top 2 accuracy and average precision, 1,566 sec in train process time and 260.4MB in model memory size. It was confirmed that the performance of XceptionNet was higher than that of other models.

A Fully Convolutional Network Model for Classifying Liver Fibrosis Stages from Ultrasound B-mode Images (초음파 B-모드 영상에서 FCN(fully convolutional network) 모델을 이용한 간 섬유화 단계 분류 알고리즘)

  • Kang, Sung Ho;You, Sun Kyoung;Lee, Jeong Eun;Ahn, Chi Young
    • Journal of Biomedical Engineering Research
    • /
    • v.41 no.1
    • /
    • pp.48-54
    • /
    • 2020
  • In this paper, we deal with a liver fibrosis classification problem using ultrasound B-mode images. Commonly representative methods for classifying the stages of liver fibrosis include liver biopsy and diagnosis based on ultrasound images. The overall liver shape and the smoothness and roughness of speckle pattern represented in ultrasound images are used for determining the fibrosis stages. Although the ultrasound image based classification is used frequently as an alternative or complementary method of the invasive biopsy, it also has the limitations that liver fibrosis stage decision depends on the image quality and the doctor's experience. With the rapid development of deep learning algorithms, several studies using deep learning methods have been carried out for automated liver fibrosis classification and showed superior performance of high accuracy. The performance of those deep learning methods depends closely on the amount of datasets. We propose an enhanced U-net architecture to maximize the classification accuracy with limited small amount of image datasets. U-net is well known as a neural network for fast and precise segmentation of medical images. We design it newly for the purpose of classifying liver fibrosis stages. In order to assess the performance of the proposed architecture, numerical experiments are conducted on a total of 118 ultrasound B-mode images acquired from 78 patients with liver fibrosis symptoms of F0~F4 stages. The experimental results support that the performance of the proposed architecture is much better compared to the transfer learning using the pre-trained model of VGGNet.

Multiple Fusion-based Deep Cross-domain Recommendation (다중 융합 기반 심층 교차 도메인 추천)

  • Hong, Minsung;Lee, WonJin
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.6
    • /
    • pp.819-832
    • /
    • 2022
  • Cross-domain recommender system transfers knowledge across different domains to improve the recommendation performance in a target domain that has a relatively sparse model. However, they suffer from the "negative transfer" in which transferred knowledge operates as noise. This paper proposes a novel Multiple Fusion-based Deep Cross-Domain Recommendation named MFDCR. We exploit Doc2Vec, one of the famous word embedding techniques, to fuse data user-wise and transfer knowledge across multi-domains. It alleviates the "negative transfer" problem. Additionally, we introduce a simple multi-layer perception to learn the user-item interactions and predict the possibility of preferring items by users. Extensive experiments with three domain datasets from one of the most famous services Amazon demonstrate that MFDCR outperforms recent single and cross-domain recommendation algorithms. Furthermore, experimental results show that MFDCR can address the problem of "negative transfer" and improve recommendation performance for multiple domains simultaneously. In addition, we show that our approach is efficient in extending toward more domains.

Unsupervised Transfer Learning for Plant Anomaly Recognition

  • Xu, Mingle;Yoon, Sook;Lee, Jaesu;Park, Dong Sun
    • Smart Media Journal
    • /
    • v.11 no.4
    • /
    • pp.30-37
    • /
    • 2022
  • Disease threatens plant growth and recognizing the type of disease is essential to making a remedy. In recent years, deep learning has witnessed a significant improvement for this task, however, a large volume of labeled images is one of the requirements to get decent performance. But annotated images are difficult and expensive to obtain in the agricultural field. Therefore, designing an efficient and effective strategy is one of the challenges in this area with few labeled data. Transfer learning, assuming taking knowledge from a source domain to a target domain, is borrowed to address this issue and observed comparable results. However, current transfer learning strategies can be regarded as a supervised method as it hypothesizes that there are many labeled images in a source domain. In contrast, unsupervised transfer learning, using only images in a source domain, gives more convenience as collecting images is much easier than annotating. In this paper, we leverage unsupervised transfer learning to perform plant disease recognition, by which we achieve a better performance than supervised transfer learning in many cases. Besides, a vision transformer with a bigger model capacity than convolution is utilized to have a better-pretrained feature space. With the vision transformer-based unsupervised transfer learning, we achieve better results than current works in two datasets. Especially, we obtain 97.3% accuracy with only 30 training images for each class in the Plant Village dataset. We hope that our work can encourage the community to pay attention to vision transformer-based unsupervised transfer learning in the agricultural field when with few labeled images.

Deep Learning-Based Brain Tumor Classification in MRI images using Ensemble of Deep Features

  • Kang, Jaeyong;Gwak, Jeonghwan
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.7
    • /
    • pp.37-44
    • /
    • 2021
  • Automatic classification of brain MRI images play an important role in early diagnosis of brain tumors. In this work, we present a deep learning-based brain tumor classification model in MRI images using ensemble of deep features. In our proposed framework, three different deep features from brain MR image are extracted using three different pre-trained models. After that, the extracted deep features are fed to the classification module. In the classification module, the three different deep features are first fed into the fully-connected layers individually to reduce the dimension of the features. After that, the output features from the fully-connected layers are concatenated and fed into the fully-connected layer to predict the final output. To evaluate our proposed model, we use openly accessible brain MRI dataset from web. Experimental results show that our proposed model outperforms other machine learning-based models.

Knowledge Distillation Based Continual Learning for PCB Part Detection (PCB 부품 검출을 위한 Knowledge Distillation 기반 Continual Learning)

  • Gang, Su Myung;Chung, Daewon;Lee, Joon Jae
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.7
    • /
    • pp.868-879
    • /
    • 2021
  • PCB (Printed Circuit Board) inspection using a deep learning model requires a large amount of data and storage. When the amount of stored data increases, problems such as learning time and insufficient storage space occur. In this study, the existing object detection model is changed to a continual learning model to enable the recognition and classification of PCB components that are constantly increasing. By changing the structure of the object detection model to a knowledge distillation model, we propose a method that allows knowledge distillation of information on existing classified parts while simultaneously learning information on new components. In classification scenario, the transfer learning model result is 75.9%, and the continual learning model proposed in this study shows 90.7%.

Novel Algorithms for Early Cancer Diagnosis Using Transfer Learning with MobileNetV2 in Thermal Images

  • Swapna Davies;Jaison Jacob
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.3
    • /
    • pp.570-590
    • /
    • 2024
  • Breast cancer ranks among the most prevalent forms of malignancy and foremost cause of death by cancer worldwide. It is not preventable. Early and precise detection is the only remedy for lowering the rate of mortality and improving the probability of survival for victims. In contrast to present procedures, thermography aids in the early diagnosis of cancer and thereby saves lives. But the accuracy experiences detrimental impact by low sensitivity for small and deep tumours and the subjectivity by physicians in interpreting the images. Employing deep learning approaches for cancer detection can enhance the efficacy. This study explored the utilization of thermography in early identification of breast cancer with the use of a publicly released dataset known as the DMR-IR dataset. For this purpose, we employed a novel approach that entails the utilization of a pre-trained MobileNetV2 model and fine tuning it through transfer learning techniques. We created three models using MobileNetV2: one was a baseline transfer learning model with weights trained from ImageNet dataset, the second was a fine-tuned model with an adaptive learning rate, and the third utilized early stopping with callbacks during fine-tuning. The results showed that the proposed methods achieved average accuracy rates of 85.15%, 95.19%, and 98.69%, respectively, with various performance indicators such as precision, sensitivity and specificity also being investigated.

Efficient Driver Attention Monitoring Using Pre-Trained Deep Convolution Neural Network Models

  • Kim, JongBae
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.14 no.2
    • /
    • pp.119-128
    • /
    • 2022
  • Recently, due to the development of related technologies for autonomous vehicles, driving work is changing more safely. However, the development of support technologies for level 5 full autonomous driving is still insufficient. That is, even in the case of an autonomous vehicle, the driver needs to drive through forward attention while driving. In this paper, we propose a method to monitor driving tasks by recognizing driver behavior. The proposed method uses pre-trained deep convolutional neural network models to recognize whether the driver's face or body has unnecessary movement. The use of pre-trained Deep Convolitional Neural Network (DCNN) models enables high accuracy in relatively short time, and has the advantage of overcoming limitations in collecting a small number of driver behavior learning data. The proposed method can be applied to an intelligent vehicle safety driving support system, such as driver drowsy driving detection and abnormal driving detection.