• Title/Summary/Keyword: CNN training strategy

Search Result 8, Processing Time 0.021 seconds

Divide and Conquer Strategy for CNN Model in Facial Emotion Recognition based on Thermal Images (얼굴 열화상 기반 감정인식을 위한 CNN 학습전략)

  • Lee, Donghwan;Yoo, Jang-Hee
    • Journal of Software Assessment and Valuation
    • /
    • v.17 no.2
    • /
    • pp.1-10
    • /
    • 2021
  • The ability to recognize human emotions by computer vision is a very important task, with many potential applications. Therefore the demand for emotion recognition using not only RGB images but also thermal images is increasing. Compared to RGB images, thermal images has the advantage of being less affected by lighting conditions but require a more sophisticated recognition method with low-resolution sources. In this paper, we propose a Divide and Conquer-based CNN training strategy to improve the performance of facial thermal image-based emotion recognition. The proposed method first trains to classify difficult-to-classify similar emotion classes into the same class group by confusion matrix analysis and then divides and solves the problem so that the emotion group classified into the same class group is recognized again as actual emotions. In experiments, the proposed method has improved accuracy in all the tests than when recognizing all the presented emotions with a single CNN model.

Comparison of Machine Learning-Based Radioisotope Identifiers for Plastic Scintillation Detector

  • Jeon, Byoungil;Kim, Jongyul;Yu, Yonggyun;Moon, Myungkook
    • Journal of Radiation Protection and Research
    • /
    • v.46 no.4
    • /
    • pp.204-212
    • /
    • 2021
  • Background: Identification of radioisotopes for plastic scintillation detectors is challenging because their spectra have poor energy resolutions and lack photo peaks. To overcome this weakness, many researchers have conducted radioisotope identification studies using machine learning algorithms; however, the effect of data normalization on radioisotope identification has not been addressed yet. Furthermore, studies on machine learning-based radioisotope identifiers for plastic scintillation detectors are limited. Materials and Methods: In this study, machine learning-based radioisotope identifiers were implemented, and their performances according to data normalization methods were compared. Eight classes of radioisotopes consisting of combinations of 22Na, 60Co, and 137Cs, and the background, were defined. The training set was generated by the random sampling technique based on probabilistic density functions acquired by experiments and simulations, and test set was acquired by experiments. Support vector machine (SVM), artificial neural network (ANN), and convolutional neural network (CNN) were implemented as radioisotope identifiers with six data normalization methods, and trained using the generated training set. Results and Discussion: The implemented identifiers were evaluated by test sets acquired by experiments with and without gain shifts to confirm the robustness of the identifiers against the gain shift effect. Among the three machine learning-based radioisotope identifiers, prediction accuracy followed the order SVM > ANN > CNN, while the training time followed the order SVM > ANN > CNN. Conclusion: The prediction accuracy for the combined test sets was highest with the SVM. The CNN exhibited a minimum variation in prediction accuracy for each class, even though it had the lowest prediction accuracy for the combined test sets among three identifiers. The SVM exhibited the highest prediction accuracy for the combined test sets, and its training time was the shortest among three identifiers.

Epileptic Seizure Detection Using CNN Ensemble Models Based on Overlapping Segments of EEG Signals (뇌파의 중첩 분할에 기반한 CNN 앙상블 모델을 이용한 뇌전증 발작 검출)

  • Kim, Min-Ki
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.12
    • /
    • pp.587-594
    • /
    • 2021
  • As the diagnosis using encephalography(EEG) has been expanded, various studies have been actively performed for classifying EEG automatically. This paper proposes a CNN model that can effectively classify EEG signals acquired from healthy persons and patients with epilepsy. We segment the EEG signals into sub-signals with smaller dimension to augment the EEG data that is necessary to train the CNN model. Then the sub-signals are segmented again with overlap and they are used for training the CNN model. We also propose ensemble strategy in order to improve the classification accuracy. Experimental result using public Bonn dataset shows that the CNN can detect the epileptic seizure with the accuracy above 99.0%. It also shows that the ensemble method improves the accuracy of 3-class and 5-class EEG classification.

Building Detection by Convolutional Neural Network with Infrared Image, LiDAR Data and Characteristic Information Fusion (적외선 영상, 라이다 데이터 및 특성정보 융합 기반의 합성곱 인공신경망을 이용한 건물탐지)

  • Cho, Eun Ji;Lee, Dong-Cheon
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.38 no.6
    • /
    • pp.635-644
    • /
    • 2020
  • Object recognition, detection and instance segmentation based on DL (Deep Learning) have being used in various practices, and mainly optical images are used as training data for DL models. The major objective of this paper is object segmentation and building detection by utilizing multimodal datasets as well as optical images for training Detectron2 model that is one of the improved R-CNN (Region-based Convolutional Neural Network). For the implementation, infrared aerial images, LiDAR data, and edges from the images, and Haralick features, that are representing statistical texture information, from LiDAR (Light Detection And Ranging) data were generated. The performance of the DL models depends on not only on the amount and characteristics of the training data, but also on the fusion method especially for the multimodal data. The results of segmenting objects and detecting buildings by applying hybrid fusion - which is a mixed method of early fusion and late fusion - results in a 32.65% improvement in building detection rate compared to training by optical image only. The experiments demonstrated complementary effect of the training multimodal data having unique characteristics and fusion strategy.

Accurate Human Localization for Automatic Labelling of Human from Fisheye Images

  • Than, Van Pha;Nguyen, Thanh Binh;Chung, Sun-Tae
    • Journal of Korea Multimedia Society
    • /
    • v.20 no.5
    • /
    • pp.769-781
    • /
    • 2017
  • Deep learning networks like Convolutional Neural Networks (CNNs) show successful performances in many computer vision applications such as image classification, object detection, and so on. For implementation of deep learning networks in embedded system with limited processing power and memory, deep learning network may need to be simplified. However, simplified deep learning network cannot learn every possible scene. One realistic strategy for embedded deep learning network is to construct a simplified deep learning network model optimized for the scene images of the installation place. Then, automatic training will be necessitated for commercialization. In this paper, as an intermediate step toward automatic training under fisheye camera environments, we study more precise human localization in fisheye images, and propose an accurate human localization method, Automatic Ground-Truth Labelling Method (AGTLM). AGTLM first localizes candidate human object bounding boxes by utilizing GoogLeNet-LSTM approach, and after reassurance process by GoogLeNet-based CNN network, finally refines them more correctly and precisely(tightly) by applying saliency object detection technique. The performance improvement of the proposed human localization method, AGTLM with respect to accuracy and tightness is shown through several experiments.

A Method of License Plate Location and Character Recognition based on CNN

  • Fang, Wei;Yi, Weinan;Pang, Lin;Hou, Shuonan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.8
    • /
    • pp.3488-3500
    • /
    • 2020
  • At the present time, the economy continues to flourish, and private cars have become the means of choice for most people. Therefore, the license plate recognition technology has become an indispensable part of intelligent transportation, with research and application value. In recent years, the convolution neural network for image classification is an application of deep learning on image processing. This paper proposes a strategy to improve the YOLO model by studying the deep learning convolutional neural network (CNN) and related target detection methods, and combines the OpenCV and TensorFlow frameworks to achieve efficient recognition of license plate characters. The experimental results show that target detection method based on YOLO is beneficial to shorten the training process and achieve a good level of accuracy.

Crack segmentation in high-resolution images using cascaded deep convolutional neural networks and Bayesian data fusion

  • Tang, Wen;Wu, Rih-Teng;Jahanshahi, Mohammad R.
    • Smart Structures and Systems
    • /
    • v.29 no.1
    • /
    • pp.221-235
    • /
    • 2022
  • Manual inspection of steel box girders on long span bridges is time-consuming and labor-intensive. The quality of inspection relies on the subjective judgements of the inspectors. This study proposes an automated approach to detect and segment cracks in high-resolution images. An end-to-end cascaded framework is proposed to first detect the existence of cracks using a deep convolutional neural network (CNN) and then segment the crack using a modified U-Net encoder-decoder architecture. A Naïve Bayes data fusion scheme is proposed to reduce the false positives and false negatives effectively. To generate the binary crack mask, first, the original images are divided into 448 × 448 overlapping image patches where these image patches are classified as cracks versus non-cracks using a deep CNN. Next, a modified U-Net is trained from scratch using only the crack patches for segmentation. A customized loss function that consists of binary cross entropy loss and the Dice loss is introduced to enhance the segmentation performance. Additionally, a Naïve Bayes fusion strategy is employed to integrate the crack score maps from different overlapping crack patches and to decide whether a pixel is crack or not. Comprehensive experiments have demonstrated that the proposed approach achieves an 81.71% mean intersection over union (mIoU) score across 5 different training/test splits, which is 7.29% higher than the baseline reference implemented with the original U-Net.

Classroom Roll-Call System Based on ResNet Networks

  • Zhu, Jinlong;Yu, Fanhua;Liu, Guangjie;Sun, Mingyu;Zhao, Dong;Geng, Qingtian;Su, Jinbo
    • Journal of Information Processing Systems
    • /
    • v.16 no.5
    • /
    • pp.1145-1157
    • /
    • 2020
  • A convolution neural networks (CNNs) has demonstrated outstanding performance compared to other algorithms in the field of face recognition. Regarding the over-fitting problem of CNN, researchers have proposed a residual network to ease the training for recognition accuracy improvement. In this study, a novel face recognition model based on game theory for call-over in the classroom was proposed. In the proposed scheme, an image with multiple faces was used as input, and the residual network identified each face with a confidence score to form a list of student identities. Face tracking of the same identity or low confidence were determined to be the optimisation objective, with the game participants set formed from the student identity list. Game theory optimises the authentication strategy according to the confidence value and identity set to improve recognition accuracy. We observed that there exists an optimal mapping relation between face and identity to avoid multiple faces associated with one identity in the proposed scheme and that the proposed game-based scheme can reduce the error rate, as compared to the existing schemes with deeper neural network.