• Title/Summary/Keyword: Deep Convolutional Neural Networks

Search Result 388, Processing Time 0.026 seconds

Arabic Text Recognition with Harakat Using Deep Learning

  • Ashwag, Maghraby;Esraa, Samkari
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.1
    • /
    • pp.41-46
    • /
    • 2023
  • Because of the significant role that harakat plays in Arabic text, this paper used deep learning to extract Arabic text with its harakat from an image. Convolutional neural networks and recurrent neural network algorithms were applied to the dataset, which contained 110 images, each representing one word. The results showed the ability to extract some letters with harakat.

Human Motion Recognition Based on Spatio-temporal Convolutional Neural Network

  • Hu, Zeyuan;Park, Sange-yun;Lee, Eung-Joo
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.8
    • /
    • pp.977-985
    • /
    • 2020
  • Aiming at the problem of complex feature extraction and low accuracy in human action recognition, this paper proposed a network structure combining batch normalization algorithm with GoogLeNet network model. Applying Batch Normalization idea in the field of image classification to action recognition field, it improved the algorithm by normalizing the network input training sample by mini-batch. For convolutional network, RGB image was the spatial input, and stacked optical flows was the temporal input. Then, it fused the spatio-temporal networks to get the final action recognition result. It trained and evaluated the architecture on the standard video actions benchmarks of UCF101 and HMDB51, which achieved the accuracy of 93.42% and 67.82%. The results show that the improved convolutional neural network has a significant improvement in improving the recognition rate and has obvious advantages in action recognition.

A Study on the Accuracy Improvement of Movie Recommender System Using Word2Vec and Ensemble Convolutional Neural Networks (Word2Vec과 앙상블 합성곱 신경망을 활용한 영화추천 시스템의 정확도 개선에 관한 연구)

  • Kang, Boo-Sik
    • Journal of Digital Convergence
    • /
    • v.17 no.1
    • /
    • pp.123-130
    • /
    • 2019
  • One of the most commonly used methods of web recommendation techniques is collaborative filtering. Many studies on collaborative filtering have suggested ways to improve accuracy. This study proposes a method of movie recommendation using Word2Vec and an ensemble convolutional neural networks. First, in the user, movie, and rating information, construct the user sentences and movie sentences. It inputs user sentences and movie sentences into Word2Vec to obtain user vectors and movie vectors. User vectors are entered into user convolution model and movie vectors are input to movie convolution model. The user and the movie convolution models are linked to a fully connected neural network model. Finally, the output layer of the fully connected neural network outputs forecasts of user movie ratings. Experimentation results showed that the accuracy of the technique proposed in this study accuracy of conventional collaborative filtering techniques was improved compared to those of conventional collaborative filtering technique and the technique using Word2Vec and deep neural networks proposed in a similar study.

Localization of ripe tomato bunch using deep neural networks and class activation mapping

  • Seung-Woo Kang;Soo-Hyun Cho;Dae-Hyun Lee;Kyung-Chul Kim
    • Korean Journal of Agricultural Science
    • /
    • v.50 no.3
    • /
    • pp.357-364
    • /
    • 2023
  • In this study, we propose a ripe tomato bunch localization method based on convolutional neural networks, to be applied in robotic harvesting systems. Tomato images were obtained from a smart greenhouse at the Rural Development Administration (RDA). The sample images for training were extracted based on tomato maturity and resized to 128 × 128 pixels for use in the classification model. The model was constructed based on four-layer convolutional neural networks, and the classes were determined based on stage of maturity, using a Softmax classifier. The localization of the ripe tomato bunch region was indicated on a class activation map. The class activation map could show the approximate location of the tomato bunch but tends to present a local part or a large part of the ripe tomato bunch region, which could lead to poor performance. Therefore, we suggest a recursive method to improve the performance of the model. The classification results indicated that the accuracy, precision, recall, and F1-score were 0.98, 0.87, 0.98, and 0.92, respectively. The localization performance was 0.52, estimated by the Intersection over Union (IoU), and through input recursion, the IoU was improved by 13%. Based on the results, the proposed localization of the ripe tomato bunch area can be incorporated in robotic harvesting systems to establish the optimal harvesting paths.

A Study on Optimal Convolutional Neural Networks Backbone for Reinforced Concrete Damage Feature Extraction (철근콘크리트 손상 특성 추출을 위한 최적 컨볼루션 신경망 백본 연구)

  • Park, Younghoon
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.43 no.4
    • /
    • pp.511-523
    • /
    • 2023
  • Research on the integration of unmanned aerial vehicles and deep learning for reinforced concrete damage detection is actively underway. Convolutional neural networks have a high impact on the performance of image classification, detection, and segmentation as backbones. The MobileNet, a pre-trained convolutional neural network, is efficient as a backbone for an unmanned aerial vehicle-based damage detection model because it can achieve sufficient accuracy with low computational complexity. Analyzing vanilla convolutional neural networks and MobileNet under various conditions, MobileNet was evaluated to have a verification accuracy 6.0~9.0% higher than vanilla convolutional neural networks with 15.9~22.9% lower computational complexity. MobileNetV2, MobileNetV3Large and MobileNetV3Small showed almost identical maximum verification accuracy, and the optimal conditions for MobileNet's reinforced concrete damage image feature extraction were analyzed to be the optimizer RMSprop, no dropout, and average pooling. The maximum validation accuracy of 75.49% for 7 types of damage detection based on MobilenetV2 derived in this study can be improved by image accumulation and continuous learning.

Multimodal Face Biometrics by Using Convolutional Neural Networks

  • Tiong, Leslie Ching Ow;Kim, Seong Tae;Ro, Yong Man
    • Journal of Korea Multimedia Society
    • /
    • v.20 no.2
    • /
    • pp.170-178
    • /
    • 2017
  • Biometric recognition is one of the major challenging topics which needs high performance of recognition accuracy. Most of existing methods rely on a single source of biometric to achieve recognition. The recognition accuracy in biometrics is affected by the variability of effects, including illumination and appearance variations. In this paper, we propose a new multimodal biometrics recognition using convolutional neural network. We focus on multimodal biometrics from face and periocular regions. Through experiments, we have demonstrated that facial multimodal biometrics features deep learning framework is helpful for achieving high recognition performance.

Automated Ulna and Radius Segmentation model based on Deep Learning on DEXA (DEXA에서 딥러닝 기반의 척골 및 요골 자동 분할 모델)

  • Kim, Young Jae;Park, Sung Jin;Kim, Kyung Rae;Kim, Kwang Gi
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.12
    • /
    • pp.1407-1416
    • /
    • 2018
  • The purpose of this study was to train a model for the ulna and radius bone segmentation based on Convolutional Neural Networks and to verify the segmentation model. The data consisted of 840 training data, 210 tuning data, and 200 verification data. The learning model for the ulna and radius bone bwas based on U-Net (19 convolutional and 8 maximum pooling) and trained with 8 batch sizes, 0.0001 learning rate, and 200 epochs. As a result, the average sensitivity of the training data was 0.998, the specificity was 0.972, the accuracy was 0.979, and the Dice's similarity coefficient was 0.968. In the validation data, the average sensitivity was 0.961, specificity was 0.978, accuracy was 0.972, and Dice's similarity coefficient was 0.961. The performance of deep convolutional neural network based models for the segmentation was good for ulna and radius bone.

A Gradient-Based Explanation Method for Node Classification Using Graph Convolutional Networks

  • Chaehyeon Kim;Hyewon Ryu;Ki Yong Lee
    • Journal of Information Processing Systems
    • /
    • v.19 no.6
    • /
    • pp.803-816
    • /
    • 2023
  • Explainable artificial intelligence is a method that explains how a complex model (e.g., a deep neural network) yields its output from a given input. Recently, graph-type data have been widely used in various fields, and diverse graph neural networks (GNNs) have been developed for graph-type data. However, methods to explain the behavior of GNNs have not been studied much, and only a limited understanding of GNNs is currently available. Therefore, in this paper, we propose an explanation method for node classification using graph convolutional networks (GCNs), which is a representative type of GNN. The proposed method finds out which features of each node have the greatest influence on the classification of that node using GCN. The proposed method identifies influential features by backtracking the layers of the GCN from the output layer to the input layer using the gradients. The experimental results on both synthetic and real datasets demonstrate that the proposed explanation method accurately identifies the features of each node that have the greatest influence on its classification.

Toward Practical Augmentation of Raman Spectra for Deep Learning Classification of Contamination in HDD

  • Seksan Laitrakun;Somrudee Deepaisarn;Sarun Gulyanon;Chayud Srisumarnk;Nattapol Chiewnawintawat;Angkoon Angkoonsawaengsuk;Pakorn Opaprakasit;Jirawan Jindakaew;Narisara Jaikaew
    • Journal of information and communication convergence engineering
    • /
    • v.21 no.3
    • /
    • pp.208-215
    • /
    • 2023
  • Deep learning techniques provide powerful solutions to several pattern-recognition problems, including Raman spectral classification. However, these networks require large amounts of labeled data to perform well. Labeled data, which are typically obtained in a laboratory, can potentially be alleviated by data augmentation. This study investigated various data augmentation techniques and applied multiple deep learning methods to Raman spectral classification. Raman spectra yield fingerprint-like information about chemical compositions, but are prone to noise when the particles of the material are small. Five augmentation models were investigated to build robust deep learning classifiers: weighted sums of spectral signals, imitated chemical backgrounds, extended multiplicative signal augmentation, and generated Gaussian and Poisson-distributed noise. We compared the performance of nine state-of-the-art convolutional neural networks with all the augmentation techniques. The LeNet5 models with background noise augmentation yielded the highest accuracy when tested on real-world Raman spectral classification at 88.33% accuracy. A class activation map of the model was generated to provide a qualitative observation of the results.

Deep compression of convolutional neural networks with low-rank approximation

  • Astrid, Marcella;Lee, Seung-Ik
    • ETRI Journal
    • /
    • v.40 no.4
    • /
    • pp.421-434
    • /
    • 2018
  • The application of deep neural networks (DNNs) to connect the world with cyber physical systems (CPSs) has attracted much attention. However, DNNs require a large amount of memory and computational cost, which hinders their use in the relatively low-end smart devices that are widely used in CPSs. In this paper, we aim to determine whether DNNs can be efficiently deployed and operated in low-end smart devices. To do this, we develop a method to reduce the memory requirement of DNNs and increase the inference speed, while maintaining the performance (for example, accuracy) close to the original level. The parameters of DNNs are decomposed using a hybrid of canonical polyadic-singular value decomposition, approximated using a tensor power method, and fine-tuned by performing iterative one-shot hybrid fine-tuning to recover from a decreased accuracy. In this study, we evaluate our method on frequently used networks. We also present results from extensive experiments on the effects of several fine-tuning methods, the importance of iterative fine-tuning, and decomposition techniques. We demonstrate the effectiveness of the proposed method by deploying compressed networks in smartphones.