• Title/Summary/Keyword: convolution layer

Search Result 138, Processing Time 0.018 seconds

Quality grading of Hanwoo (Korean native cattle breed) sub-images using convolutional neural network

  • Kwon, Kyung-Do;Lee, Ahyeong;Lim, Jongkuk;Cho, Soohyun;Lee, Wanghee;Cho, Byoung-Kwan;Seo, Youngwook
    • Korean Journal of Agricultural Science
    • /
    • v.47 no.4
    • /
    • pp.1109-1122
    • /
    • 2020
  • The aim of this study was to develop a marbling classification and prediction model using small parts of sirloin images based on a deep learning algorithm, namely, a convolutional neural network (CNN). Samples were purchased from a commercial slaughterhouse in Korea, images for each grade were acquired, and the total images (n = 500) were assigned according to their grade number: 1++, 1+, 1, and both 2 & 3. The image acquisition system consists of a DSLR camera with a polarization filter to remove diffusive reflectance and two light sources (55 W). To correct the distorted original images, a radial correction algorithm was implemented. Color images of sirloins of Hanwoo (mixed with feeder cattle, steer, and calf) were divided and sub-images with image sizes of 161 × 161 were made to train the marbling prediction model. In this study, the convolutional neural network (CNN) has four convolution layers and yields prediction results in accordance with marbling grades (1++, 1+, 1, and 2&3). Every single layer uses a rectified linear unit (ReLU) function as an activation function and max-pooling is used for extracting the edge between fat and muscle and reducing the variance of the data. Prediction accuracy was measured using an accuracy and kappa coefficient from a confusion matrix. We summed the prediction of sub-images and determined the total average prediction accuracy. Training accuracy was 100% and the test accuracy was 86%, indicating comparably good performance using the CNN. This study provides classification potential for predicting the marbling grade using color images and a convolutional neural network algorithm.

Grad-CAM based deep learning network for location detection of the main object (주 객체 위치 검출을 위한 Grad-CAM 기반의 딥러닝 네트워크)

  • Kim, Seon-Jin;Lee, Jong-Keun;Kwak, Nae-Jung;Ryu, Sung-Pil;Ahn, Jae-Hyeong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.2
    • /
    • pp.204-211
    • /
    • 2020
  • In this paper, we propose an optimal deep learning network architecture for main object location detection through weak supervised learning. The proposed network adds convolution blocks for improving the localization accuracy of the main object through weakly-supervised learning. The additional deep learning network consists of five additional blocks that add a composite product layer based on VGG-16. And the proposed network was trained by the method of weakly-supervised learning that does not require real location information for objects. In addition, Grad-CAM to compensate for the weakness of GAP in CAM, which is one of weak supervised learning methods, was used. The proposed network was tested through the CUB-200-2011 data set, we could obtain 50.13% in top-1 localization error. Also, the proposed network shows higher accuracy in detecting the main object than the existing method.

Real Time Hornet Classification System Based on Deep Learning (딥러닝을 이용한 실시간 말벌 분류 시스템)

  • Jeong, Yunju;Lee, Yeung-Hak;Ansari, Israfil;Lee, Cheol-Hee
    • Journal of IKEEE
    • /
    • v.24 no.4
    • /
    • pp.1141-1147
    • /
    • 2020
  • The hornet species are so similar in shape that they are difficult for non-experts to classify, and because the size of the objects is small and move fast, it is more difficult to detect and classify the species in real time. In this paper, we developed a system that classifies hornets species in real time based on a deep learning algorithm using a boundary box. In order to minimize the background area included in the bounding box when labeling the training image, we propose a method of selecting only the head and body of the hornet. It also experimentally compares existing boundary box-based object recognition algorithms to find the best algorithms that can detect wasps in real time and classify their species. As a result of the experiment, when the mish function was applied as the activation function of the convolution layer and the hornet images were tested using the YOLOv4 model with the Spatial Attention Module (SAM) applied before the object detection block, the average precision was 97.89% and the average recall was 98.69%.

A new lightweight network based on MobileNetV3

  • Zhao, Liquan;Wang, Leilei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.1
    • /
    • pp.1-15
    • /
    • 2022
  • The MobileNetV3 is specially designed for mobile devices with limited memory and computing power. To reduce the network parameters and improve the network inference speed, a new lightweight network is proposed based on MobileNetV3. Firstly, to reduce the computation of residual blocks, a partial residual structure is designed by dividing the input feature maps into two parts. The designed partial residual structure is used to replace the residual block in MobileNetV3. Secondly, a dual-path feature extraction structure is designed to further reduce the computation of MobileNetV3. Different convolution kernel sizes are used in the two paths to extract feature maps with different sizes. Besides, a transition layer is also designed for fusing features to reduce the influence of the new structure on accuracy. The CIFAR-100 dataset and Image Net dataset are used to test the performance of the proposed partial residual structure. The ResNet based on the proposed partial residual structure has smaller parameters and FLOPs than the original ResNet. The performance of improved MobileNetV3 is tested on CIFAR-10, CIFAR-100 and ImageNet image classification task dataset. Comparing MobileNetV3, GhostNet and MobileNetV2, the improved MobileNetV3 has smaller parameters and FLOPs. Besides, the improved MobileNetV3 is also tested on CPU and Raspberry Pi. It is faster than other networks

A Deep Learning-based Automatic Modulation Classification Method on SDR Platforms (SDR 플랫폼을 위한 딥러닝 기반의 무선 자동 변조 분류 기술 연구)

  • Jung-Ik, Jang;Jaehyuk, Choi;Young-Il, Yoon
    • Journal of IKEEE
    • /
    • v.26 no.4
    • /
    • pp.568-576
    • /
    • 2022
  • Automatic modulation classification(AMC) is a core technique in Software Defined Radio(SDR) platform that enables smart and flexible spectrum sensing and access in a wide frequency band. In this study, we propose a simple yet accurate deep learning-based method that allows AMC for variable-size radio signals. To this end, we design a classification architecture consisting of two Convolutional Neural Network(CNN)-based models, namely main and small models, which were trained on radio signal datasets with two different signal sizes, respectively. Then, for a received signal input with an arbitrary length, modulation classification is performed by augmenting the input samples using a self-replicating padding technique to fit the input layer size of our model. Experiments using the RadioML 2018.01A dataset demonstrated that the proposed method provides higher accuracy than the existing methods in all signal-to-noise ratio(SNR) domains with less computation overhead.

Optimizing CNN Structure to Improve Accuracy of Artwork Artist Classification

  • Ji-Seon Park;So-Yeon Kim;Yeo-Chan Yoon;Soo Kyun Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.9
    • /
    • pp.9-15
    • /
    • 2023
  • Metaverse is a modern new technology that is advancing quickly. The goal of this study is to investigate this technique from the perspective of computer vision as well as general perspective. A thorough analysis of computer vision related Metaverse topics has been done in this study. Its history, method, architecture, benefits, and drawbacks are all covered. The Metaverse's future and the steps that must be taken to adapt to this technology are described. The concepts of Mixed Reality (MR), Augmented Reality (AR), Extended Reality (XR) and Virtual Reality (VR) are briefly discussed. The role of computer vision and its application, advantages and disadvantages and the future research areas are discussed.

Hierarchical Flow-Based Anomaly Detection Model for Motor Gearbox Defect Detection

  • Younghwa Lee;Il-Sik Chang;Suseong Oh;Youngjin Nam;Youngteuk Chae;Geonyoung Choi;Gooman Park
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.6
    • /
    • pp.1516-1529
    • /
    • 2023
  • In this paper, a motor gearbox fault-detection system based on a hierarchical flow-based model is proposed. The proposed system is used for the anomaly detection of a motion sound-based actuator module. The proposed flow-based model, which is a generative model, learns by directly modeling a data distribution function. As the objective function is the maximum likelihood value of the input data, the training is stable and simple to use for anomaly detection. The operation sound of a car's side-view mirror motor is converted into a Mel-spectrogram image, consisting of a folding signal and an unfolding signal, and used as training data in this experiment. The proposed system is composed of an encoder and a decoder. The data extracted from the layer of the pretrained feature extractor are used as the decoder input data in the encoder. This information is used in the decoder by performing an interlayer cross-scale convolution operation. The experimental results indicate that the context information of various dimensions extracted from the interlayer hierarchical data improves the defect detection accuracy. This paper is notable because it uses acoustic data and a normalizing flow model to detect outliers based on the features of experimental data.

Arabic Words Extraction and Character Recognition from Picturesque Image Macros with Enhanced VGG-16 based Model Functionality Using Neural Networks

  • Ayed Ahmad Hamdan Al-Radaideh;Mohd Shafry bin Mohd Rahim;Wad Ghaban;Majdi Bsoul;Shahid Kamal;Naveed Abbas
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.7
    • /
    • pp.1807-1822
    • /
    • 2023
  • Innovation and rapid increased functionality in user friendly smartphones has encouraged shutterbugs to have picturesque image macros while in work environment or during travel. Formal signboards are placed with marketing objectives and are enriched with text for attracting people. Extracting and recognition of the text from natural images is an emerging research issue and needs consideration. When compared to conventional optical character recognition (OCR), the complex background, implicit noise, lighting, and orientation of these scenic text photos make this problem more difficult. Arabic language text scene extraction and recognition adds a number of complications and difficulties. The method described in this paper uses a two-phase methodology to extract Arabic text and word boundaries awareness from scenic images with varying text orientations. The first stage uses a convolution autoencoder, and the second uses Arabic Character Segmentation (ACS), which is followed by traditional two-layer neural networks for recognition. This study presents the way that how can an Arabic training and synthetic dataset be created for exemplify the superimposed text in different scene images. For this purpose a dataset of size 10K of cropped images has been created in the detection phase wherein Arabic text was found and 127k Arabic character dataset for the recognition phase. The phase-1 labels were generated from an Arabic corpus of quotes and sentences, which consists of 15kquotes and sentences. This study ensures that Arabic Word Awareness Region Detection (AWARD) approach with high flexibility in identifying complex Arabic text scene images, such as texts that are arbitrarily oriented, curved, or deformed, is used to detect these texts. Our research after experimentations shows that the system has a 91.8% word segmentation accuracy and a 94.2% character recognition accuracy. We believe in the future that the researchers will excel in the field of image processing while treating text images to improve or reduce noise by processing scene images in any language by enhancing the functionality of VGG-16 based model using Neural Networks.

Attention-Based Heart Rate Estimation using MobilenetV3

  • Yeo-Chan Yoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.12
    • /
    • pp.1-7
    • /
    • 2023
  • The advent of deep learning technologies has led to the development of various medical applications, making healthcare services more convenient and effective. Among these applications, heart rate estimation is considered a vital method for assessing an individual's health. Traditional methods, such as photoplethysmography through smart watches, have been widely used but are invasive and require additional hardware. Recent advancements allow for contactless heart rate estimation through facial image analysis, providing a more hygienic and convenient approach. In this paper, we propose a lightweight methodology capable of accurately estimating heart rate in mobile environments, using a specialized 2-channel network structure based on 2D convolution. Our method considers both subtle facial movements and color changes resulting from blood flow and muscle contractions. The approach comprises two major components: an Encoder for analyzing image features and a regression layer for evaluating Blood Volume Pulse. By incorporating both features simultaneously our methodology delivers more accurate results even in computing environments with limited resources. The proposed approach is expected to offer a more efficient way to monitor heart rate without invasive technology, particularly well-suited for mobile devices.

MLCNN-COV: A multilabel convolutional neural network-based framework to identify negative COVID medicine responses from the chemical three-dimensional conformer

  • Pranab Das;Dilwar Hussain Mazumder
    • ETRI Journal
    • /
    • v.46 no.2
    • /
    • pp.290-306
    • /
    • 2024
  • To treat the novel COronaVIrus Disease (COVID), comparatively fewer medicines have been approved. Due to the global pandemic status of COVID, several medicines are being developed to treat patients. The modern COVID medicines development process has various challenges, including predicting and detecting hazardous COVID medicine responses. Moreover, correctly predicting harmful COVID medicine reactions is essential for health safety. Significant developments in computational models in medicine development can make it possible to identify adverse COVID medicine reactions. Since the beginning of the COVID pandemic, there has been significant demand for developing COVID medicines. Therefore, this paper presents the transferlearning methodology and a multilabel convolutional neural network for COVID (MLCNN-COV) medicines development model to identify negative responses of COVID medicines. For analysis, a framework is proposed with five multilabel transfer-learning models, namely, MobileNetv2, ResNet50, VGG19, DenseNet201, and Inceptionv3, and an MLCNN-COV model is designed with an image augmentation (IA) technique and validated through experiments on the image of three-dimensional chemical conformer of 17 number of COVID medicines. The RGB color channel is utilized to represent the feature of the image, and image features are extracted by employing the Convolution2D and MaxPooling2D layer. The findings of the current MLCNN-COV are promising, and it can identify individual adverse reactions of medicines, with the accuracy ranging from 88.24% to 100%, which outperformed the transfer-learning model's performance. It shows that three-dimensional conformers adequately identify negative COVID medicine responses.