• Title/Summary/Keyword: CNN(Convolution Neural Network)

Search Result 283, Processing Time 0.043 seconds

A Convolutional Neural Network Model with Weighted Combination of Multi-scale Spatial Features for Crop Classification (작물 분류를 위한 다중 규모 공간특징의 가중 결합 기반 합성곱 신경망 모델)

  • Park, Min-Gyu;Kwak, Geun-Ho;Park, No-Wook
    • Korean Journal of Remote Sensing
    • /
    • v.35 no.6_3
    • /
    • pp.1273-1283
    • /
    • 2019
  • This paper proposes an advanced crop classification model that combines a procedure for weighted combination of spatial features extracted from multi-scale input images with a conventional convolutional neural network (CNN) structure. The proposed model first extracts spatial features from patches with different sizes in convolution layers, and then assigns different weights to the extracted spatial features by considering feature-specific importance using squeeze-and-excitation block sets. The novelty of the model lies in its ability to extract spatial features useful for classification and account for their relative importance. A case study of crop classification with multi-temporal Landsat-8 OLI images in Illinois, USA was carried out to evaluate the classification performance of the proposed model. The impact of patch sizes on crop classification was first assessed in a single-patch model to find useful patch sizes. The classification performance of the proposed model was then compared with those of conventional two CNN models including the single-patch model and a multi-patch model without considering feature-specific weights. From the results of comparison experiments, the proposed model could alleviate misclassification patterns by considering the spatial characteristics of different crops in the study area, achieving the best classification accuracy compared to the other models. Based on the case study results, the proposed model, which can account for the relative importance of spatial features, would be effectively applied to classification of objects with different spatial characteristics, as well as crops.

Watermarking for Digital Hologram by a Deep Neural Network and its Training Considering the Hologram Data Characteristics (딥 뉴럴 네트워크에 의한 디지털 홀로그램의 워터마킹 및 홀로그램 데이터 특성을 고려한 학습)

  • Lee, Juwon;Lee, Jae-Eun;Seo, Young-Ho;Kim, Dong-Wook
    • Journal of Broadcast Engineering
    • /
    • v.26 no.3
    • /
    • pp.296-307
    • /
    • 2021
  • A digital hologram (DH) is an ultra-high value-added video content that includes 3D information in 2D data. Therefore, its intellectual property rights must be protected for its distribution. For this, this paper proposes a watermarking method of DH using a deep neural network. This method is a watermark (WM) invisibility, attack robustness, and blind watermarking method that does not use host information in WM extraction. The proposed network consists of four sub-networks: pre-processing for each of the host and WM, WM embedding watermark, and WM extracting watermark. This network expand the WM data to the host instead of shrinking host data to WM and concatenate it to the host to insert the WM by considering the characteristics of a DH having a strong high frequency component. In addition, in the training of this network, the difference in performance according to the data distribution property of DH is identified, and a method of selecting a training data set with the best performance in all types of DH is presented. The proposed method is tested for various types and strengths of attacks to show its performance. It also shows that this method has high practicality as it operates independently of the resolution of the host DH and WM data.

Super-resolution based on multi-channel input convolutional residual neural network (다중 채널 입력 Convolution residual neural networks 기반의 초해상화 기법)

  • Youm, Gwang-Young;Kim, Munchurl
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2016.06a
    • /
    • pp.37-39
    • /
    • 2016
  • 최근 Convolutional neural networks(CNN) 기반의 초해상화 기법인 Super-Resolution Convolutional Neural Networks (SRCNN) 이 좋은 PSNR 성능을 발휘하는 것으로 보고되었다 [1]. 하지만 많은 제안 방법들이 고주파 성분을 복원하는데 한계를 드러내는 것처럼, SRCNN 도 고주파 성분 복원에 한계점을 지니고 있다. 또한 SRCNN 의 네트워크 층을 깊게 만들면 좋은 PSNR 성능을 발휘하는 것으로 널리 알려져 있지만, 네트워크의 층을 깊게 하는 것은 네트워크 파라미터 학습을 어렵게 하는 경향이 있다. 네트워크의 층을 깊게 할 경우, gradient 값이 아래(역방향) 층으로 갈수록 발산하거나 0 으로 수렴하여, 네트워크 파라미터 학습이 제대로 되지 않는 현상이 발생하기 때문이다. 따라서 본 논문에서는 네트워크 층을 깊게 하는 대신에, 입력을 다중 채널로 구성하여, 네트워크에 고주파 성분에 관한 추가적인 정보를 주는 방법을 제안하였다. 많은 초해상화 기법들이 고주파 성분의 복원 능력이 부족하다는 점에 착안하여, 우리는 네트워크가 고주파 성분에 관한 많은 정보를 필요로 한다는 것을 가정하였다. 따라서 우리는 네트워크의 입력을 고주파 성분이 여러 가지 강도로 입력되도록 저해상도 입력 영상들을 구성하였다. 또한 잔차신호 네트워크(residual networks)를 도입하여, 네트워크 파라미터를 학습할 때 고주파 성분의 복원에 집중할 수 있도록 하였다. 본 논문의 효율성을 검증하기 위하여 set5 데이터와 set14 데이터에 관하여 실험을 진행하였고, SRCNN 과 비교하여 set5 데이터에서는 2, 3, 4 배에 관하여 각각 평균 0.29, 0.35, 0.17dB 의 PSNR 성능 향상이 있었으며, set14 데이터에서는 3 배의 관하여 평균 0.20dB 의 PSNR 성능 향상이 있었다.

  • PDF

Improving Test Accuracy on the MNIST Dataset using a Simple CNN with Batch Normalization

  • Seungbin Lee;Jungsoo Rhee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.9
    • /
    • pp.1-7
    • /
    • 2024
  • In this paper, we proposes a Convolutional Neural Networks(CNN) equipped with Batch Normalization(BN) for handwritten digit recognition training the MNIST dataset. Aiming to surpass the performance of LeNet-5 by LeCun et al., a 6-layer neural network was designed. The proposed model processes 28×28 pixel images through convolution, Max Pooling, and Fully connected layers, with the batch normalization to improve learning stability and performance. The experiment utilized 60,000 training images and 10,000 test images, applying the Momentum optimization algorithm. The model configuration used 30 filters with a 5×5 filter size, padding 0, stride 1, and ReLU as activation function. The training process was set with a mini-batch size of 100, 20 epochs in total, and a learning rate of 0.1. As a result, the proposed model achieved a test accuracy of 99.22%, surpassing LeNet-5's 99.05%, and recorded an F1-score of 0.9919, demonstrating the model's performance. Moreover, the 6-layer model proposed in this paper emphasizes model efficiency with a simpler structure compared to LeCun et al.'s LeNet-5 (7-layer model) and the model proposed by Ji, Chun and Kim (10-layer model). The results of this study show potential for application in real industrial applications such as AI vision inspection systems. It is expected to be effectively applied in smart factories, particularly in determining the defective status of parts.

Analyzing Media Bias in News Articles Using RNN and CNN (순환 신경망과 합성곱 신경망을 이용한 뉴스 기사 편향도 분석)

  • Oh, Seungbin;Kim, Hyunmin;Kim, Seungjae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.8
    • /
    • pp.999-1005
    • /
    • 2020
  • While search portals' 'Portal News' account for the largest portion of aggregated news outlet, its neutrality as an outlet is questionable. This is because news aggregation may lead to prejudiced information consumption by recommending biased news articles. In this paper we introduce a new method of measuring political bias of news articles by using deep learning. It can provide its readers with insights on critical thinking. For this method, we build the dataset for deep learning by analyzing articles' bias from keywords, sourced from the National Assembly proceedings, and assigning bias to said keywords. Based on these data, news article bias is calculated by applying deep learning with a combination of Convolution Neural Network and Recurrent Neural Network. Using this method, 95.6% of sentences are correctly distinguished as either conservative or progressive-biased; on the entire article, the accuracy is 46.0%. This enables analyzing any articles' bias between conservative and progressive unlike previous methods that were limited on article subjects.

Efficient Tire Wear and Defect Detection Algorithm Based on Deep Learning (심층학습 기법을 활용한 효과적인 타이어 마모도 분류 및 손상 부위 검출 알고리즘)

  • Park, Hye-Jin;Lee, Young-Woon;Kim, Byung-Gyu
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.8
    • /
    • pp.1026-1034
    • /
    • 2021
  • Tire wear and defect are important factors for safe driving condition. These defects are generally inspected by some specialized experts or very expensive equipments such as stereo depth camera and depth gauge. In this paper, we propose tire safety vision inspector based on deep neural network (DNN). The status of tire wear is categorized into three: 'safety', 'warning', and 'danger' based on depth of tire tread. We propose an attention mechanism for emphasizing the feature of tread area. The attention-based feature is concatenated to output feature maps of the last convolution layer of ResNet-101 to extract more robust feature. Through experiments, the proposed tire wear classification model improves 1.8% of accuracy compared to the existing ResNet-101 model. For detecting the tire defections, the developed tire defect detection model shows up-to 91% of accuracy using the Mask R-CNN model. From these results, we can see that the suggested models are useful for checking on the safety condition of working tire in real environment.

Analysis of Livestock Vocal Data using Lightweight MobileNet (경량화 MobileNet을 활용한 축산 데이터 음성 분석)

  • Se Yeon Chung;Sang Cheol Kim
    • Smart Media Journal
    • /
    • v.13 no.6
    • /
    • pp.16-23
    • /
    • 2024
  • Pigs express their reactions to their environment and health status through a variety of sounds, such as grunting, coughing, and screaming. Given the significance of pig vocalizations, their study has recently become a vital source of data for livestock industry workers. To facilitate this, we propose a lightweight deep learning model based on MobileNet that analyzes pig vocal patterns to distinguish pig voices from farm noise and differentiate between vocal sounds and coughing. This model was able to accurately identify pig vocalizations amidst a variety of background noises and cough sounds within the pigsty. Test results demonstrated that this model achieved a high accuracy of 98.2%. Based on these results, future research is expected to address issues such as analyzing pig emotions and identifying stress levels.

A Study on the Prediction of Buried Rebar Thickness Using CNN Based on GPR Heatmap Image Data (GPR 히트맵 이미지 데이터 기반 CNN을 이용한 철근 두께 예측에 관한 연구)

  • Park, Sehwan;Kim, Juwon;Kim, Wonkyu;Kim, Hansun;Park, Seunghee
    • Journal of the Korea institute for structural maintenance and inspection
    • /
    • v.23 no.7
    • /
    • pp.66-71
    • /
    • 2019
  • In this paper, a study was conducted on the method of using GPR data to predict rebar thickness inside a facility. As shown in the cases of poor construction, such as the use of rebars below the domestic standard and the construction of reinforcement, information on rebar thickness can be found to be essential for precision safety diagnosis of structures. For this purpose, the B-scan data of GPR was obtained by gradually increasing the diameter of rebars by making specimen. Because the B-scan data of GPR is less visible, the data was converted into the heatmap image data through migration to increase the intuition of the data. In order to compare the results of application of commonly used B-scan data and heatmap data to CNN, this study extracted areas for rebars from B-scan and heatmap data respectively to build training and validation data, and applied CNN to the deployed data. As a result, better results were obtained for the heatmap data when compared with the B-scan data. This confirms that if GPR heatmap data are used, rebar thickness can be predicted with higher accuracy than when B-scan data is used, and the possibility of predicting rebar thickness inside a facility is verified.

Sensor Fault Detection Scheme based on Deep Learning and Support Vector Machine (딥 러닝 및 서포트 벡터 머신기반 센서 고장 검출 기법)

  • Yang, Jae-Wan;Lee, Young-Doo;Koo, In-Soo
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.18 no.2
    • /
    • pp.185-195
    • /
    • 2018
  • As machines have been automated in the field of industries in recent years, it is a paramount importance to manage and maintain the automation machines. When a fault occurs in sensors attached to the machine, the machine may malfunction and further, a huge damage will be caused in the process line. To prevent the situation, the fault of sensors should be monitored, diagnosed and classified in a proper way. In the paper, we propose a sensor fault detection scheme based on SVM and CNN to detect and classify typical sensor errors such as erratic, drift, hard-over, spike, and stuck faults. Time-domain statistical features are utilized for the learning and testing in the proposed scheme, and the genetic algorithm is utilized to select the subset of optimal features. To classify multiple sensor faults, a multi-layer SVM is utilized, and ensemble technique is used for CNN. As a result, the SVM that utilizes a subset of features selected by the genetic algorithm provides better performance than the SVM that utilizes all the features. However, the performance of CNN is superior to that of the SVM.

Low Resolution Infrared Image Deep Convolution Neural Network for Embedded System

  • Hong, Yong-hee;Jin, Sang-hun;Kim, Dae-hyeon;Jhee, Ho-Jin
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.6
    • /
    • pp.1-8
    • /
    • 2021
  • In this paper, we propose reinforced VGG style network structure for low performance embedded system to classify low resolution infrared image. The combination of reinforced VGG style network structure and global average pooling makes lower computational complexity and higher accuracy. The proposed method classify the synthesize image which have 9 class 3,723,328ea images made from OKTAL-SE tool. The reinforced VGG style network structure composed of 4 filters on input and 16 filters on output from max pooling layer shows about 34% lower computational complexity and about 2.4% higher accuracy then the first parameter minimized network structure made for embedded system composed of 8 filters on input and 8 filters on output from max pooling layer. Finally we get 96.1% accuracy model. Additionally we confirmed the about 31% lower inference lead time in ported C code.