• Title/Summary/Keyword: CNN Feature

Search Result 309, Processing Time 0.028 seconds

Facial Expression Classification Using Deep Convolutional Neural Network (깊은 Convolutional Neural Network를 이용한 얼굴표정 분류 기법)

  • Choi, In-kyu;Song, Hyok;Lee, Sangyong;Yoo, Jisang
    • Journal of Broadcast Engineering
    • /
    • v.22 no.2
    • /
    • pp.162-172
    • /
    • 2017
  • In this paper, we propose facial expression recognition using CNN (Convolutional Neural Network), one of the deep learning technologies. To overcome the disadvantages of existing facial expression databases, various databases are used. In the proposed technique, we construct six facial expression data sets such as 'expressionless', 'happiness', 'sadness', 'angry', 'surprise', and 'disgust'. Pre-processing and data augmentation techniques are also applied to improve efficient learning and classification performance. In the existing CNN structure, the optimal CNN structure that best expresses the features of six facial expressions is found by adjusting the number of feature maps of the convolutional layer and the number of fully-connected layer nodes. Experimental results show that the proposed scheme achieves the highest classification performance of 96.88% while it takes the least time to pass through the CNN structure compared to other models.

Intra Prediction Method for Depth Picture Using CNN and Attention Mechanism (CNN과 Attention을 통한 깊이 화면 내 예측 방법)

  • Jae-hyuk Yoon;Dong-seok Lee;Byoung-ju Yun;Soon-kak Kwon
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.29 no.2
    • /
    • pp.35-45
    • /
    • 2024
  • In this paper, we propose an intra prediction method for depth picture using CNN and Attention mechanism. The proposed method allows each pixel in a block to predict to select pixels among reference area. Spatial features in the vertical and horizontal directions for reference pixels are extracted from the top and left areas adjacent to the block, respectively, through a CNN layer. The two spatial features are merged into the feature direction and the spatial direction to predict features for the prediction block and reference pixels, respectively. the correlation between the prediction block and the reference pixel is predicted through attention mechanism. The predicted correlations are restored to the pixel domain through CNN layers to predict the pixels in the block. The average prediction error of intra prediction is reduced by 5.8% when the proposed method is added to VVC intra modes.

Active pulse classification algorithm using convolutional neural networks (콘볼루션 신경회로망을 이용한 능동펄스 식별 알고리즘)

  • Kim, Geunhwan;Choi, Seung-Ryul;Yoon, Kyung-Sik;Lee, Kyun-Kyung;Lee, Donghwa
    • The Journal of the Acoustical Society of Korea
    • /
    • v.38 no.1
    • /
    • pp.106-113
    • /
    • 2019
  • In this paper, we propose an algorithm to classify the received active pulse when the active sonar system is operated as a non-cooperative mode. The proposed algorithm uses CNN (Convolutional Neural Networks) which shows good performance in various fields. As an input of CNN, time frequency analysis data which performs STFT (Short Time Fourier Transform) of the received signal is used. The CNN used in this paper consists of two convolution and pulling layers. We designed a database based neural network and a pulse feature based neural network according to the output layer design. To verify the performance of the algorithm, the data of 3110 CW (Continuous Wave) pulses and LFM (Linear Frequency Modulated) pulses received from the actual ocean were processed to construct training data and test data. As a result of simulation, the database based neural network showed 99.9 % accuracy and the feature based neural network showed about 96 % accuracy when allowing 2 pixel error.

Fault Diagnosis of Bearing Based on Convolutional Neural Network Using Multi-Domain Features

  • Shao, Xiaorui;Wang, Lijiang;Kim, Chang Soo;Ra, Ilkyeun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.5
    • /
    • pp.1610-1629
    • /
    • 2021
  • Failures frequently occurred in manufacturing machines due to complex and changeable manufacturing environments, increasing the downtime and maintenance costs. This manuscript develops a novel deep learning-based method named Multi-Domain Convolutional Neural Network (MDCNN) to deal with this challenging task with vibration signals. The proposed MDCNN consists of time-domain, frequency-domain, and statistical-domain feature channels. The Time-domain channel is to model the hidden patterns of signals in the time domain. The frequency-domain channel uses Discrete Wavelet Transformation (DWT) to obtain the rich feature representations of signals in the frequency domain. The statistic-domain channel contains six statistical variables, which is to reflect the signals' macro statistical-domain features, respectively. Firstly, in the proposed MDCNN, time-domain and frequency-domain channels are processed by CNN individually with various filters. Secondly, the CNN extracted features from time, and frequency domains are merged as time-frequency features. Lastly, time-frequency domain features are fused with six statistical variables as the comprehensive features for identifying the fault. Thereby, the proposed method could make full use of those three domain-features for fault diagnosis while keeping high distinguishability due to CNN's utilization. The authors designed massive experiments with 10-folder cross-validation technology to validate the proposed method's effectiveness on the CWRU bearing data set. The experimental results are calculated by ten-time averaged accuracy. They have confirmed that the proposed MDCNN could intelligently, accurately, and timely detect the fault under the complex manufacturing environments, whose accuracy is nearly 100%.

Distracted Driver Detection and Characteristic Area Localization by Combining CAM-Based Hierarchical and Horizontal Classification Models (CAM 기반의 계층적 및 수평적 분류 모델을 결합한 운전자 부주의 검출 및 특징 영역 지역화)

  • Go, Sooyeon;Choi, Yeongwoo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.11
    • /
    • pp.439-448
    • /
    • 2021
  • Driver negligence accounts for the largest proportion of the causes of traffic accidents, and research to detect them is continuously being conducted. This paper proposes a method to accurately detect a distracted driver and localize the most characteristic parts of the driver. The proposed method hierarchically constructs a CNN basic model that classifies 10 classes based on CAM in order to detect driver distration and 4 subclass models for detailed classification of classes having a confusing or common feature area in this model. The classification result output from each model can be considered as a new feature indicating the degree of matching with the CNN feature maps, and the accuracy of classification is improved by horizontally combining and learning them. In addition, by combining the heat map results reflecting the classification results of the basic and detailed classification models, the characteristic areas of attention in the image are found. The proposed method obtained an accuracy of 95.14% in an experiment using the State Farm data set, which is 2.94% higher than the 92.2%, which is the highest accuracy among the results using this data set. Also, it was confirmed by the experiment that more meaningful and accurate attention areas were found than the results of the attention area found when only the basic model was used.

Diagnosis of Valve Internal Leakage for Ship Piping System using Acoustic Emission Signal-based Machine Learning Approach (선박용 밸브의 내부 누설 진단을 위한 음향방출신호의 머신러닝 기법 적용 연구)

  • Lee, Jung-Hyung
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.28 no.1
    • /
    • pp.184-192
    • /
    • 2022
  • Valve internal leakage is caused by damage to the internal parts of the valve, resulting in accidents and shutdowns of the piping system. This study investigated the possibility of a real-time leak detection method using the acoustic emission (AE) signal generated from the piping system during the internal leakage of a butterfly valve. Datasets of raw time-domain AE signals were collected and postprocessed for each operation mode of the valve in a systematic manner to develop a data-driven model for the detection and classification of internal leakage, by applying machine learning algorithms. The aim of this study was to determine whether it is possible to treat leak detection as a classification problem by applying two classification algorithms: support vector machine (SVM) and convolutional neural network (CNN). The results showed different performances for the algorithms and datasets used. The SVM-based binary classification models, based on feature extraction of data, achieved an overall accuracy of 83% to 90%, while in the case of a multiple classification model, the accuracy was reduced to 66%. By contrast, the CNN-based classification model achieved an accuracy of 99.85%, which is superior to those of any other models based on the SVM algorithm. The results revealed that the SVM classification model requires effective feature extraction of the AE signals to improve the accuracy of multi-class classification. Moreover, the CNN-based classification can be a promising approach to detect both leakage and valve opening as long as the performance of the processor does not degrade.

Multi-resolution DenseNet based acoustic models for reverberant speech recognition (잔향 환경 음성인식을 위한 다중 해상도 DenseNet 기반 음향 모델)

  • Park, Sunchan;Jeong, Yongwon;Kim, Hyung Soon
    • Phonetics and Speech Sciences
    • /
    • v.10 no.1
    • /
    • pp.33-38
    • /
    • 2018
  • Although deep neural network-based acoustic models have greatly improved the performance of automatic speech recognition (ASR), reverberation still degrades the performance of distant speech recognition in indoor environments. In this paper, we adopt the DenseNet, which has shown great performance results in image classification tasks, to improve the performance of reverberant speech recognition. The DenseNet enables the deep convolutional neural network (CNN) to be effectively trained by concatenating feature maps in each convolutional layer. In addition, we extend the concept of multi-resolution CNN to multi-resolution DenseNet for robust speech recognition in reverberant environments. We evaluate the performance of reverberant speech recognition on the single-channel ASR task in reverberant voice enhancement and recognition benchmark (REVERB) challenge 2014. According to the experimental results, the DenseNet-based acoustic models show better performance than do the conventional CNN-based ones, and the multi-resolution DenseNet provides additional performance improvement.

A Study on the Two-Phased Hybrid Neural Network Approach to an Effective Decision-Making (효과적인 의사결정을 위한 2단계 하이브리드 인공신경망 접근방법에 관한 연구)

  • Lee, Geon-Chang
    • Asia pacific journal of information systems
    • /
    • v.5 no.1
    • /
    • pp.36-51
    • /
    • 1995
  • 본 논문에서는 비구조적인 의사결정문제를 효과적으로 해결하기 위하여 감독학습 인공신경망 모형과 비감독학습 인공신경망 모형을 결합한 하이브리드 인공신경망 모형인 HYNEN(HYbrid NEural Network) 모형을 제안한다. HYNEN모형은 주어진 자료를 클러스터화 하는 CNN(Clustering Neural Network)과 최종적인 출력을 제공하는 ONN(Output Neural Network)의 2단계로 구성되어 있다. 먼저 CNN에서는 주어진 자료로부터 적정한 퍼지규칙을 찾기 위하여 클러스터를 구성한다. 그리고 이러한 클러스터를 지식베이스로하여 ONN에서 최종적인 의사결정을 한다. CNN에서는 SOFM(Self Organizing Feature Map)과 LVQ(Learning Vector Quantization)를 클러스터를 만든 후 역전파학습 인공신경망 모형으로 이를 학습한다. ONN에서는 역전파학습 인공신경망 모형을 이용하여 각 클러스터의 내용을 학습한다. 제안된 HYNEN 모형을 우리나라 기업의 도산자료에 적용하여 그 결과를 다변량 판별분석법(MDA:Multivariate Discriminant Analysis)과 ACLS(Analog Concept Learning System) 퍼지 ARTMAP 그리고 기존의 역전파학습 인공신경망에 의한 실험결과와 비교하였다.

  • PDF

Gait Recognition Based on GF-CNN and Metric Learning

  • Wen, Junqin
    • Journal of Information Processing Systems
    • /
    • v.16 no.5
    • /
    • pp.1105-1112
    • /
    • 2020
  • Gait recognition, as a promising biometric, can be used in video-based surveillance and other security systems. However, due to the complexity of leg movement and the difference of external sampling conditions, gait recognition still faces many problems to be addressed. In this paper, an improved convolutional neural network (CNN) based on Gabor filter is therefore proposed to achieve gait recognition. Firstly, a gait feature extraction layer based on Gabor filter is inserted into the traditional CNNs, which is used to extract gait features from gait silhouette images. Then, in the process of gait classification, using the output of CNN as input, we utilize metric learning techniques to calculate distance between two gaits and achieve gait classification by k-nearest neighbors classifiers. Finally, several experiments are conducted on two open-accessed gait datasets and demonstrate that our method reaches state-of-the-art performances in terms of correct recognition rate on the OULP and CASIA-B datasets.

Lightweight image classifier for CIFAR-10

  • Sharma, Akshay Kumar;Rana, Amrita;Kim, Kyung Ki
    • Journal of Sensor Science and Technology
    • /
    • v.30 no.5
    • /
    • pp.286-289
    • /
    • 2021
  • Image classification is one of the fundamental applications of computer vision. It enables a system to identify an object in an image. Recently, image classification applications have broadened their scope from computer applications to edge devices. The convolutional neural network (CNN) is the main class of deep learning neural networks that are widely used in computer tasks, and it delivers high accuracy. However, CNN algorithms use a large number of parameters and incur high computational costs, which hinder their implementation in edge hardware devices. To address this issue, this paper proposes a lightweight image classifier that provides good accuracy while using fewer parameters. The proposed image classifier diverts the input into three paths and utilizes different scales of receptive fields to extract more feature maps while using fewer parameters at the time of training. This results in the development of a model of small size. This model is tested on the CIFAR-10 dataset and achieves an accuracy of 90% using .26M parameters. This is better than the state-of-the-art models, and it can be implemented on edge devices.