• Title/Summary/Keyword: 3-Dimensional Convolutional Network

Search Result 40, Processing Time 0.025 seconds

Dynamic Hand Gesture Recognition Using CNN Model and FMM Neural Networks (CNN 모델과 FMM 신경망을 이용한 동적 수신호 인식 기법)

  • Kim, Ho-Joon
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.2
    • /
    • pp.95-108
    • /
    • 2010
  • In this paper, we present a hybrid neural network model for dynamic hand gesture recognition. The model consists of two modules, feature extraction module and pattern classification module. We first propose a modified CNN(convolutional Neural Network) a pattern recognition model for the feature extraction module. Then we introduce a weighted fuzzy min-max(WFMM) neural network for the pattern classification module. The data representation proposed in this research is a spatiotemporal template which is based on the motion information of the target object. To minimize the influence caused by the spatial and temporal variation of the feature points, we extend the receptive field of the CNN model to a three-dimensional structure. We discuss the learning capability of the WFMM neural networks in which the weight concept is added to represent the frequency factor in training pattern set. The model can overcome the performance degradation which may be caused by the hyperbox contraction process of conventional FMM neural networks. From the experimental results of human action recognition and dynamic hand gesture recognition for remote-control electric home appliances, the validity of the proposed models is discussed.

CNN based data anomaly detection using multi-channel imagery for structural health monitoring

  • Shajihan, Shaik Althaf V.;Wang, Shuo;Zhai, Guanghao;Spencer, Billie F. Jr.
    • Smart Structures and Systems
    • /
    • v.29 no.1
    • /
    • pp.181-193
    • /
    • 2022
  • Data-driven structural health monitoring (SHM) of civil infrastructure can be used to continuously assess the state of a structure, allowing preemptive safety measures to be carried out. Long-term monitoring of large-scale civil infrastructure often involves data-collection using a network of numerous sensors of various types. Malfunctioning sensors in the network are common, which can disrupt the condition assessment and even lead to false-negative indications of damage. The overwhelming size of the data collected renders manual approaches to ensure data quality intractable. The task of detecting and classifying an anomaly in the raw data is non-trivial. We propose an approach to automate this task, improving upon the previously developed technique of image-based pre-processing on one-dimensional (1D) data by enriching the features of the neural network input data with multiple channels. In particular, feature engineering is employed to convert the measured time histories into a 3-channel image comprised of (i) the time history, (ii) the spectrogram, and (iii) the probability density function representation of the signal. To demonstrate this approach, a CNN model is designed and trained on a dataset consisting of acceleration records of sensors installed on a long-span bridge, with the goal of fault detection and classification. The effect of imbalance in anomaly patterns observed is studied to better account for unseen test cases. The proposed framework achieves high overall accuracy and recall even when tested on an unseen dataset that is much larger than the samples used for training, offering a viable solution for implementation on full-scale structures where limited labeled-training data is available.

Development of Image Defect Detection Model Using Machine Learning (기계 학습을 활용한 이미지 결함 검출 모델 개발)

  • Lee, Nam-Yeong;Cho, Hyug-Hyun;Ceong, Hyi-Thaek
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.15 no.3
    • /
    • pp.513-520
    • /
    • 2020
  • Recently, the development of a vision inspection system using machine learning has become more active. This study seeks to develop a defect inspection model using machine learning. Defect detection problems for images correspond to classification problems, which are the method of supervised learning in machine learning. In this study, defect detection models are developed based on algorithms that automatically extract features and algorithms that do not extract features. One-dimensional CNN and two-dimensional CNN are used as algorithms for automatic extraction of features, and MLP and SVM are used as algorithms for non-extracting features. A defect detection model is developed based on four models and their accuracy and AUC compare based on AUC. Although image classification is common in the development of models using CNN, high accuracy and AUC is achieved when developing SVM models by converting pixels from images into RGB values in this study.

Deep learning based Person Re-identification with RGB-D sensors

  • Kim, Min;Park, Dong-Hyun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.3
    • /
    • pp.35-42
    • /
    • 2021
  • In this paper, we propose a deep learning-based person re-identification method using a three-dimensional RGB-Depth Xtion2 camera considering joint coordinates and dynamic features(velocity, acceleration). The main idea of the proposed identification methodology is to easily extract gait data such as joint coordinates, dynamic features with an RGB-D camera and automatically identify gait patterns through a self-designed one-dimensional convolutional neural network classifier(1D-ConvNet). The accuracy was measured based on the F1 Score, and the influence was measured by comparing the accuracy with the classifier model (JC) that did not consider dynamic characteristics. As a result, our proposed classifier model in the case of considering the dynamic characteristics(JCSpeed) showed about 8% higher F1-Score than JC.

Performance Evaluation of Machine Learning and Deep Learning Algorithms in Crop Classification: Impact of Hyper-parameters and Training Sample Size (작물분류에서 기계학습 및 딥러닝 알고리즘의 분류 성능 평가: 하이퍼파라미터와 훈련자료 크기의 영향 분석)

  • Kim, Yeseul;Kwak, Geun-Ho;Lee, Kyung-Do;Na, Sang-Il;Park, Chan-Won;Park, No-Wook
    • Korean Journal of Remote Sensing
    • /
    • v.34 no.5
    • /
    • pp.811-827
    • /
    • 2018
  • The purpose of this study is to compare machine learning algorithm and deep learning algorithm in crop classification using multi-temporal remote sensing data. For this, impacts of machine learning and deep learning algorithms on (a) hyper-parameter and (2) training sample size were compared and analyzed for Haenam-gun, Korea and Illinois State, USA. In the comparison experiment, support vector machine (SVM) was applied as machine learning algorithm and convolutional neural network (CNN) was applied as deep learning algorithm. In particular, 2D-CNN considering 2-dimensional spatial information and 3D-CNN with extended time dimension from 2D-CNN were applied as CNN. As a result of the experiment, it was found that the hyper-parameter values of CNN, considering various hyper-parameter, defined in the two study areas were similar compared with SVM. Based on this result, although it takes much time to optimize the model in CNN, it is considered that it is possible to apply transfer learning that can extend optimized CNN model to other regions. Then, in the experiment results with various training sample size, the impact of that on CNN was larger than SVM. In particular, this impact was exaggerated in Illinois State with heterogeneous spatial patterns. In addition, the lowest classification performance of 3D-CNN was presented in Illinois State, which is considered to be due to over-fitting as complexity of the model. That is, the classification performance was relatively degraded due to heterogeneous patterns and noise effect of input data, although the training accuracy of 3D-CNN model was high. This result simply that a proper classification algorithms should be selected considering spatial characteristics of study areas. Also, a large amount of training samples is necessary to guarantee higher classification performance in CNN, particularly in 3D-CNN.

Transfer Learning using Multiple ConvNet Layers Activation Features with Principal Component Analysis for Image Classification (전이학습 기반 다중 컨볼류션 신경망 레이어의 활성화 특징과 주성분 분석을 이용한 이미지 분류 방법)

  • Byambajav, Batkhuu;Alikhanov, Jumabek;Fang, Yang;Ko, Seunghyun;Jo, Geun Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.205-225
    • /
    • 2018
  • Convolutional Neural Network (ConvNet) is one class of the powerful Deep Neural Network that can analyze and learn hierarchies of visual features. Originally, first neural network (Neocognitron) was introduced in the 80s. At that time, the neural network was not broadly used in both industry and academic field by cause of large-scale dataset shortage and low computational power. However, after a few decades later in 2012, Krizhevsky made a breakthrough on ILSVRC-12 visual recognition competition using Convolutional Neural Network. That breakthrough revived people interest in the neural network. The success of Convolutional Neural Network is achieved with two main factors. First of them is the emergence of advanced hardware (GPUs) for sufficient parallel computation. Second is the availability of large-scale datasets such as ImageNet (ILSVRC) dataset for training. Unfortunately, many new domains are bottlenecked by these factors. For most domains, it is difficult and requires lots of effort to gather large-scale dataset to train a ConvNet. Moreover, even if we have a large-scale dataset, training ConvNet from scratch is required expensive resource and time-consuming. These two obstacles can be solved by using transfer learning. Transfer learning is a method for transferring the knowledge from a source domain to new domain. There are two major Transfer learning cases. First one is ConvNet as fixed feature extractor, and the second one is Fine-tune the ConvNet on a new dataset. In the first case, using pre-trained ConvNet (such as on ImageNet) to compute feed-forward activations of the image into the ConvNet and extract activation features from specific layers. In the second case, replacing and retraining the ConvNet classifier on the new dataset, then fine-tune the weights of the pre-trained network with the backpropagation. In this paper, we focus on using multiple ConvNet layers as a fixed feature extractor only. However, applying features with high dimensional complexity that is directly extracted from multiple ConvNet layers is still a challenging problem. We observe that features extracted from multiple ConvNet layers address the different characteristics of the image which means better representation could be obtained by finding the optimal combination of multiple ConvNet layers. Based on that observation, we propose to employ multiple ConvNet layer representations for transfer learning instead of a single ConvNet layer representation. Overall, our primary pipeline has three steps. Firstly, images from target task are given as input to ConvNet, then that image will be feed-forwarded into pre-trained AlexNet, and the activation features from three fully connected convolutional layers are extracted. Secondly, activation features of three ConvNet layers are concatenated to obtain multiple ConvNet layers representation because it will gain more information about an image. When three fully connected layer features concatenated, the occurring image representation would have 9192 (4096+4096+1000) dimension features. However, features extracted from multiple ConvNet layers are redundant and noisy since they are extracted from the same ConvNet. Thus, a third step, we will use Principal Component Analysis (PCA) to select salient features before the training phase. When salient features are obtained, the classifier can classify image more accurately, and the performance of transfer learning can be improved. To evaluate proposed method, experiments are conducted in three standard datasets (Caltech-256, VOC07, and SUN397) to compare multiple ConvNet layer representations against single ConvNet layer representation by using PCA for feature selection and dimension reduction. Our experiments demonstrated the importance of feature selection for multiple ConvNet layer representation. Moreover, our proposed approach achieved 75.6% accuracy compared to 73.9% accuracy achieved by FC7 layer on the Caltech-256 dataset, 73.1% accuracy compared to 69.2% accuracy achieved by FC8 layer on the VOC07 dataset, 52.2% accuracy compared to 48.7% accuracy achieved by FC7 layer on the SUN397 dataset. We also showed that our proposed approach achieved superior performance, 2.8%, 2.1% and 3.1% accuracy improvement on Caltech-256, VOC07, and SUN397 dataset respectively compare to existing work.

Estimation for Nominal Wake Field of Ships by Using Machine Learning Model (기계학습 모델을 활용한 선박 공칭반류장 예측)

  • Yoo-Chul Kim;Gun-Do Kim;Seongmo Yeon;Seung-Hyun Hwang;Young-Yeon Lee;Kwang-Soo Kim
    • Journal of the Society of Naval Architects of Korea
    • /
    • v.61 no.5
    • /
    • pp.343-351
    • /
    • 2024
  • In this paper, we introduce the machine learning model to estimate the nominal wake field of a ship from the afterbody hullform using a 3 dimensional CNN (Convolutional Neural Network) model. The convolution layers extract the features of the hullform and they are connected to the nominal wake field. In this research, two different models were tested. The one learns the velocity field itself while the other learns the Fourier coefficients expressing the wake field. Both models showed about 4% volumetric mean velocity error for the test data not used in the learning process. In the case study of two sample ships included in the test data, the direct prediction model showed the better estimation results than the Fourier coefficient based model. Application cases for estimating cavitation performance using the developed model were also introduced.

Comparison of Prediction Accuracy Between Classification and Convolution Algorithm in Fault Diagnosis of Rotatory Machines at Varying Speed (회전수가 변하는 기기의 고장진단에 있어서 특성 기반 분류와 합성곱 기반 알고리즘의 예측 정확도 비교)

  • Moon, Ki-Yeong;Kim, Hyung-Jin;Hwang, Se-Yun;Lee, Jang Hyun
    • Journal of Navigation and Port Research
    • /
    • v.46 no.3
    • /
    • pp.280-288
    • /
    • 2022
  • This study examined the diagnostics of abnormalities and faults of equipment, whose rotational speed changes even during regular operation. The purpose of this study was to suggest a procedure that can properly apply machine learning to the time series data, comprising non-stationary characteristics as the rotational speed changes. Anomaly and fault diagnosis was performed using machine learning: k-Nearest Neighbor (k-NN), Support Vector Machine (SVM), and Random Forest. To compare the diagnostic accuracy, an autoencoder was used for anomaly detection and a convolution based Conv1D was additionally used for fault diagnosis. Feature vectors comprising statistical and frequency attributes were extracted, and normalization & dimensional reduction were applied to the extracted feature vectors. Changes in the diagnostic accuracy of machine learning according to feature selection, normalization, and dimensional reduction are explained. The hyperparameter optimization process and the layered structure are also described for each algorithm. Finally, results show that machine learning can accurately diagnose the failure of a variable-rotation machine under the appropriate feature treatment, although the convolution algorithms have been widely applied to the considered problem.

Improvements in Patch-Based Machine Learning for Analyzing Three-Dimensional Seismic Sequence Data (3차원 탄성파자료의 층서구분을 위한 패치기반 기계학습 방법의 개선)

  • Lee, Donguk;Moon, Hye-Jin;Kim, Chung-Ho;Moon, Seonghoon;Lee, Su Hwan;Jou, Hyeong-Tae
    • Geophysics and Geophysical Exploration
    • /
    • v.25 no.2
    • /
    • pp.59-70
    • /
    • 2022
  • Recent studies demonstrate that machine learning has expanded in the field of seismic interpretation. Many convolutional neural networks have been developed for seismic sequence identification, which is important for seismic interpretation. However, expense and time limitations indicate that there is insufficient data available to provide a sufficient dataset to train supervised machine learning programs to identify seismic sequences. In this study, patch division and data augmentation are applied to mitigate this lack of data. Furthermore, to obtain spatial information that could be lost during patch division, an artificial channel is added to the original data to indicate depth. Seismic sequence identification is performed using a U-Net network and the Netherlands F3 block dataset from the dGB Open Seismic Repository, which offers datasets for machine learning, and the predicted results are evaluated. The results show that patch-based U-Net seismic sequence identification is improved by data augmentation and the addition of an artificial channel.

A Vision Transformer Based Recommender System Using Side Information (부가 정보를 활용한 비전 트랜스포머 기반의 추천시스템)

  • Kwon, Yujin;Choi, Minseok;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.3
    • /
    • pp.119-137
    • /
    • 2022
  • Recent recommendation system studies apply various deep learning models to represent user and item interactions better. One of the noteworthy studies is ONCF(Outer product-based Neural Collaborative Filtering) which builds a two-dimensional interaction map via outer product and employs CNN (Convolutional Neural Networks) to learn high-order correlations from the map. However, ONCF has limitations in recommendation performance due to the problems with CNN and the absence of side information. ONCF using CNN has an inductive bias problem that causes poor performances for data with a distribution that does not appear in the training data. This paper proposes to employ a Vision Transformer (ViT) instead of the vanilla CNN used in ONCF. The reason is that ViT showed better results than state-of-the-art CNN in many image classification cases. In addition, we propose a new architecture to reflect side information that ONCF did not consider. Unlike previous studies that reflect side information in a neural network using simple input combination methods, this study uses an independent auxiliary classifier to reflect side information more effectively in the recommender system. ONCF used a single latent vector for user and item, but in this study, a channel is constructed using multiple vectors to enable the model to learn more diverse expressions and to obtain an ensemble effect. The experiments showed our deep learning model improved performance in recommendation compared to ONCF.