• Title, Summary, Keyword: 3D CNN

Search Result 66, Processing Time 0.033 seconds

CNN Based 2D and 2.5D Face Recognition For Home Security System (홈보안 시스템을 위한 CNN 기반 2D와 2.5D 얼굴 인식)

  • MaYing, MaYing;Kim, Kang-Chul
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.14 no.6
    • /
    • pp.1207-1214
    • /
    • 2019
  • Technologies of the 4th industrial revolution have been unknowingly seeping into our lives. Many IoT based home security systems are using the convolutional neural network(CNN) as good biometrics to recognize a face and protect home and family from intruders since CNN has demonstrated its excellent ability in image recognition. In this paper, three layouts of CNN for 2D and 2.5D image of small dataset with various input image size and filter size are explored. The simulation results show that the layout of CNN with 50*50 input size of 2.5D image, 2 convolution and max pooling layer, and 3*3 filter size for small dataset of 2.5D image is optimal for a home security system with recognition accuracy of 0.966. In addition, the longest CPU time consumption for one input image is 0.057S. The proposed layout of CNN for a face recognition is suitable to control the actuators in the home security system because a home security system requires good face recognition and short recognition time.

The Impact of the PCA Dimensionality Reduction for CNN based Hyperspectral Image Classification (CNN 기반 초분광 영상 분류를 위한 PCA 차원축소의 영향 분석)

  • Kwak, Taehong;Song, Ahram;Kim, Yongil
    • Korean Journal of Remote Sensing
    • /
    • v.35 no.6_1
    • /
    • pp.959-971
    • /
    • 2019
  • CNN (Convolutional Neural Network) is one representative deep learning algorithm, which can extract high-level spatial and spectral features, and has been applied for hyperspectral image classification. However, one significant drawback behind the application of CNNs in hyperspectral images is the high dimensionality of the data, which increases the training time and processing complexity. To address this problem, several CNN based hyperspectral image classification studies have exploited PCA (Principal Component Analysis) for dimensionality reduction. One limitation to this is that the spectral information of the original image can be lost through PCA. Although it is clear that the use of PCA affects the accuracy and the CNN training time, the impact of PCA for CNN based hyperspectral image classification has been understudied. The purpose of this study is to analyze the quantitative effect of PCA in CNN for hyperspectral image classification. The hyperspectral images were first transformed through PCA and applied into the CNN model by varying the size of the reduced dimensionality. In addition, 2D-CNN and 3D-CNN frameworks were applied to analyze the sensitivity of the PCA with respect to the convolution kernel in the model. Experimental results were evaluated based on classification accuracy, learning time, variance ratio, and training process. The size of the reduced dimensionality was the most efficient when the explained variance ratio recorded 99.7%~99.8%. Since the 3D kernel had higher classification accuracy in the original-CNN than the PCA-CNN in comparison to the 2D-CNN, the results revealed that the dimensionality reduction was relatively less effective in 3D kernel.

Performance Evaluation of Machine Learning and Deep Learning Algorithms in Crop Classification: Impact of Hyper-parameters and Training Sample Size (작물분류에서 기계학습 및 딥러닝 알고리즘의 분류 성능 평가: 하이퍼파라미터와 훈련자료 크기의 영향 분석)

  • Kim, Yeseul;Kwak, Geun-Ho;Lee, Kyung-Do;Na, Sang-Il;Park, Chan-Won;Park, No-Wook
    • Korean Journal of Remote Sensing
    • /
    • v.34 no.5
    • /
    • pp.811-827
    • /
    • 2018
  • The purpose of this study is to compare machine learning algorithm and deep learning algorithm in crop classification using multi-temporal remote sensing data. For this, impacts of machine learning and deep learning algorithms on (a) hyper-parameter and (2) training sample size were compared and analyzed for Haenam-gun, Korea and Illinois State, USA. In the comparison experiment, support vector machine (SVM) was applied as machine learning algorithm and convolutional neural network (CNN) was applied as deep learning algorithm. In particular, 2D-CNN considering 2-dimensional spatial information and 3D-CNN with extended time dimension from 2D-CNN were applied as CNN. As a result of the experiment, it was found that the hyper-parameter values of CNN, considering various hyper-parameter, defined in the two study areas were similar compared with SVM. Based on this result, although it takes much time to optimize the model in CNN, it is considered that it is possible to apply transfer learning that can extend optimized CNN model to other regions. Then, in the experiment results with various training sample size, the impact of that on CNN was larger than SVM. In particular, this impact was exaggerated in Illinois State with heterogeneous spatial patterns. In addition, the lowest classification performance of 3D-CNN was presented in Illinois State, which is considered to be due to over-fitting as complexity of the model. That is, the classification performance was relatively degraded due to heterogeneous patterns and noise effect of input data, although the training accuracy of 3D-CNN model was high. This result simply that a proper classification algorithms should be selected considering spatial characteristics of study areas. Also, a large amount of training samples is necessary to guarantee higher classification performance in CNN, particularly in 3D-CNN.

One Step Measurements of hippocampal Pure Volumes from MRI Data Using an Ensemble Model of 3-D Convolutional Neural Network

  • Basher, Abol;Ahmed, Samsuddin;Jung, Ho Yub
    • Smart Media Journal
    • /
    • v.9 no.2
    • /
    • pp.22-32
    • /
    • 2020
  • The hippocampal volume atrophy is known to be linked with neuro-degenerative disorders and it is also one of the most important early biomarkers for Alzheimer's disease detection. The measurements of hippocampal pure volumes from Magnetic Resonance Imaging (MRI) is a crucial task and state-of-the-art methods require a large amount of time. In addition, the structural brain development is investigated using MRI data, where brain morphometry (e.g. cortical thickness, volume, surface area etc.) study is one of the significant parts of the analysis. In this study, we have proposed a patch-based ensemble model of 3-D convolutional neural network (CNN) to measure the hippocampal pure volume from MRI data. The 3-D patches were extracted from the volumetric MRI scans to train the proposed 3-D CNN models. The trained models are used to construct the ensemble 3-D CNN model and the aggregated model predicts the pure volume in one-step in the test phase. Our approach takes only 5 seconds to estimate the volumes from an MRI scan. The average errors for the proposed ensemble 3-D CNN model are 11.7±8.8 (error%±STD) and 12.5±12.8 (error%±STD) for the left and right hippocampi of 65 test MRI scans, respectively. The quantitative study on the predicted volumes over the ground truth volumes shows that the proposed approach can be used as a proxy.

Sketch-based 3D object retrieval using Wasserstein Center Loss (Wasserstein Center 손실을 이용한 스케치 기반 3차원 물체 검색)

  • Ji, Myunggeun;Chun, Junchul;Kim, Namgi
    • Journal of Internet Computing and Services
    • /
    • v.19 no.6
    • /
    • pp.91-99
    • /
    • 2018
  • Sketch-based 3D object retrieval is a convenient way to search for various 3D data using human-drawn sketches as query. In this paper, we propose a new method of using Sketch CNN, Wasserstein CNN and Wasserstein center loss for sketch-based 3D object search. Specifically, Wasserstein center loss is a method of learning the center of each object category and reducing the Wasserstein distance between center and features of the same category. To do this, the proposed 3D object retrieval is performed as follows. Firstly, Wasserstein CNN extracts 2D images taken from various directions of 3D object using CNN, and extracts features of 3D data by computing the Wasserstein barycenters of features of each image. Secondly, the features of the sketch are extracted using a separate Sketch CNN. Finally, we learn the features of the extracted 3D object and the features of the sketch using the proposed Wasserstein center loss. In order to demonstrate the superiority of the proposed method, we evaluated two sets of benchmark data sets, SHREC 13 and SHREC 14, and the proposed method shows better performance in all conventional metrics compared to the state of the art methods.

A Sketch-based 3D Object Retrieval Approach for Augmented Reality Models Using Deep Learning

  • Ji, Myunggeun;Chun, Junchul
    • Journal of Internet Computing and Services
    • /
    • v.21 no.1
    • /
    • pp.33-43
    • /
    • 2020
  • Retrieving a 3D model from a 3D database and augmenting the retrieved model in the Augmented Reality system simultaneously became an issue in developing the plausible AR environments in a convenient fashion. It is considered that the sketch-based 3D object retrieval is an intuitive way for searching 3D objects based on human-drawn sketches as query. In this paper, we propose a novel deep learning based approach of retrieving a sketch-based 3D object as for an Augmented Reality Model. For this work, we introduce a new method which uses Sketch CNN, Wasserstein CNN and Wasserstein center loss for retrieving a sketch-based 3D object. Especially, Wasserstein center loss is used for learning the center of each object category and reducing the Wasserstein distance between center and features of the same category. The proposed 3D object retrieval and augmentation consist of three major steps as follows. Firstly, Wasserstein CNN extracts 2D images taken from various directions of 3D object using CNN, and extracts features of 3D data by computing the Wasserstein barycenters of features of each image. Secondly, the features of the sketch are extracted using a separate Sketch CNN. Finally, we adopt sketch-based object matching method to localize the natural marker of the images to register a 3D virtual object in AR system. Using the detected marker, the retrieved 3D virtual object is augmented in AR system automatically. By the experiments, we prove that the proposed method is efficiency for retrieving and augmenting objects.

A Study of Video-Based Abnormal Behavior Recognition Model Using Deep Learning

  • Lee, Jiyoo;Shin, Seung-Jung
    • International journal of advanced smart convergence
    • /
    • v.9 no.4
    • /
    • pp.115-119
    • /
    • 2020
  • Recently, CCTV installations are rapidly increasing in the public and private sectors to prevent various crimes. In accordance with the increasing number of CCTVs, video-based abnormal behavior detection in control systems is one of the key technologies for safety. This is because it is difficult for the surveillance personnel who control multiple CCTVs to manually monitor all abnormal behaviors in the video. In order to solve this problem, research to recognize abnormal behavior using deep learning is being actively conducted. In this paper, we propose a model for detecting abnormal behavior based on the deep learning model that is currently widely used. Based on the abnormal behavior video data provided by AI Hub, we performed a comparative experiment to detect anomalous behavior through violence learning and fainting in videos using 2D CNN-LSTM, 3D CNN, and I3D models. We hope that the experimental results of this abnormal behavior learning model will be helpful in developing intelligent CCTV.

Real-time 3D Pose Estimation of Both Human Hands via RGB-Depth Camera and Deep Convolutional Neural Networks (RGB-Depth 카메라와 Deep Convolution Neural Networks 기반의 실시간 사람 양손 3D 포즈 추정)

  • Park, Na Hyeon;Ji, Yong Bin;Gi, Geon;Kim, Tae Yeon;Park, Hye Min;Kim, Tae-Seong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • /
    • pp.686-689
    • /
    • 2018
  • 3D 손 포즈 추정(Hand Pose Estimation, HPE)은 스마트 인간 컴퓨터 인터페이스를 위해서 중요한 기술이다. 이 연구에서는 딥러닝 방법을 기반으로 하여 단일 RGB-Depth 카메라로 촬영한 양손의 3D 손 자세를 실시간으로 인식하는 손 포즈 추정 시스템을 제시한다. 손 포즈 추정 시스템은 4단계로 구성된다. 첫째, Skin Detection 및 Depth cutting 알고리즘을 사용하여 양손을 RGB와 깊이 영상에서 감지하고 추출한다. 둘째, Convolutional Neural Network(CNN) Classifier는 오른손과 왼손을 구별하는데 사용된다. CNN Classifier 는 3개의 convolution layer와 2개의 Fully-Connected Layer로 구성되어 있으며, 추출된 깊이 영상을 입력으로 사용한다. 셋째, 학습된 CNN regressor는 추출된 왼쪽 및 오른쪽 손의 깊이 영상에서 손 관절을 추정하기 위해 다수의 Convolutional Layers, Pooling Layers, Fully Connected Layers로 구성된다. CNN classifier와 regressor는 22,000개 깊이 영상 데이터셋으로 학습된다. 마지막으로, 각 손의 3D 손 자세는 추정된 손 관절 정보로부터 재구성된다. 테스트 결과, CNN classifier는 오른쪽 손과 왼쪽 손을 96.9%의 정확도로 구별할 수 있으며, CNN regressor는 형균 8.48mm의 오차 범위로 3D 손 관절 정보를 추정할 수 있다. 본 연구에서 제안하는 손 포즈 추정 시스템은 가상 현실(virtual reality, VR), 증강 현실(Augmented Reality, AR) 및 융합 현실 (Mixed Reality, MR) 응용 프로그램을 포함한 다양한 응용 분야에서 사용할 수 있다.

Spatial-temporal Ensemble Method for Action Recognition (행동 인식을 위한 시공간 앙상블 기법)

  • Seo, Minseok;Lee, Sangwoo;Choi, Dong-Geol
    • The Journal of Korea Robotics Society
    • /
    • v.15 no.4
    • /
    • pp.385-391
    • /
    • 2020
  • As deep learning technology has been developed and applied to various fields, it is gradually changing from an existing single image based application to a video based application having a time base in order to recognize human behavior. However, unlike 2D CNN in a single image, 3D CNN in a video has a very high amount of computation and parameter increase due to the addition of a time axis, so improving accuracy in action recognition technology is more difficult than in a single image. To solve this problem, we investigate and analyze various techniques to improve performance in 3D CNN-based image recognition without additional training time and parameter increase. We propose a time base ensemble using the time axis that exists only in the videos and an ensemble in the input frame. We have achieved an accuracy improvement of up to 7.1% compared to the existing performance with a combination of techniques. It also revealed the trade-off relationship between computational and accuracy.

Application Research on Obstruction Area Detection of Building Wall using R-CNN Technique (R-CNN 기법을 이용한 건물 벽 폐색영역 추출 적용 연구)

  • Kim, Hye Jin;Lee, Jeong Min;Bae, Kyoung Ho;Eo, Yang Dam
    • Journal of Cadastre & Land InformatiX
    • /
    • v.48 no.2
    • /
    • pp.213-225
    • /
    • 2018
  • For constructing three-dimensional (3D) spatial information occlusion region problem arises in the process of taking the texture of the building. In order to solve this problem, it is necessary to investigate the automation method to automatically recognize the occlusion region, issue it, and automatically complement the texture. In fact there are occasions when it is possible to generate a very large number of structures and occlusion, so alternatives to overcome are being considered. In this study, we attempt to apply an approach to automatically create an occlusion region based on learning by patterning the blocked region using the recently emerging deep learning algorithm. Experiment to see the performance automatic detection of people, banners, vehicles, and traffic lights that cause occlusion in building walls using two advanced algorithms of Convolutional Neural Network (CNN) technique, Faster Region-based Convolutional Neural Network (R-CNN) and Mask R-CNN. And the results of the automatic detection by learning the banners in the pre-learned model of the Mask R-CNN method were found to be excellent.