• Title/Summary/Keyword: 3D Convolutional Neural Network (3D CNN)

Search Result 71, Processing Time 0.029 seconds

Depth Image Restoration Using Generative Adversarial Network (Generative Adversarial Network를 이용한 손실된 깊이 영상 복원)

  • Nah, John Junyeop;Sim, Chang Hun;Park, In Kyu
    • Journal of Broadcast Engineering
    • /
    • v.23 no.5
    • /
    • pp.614-621
    • /
    • 2018
  • This paper proposes a method of restoring corrupted depth image captured by depth camera through unsupervised learning using generative adversarial network (GAN). The proposed method generates restored face depth images using 3D morphable model convolutional neural network (3DMM CNN) with large-scale CelebFaces Attribute (CelebA) and FaceWarehouse dataset for training deep convolutional generative adversarial network (DCGAN). The generator and discriminator equip with Wasserstein distance for loss function by utilizing minimax game. Then the DCGAN restore the loss of captured facial depth images by performing another learning procedure using trained generator and new loss function.

CNN Based 2D and 2.5D Face Recognition For Home Security System (홈보안 시스템을 위한 CNN 기반 2D와 2.5D 얼굴 인식)

  • MaYing, MaYing;Kim, Kang-Chul
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.14 no.6
    • /
    • pp.1207-1214
    • /
    • 2019
  • Technologies of the 4th industrial revolution have been unknowingly seeping into our lives. Many IoT based home security systems are using the convolutional neural network(CNN) as good biometrics to recognize a face and protect home and family from intruders since CNN has demonstrated its excellent ability in image recognition. In this paper, three layouts of CNN for 2D and 2.5D image of small dataset with various input image size and filter size are explored. The simulation results show that the layout of CNN with 50*50 input size of 2.5D image, 2 convolution and max pooling layer, and 3*3 filter size for small dataset of 2.5D image is optimal for a home security system with recognition accuracy of 0.966. In addition, the longest CPU time consumption for one input image is 0.057S. The proposed layout of CNN for a face recognition is suitable to control the actuators in the home security system because a home security system requires good face recognition and short recognition time.

One Step Measurements of hippocampal Pure Volumes from MRI Data Using an Ensemble Model of 3-D Convolutional Neural Network

  • Basher, Abol;Ahmed, Samsuddin;Jung, Ho Yub
    • Smart Media Journal
    • /
    • v.9 no.2
    • /
    • pp.22-32
    • /
    • 2020
  • The hippocampal volume atrophy is known to be linked with neuro-degenerative disorders and it is also one of the most important early biomarkers for Alzheimer's disease detection. The measurements of hippocampal pure volumes from Magnetic Resonance Imaging (MRI) is a crucial task and state-of-the-art methods require a large amount of time. In addition, the structural brain development is investigated using MRI data, where brain morphometry (e.g. cortical thickness, volume, surface area etc.) study is one of the significant parts of the analysis. In this study, we have proposed a patch-based ensemble model of 3-D convolutional neural network (CNN) to measure the hippocampal pure volume from MRI data. The 3-D patches were extracted from the volumetric MRI scans to train the proposed 3-D CNN models. The trained models are used to construct the ensemble 3-D CNN model and the aggregated model predicts the pure volume in one-step in the test phase. Our approach takes only 5 seconds to estimate the volumes from an MRI scan. The average errors for the proposed ensemble 3-D CNN model are 11.7±8.8 (error%±STD) and 12.5±12.8 (error%±STD) for the left and right hippocampi of 65 test MRI scans, respectively. The quantitative study on the predicted volumes over the ground truth volumes shows that the proposed approach can be used as a proxy.

A Study on Combine Artificial Intelligence Models for multi-classification for an Abnormal Behaviors in CCTV images (CCTV 영상의 이상행동 다중 분류를 위한 결합 인공지능 모델에 관한 연구)

  • Lee, Hongrae;Kim, Youngtae;Seo, Byung-suk
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.498-500
    • /
    • 2022
  • CCTV protects people and assets safely by identifying dangerous situations and responding promptly. However, it is difficult to continuously monitor the increasing number of CCTV images. For this reason, there is a need for a device that continuously monitors CCTV images and notifies when abnormal behavior occurs. Recently, many studies using artificial intelligence models for image data analysis have been conducted. This study simultaneously learns spatial and temporal characteristic information between image data to classify various abnormal behaviors that can be observed in CCTV images. As an artificial intelligence model used for learning, we propose a multi-classification deep learning model that combines an end-to-end 3D convolutional neural network(CNN) and ResNet.

  • PDF

Depth Map Estimation Model Using 3D Feature Volume (3차원 특징볼륨을 이용한 깊이영상 생성 모델)

  • Shin, Soo-Yeon;Kim, Dong-Myung;Suh, Jae-Won
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.11
    • /
    • pp.447-454
    • /
    • 2018
  • This paper proposes a depth image generation algorithm of stereo images using a deep learning model composed of a CNN (convolutional neural network). The proposed algorithm consists of a feature extraction unit which extracts the main features of each parallax image and a depth learning unit which learns the parallax information using extracted features. First, the feature extraction unit extracts a feature map for each parallax image through the Xception module and the ASPP(Atrous spatial pyramid pooling) module, which are composed of 2D CNN layers. Then, the feature map for each parallax is accumulated in 3D form according to the time difference and the depth image is estimated after passing through the depth learning unit for learning the depth estimation weight through 3D CNN. The proposed algorithm estimates the depth of object region more accurately than other algorithms.

Customized AI Exercise Recommendation Service for the Balanced Physical Activity (균형적인 신체활동을 위한 맞춤형 AI 운동 추천 서비스)

  • Chang-Min Kim;Woo-Beom Lee
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.23 no.4
    • /
    • pp.234-240
    • /
    • 2022
  • This paper proposes a customized AI exercise recommendation service for balancing the relative amount of exercise according to the working environment by each occupation. WISDM database is collected by using acceleration and gyro sensors, and is a dataset that classifies physical activities into 18 categories. Our system recommends a adaptive exercise using the analyzed activity type after classifying 18 physical activities into 3 physical activities types such as whole body, upper body and lower body. 1 Dimensional convolutional neural network is used for classifying a physical activity in this paper. Proposed model is composed of a convolution blocks in which 1D convolution layers with a various sized kernel are connected in parallel. Convolution blocks can extract a detailed local features of input pattern effectively that can be extracted from deep neural network models, as applying multi 1D convolution layers to input pattern. To evaluate performance of the proposed neural network model, as a result of comparing the previous recurrent neural network, our method showed a remarkable 98.4% accuracy.

CNN Architecture for Accurately and Efficiently Learning a 3D Triangular Mesh (3차원 삼각형 메쉬를 정확하고 효율적으로 학습하기 위한 CNN 아키텍처)

  • Hong Eun Na;Jong-Hyun Kim
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2023.01a
    • /
    • pp.369-372
    • /
    • 2023
  • 본 논문에서는 삼각형 구조로 구성된 3차원 메쉬(Mesh)에서 합성곱 신경망(Convolution Neural Network, CNN)을 응용하여 정확도가 높은 새로운 학습 표현 기법을 제시한다. 우리는 메쉬를 구성하고 있는 폴리곤의 edge와 face의 로컬 특징을 기반으로 학습을 진행한다. 일반적으로 딥러닝은 인공신경망을 수많은 계층 형태로 연결한 기법을 말하며, 주요 처리 대상은 1, 2차원 데이터 형태인 오디오 파일과 이미지였다. 인공지능에 대한 연구가 지속되면서 3차원 딥러닝이 도입되었지만, 기존의 학습과는 달리 3차원 딥러닝은 데이터의 확보가 쉽지 않다. 혼합현실과 메타버스 시장의 확대로 인해 3차원 모델링 시장이 증가하고, 기술의 발전으로 데이터를 획득할 수 있는 방법이 생겼지만, 3차원 데이터를 직접적으로 학습에 이용하는 방식으로 적용하는 것은 쉽지 않다. 그렇게 때문에 본 논문에서는 산업 현장에서 이용되는 데이터인 메쉬 구조를 폴리곤의 최소 단위인 삼각형 형태로 구성하여 학습 데이터를 구성해 기존의 방법보다 정확도가 높은 학습 기법을 제안한다.

  • PDF

A Review of 3D Object Tracking Methods Using Deep Learning (딥러닝 기술을 이용한 3차원 객체 추적 기술 리뷰)

  • Park, Hanhoon
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.22 no.1
    • /
    • pp.30-37
    • /
    • 2021
  • Accurate 3D object tracking with camera images is a key enabling technology for augmented reality applications. Motivated by the impressive success of convolutional neural networks (CNNs) in computer vision tasks such as image classification, object detection, image segmentation, recent studies for 3D object tracking have focused on leveraging deep learning. In this paper, we review deep learning approaches for 3D object tracking. We describe key methods in this field and discuss potential future research directions.

A Deep Convolutional Neural Network Based 6-DOF Relocalization with Sensor Fusion System (센서 융합 시스템을 이용한 심층 컨벌루션 신경망 기반 6자유도 위치 재인식)

  • Jo, HyungGi;Cho, Hae Min;Lee, Seongwon;Kim, Euntai
    • The Journal of Korea Robotics Society
    • /
    • v.14 no.2
    • /
    • pp.87-93
    • /
    • 2019
  • This paper presents a 6-DOF relocalization using a 3D laser scanner and a monocular camera. A relocalization problem in robotics is to estimate pose of sensor when a robot revisits the area. A deep convolutional neural network (CNN) is designed to regress 6-DOF sensor pose and trained using both RGB image and 3D point cloud information in end-to-end manner. We generate the new input that consists of RGB and range information. After training step, the relocalization system results in the pose of the sensor corresponding to each input when a new input is received. However, most of cases, mobile robot navigation system has successive sensor measurements. In order to improve the localization performance, the output of CNN is used for measurements of the particle filter that smooth the trajectory. We evaluate our relocalization method on real world datasets using a mobile robot platform.

Sketch-based 3D object retrieval using Wasserstein Center Loss (Wasserstein Center 손실을 이용한 스케치 기반 3차원 물체 검색)

  • Ji, Myunggeun;Chun, Junchul;Kim, Namgi
    • Journal of Internet Computing and Services
    • /
    • v.19 no.6
    • /
    • pp.91-99
    • /
    • 2018
  • Sketch-based 3D object retrieval is a convenient way to search for various 3D data using human-drawn sketches as query. In this paper, we propose a new method of using Sketch CNN, Wasserstein CNN and Wasserstein center loss for sketch-based 3D object search. Specifically, Wasserstein center loss is a method of learning the center of each object category and reducing the Wasserstein distance between center and features of the same category. To do this, the proposed 3D object retrieval is performed as follows. Firstly, Wasserstein CNN extracts 2D images taken from various directions of 3D object using CNN, and extracts features of 3D data by computing the Wasserstein barycenters of features of each image. Secondly, the features of the sketch are extracted using a separate Sketch CNN. Finally, we learn the features of the extracted 3D object and the features of the sketch using the proposed Wasserstein center loss. In order to demonstrate the superiority of the proposed method, we evaluated two sets of benchmark data sets, SHREC 13 and SHREC 14, and the proposed method shows better performance in all conventional metrics compared to the state of the art methods.