• Title/Summary/Keyword: 3D convolutional neural network

Search Result 108, Processing Time 0.028 seconds

Learning-Based Multiple Pooling Fusion in Multi-View Convolutional Neural Network for 3D Model Classification and Retrieval

  • Zeng, Hui;Wang, Qi;Li, Chen;Song, Wei
    • Journal of Information Processing Systems
    • /
    • v.15 no.5
    • /
    • pp.1179-1191
    • /
    • 2019
  • We design an ingenious view-pooling method named learning-based multiple pooling fusion (LMPF), and apply it to multi-view convolutional neural network (MVCNN) for 3D model classification or retrieval. By this means, multi-view feature maps projected from a 3D model can be compiled as a simple and effective feature descriptor. The LMPF method fuses the max pooling method and the mean pooling method by learning a set of optimal weights. Compared with the hand-crafted approaches such as max pooling and mean pooling, the LMPF method can decrease the information loss effectively because of its "learning" ability. Experiments on ModelNet40 dataset and McGill dataset are presented and the results verify that LMPF can outperform those previous methods to a great extent.

Depth Image Restoration Using Generative Adversarial Network (Generative Adversarial Network를 이용한 손실된 깊이 영상 복원)

  • Nah, John Junyeop;Sim, Chang Hun;Park, In Kyu
    • Journal of Broadcast Engineering
    • /
    • v.23 no.5
    • /
    • pp.614-621
    • /
    • 2018
  • This paper proposes a method of restoring corrupted depth image captured by depth camera through unsupervised learning using generative adversarial network (GAN). The proposed method generates restored face depth images using 3D morphable model convolutional neural network (3DMM CNN) with large-scale CelebFaces Attribute (CelebA) and FaceWarehouse dataset for training deep convolutional generative adversarial network (DCGAN). The generator and discriminator equip with Wasserstein distance for loss function by utilizing minimax game. Then the DCGAN restore the loss of captured facial depth images by performing another learning procedure using trained generator and new loss function.

1D CNN and Machine Learning Methods for Fall Detection (1D CNN과 기계 학습을 사용한 낙상 검출)

  • Kim, Inkyung;Kim, Daehee;Noh, Song;Lee, Jaekoo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.3
    • /
    • pp.85-90
    • /
    • 2021
  • In this paper, fall detection using individual wearable devices for older people is considered. To design a low-cost wearable device for reliable fall detection, we present a comprehensive analysis of two representative models. One is a machine learning model composed of a decision tree, random forest, and Support Vector Machine(SVM). The other is a deep learning model relying on a one-dimensional(1D) Convolutional Neural Network(CNN). By considering data segmentation, preprocessing, and feature extraction methods applied to the input data, we also evaluate the considered models' validity. Simulation results verify the efficacy of the deep learning model showing improved overall performance.

Enhancing Alzheimer's Disease Classification using 3D Convolutional Neural Network and Multilayer Perceptron Model with Attention Network

  • Enoch A. Frimpong;Zhiguang Qin;Regina E. Turkson;Bernard M. Cobbinah;Edward Y. Baagyere;Edwin K. Tenagyei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.11
    • /
    • pp.2924-2944
    • /
    • 2023
  • Alzheimer's disease (AD) is a neurological condition that is recognized as one of the primary causes of memory loss. AD currently has no cure. Therefore, the need to develop an efficient model with high precision for timely detection of the disease is very essential. When AD is detected early, treatment would be most likely successful. The most often utilized indicators for AD identification are the Mini-mental state examination (MMSE), and the clinical dementia. However, the use of these indicators as ground truth marking could be imprecise for AD detection. Researchers have proposed several computer-aided frameworks and lately, the supervised model is mostly used. In this study, we propose a novel 3D Convolutional Neural Network Multilayer Perceptron (3D CNN-MLP) based model for AD classification. The model uses Attention Mechanism to automatically extract relevant features from Magnetic Resonance Images (MRI) to generate probability maps which serves as input for the MLP classifier. Three MRI scan categories were considered, thus AD dementia patients, Mild Cognitive Impairment patients (MCI), and Normal Control (NC) or healthy patients. The performance of the model is assessed by comparing basic CNN, VGG16, DenseNet models, and other state of the art works. The models were adjusted to fit the 3D images before the comparison was done. Our model exhibited excellent classification performance, with an accuracy of 91.27% for AD and NC, 80.85% for MCI and NC, and 87.34% for AD and MCI.

Customized AI Exercise Recommendation Service for the Balanced Physical Activity (균형적인 신체활동을 위한 맞춤형 AI 운동 추천 서비스)

  • Chang-Min Kim;Woo-Beom Lee
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.23 no.4
    • /
    • pp.234-240
    • /
    • 2022
  • This paper proposes a customized AI exercise recommendation service for balancing the relative amount of exercise according to the working environment by each occupation. WISDM database is collected by using acceleration and gyro sensors, and is a dataset that classifies physical activities into 18 categories. Our system recommends a adaptive exercise using the analyzed activity type after classifying 18 physical activities into 3 physical activities types such as whole body, upper body and lower body. 1 Dimensional convolutional neural network is used for classifying a physical activity in this paper. Proposed model is composed of a convolution blocks in which 1D convolution layers with a various sized kernel are connected in parallel. Convolution blocks can extract a detailed local features of input pattern effectively that can be extracted from deep neural network models, as applying multi 1D convolution layers to input pattern. To evaluate performance of the proposed neural network model, as a result of comparing the previous recurrent neural network, our method showed a remarkable 98.4% accuracy.

A Deep Convolutional Neural Network Based 6-DOF Relocalization with Sensor Fusion System (센서 융합 시스템을 이용한 심층 컨벌루션 신경망 기반 6자유도 위치 재인식)

  • Jo, HyungGi;Cho, Hae Min;Lee, Seongwon;Kim, Euntai
    • The Journal of Korea Robotics Society
    • /
    • v.14 no.2
    • /
    • pp.87-93
    • /
    • 2019
  • This paper presents a 6-DOF relocalization using a 3D laser scanner and a monocular camera. A relocalization problem in robotics is to estimate pose of sensor when a robot revisits the area. A deep convolutional neural network (CNN) is designed to regress 6-DOF sensor pose and trained using both RGB image and 3D point cloud information in end-to-end manner. We generate the new input that consists of RGB and range information. After training step, the relocalization system results in the pose of the sensor corresponding to each input when a new input is received. However, most of cases, mobile robot navigation system has successive sensor measurements. In order to improve the localization performance, the output of CNN is used for measurements of the particle filter that smooth the trajectory. We evaluate our relocalization method on real world datasets using a mobile robot platform.

Development of Combined Architecture of Multiple Deep Convolutional Neural Networks for Improving Video Face Identification (비디오 얼굴 식별 성능개선을 위한 다중 심층합성곱신경망 결합 구조 개발)

  • Kim, Kyeong Tae;Choi, Jae Young
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.6
    • /
    • pp.655-664
    • /
    • 2019
  • In this paper, we propose a novel way of combining multiple deep convolutional neural network (DCNN) architectures which work well for accurate video face identification by adopting a serial combination of 3D and 2D DCNNs. The proposed method first divides an input video sequence (to be recognized) into a number of sub-video sequences. The resulting sub-video sequences are used as input to the 3D DCNN so as to obtain the class-confidence scores for a given input video sequence by considering both temporal and spatial face feature characteristics of input video sequence. The class-confidence scores obtained from corresponding sub-video sequences is combined by forming our proposed class-confidence matrix. The resulting class-confidence matrix is then used as an input for learning 2D DCNN learning which is serially linked to 3D DCNN. Finally, fine-tuned, serially combined DCNN framework is applied for recognizing the identity present in a given test video sequence. To verify the effectiveness of our proposed method, extensive and comparative experiments have been conducted to evaluate our method on COX face databases with their standard face identification protocols. Experimental results showed that our method can achieve better or comparable identification rate compared to other state-of-the-art video FR methods.

Local Feature Map Using Triangle Area and Variation for Efficient Learning of 3D Mesh (3차원 메쉬의 효율적인 학습을 위한 삼각형의 면적과 변화를 이용한 로컬 특징맵)

  • Na, Hong Eun;Kim, Jong-Hyun
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2022.07a
    • /
    • pp.573-576
    • /
    • 2022
  • 본 논문에서는 삼각형 구조로 구성된 3차원 메쉬(Mesh)에서 합성곱 신경망(Convolutional Neural Network, CNN)의 정확도를 개선시킬 수 있는 새로운 학습 표현 기법을 제시한다. 우리는 메쉬를 구성하고 있는 삼각형의 넓이와 그 로컬 특징을 기반으로 학습을 진행한다. 일반적으로 딥러닝은 인공신경망을 수많은 계층 형태로 연결한 기법을 말하며, 주요 처리 대상은 오디오 파일과 이미지이었다. 인공지능에 대한 연구가 지속되면서 3차원 딥러닝이 도입되었지만, 기존의 학습과는 달리 3차원 학습은 데이터의 확보가 쉽지 않다. 혼합현실과 메타버스 시장으로 인해 3차원 모델링 시장이 증가가 하면서 기술의 발전으로 데이터를 획득할 수 있는 방법이 생겼지만, 3차원 데이터를 직접적으로 학습 표현하는 방식으로 적용하는 것은 쉽지 않다. 그렇기 때문에 본 논문에서는 산업 현장에서 사용되는 데이터인 삼각형 메쉬 구조를 바탕으로 기존 방법보다 정확도가 높은 학습 기법을 제안한다.

  • PDF

Application Research on Obstruction Area Detection of Building Wall using R-CNN Technique (R-CNN 기법을 이용한 건물 벽 폐색영역 추출 적용 연구)

  • Kim, Hye Jin;Lee, Jeong Min;Bae, Kyoung Ho;Eo, Yang Dam
    • Journal of Cadastre & Land InformatiX
    • /
    • v.48 no.2
    • /
    • pp.213-225
    • /
    • 2018
  • For constructing three-dimensional (3D) spatial information occlusion region problem arises in the process of taking the texture of the building. In order to solve this problem, it is necessary to investigate the automation method to automatically recognize the occlusion region, issue it, and automatically complement the texture. In fact there are occasions when it is possible to generate a very large number of structures and occlusion, so alternatives to overcome are being considered. In this study, we attempt to apply an approach to automatically create an occlusion region based on learning by patterning the blocked region using the recently emerging deep learning algorithm. Experiment to see the performance automatic detection of people, banners, vehicles, and traffic lights that cause occlusion in building walls using two advanced algorithms of Convolutional Neural Network (CNN) technique, Faster Region-based Convolutional Neural Network (R-CNN) and Mask R-CNN. And the results of the automatic detection by learning the banners in the pre-learned model of the Mask R-CNN method were found to be excellent.

A Review of 3D Object Tracking Methods Using Deep Learning (딥러닝 기술을 이용한 3차원 객체 추적 기술 리뷰)

  • Park, Hanhoon
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.22 no.1
    • /
    • pp.30-37
    • /
    • 2021
  • Accurate 3D object tracking with camera images is a key enabling technology for augmented reality applications. Motivated by the impressive success of convolutional neural networks (CNNs) in computer vision tasks such as image classification, object detection, image segmentation, recent studies for 3D object tracking have focused on leveraging deep learning. In this paper, we review deep learning approaches for 3D object tracking. We describe key methods in this field and discuss potential future research directions.