• Title/Summary/Keyword: 3D-CNN

Search Result 157, Processing Time 0.028 seconds

A Study on a Mask R-CNN-Based Diagnostic System Measuring DDH Angles on Ultrasound Scans (다중 트레이닝 기법을 이용한 MASK R-CNN의 초음파 DDH 각도 측정 진단 시스템 연구)

  • Hwang, Seok-Min;Lee, Si-Wook;Lee, Jong-Ha
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.21 no.4
    • /
    • pp.183-194
    • /
    • 2020
  • Recently, the number of hip dysplasia (DDH) that occurs during infant and child growth has been increasing. DDH should be detected and treated as early as possible because it hinders infant growth and causes many other side effects In this study, two modelling techniques were used for multiple training techniques. Based on the results after the first transformation, the training was designed to be possible even with a small amount of data. The vertical flip, rotation, width and height shift functions were used to improve the efficiency of the model. Adam optimization was applied for parameter learning with the learning parameter initially set at 2.0 x 10e-4. Training was stopped when the validation loss was at the minimum. respectively A novel image overlay system using 3D laser scanner and a non-rigid registration method is implemented and its accuracy is evaluated. By using the proposed system, we successfully related the preoperative images with an open organ in the operating room

Moving Human Shape and Pose Reconstruction from Video (비디오로부터의 움직이는 3D 인체 형상 및 자세 복원)

  • Han, Ji Soo;Cho, Myung Rai;Park, In Kyu
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2018.11a
    • /
    • pp.66-68
    • /
    • 2018
  • 본 논문에서는 비디오로부터 추출된 프레임에서 3D 인체 모델의 복원하고 이를 부드럽게 재생될 수 있도록 보정하는 기법을 제안한다. 매개변수 기반의 모델을 사용하여 자세 및 체형을 복원하도록 접근하고 있다. 매개변수 기반의 인체 모델은 다양한 인체 데이터의 학습을 통해 만들어지며 입력 영상으로부터 최적의 자세와 체형 매개변수 값을 찾아 복원하게 된다. 자세 복원은 CNN 을 사용하여 영상으로부터 인체의 관절 위치를 추정하고 3D 모델로부터 2D 로 투영을 통해 관절 간의 거리가 최소화되는 매개변수 값을 찾아 복원한다. 형상 복원은 2D 영상으로부터 취득된 사람의 윤곽 데이터와 3D 모델의 윤곽 데이터 간의 매칭을 통해 복원된다. 이러한 단일 입력 영상에서 비디오와 같은 다중 입력 영상으로 확장하여 칼만 필터를 적용하여 오류 프레임을 검출하고 이전, 이후 프레임의 매개변수와의 보간을 통해 보다 자연스럽고 정확한 모델을 생성한다.

  • PDF

Proposal of 3D Camera-Based Digital Coordinate Recognition Technology (3D 카메라 기반 디지털 좌표 인식 기술 제안)

  • Koh, Jun-Young;Lee, Kang-Hee
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2022.07a
    • /
    • pp.229-230
    • /
    • 2022
  • 본 논문에서는 CNN Object Detection과 더불어 3D 카메라 기반 디지털 좌표 인식 기술을 제안한다. 이 기술은 3D Depth Camera인 Intel 사의 Realsense D455를 이용해 대상을 감지하고 분류하며 대상의 위치를 파악한다. 또한 이 기술은 기존의 Depth Camera 내장 거리와는 다르게 좌표를 인식하여 좌표간의 거리까지 계산이 가능하다. 또한 Tensorflow SSD 구조와의 메모리 공유를 통해 시스템의 자원 낭비를 줄이며, 속도를 높이는 멀티쓰레드를 탑재했다. 본 기술을 통해 좌표간의 거리를 계산함으로써 스포츠, 심리, 놀이, 산업 등 다양한 환경에서 활용할 수 있다.

  • PDF

A Deep Convolutional Neural Network Based 6-DOF Relocalization with Sensor Fusion System (센서 융합 시스템을 이용한 심층 컨벌루션 신경망 기반 6자유도 위치 재인식)

  • Jo, HyungGi;Cho, Hae Min;Lee, Seongwon;Kim, Euntai
    • The Journal of Korea Robotics Society
    • /
    • v.14 no.2
    • /
    • pp.87-93
    • /
    • 2019
  • This paper presents a 6-DOF relocalization using a 3D laser scanner and a monocular camera. A relocalization problem in robotics is to estimate pose of sensor when a robot revisits the area. A deep convolutional neural network (CNN) is designed to regress 6-DOF sensor pose and trained using both RGB image and 3D point cloud information in end-to-end manner. We generate the new input that consists of RGB and range information. After training step, the relocalization system results in the pose of the sensor corresponding to each input when a new input is received. However, most of cases, mobile robot navigation system has successive sensor measurements. In order to improve the localization performance, the output of CNN is used for measurements of the particle filter that smooth the trajectory. We evaluate our relocalization method on real world datasets using a mobile robot platform.

Customized AI Exercise Recommendation Service for the Balanced Physical Activity (균형적인 신체활동을 위한 맞춤형 AI 운동 추천 서비스)

  • Chang-Min Kim;Woo-Beom Lee
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.23 no.4
    • /
    • pp.234-240
    • /
    • 2022
  • This paper proposes a customized AI exercise recommendation service for balancing the relative amount of exercise according to the working environment by each occupation. WISDM database is collected by using acceleration and gyro sensors, and is a dataset that classifies physical activities into 18 categories. Our system recommends a adaptive exercise using the analyzed activity type after classifying 18 physical activities into 3 physical activities types such as whole body, upper body and lower body. 1 Dimensional convolutional neural network is used for classifying a physical activity in this paper. Proposed model is composed of a convolution blocks in which 1D convolution layers with a various sized kernel are connected in parallel. Convolution blocks can extract a detailed local features of input pattern effectively that can be extracted from deep neural network models, as applying multi 1D convolution layers to input pattern. To evaluate performance of the proposed neural network model, as a result of comparing the previous recurrent neural network, our method showed a remarkable 98.4% accuracy.

Hierarchical Grouping of Line Segments for Building Model Generation (건물 형태 발생을 위한 3차원 선소의 계층적 군집화)

  • Han, Ji-Ho;Park, Dong-Chul;Woo, Dong-Min;Jeong, Tai-Kyeong;Lee, Yun-Sik;Min, Soo-Young
    • Journal of IKEEE
    • /
    • v.16 no.2
    • /
    • pp.95-101
    • /
    • 2012
  • A novel approach for the reconstruction of 3D building model from aerial image data is proposed in this paper. In this approach, a Centroid Neural Network (CNN) with a metric of line segments is proposed for connecting low-level linear structures. After the straight lines are extracted from an edge image using the CNN, rectangular boundaries are then found by using an edge-based grouping approach. In order to avoid producing unrealistic building models from grouping lined segments, a hierarchical grouping method is proposed in this paper. The proposed hierarchical grouping method is evaluated with a set of aerial image data in the experiment. The results show that the proposed method can be successfully applied for the reconstruction of 3D building model from satellite images.

Autonomous-Driving Vehicle Learning Environments using Unity Real-time Engine and End-to-End CNN Approach (유니티 실시간 엔진과 End-to-End CNN 접근법을 이용한 자율주행차 학습환경)

  • Hossain, Sabir;Lee, Deok-Jin
    • The Journal of Korea Robotics Society
    • /
    • v.14 no.2
    • /
    • pp.122-130
    • /
    • 2019
  • Collecting a rich but meaningful training data plays a key role in machine learning and deep learning researches for a self-driving vehicle. This paper introduces a detailed overview of existing open-source simulators which could be used for training self-driving vehicles. After reviewing the simulators, we propose a new effective approach to make a synthetic autonomous vehicle simulation platform suitable for learning and training artificial intelligence algorithms. Specially, we develop a synthetic simulator with various realistic situations and weather conditions which make the autonomous shuttle to learn more realistic situations and handle some unexpected events. The virtual environment is the mimics of the activity of a genuine shuttle vehicle on a physical world. Instead of doing the whole experiment of training in the real physical world, scenarios in 3D virtual worlds are made to calculate the parameters and training the model. From the simulator, the user can obtain data for the various situation and utilize it for the training purpose. Flexible options are available to choose sensors, monitor the output and implement any autonomous driving algorithm. Finally, we verify the effectiveness of the developed simulator by implementing an end-to-end CNN algorithm for training a self-driving shuttle.

Three-Dimensional Convolutional Vision Transformer for Sign Language Translation (수어 번역을 위한 3차원 컨볼루션 비전 트랜스포머)

  • Horyeor Seong;Hyeonjoong Cho
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.3
    • /
    • pp.140-147
    • /
    • 2024
  • In the Republic of Korea, people with hearing impairments are the second-largest demographic within the registered disability community, following those with physical disabilities. Despite this demographic significance, research on sign language translation technology is limited due to several reasons including the limited market size and the lack of adequately annotated datasets. Despite the difficulties, a few researchers continue to improve the performacne of sign language translation technologies by employing the recent advance of deep learning, for example, the transformer architecture, as the transformer-based models have demonstrated noteworthy performance in tasks such as action recognition and video classification. This study focuses on enhancing the recognition performance of sign language translation by combining transformers with 3D-CNN. Through experimental evaluations using the PHOENIX-Wether-2014T dataset [1], we show that the proposed model exhibits comparable performance to existing models in terms of Floating Point Operations Per Second (FLOPs).

Basic Implementation of Multi Input CNN for Face Recognition (얼굴인식을 위한 다중입력 CNN의 기본 구현)

  • Cheema, Usman;Moon, Seungbin
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.10a
    • /
    • pp.1002-1003
    • /
    • 2019
  • Face recognition is an extensively researched area of computer vision. Visible, infrared, thermal, and 3D modalities have been used against various challenges of face recognition such as illumination, pose, expression, partial information, and disguise. In this paper we present a multi-modal approach to face recognition using convolutional neural networks. We use visible and thermal face images as two separate inputs to a multi-input deep learning network for face recognition. The experiments are performed on IRIS visible and thermal face database and high face verification rates are achieved.

A Time Series Graph based Convolutional Neural Network Model for Effective Input Variable Pattern Learning : Application to the Prediction of Stock Market (효과적인 입력변수 패턴 학습을 위한 시계열 그래프 기반 합성곱 신경망 모형: 주식시장 예측에의 응용)

  • Lee, Mo-Se;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.167-181
    • /
    • 2018
  • Over the past decade, deep learning has been in spotlight among various machine learning algorithms. In particular, CNN(Convolutional Neural Network), which is known as the effective solution for recognizing and classifying images or voices, has been popularly applied to classification and prediction problems. In this study, we investigate the way to apply CNN in business problem solving. Specifically, this study propose to apply CNN to stock market prediction, one of the most challenging tasks in the machine learning research. As mentioned, CNN has strength in interpreting images. Thus, the model proposed in this study adopts CNN as the binary classifier that predicts stock market direction (upward or downward) by using time series graphs as its inputs. That is, our proposal is to build a machine learning algorithm that mimics an experts called 'technical analysts' who examine the graph of past price movement, and predict future financial price movements. Our proposed model named 'CNN-FG(Convolutional Neural Network using Fluctuation Graph)' consists of five steps. In the first step, it divides the dataset into the intervals of 5 days. And then, it creates time series graphs for the divided dataset in step 2. The size of the image in which the graph is drawn is $40(pixels){\times}40(pixels)$, and the graph of each independent variable was drawn using different colors. In step 3, the model converts the images into the matrices. Each image is converted into the combination of three matrices in order to express the value of the color using R(red), G(green), and B(blue) scale. In the next step, it splits the dataset of the graph images into training and validation datasets. We used 80% of the total dataset as the training dataset, and the remaining 20% as the validation dataset. And then, CNN classifiers are trained using the images of training dataset in the final step. Regarding the parameters of CNN-FG, we adopted two convolution filters ($5{\times}5{\times}6$ and $5{\times}5{\times}9$) in the convolution layer. In the pooling layer, $2{\times}2$ max pooling filter was used. The numbers of the nodes in two hidden layers were set to, respectively, 900 and 32, and the number of the nodes in the output layer was set to 2(one is for the prediction of upward trend, and the other one is for downward trend). Activation functions for the convolution layer and the hidden layer were set to ReLU(Rectified Linear Unit), and one for the output layer set to Softmax function. To validate our model - CNN-FG, we applied it to the prediction of KOSPI200 for 2,026 days in eight years (from 2009 to 2016). To match the proportions of the two groups in the independent variable (i.e. tomorrow's stock market movement), we selected 1,950 samples by applying random sampling. Finally, we built the training dataset using 80% of the total dataset (1,560 samples), and the validation dataset using 20% (390 samples). The dependent variables of the experimental dataset included twelve technical indicators popularly been used in the previous studies. They include Stochastic %K, Stochastic %D, Momentum, ROC(rate of change), LW %R(Larry William's %R), A/D oscillator(accumulation/distribution oscillator), OSCP(price oscillator), CCI(commodity channel index), and so on. To confirm the superiority of CNN-FG, we compared its prediction accuracy with the ones of other classification models. Experimental results showed that CNN-FG outperforms LOGIT(logistic regression), ANN(artificial neural network), and SVM(support vector machine) with the statistical significance. These empirical results imply that converting time series business data into graphs and building CNN-based classification models using these graphs can be effective from the perspective of prediction accuracy. Thus, this paper sheds a light on how to apply deep learning techniques to the domain of business problem solving.