• Title/Summary/Keyword: 2D Convolutional Neural Network

Search Result 99, Processing Time 0.031 seconds

Efficient Super-Resolution of 2D Smoke Data with Optimized Quadtree (최적화된 쿼드트리를 이용한 2차원 연기 데이터의 효율적인 슈퍼 해상도 기법)

  • Choe, YooYeon;Kim, Donghui;Kim, Jong-Hyun
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2021.01a
    • /
    • pp.261-264
    • /
    • 2021
  • 본 논문에서는 SR(Super-Resolution)을 계산하는데 필요한 데이터를 효율적으로 분류하고 분할하여 빠르게 SR연산을 가능하게 하는 쿼드트리 기반 최적화 기법을 제안한다. 제안하는 방법은 입력 데이터로 사용하는 연기 데이터를 다운스케일링(Downscaling)하여 쿼드트리 연산 소요 시간을 감소시키며, 이때 연기의 밀도를 이진화함으로써, 다운스케일링 과정에서 밀도가 손실되는 문제를 피한다. 학습에 사용된 데이터는 COCO 2017 Dataset이며, 인공신경망은 VGG19 기반 네트워크를 사용한다. 컨볼루션 계층을 거칠 때 데이터의 손실을 막기 위해 잔차(Residual)방식과 유사하게 이전 계층의 출력 값을 더해주며 학습한다. 결과적으로 제안하는 방법은 이전 결과 기법에 비해 약15~18배 정도의 속도향상을 얻었다.

  • PDF

CNN Based Spectrum Sensing Technique for Cognitive Radio Communications (인지 무선 통신을 위한 합성곱 신경망 기반 스펙트럼 센싱 기법)

  • Jung, Tae-Yun;Lee, Eui-Soo;Kim, Do-Kyoung;Oh, Ji-Myung;Noh, Woo-Young;Jeong, Eui-Rim
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.2
    • /
    • pp.276-284
    • /
    • 2020
  • This paper proposes a new convolutional neural network (CNN) based spectrum sensing technique for cognitive radio communications. The proposed technique determines the existence of the primary user (PU) by using energy detection without any prior knowledge of the PU's signal. In the proposed method, the received signal is high-rate sampled to sense the entire spectrum bands of interest. After that, fast Fourier transform (FFT) of the signal converts the time domain signal to frequency domain spectrum and by stacking those consecutive spectrums, a 2 dimensional signal is made. The 2 dimensional signal is cut by the sensing channel bandwidth and inputted to the CNN. The CNN determines the existence of the primary user. Since there are only two states (existence or non-existence), binary classification CNN is used. The performance of the proposed method is examined through computer simulation and indoor experiment. According to the results, the proposed method outperforms the conventional threshold-based method by over 2 dB.

Design of Beacon System for Estim ating 6DOF and Central Management Based on the Convolutional Neural Network in an augmented reality environment (증강현실 환경에서 합성곱 신경망 기반 6 자유도 자세 추정 및 중앙 관리가 가능한 비콘 시스템 설계)

  • An, Hyeon Woo;Cho, Jae Hyeon;Moon, Nammee
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2018.06a
    • /
    • pp.178-179
    • /
    • 2018
  • 증강현실 환경에서 현실 세계의 물체를 포착하여 디지털화 시키는 것은 몰입감 향상에 있어 매우 중요한 기술이다. Faster R - CNN 은 영상에서 여러 물체를 인식하는 기술 중 하나이며, 지금껏 많은 응용 기술의 개발과 함께 많은 연구가 진행되고 있다. 본 논문은 증강현실 환경에서 평면물체의 2D 변환관계를 설명하는 Homography 와 Faster R - CNN 을 활용하여 여러 개의 비콘에 대한 6 자유도(6DOF) 를 추정하는 방법을 제안한다. 또한 증강현실에서 주로 사용되는 마커 기술에 존재하는 단점들을 극복할 수 있는 비콘 구조를 소개하고 여러 개의 비콘을 용이하게 관리하는 시스템을 제안한다.

  • PDF

A Manually Captured and Modified Phone Screen Image Dataset for Widget Classification on CNNs

  • Byun, SungChul;Han, Seong-Soo;Jeong, Chang-Sung
    • Journal of Information Processing Systems
    • /
    • v.18 no.2
    • /
    • pp.197-207
    • /
    • 2022
  • The applications and user interfaces (UIs) of smart mobile devices are constantly diversifying. For example, deep learning can be an innovative solution to classify widgets in screen images for increasing convenience. To this end, the present research leverages captured images and the ReDraw dataset to write deep learning datasets for image classification purposes. First, as the validation for datasets using ResNet50 and EfficientNet, the experiments show that the dataset composed in this study is helpful for classification according to a widget's functionality. An implementation for widget detection and classification on RetinaNet and EfficientNet is then executed. Finally, the research suggests the Widg-C and Widg-D datasets-a deep learning dataset for identifying the widgets of smart devices-and implementing them for use with representative convolutional neural network models.

1D-CNN-LSTM Hybrid-Model-Based Pet Behavior Recognition through Wearable Sensor Data Augmentation

  • Hyungju Kim;Nammee Moon
    • Journal of Information Processing Systems
    • /
    • v.20 no.2
    • /
    • pp.159-172
    • /
    • 2024
  • The number of healthcare products available for pets has increased in recent times, which has prompted active research into wearable devices for pets. However, the data collected through such devices are limited by outliers and missing values owing to the anomalous and irregular characteristics of pets. Hence, we propose pet behavior recognition based on a hybrid one-dimensional convolutional neural network (CNN) and long short- term memory (LSTM) model using pet wearable devices. An Arduino-based pet wearable device was first fabricated to collect data for behavior recognition, where gyroscope and accelerometer values were collected using the device. Then, data augmentation was performed after replacing any missing values and outliers via preprocessing. At this time, the behaviors were classified into five types. To prevent bias from specific actions in the data augmentation, the number of datasets was compared and balanced, and CNN-LSTM-based deep learning was performed. The five subdivided behaviors and overall performance were then evaluated, and the overall accuracy of behavior recognition was found to be about 88.76%.

A Study on Design and Implementation of Driver's Blind Spot Assist System Using CNN Technique (CNN 기법을 활용한 운전자 시선 사각지대 보조 시스템 설계 및 구현 연구)

  • Lim, Seung-Cheol;Go, Jae-Seung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.2
    • /
    • pp.149-155
    • /
    • 2020
  • The Korea Highway Traffic Authority provides statistics that analyze the causes of traffic accidents that occurred since 2015 using the Traffic Accident Analysis System (TAAS). it was reported Through TAAS that the driver's forward carelessness was the main cause of traffic accidents in 2018. As statistics on the cause of traffic accidents, 51.2 percent used mobile phones and watched DMB while driving, 14 percent did not secure safe distance, and 3.6 percent violated their duty to protect pedestrians, representing a total of 68.8 percent. In this paper, we propose a system that has improved the advanced driver assistance system ADAS (Advanced Driver Assistance Systems) by utilizing CNN (Convolutional Neural Network) among the algorithms of Deep Learning. The proposed system learns a model that classifies the movement of the driver's face and eyes using Conv2D techniques which are mainly used for Image processing, while recognizing and detecting objects around the vehicle with cameras attached to the front of the vehicle to recognize the driving environment. Then, using the learned visual steering model and driving environment data, the hazard is classified and detected in three stages, depending on the driver's view and driving environment to assist the driver with the forward and blind spots.

Robust Deep Age Estimation Method Using Artificially Generated Image Set

  • Jang, Jaeyoon;Jeon, Seung-Hyuk;Kim, Jaehong;Yoon, Hosub
    • ETRI Journal
    • /
    • v.39 no.5
    • /
    • pp.643-651
    • /
    • 2017
  • Human age estimation is one of the key factors in the field of Human-Robot Interaction/Human-Computer Interaction (HRI/HCI). Owing to the development of deep-learning technologies, age recognition has recently been attempted. In general, however, deep learning techniques require a large-scale database, and for age learning with variations, a conventional database is insufficient. For this reason, we propose an age estimation method using artificially generated data. Image data are artificially generated through 3D information, thus solving the problem of shortage of training data, and helping with the training of the deep-learning technique. Augmentation using 3D has advantages over 2D because it creates new images with more information. We use a deep architecture as a pre-trained model, and improve the estimation capacity using artificially augmented training images. The deep architecture can outperform traditional estimation methods, and the improved method showed increased reliability. We have achieved state-of-the-art performance using the proposed method in the Morph-II dataset and have proven that the proposed method can be used effectively using the Adience dataset.

A Three-Dimensional Deep Convolutional Neural Network for Automatic Segmentation and Diameter Measurement of Type B Aortic Dissection

  • Yitong Yu;Yang Gao;Jianyong Wei;Fangzhou Liao;Qianjiang Xiao;Jie Zhang;Weihua Yin;Bin Lu
    • Korean Journal of Radiology
    • /
    • v.22 no.2
    • /
    • pp.168-178
    • /
    • 2021
  • Objective: To provide an automatic method for segmentation and diameter measurement of type B aortic dissection (TBAD). Materials and Methods: Aortic computed tomography angiographic images from 139 patients with TBAD were consecutively collected. We implemented a deep learning method based on a three-dimensional (3D) deep convolutional neural (CNN) network, which realizes automatic segmentation and measurement of the entire aorta (EA), true lumen (TL), and false lumen (FL). The accuracy, stability, and measurement time were compared between deep learning and manual methods. The intra- and inter-observer reproducibility of the manual method was also evaluated. Results: The mean dice coefficient scores were 0.958, 0.961, and 0.932 for EA, TL, and FL, respectively. There was a linear relationship between the reference standard and measurement by the manual and deep learning method (r = 0.964 and 0.991, respectively). The average measurement error of the deep learning method was less than that of the manual method (EA, 1.64% vs. 4.13%; TL, 2.46% vs. 11.67%; FL, 2.50% vs. 8.02%). Bland-Altman plots revealed that the deviations of the diameters between the deep learning method and the reference standard were -0.042 mm (-3.412 to 3.330 mm), -0.376 mm (-3.328 to 2.577 mm), and 0.026 mm (-3.040 to 3.092 mm) for EA, TL, and FL, respectively. For the manual method, the corresponding deviations were -0.166 mm (-1.419 to 1.086 mm), -0.050 mm (-0.970 to 1.070 mm), and -0.085 mm (-1.010 to 0.084 mm). Intra- and inter-observer differences were found in measurements with the manual method, but not with the deep learning method. The measurement time with the deep learning method was markedly shorter than with the manual method (21.7 ± 1.1 vs. 82.5 ± 16.1 minutes, p < 0.001). Conclusion: The performance of efficient segmentation and diameter measurement of TBADs based on the 3D deep CNN was both accurate and stable. This method is promising for evaluating aortic morphology automatically and alleviating the workload of radiologists in the near future.

Accuracy of artificial intelligence-assisted landmark identification in serial lateral cephalograms of Class III patients who underwent orthodontic treatment and two-jaw orthognathic surgery

  • Hong, Mihee;Kim, Inhwan;Cho, Jin-Hyoung;Kang, Kyung-Hwa;Kim, Minji;Kim, Su-Jung;Kim, Yoon-Ji;Sung, Sang-Jin;Kim, Young Ho;Lim, Sung-Hoon;Kim, Namkug;Baek, Seung-Hak
    • The korean journal of orthodontics
    • /
    • v.52 no.4
    • /
    • pp.287-297
    • /
    • 2022
  • Objective: To investigate the pattern of accuracy change in artificial intelligence-assisted landmark identification (LI) using a convolutional neural network (CNN) algorithm in serial lateral cephalograms (Lat-cephs) of Class III (C-III) patients who underwent two-jaw orthognathic surgery. Methods: A total of 3,188 Lat-cephs of C-III patients were allocated into the training and validation sets (3,004 Lat-cephs of 751 patients) and test set (184 Lat-cephs of 46 patients; subdivided into the genioplasty and non-genioplasty groups, n = 23 per group) for LI. Each C-III patient in the test set had four Lat-cephs: initial (T0), pre-surgery (T1, presence of orthodontic brackets [OBs]), post-surgery (T2, presence of OBs and surgical plates and screws [S-PS]), and debonding (T3, presence of S-PS and fixed retainers [FR]). After mean errors of 20 landmarks between human gold standard and the CNN model were calculated, statistical analysis was performed. Results: The total mean error was 1.17 mm without significant difference among the four time-points (T0, 1.20 mm; T1, 1.14 mm; T2, 1.18 mm; T3, 1.15 mm). In comparison of two time-points ([T0, T1] vs. [T2, T3]), ANS, A point, and B point showed an increase in error (p < 0.01, 0.05, 0.01, respectively), while Mx6D and Md6D showeda decrease in error (all p < 0.01). No difference in errors existed at B point, Pogonion, Menton, Md1C, and Md1R between the genioplasty and non-genioplasty groups. Conclusions: The CNN model can be used for LI in serial Lat-cephs despite the presence of OB, S-PS, FR, genioplasty, and bone remodeling.

Semantic Segmentation of Drone Imagery Using Deep Learning for Seagrass Habitat Monitoring (잘피 서식지 모니터링을 위한 딥러닝 기반의 드론 영상 의미론적 분할)

  • Jeon, Eui-Ik;Kim, Seong-Hak;Kim, Byoung-Sub;Park, Kyung-Hyun;Choi, Ock-In
    • Korean Journal of Remote Sensing
    • /
    • v.36 no.2_1
    • /
    • pp.199-215
    • /
    • 2020
  • A seagrass that is marine vascular plants plays an important role in the marine ecosystem, so periodic monitoring ofseagrass habitatsis being performed. Recently, the use of dronesthat can easily acquire very high-resolution imagery is increasing to efficiently monitor seagrass habitats. And deep learning based on a convolutional neural network has shown excellent performance in semantic segmentation. So, studies applied to deep learning models have been actively conducted in remote sensing. However, the segmentation accuracy was different due to the hyperparameter, various deep learning models and imagery. And the normalization of the image and the tile and batch size are also not standardized. So,seagrass habitats were segmented from drone-borne imagery using a deep learning that shows excellent performance in this study. And it compared and analyzed the results focused on normalization and tile size. For comparison of the results according to the normalization, tile and batch size, a grayscale image and grayscale imagery converted to Z-score and Min-Max normalization methods were used. And the tile size isincreased at a specific interval while the batch size is allowed the memory size to be used as much as possible. As a result, IoU was 0.26 ~ 0.4 higher than that of Z-score normalized imagery than other imagery. Also, it wasfound that the difference to 0.09 depending on the tile and batch size. The results were different according to the normalization, tile and batch. Therefore, this experiment found that these factors should have a suitable decision process.