• 제목/요약/키워드: Improved Convolutional Neural Network

검색결과 171건 처리시간 0.026초

Deep Learning을 위한 GPGPU 기반 Convolution 가속기 구현 (An Implementation of a Convolutional Accelerator based on a GPGPU for a Deep Learning)

  • 전희경;이광엽;김치용
    • 전기전자학회논문지
    • /
    • 제20권3호
    • /
    • pp.303-306
    • /
    • 2016
  • 본 논문에서는 GPGPU를 활용하여 Convolutional neural network의 가속화 방법을 제안한다. Convolutional neural network는 이미지의 특징 값을 학습하여 분류하는 neural network의 일종으로 대량의 데이터를 학습해야하는 영상 처리에 적합하다. 기존의 Convolutional neural network의 convolution layer는 다수의 곱셈 연산을 필요로 하여 임베디드 환경에서 실시간으로 동작하기에 어려움이 있다. 본 논문에서는 이러한 단점을 해결하기 위하여 winograd convolution 연산을 통하여 곱셈 연산을 줄이고 GPGPU의 SIMT 구조를 활용하여 convolution 연산을 병렬 처리한다. 실험은 ModelSim, TestDrive를 사용하여 진행하였고 실험 결과 기존의 convolution 연산보다 처리 시간이 약 17% 개선되었다.

Human Motion Recognition Based on Spatio-temporal Convolutional Neural Network

  • Hu, Zeyuan;Park, Sange-yun;Lee, Eung-Joo
    • 한국멀티미디어학회논문지
    • /
    • 제23권8호
    • /
    • pp.977-985
    • /
    • 2020
  • Aiming at the problem of complex feature extraction and low accuracy in human action recognition, this paper proposed a network structure combining batch normalization algorithm with GoogLeNet network model. Applying Batch Normalization idea in the field of image classification to action recognition field, it improved the algorithm by normalizing the network input training sample by mini-batch. For convolutional network, RGB image was the spatial input, and stacked optical flows was the temporal input. Then, it fused the spatio-temporal networks to get the final action recognition result. It trained and evaluated the architecture on the standard video actions benchmarks of UCF101 and HMDB51, which achieved the accuracy of 93.42% and 67.82%. The results show that the improved convolutional neural network has a significant improvement in improving the recognition rate and has obvious advantages in action recognition.

Two-Stream Convolutional Neural Network for Video Action Recognition

  • Qiao, Han;Liu, Shuang;Xu, Qingzhen;Liu, Shouqiang;Yang, Wanggan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제15권10호
    • /
    • pp.3668-3684
    • /
    • 2021
  • Video action recognition is widely used in video surveillance, behavior detection, human-computer interaction, medically assisted diagnosis and motion analysis. However, video action recognition can be disturbed by many factors, such as background, illumination and so on. Two-stream convolutional neural network uses the video spatial and temporal models to train separately, and performs fusion at the output end. The multi segment Two-Stream convolutional neural network model trains temporal and spatial information from the video to extract their feature and fuse them, then determine the category of video action. Google Xception model and the transfer learning is adopted in this paper, and the Xception model which trained on ImageNet is used as the initial weight. It greatly overcomes the problem of model underfitting caused by insufficient video behavior dataset, and it can effectively reduce the influence of various factors in the video. This way also greatly improves the accuracy and reduces the training time. What's more, to make up for the shortage of dataset, the kinetics400 dataset was used for pre-training, which greatly improved the accuracy of the model. In this applied research, through continuous efforts, the expected goal is basically achieved, and according to the study and research, the design of the original dual-flow model is improved.

IPC-CNN: A Robust Solution for Precise Brain Tumor Segmentation Using Improved Privacy-Preserving Collaborative Convolutional Neural Network

  • Abdul Raheem;Zhen Yang;Haiyang Yu;Muhammad Yaqub;Fahad Sabah;Shahzad Ahmed;Malik Abdul Manan;Imran Shabir Chuhan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제18권9호
    • /
    • pp.2589-2604
    • /
    • 2024
  • Brain tumors, characterized by uncontrollable cellular growths, are a significant global health challenge. Navigating the complexities of tumor identification due to their varied dimensions and positions, our research introduces enhanced methods for precise detection. Utilizing advanced learning techniques, we've improved early identification by preprocessing clinical dataset-derived images, augmenting them via a Generative Adversarial Network, and applying an Improved Privacy-Preserving Collaborative Convolutional Neural Network (IPC-CNN) for segmentation. Recognizing the critical importance of data security in today's digital era, our framework emphasizes the preservation of patient privacy. We evaluated the performance of our proposed model on the Figshare and BRATS 2018 datasets. By facilitating a collaborative model training environment across multiple healthcare institutions, we harness the power of distributed computing to securely aggregate model updates, ensuring individual data protection while leveraging collective expertise. Our IPC-CNN model achieved an accuracy of 99.40%, marking a notable advancement in brain tumor classification and offering invaluable insights for both the medical imaging and machine learning communities.

Convolutional Neural Network와 Monte Carlo Tree Search를 이용한 인공지능 바둑 프로그램의 구현 (Implementation of Artificial Intelligence Computer Go Program Using a Convolutional Neural Network and Monte Carlo Tree Search)

  • 기철민;조태훈
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국정보통신학회 2016년도 추계학술대회
    • /
    • pp.405-408
    • /
    • 2016
  • 바둑, 체스, 장기와 같은 게임은 사람들의 두뇌발달에 도움을 주어왔다. 이 게임들은 컴퓨터 프로그램으로도 개발되었으며, 혼자서도 게임을 즐길 수 있도록 많은 알고리즘들이 개발되었다. 사람을 이기는 체스 프로그램은 1990년대에 개발된 것에 비해 바둑은 경우의 수가 너무 많아서 프로 바둑기사를 이기기는 불가능한 것으로 여겨졌다. 하지만 MCTS(Monte Carlo Tree Search)와 CNN(Convolutional Neural Network)의 이용으로 바둑 알고리즘의 성능은 큰 향상을 이루었다. 본 논문에서는 CNN과 MCTS를 사용하여 바둑 알고리즘의 개발을 진행하였다. 바둑의 기보가 학습된 CNN을 이용하여 최적의 수를 찾고, MCTS를 이용하여 게임의 시뮬레이션을 진행하여 이길 확률을 계산한다. 또한 기존 기보를 이용하여 바둑의 패턴 정보를 추출하고, 이를 이용하여 속도와 성능 향상을 도모하였다. 이 방법은 일반적으로 사용되는 바둑 알고리즘들에 비해 성능 향상이 있었다. 또한 충분한 Computing Power가 제공되면 더욱 성능이 향상될 것으로 보인다.

  • PDF

OpenCV 와 Convolutional neural network를 이용한 눈동자 모션인식 시스템 구현 (Implementation to eye motion tracking system using OpenCV and convolutional neural network)

  • 이승준;허승원;이희빈;유윤섭
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국정보통신학회 2018년도 추계학술대회
    • /
    • pp.379-380
    • /
    • 2018
  • 본 논문에서는 이전에 발표한 "Convolutional neural network를 이용한 눈동자 모션인식 시스템 구현"을 OpenCV의 눈 영역 검출을 이용해 보완하고 신경망 구성 및 연산에 Numpy를 이용한 눈동자 모션인식 시스템에 대해 소개한다. Numpy를 이용해 신경망을 구성하고 OpenCV를 이용해 얼굴영역과 눈영역을 검출해 내고 눈영역 이미지를 이용해 신경망을 학습하고 학습된 신경망으로 눈동자의 움직임을 인식한다. 이 시스템은 SOC 보드인 DE1-SOC 보드 위에 구현했다.

  • PDF

합성곱 신경망의 학습 가속화를 위한 방법 (A Method for accelerating training of Convolutional Neural Network)

  • 최세진;정준모
    • 문화기술의 융합
    • /
    • 제3권4호
    • /
    • pp.171-175
    • /
    • 2017
  • 최근 CNN(Convolutional Neural Network)의 구조가 복잡해지고 신견망의 깊이가 깊어지고 있다. 이에 따라 신경망의 학습에 요구되는 연산량 및 학습 시간이 증가하게 되었다. 최근 GPGPU 및 FPGA를 이용하여 신경망의 학습 속도를 가속화 하는 방법에 대한 연구가 활발히 진행되고 있다. 본 논문에서는 NVIDIA GPGPU를 제어하는 CUDA를 이용하여 CNN의 특징추출부와 분류부에 대한 연산을 가속화하는 방법을 제시한다. 특징추출부와 분류부에 대한 연산을 GPGPU의 블록 및 스레드로 할당하여 병렬로 처리하였다. 본 논문에서 제안하는 방법과 기존 CPU를 이용하여 CNN을 학습하여 학습 속도를 비교하였다. MNIST 데이터세트에 대하여 총 5 epoch을 학습한 결과 제안하는 방법이 CPU를 이용하여 학습한 방법에 비하여 약 314% 정도 학습 속도가 향상된 것을 확인하였다.

무인기를 이용한 심층 신경망 기반 해파리 분포 인식 시스템 (Deep Neural Network-based Jellyfish Distribution Recognition System Using a UAV)

  • 구정모;명현
    • 로봇학회논문지
    • /
    • 제12권4호
    • /
    • pp.432-440
    • /
    • 2017
  • In this paper, we propose a jellyfish distribution recognition and monitoring system using a UAV (unmanned aerial vehicle). The UAV was designed to satisfy the requirements for flight in ocean environment. The target jellyfish, Aurelia aurita, is recognized through convolutional neural network and its distribution is calculated. The modified deep neural network architecture has been developed to have reliable recognition accuracy and fast operation speed. Recognition speed is about 400 times faster than GoogLeNet by using a lightweight network architecture. We also introduce the method for selecting candidates to be used as inputs to the proposed network. The recognition accuracy of the jellyfish is improved by removing the probability value of the meaningless class among the probability vectors of the evaluated input image and re-evaluating it by normalization. The jellyfish distribution is calculated based on the unit jellyfish image recognized. The distribution level is defined by using the novelty concept of the distribution map buffer.

Number Plate Detection with a Multi-Convolutional Neural Network Approach with Optical Character Recognition for Mobile Devices

  • Gerber, Christian;Chung, Mokdong
    • Journal of Information Processing Systems
    • /
    • 제12권1호
    • /
    • pp.100-108
    • /
    • 2016
  • In this paper, we propose a method to achieve improved number plate detection for mobile devices by applying a multiple convolutional neural network (CNN) approach. First, we processed supervised CNN-verified car detection and then we applied the detected car regions to the next supervised CNN-verifier for number plate detection. In the final step, the detected number plate regions were verified through optical character recognition by another CNN-verifier. Since mobile devices are limited in computation power, we are proposing a fast method to recognize number plates. We expect for it to be used in the field of intelligent transportation systems.

잔향 환경 음성인식을 위한 다중 해상도 DenseNet 기반 음향 모델 (Multi-resolution DenseNet based acoustic models for reverberant speech recognition)

  • 박순찬;정용원;김형순
    • 말소리와 음성과학
    • /
    • 제10권1호
    • /
    • pp.33-38
    • /
    • 2018
  • Although deep neural network-based acoustic models have greatly improved the performance of automatic speech recognition (ASR), reverberation still degrades the performance of distant speech recognition in indoor environments. In this paper, we adopt the DenseNet, which has shown great performance results in image classification tasks, to improve the performance of reverberant speech recognition. The DenseNet enables the deep convolutional neural network (CNN) to be effectively trained by concatenating feature maps in each convolutional layer. In addition, we extend the concept of multi-resolution CNN to multi-resolution DenseNet for robust speech recognition in reverberant environments. We evaluate the performance of reverberant speech recognition on the single-channel ASR task in reverberant voice enhancement and recognition benchmark (REVERB) challenge 2014. According to the experimental results, the DenseNet-based acoustic models show better performance than do the conventional CNN-based ones, and the multi-resolution DenseNet provides additional performance improvement.