• Title/Summary/Keyword: Improved Convolutional Neural Network

Search Result 171, Processing Time 0.032 seconds

Intelligent Activity Recognition based on Improved Convolutional Neural Network

  • Park, Jin-Ho;Lee, Eung-Joo
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.6
    • /
    • pp.807-818
    • /
    • 2022
  • In order to further improve the accuracy and time efficiency of behavior recognition in intelligent monitoring scenarios, a human behavior recognition algorithm based on YOLO combined with LSTM and CNN is proposed. Using the real-time nature of YOLO target detection, firstly, the specific behavior in the surveillance video is detected in real time, and the depth feature extraction is performed after obtaining the target size, location and other information; Then, remove noise data from irrelevant areas in the image; Finally, combined with LSTM modeling and processing time series, the final behavior discrimination is made for the behavior action sequence in the surveillance video. Experiments in the MSR and KTH datasets show that the average recognition rate of each behavior reaches 98.42% and 96.6%, and the average recognition speed reaches 210ms and 220ms. The method in this paper has a good effect on the intelligence behavior recognition.

Crack Detection in Tunnel Using Convolutional Encoder-Decoder Network (컨볼루셔널 인코더-디코더 네트워크를 이용한 터널에서의 균열 검출)

  • Han, Bok Gyu;Yang, Hyeon Seok;Lee, Jong Min;Moon, Young Shik
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.54 no.6
    • /
    • pp.80-89
    • /
    • 2017
  • The classical approaches to detect cracks are performed by experienced inspection professionals by annotating the crack patterns manually. Because of each inspector's personal subjective experience, it is hard to guarantee objectiveness. To solve this issue, automated crack detection methods have been proposed however the methods are sensitive to image noise. Depending on the quality of image obtained, the image noise affect overall performance. In this paper, we propose crack detection method using a convolutional encoder-decoder network to overcome these weaknesses. Performance of which is significantly improved in terms of the recall, precision rate and F-measure than the previous methods.

Medical Image Denoising using Wavelet Transform-Based CNN Model

  • Seoyun Jang;Dong Hoon Lim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.10
    • /
    • pp.21-34
    • /
    • 2024
  • In medical images such as MRI(Magnetic Resonance Imaging) and CT(Computed Tomography) images, noise removal has a significant impact on the performance of medical imaging systems. Recently, the introduction of deep learning in image processing technology has improved the performance of noise removal methods. However, there is a limit to removing only noise while preserving details in the image domain. In this paper, we propose a wavelet transform-based CNN(Convolutional Neural Network) model, namely the WT-DnCNN(Wavelet Transform-Denoising Convolutional Neural Network) model, to improve noise removal performance. This model first removes noise by dividing the noisy image into frequency bands using wavelet transform, and then applies the existing DnCNN model to the corresponding frequency bands to finally remove noise. In order to evaluate the performance of the WT-DnCNN model proposed in this paper, experiments were conducted on MRI and CT images damaged by various noises, namely Gaussian noise, Poisson noise, and speckle noise. The performance experiment results show that the WT-DnCNN model is superior to the traditional filter, i.e., the BM3D(Block-Matching and 3D Filtering) filter, as well as the existing deep learning models, DnCNN and CDAE(Convolution Denoising AutoEncoder) model in qualitative comparison, and in quantitative comparison, the PSNR(Peak Signal-to-Noise Ratio) and SSIM(Structural Similarity Index Measure) values were 36~43 and 0.93~0.98 for MRI images and 38~43 and 0.95~0.98 for CT images, respectively. In addition, in the comparison of the execution speed of the models, the DnCNN model was much less than the BM3D model, but it took a long time due to the addition of the wavelet transform in the comparison with the DnCNN model.

Low-Quality Banknote Serial Number Recognition Based on Deep Neural Network

  • Jang, Unsoo;Suh, Kun Ha;Lee, Eui Chul
    • Journal of Information Processing Systems
    • /
    • v.16 no.1
    • /
    • pp.224-237
    • /
    • 2020
  • Recognition of banknote serial number is one of the important functions for intelligent banknote counter implementation and can be used for various purposes. However, the previous character recognition method is limited to use due to the font type of the banknote serial number, the variation problem by the solid status, and the recognition speed issue. In this paper, we propose an aspect ratio based character region segmentation and a convolutional neural network (CNN) based banknote serial number recognition method. In order to detect the character region, the character area is determined based on the aspect ratio of each character in the serial number candidate area after the banknote area detection and de-skewing process is performed. Then, we designed and compared four types of CNN models and determined the best model for serial number recognition. Experimental results showed that the recognition accuracy of each character was 99.85%. In addition, it was confirmed that the recognition performance is improved as a result of performing data augmentation. The banknote used in the experiment is Indian rupee, which is badly soiled and the font of characters is unusual, therefore it can be regarded to have good performance. Recognition speed was also enough to run in real time on a device that counts 800 banknotes per minute.

Deep Unsupervised Learning for Rain Streak Removal using Time-varying Rain Streak Scene (시간에 따라 변화하는 빗줄기 장면을 이용한 딥러닝 기반 비지도 학습 빗줄기 제거 기법)

  • Cho, Jaehoon;Jang, Hyunsung;Ha, Namkoo;Lee, Seungha;Park, Sungsoon;Sohn, Kwanghoon
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.1
    • /
    • pp.1-9
    • /
    • 2019
  • Single image rain removal is a typical inverse problem which decomposes the image into a background scene and a rain streak. Recent works have witnessed a substantial progress on the task due to the development of convolutional neural network (CNN). However, existing CNN-based approaches train the network with synthetically generated training examples. These data tend to make the network bias to the synthetic scenes. In this paper, we present an unsupervised framework for removing rain streaks from real-world rainy images. We focus on the natural phenomena that static rainy scenes capture a common background but different rain streak. From this observation, we train siamese network with the real rain image pairs, which outputs identical backgrounds from the pairs. To train our network, a real rainy dataset is constructed via web-crawling. We show that our unsupervised framework outperforms the recent CNN-based approaches, which are trained by supervised manner. Experimental results demonstrate that the effectiveness of our framework on both synthetic and real-world datasets, showing improved performance over previous approaches.

A Study on the Accuracy Improvement of Movie Recommender System Using Word2Vec and Ensemble Convolutional Neural Networks (Word2Vec과 앙상블 합성곱 신경망을 활용한 영화추천 시스템의 정확도 개선에 관한 연구)

  • Kang, Boo-Sik
    • Journal of Digital Convergence
    • /
    • v.17 no.1
    • /
    • pp.123-130
    • /
    • 2019
  • One of the most commonly used methods of web recommendation techniques is collaborative filtering. Many studies on collaborative filtering have suggested ways to improve accuracy. This study proposes a method of movie recommendation using Word2Vec and an ensemble convolutional neural networks. First, in the user, movie, and rating information, construct the user sentences and movie sentences. It inputs user sentences and movie sentences into Word2Vec to obtain user vectors and movie vectors. User vectors are entered into user convolution model and movie vectors are input to movie convolution model. The user and the movie convolution models are linked to a fully connected neural network model. Finally, the output layer of the fully connected neural network outputs forecasts of user movie ratings. Experimentation results showed that the accuracy of the technique proposed in this study accuracy of conventional collaborative filtering techniques was improved compared to those of conventional collaborative filtering technique and the technique using Word2Vec and deep neural networks proposed in a similar study.

An Improved Image Classification Using Batch Normalization and CNN (배치 정규화와 CNN을 이용한 개선된 영상분류 방법)

  • Ji, Myunggeun;Chun, Junchul;Kim, Namgi
    • Journal of Internet Computing and Services
    • /
    • v.19 no.3
    • /
    • pp.35-42
    • /
    • 2018
  • Deep learning is known as a method of high accuracy among several methods for image classification. In this paper, we propose a method of enhancing the accuracy of image classification using CNN with a batch normalization method for classification of images using deep CNN (Convolutional Neural Network). In this paper, we propose a method to add a batch normalization layer to existing neural networks to enhance the accuracy of image classification. Batch normalization is a method to calculate and move the average and variance of each batch for reducing the deflection in each layer. In order to prove the superiority of the proposed method, Accuracy and mAP are measured by image classification experiments using five image data sets SHREC13, MNIST, SVHN, CIFAR-10, and CIFAR-100. Experimental results showed that the CNN with batch normalization is better classification accuracy and mAP rather than using the conventional CNN.

Deep Window Detection in Street Scenes

  • Ma, Wenguang;Ma, Wei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.2
    • /
    • pp.855-870
    • /
    • 2020
  • Windows are key components of building facades. Detecting windows, crucial to 3D semantic reconstruction and scene parsing, is a challenging task in computer vision. Early methods try to solve window detection by using hand-crafted features and traditional classifiers. However, these methods are unable to handle the diversity of window instances in real scenes and suffer from heavy computational costs. Recently, convolutional neural networks based object detection algorithms attract much attention due to their good performances. Unfortunately, directly training them for challenging window detection cannot achieve satisfying results. In this paper, we propose an approach for window detection. It involves an improved Faster R-CNN architecture for window detection, featuring in a window region proposal network, an RoI feature fusion and a context enhancement module. Besides, a post optimization process is designed by the regular distribution of windows to refine detection results obtained by the improved deep architecture. Furthermore, we present a newly collected dataset which is the largest one for window detection in real street scenes to date. Experimental results on both existing datasets and the new dataset show that the proposed method has outstanding performance.

MRU-Net: A remote sensing image segmentation network for enhanced edge contour Detection

  • Jing Han;Weiyu Wang;Yuqi Lin;Xueqiang LYU
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.12
    • /
    • pp.3364-3382
    • /
    • 2023
  • Remote sensing image segmentation plays an important role in realizing intelligent city construction. The current mainstream segmentation networks effectively improve the segmentation effect of remote sensing images by deeply mining the rich texture and semantic features of images. But there are still some problems such as rough results of small target region segmentation and poor edge contour segmentation. To overcome these three challenges, we propose an improved semantic segmentation model, referred to as MRU-Net, which adopts the U-Net architecture as its backbone. Firstly, the convolutional layer is replaced by BasicBlock structure in U-Net network to extract features, then the activation function is replaced to reduce the computational load of model in the network. Secondly, a hybrid multi-scale recognition module is added in the encoder to improve the accuracy of image segmentation of small targets and edge parts. Finally, test on Massachusetts Buildings Dataset and WHU Dataset the experimental results show that compared with the original network the ACC, mIoU and F1 value are improved, and the imposed network shows good robustness and portability in different datasets.

Evaluation of Transfer Learning in Gastroscopy Image Classification using Convolutional Neual Network (합성곱 신경망을 활용한 위내시경 이미지 분류에서 전이학습의 효용성 평가)

  • Park, Sung Jin;Kim, Young Jae;Park, Dong Kyun;Chung, Jun Won;Kim, Kwang Gi
    • Journal of Biomedical Engineering Research
    • /
    • v.39 no.5
    • /
    • pp.213-219
    • /
    • 2018
  • Stomach cancer is the most diagnosed cancer in Korea. When gastric cancer is detected early, the 5-year survival rate is as high as 90%. Gastroscopy is a very useful method for early diagnosis. But the false negative rate of gastric cancer in the gastroscopy was 4.6~25.8% due to the subjective judgment of the physician. Recently, the image classification performance of the image recognition field has been advanced by the convolutional neural network. Convolutional neural networks perform well when diverse and sufficient amounts of data are supported. However, medical data is not easy to access and it is difficult to gather enough high-quality data that includes expert annotations. So This paper evaluates the efficacy of transfer learning in gastroscopy classification and diagnosis. We obtained 787 endoscopic images of gastric endoscopy at Gil Medical Center, Gachon University. The number of normal images was 200, and the number of abnormal images was 587. The image size was reconstructed and normalized. In the case of the ResNet50 structure, the classification accuracy before and after applying the transfer learning was improved from 0.9 to 0.947, and the AUC was also improved from 0.94 to 0.98. In the case of the InceptionV3 structure, the classification accuracy before and after applying the transfer learning was improved from 0.862 to 0.924, and the AUC was also improved from 0.89 to 0.97. In the case of the VGG16 structure, the classification accuracy before and after applying the transfer learning was improved from 0.87 to 0.938, and the AUC was also improved from 0.89 to 0.98. The difference in the performance of the CNN model before and after transfer learning was statistically significant when confirmed by T-test (p < 0.05). As a result, transfer learning is judged to be an effective method of medical data that is difficult to collect good quality data.