• Title/Summary/Keyword: Region Convolutional Neural Network

Search Result 82, Processing Time 0.027 seconds

Detection Model of Fruit Epidermal Defects Using YOLOv3: A Case of Peach (YOLOv3을 이용한 과일표피 불량검출 모델: 복숭아 사례)

  • Hee Jun Lee;Won Seok Lee;In Hyeok Choi;Choong Kwon Lee
    • Information Systems Review
    • /
    • v.22 no.1
    • /
    • pp.113-124
    • /
    • 2020
  • In the operation of farms, it is very important to evaluate the quality of harvested crops and to classify defective products. However, farmers have difficulty coping with the cost and time required for quality assessment due to insufficient capital and manpower. This study thus aims to detect defects by analyzing the epidermis of fruit using deep learning algorithm. We developed a model that can analyze the epidermis by applying YOLOv3 algorithm based on Region Convolutional Neural Network to video images of peach. A total of four classes were selected and trained. Through 97,600 epochs, a high performance detection model was obtained. The crop failure detection model proposed in this study can be used to automate the process of data collection, quality evaluation through analyzed data, and defect detection. In particular, we have developed an analytical model for peach, which is the most vulnerable to external wounds among crops, so it is expected to be applicable to other crops in farming.

Uniform Motion Deblurring using Shock Filter and Convolutional Neural Network (쇼크 필터와 합성곱 신경망 기반의 균일 모션 디블러링 기법)

  • Jeong, Minso;Jeong, Jechang
    • Journal of Broadcast Engineering
    • /
    • v.23 no.4
    • /
    • pp.484-494
    • /
    • 2018
  • The uniform motion blur removing algorithm of Cho et al. has the problem that the edge region of the image cannot be restored clearly. We propose the effective algorithm to overcome this problem by using shock filter that reconstructs a blurred step signal into a sharp edge, and convolutional neural network (CNN) that learns by extracting features from the image. Then uniform motion blur kernel is estimated from the latent sharp image to remove blur in the image. The proposed algorithm improved the disadvantages of the conventional algorithm by reconstructing the latent sharp image using shock filter and CNN. Through the experimental results, it was confirmed that the proposed algorithm shows excellent reconstruction performance in objective and subjective image quality than the conventional algorithm.

Two person Interaction Recognition Based on Effective Hybrid Learning

  • Ahmed, Minhaz Uddin;Kim, Yeong Hyeon;Kim, Jin Woo;Bashar, Md Rezaul;Rhee, Phill Kyu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.2
    • /
    • pp.751-770
    • /
    • 2019
  • Action recognition is an essential task in computer vision due to the variety of prospective applications, such as security surveillance, machine learning, and human-computer interaction. The availability of more video data than ever before and the lofty performance of deep convolutional neural networks also make it essential for action recognition in video. Unfortunately, limited crafted video features and the scarcity of benchmark datasets make it challenging to address the multi-person action recognition task in video data. In this work, we propose a deep convolutional neural network-based Effective Hybrid Learning (EHL) framework for two-person interaction classification in video data. Our approach exploits a pre-trained network model (the VGG16 from the University of Oxford Visual Geometry Group) and extends the Faster R-CNN (region-based convolutional neural network a state-of-the-art detector for image classification). We broaden a semi-supervised learning method combined with an active learning method to improve overall performance. Numerous types of two-person interactions exist in the real world, which makes this a challenging task. In our experiment, we consider a limited number of actions, such as hugging, fighting, linking arms, talking, and kidnapping in two environment such simple and complex. We show that our trained model with an active semi-supervised learning architecture gradually improves the performance. In a simple environment using an Intelligent Technology Laboratory (ITLab) dataset from Inha University, performance increased to 95.6% accuracy, and in a complex environment, performance reached 81% accuracy. Our method reduces data-labeling time, compared to supervised learning methods, for the ITLab dataset. We also conduct extensive experiment on Human Action Recognition benchmarks such as UT-Interaction dataset, HMDB51 dataset and obtain better performance than state-of-the-art approaches.

Image Retrieval Based on the Weighted and Regional Integration of CNN Features

  • Liao, Kaiyang;Fan, Bing;Zheng, Yuanlin;Lin, Guangfeng;Cao, Congjun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.3
    • /
    • pp.894-907
    • /
    • 2022
  • The features extracted by convolutional neural networks are more descriptive of images than traditional features, and their convolutional layers are more suitable for retrieving images than are fully connected layers. The convolutional layer features will consume considerable time and memory if used directly to match an image. Therefore, this paper proposes a feature weighting and region integration method for convolutional layer features to form global feature vectors and subsequently use them for image matching. First, the 3D feature of the last convolutional layer is extracted, and the convolutional feature is subsequently weighted again to highlight the edge information and position information of the image. Next, we integrate several regional eigenvectors that are processed by sliding windows into a global eigenvector. Finally, the initial ranking of the retrieval is obtained by measuring the similarity of the query image and the test image using the cosine distance, and the final mean Average Precision (mAP) is obtained by using the extended query method for rearrangement. We conduct experiments using the Oxford5k and Paris6k datasets and their extended datasets, Paris106k and Oxford105k. These experimental results indicate that the global feature extracted by the new method can better describe an image.

AMD Identification from OCT Volume Data Acquired from Heterogeneous OCT Machines using Deep Convolutional Neural Network (이종의 OCT 기기로부터 생성된 볼륨 데이터로부터 심층 컨볼루션 신경망을 이용한 AMD 진단)

  • Kwon, Oh-Heum;Jung, Yoo Jin;Kwon, Ki-Ryong;Song, Ha-Joo
    • Database Research
    • /
    • v.34 no.3
    • /
    • pp.124-136
    • /
    • 2018
  • There have been active research activities to use neural networks to analyze OCT images and make medical decisions. One requirement for these approaches to be promising solutions is that the trained network must be generalized to new devices without a substantial loss of performance. In this paper, we use a deep convolutional neural network to distinguish AMD from normal patients. The network was trained using a data set generated from an OCT device. We observed a significant performance degradation when it was applied to a new data set obtained from a different OCT device. To overcome this performance degradation, we propose an image normalization method which performs segmentation of OCT images to identify the retina area and aligns images so that the retina region lies horizontally in the image. We experimentally evaluated the performance of the proposed method. The experiment confirmed a significant performance improvement of our approach.

Performance Comparison of the Optimizers in a Faster R-CNN Model for Object Detection of Metaphase Chromosomes (중기 염색체 객체 검출을 위한 Faster R-CNN 모델의 최적화기 성능 비교)

  • Jung, Wonseok;Lee, Byeong-Soo;Seo, Jeongwook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.11
    • /
    • pp.1357-1363
    • /
    • 2019
  • In this paper, we compares the performance of the gredient descent optimizers of the Faster Region-based Convolutional Neural Network (R-CNN) model for the chromosome object detection in digital images composed of human metaphase chromosomes. In faster R-CNN, the gradient descent optimizer is used to minimize the objective function of the region proposal network (RPN) module and the classification score and bounding box regression blocks. The gradient descent optimizer. Through performance comparisons among these four gradient descent optimizers in our experiments, we found that the Adamax optimizer could achieve the mean average precision (mAP) of about 52% when considering faster R-CNN with a base network, VGG16. In case of faster R-CNN with a base network, ResNet50, the Adadelta optimizer could achieve the mAP of about 58%.

Korean License Plate Recognition Using CNN (CNN 기반 한국 번호판 인식)

  • Hieu, Tang Quang;Yeon, Seungho;Kim, Jaemin
    • Journal of IKEEE
    • /
    • v.23 no.4
    • /
    • pp.1337-1342
    • /
    • 2019
  • The Automatic Korean license plate recognition (AKLPR) is used in many fields. For many applications, high recognition rate and fast processing speed of ALPR are important. Recent advances in deep learning have improved the accuracy and speed of object detection and recognition, and CNN (Convolutional Neural Network) has been applied to ALPR. The ALPR is divided into the stage of detecting the LP region and the stage of detecting and recognizing the character in the LP region, and each step is implemented with separate CNN. In this paper, we propose a single stage CNN architecture to recognize license plate characters at high speed while keeping high recognition rate.

Region of Interest Localization for Bone Age Estimation Using Whole-Body Bone Scintigraphy

  • Do, Thanh-Cong;Yang, Hyung Jeong;Kim, Soo Hyung;Lee, Guee Sang;Kang, Sae Ryung;Min, Jung Joon
    • Smart Media Journal
    • /
    • v.10 no.2
    • /
    • pp.22-29
    • /
    • 2021
  • In the past decade, deep learning has been applied to various medical image analysis tasks. Skeletal bone age estimation is clinically important as it can help prevent age-related illness and pave the way for new anti-aging therapies. Recent research has applied deep learning techniques to the task of bone age assessment and achieved positive results. In this paper, we propose a bone age prediction method using a deep convolutional neural network. Specifically, we first train a classification model that automatically localizes the most discriminative region of an image and crops it from the original image. The regions of interest are then used as input for a regression model to estimate the age of the patient. The experiments are conducted on a whole-body scintigraphy dataset that was collected by Chonnam National University Hwasun Hospital. The experimental results illustrate the potential of our proposed method, which has a mean absolute error of 3.35 years. Our proposed framework can be used as a robust supporting tool for clinicians to prevent age-related diseases.

Road Surface Damage Detection based on Object Recognition using Fast R-CNN (Fast R-CNN을 이용한 객체 인식 기반의 도로 노면 파손 탐지 기법)

  • Shim, Seungbo;Chun, Chanjun;Ryu, Seung-Ki
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.18 no.2
    • /
    • pp.104-113
    • /
    • 2019
  • The road management institute needs lots of cost to repair road surface damage. These damages are inevitable due to natural factors and aging, but maintenance technologies for efficient repair of the broken road are needed. Various technologies have been developed and applied to cope with such a demand. Recently, maintenance technology for road surface damage repair is being developed using image information collected in the form of a black box installed in a vehicle. There are various methods to extract the damaged region, however, we will discuss the image recognition technology of the deep neural network structure that is actively studied recently. In this paper, we introduce a new neural network which can estimate the road damage and its location in the image by region-based convolution neural network algorithm. In order to develop the algorithm, about 600 images were collected through actual driving. Then, learning was carried out and compared with the existing model, we developed a neural network with 10.67% accuracy.

Evaluation of Building Detection from Aerial Images Using Region-based Convolutional Neural Network for Deep Learning (딥러닝을 위한 영역기반 합성곱 신경망에 의한 항공영상에서 건물탐지 평가)

  • Lee, Dae Geon;Cho, Eun Ji;Lee, Dong-Cheon
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.36 no.6
    • /
    • pp.469-481
    • /
    • 2018
  • DL (Deep Learning) is getting popular in various fields to implement artificial intelligence that resembles human learning and cognition. DL based on complicate structure of the ANN (Artificial Neural Network) requires computing power and computation cost. Variety of DL models with improved performance have been developed with powerful computer specification. The main purpose of this paper is to detect buildings from aerial images and evaluate performance of Mask R-CNN (Region-based Convolutional Neural Network) developed by FAIR (Facebook AI Research) team recently. Mask R-CNN is a R-CNN that is evaluated to be one of the best ANN models in terms of performance for semantic segmentation with pixel-level accuracy. The performance of the DL models is determined by training ability as well as architecture of the ANN. In this paper, we characteristics of the Mask R-CNN with various types of the images and evaluate possibility of the generalization which is the ultimate goal of the DL. As for future study, it is expected that reliability and generalization of DL will be improved by using a variety of spatial information data for training of the DL models.