• 제목/요약/키워드: Deep Learning Dataset

검색결과 815건 처리시간 0.027초

Sign Language Translation Using Deep Convolutional Neural Networks

  • Abiyev, Rahib H.;Arslan, Murat;Idoko, John Bush
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제14권2호
    • /
    • pp.631-653
    • /
    • 2020
  • Sign language is a natural, visually oriented and non-verbal communication channel between people that facilitates communication through facial/bodily expressions, postures and a set of gestures. It is basically used for communication with people who are deaf or hard of hearing. In order to understand such communication quickly and accurately, the design of a successful sign language translation system is considered in this paper. The proposed system includes object detection and classification stages. Firstly, Single Shot Multi Box Detection (SSD) architecture is utilized for hand detection, then a deep learning structure based on the Inception v3 plus Support Vector Machine (SVM) that combines feature extraction and classification stages is proposed to constructively translate the detected hand gestures. A sign language fingerspelling dataset is used for the design of the proposed model. The obtained results and comparative analysis demonstrate the efficiency of using the proposed hybrid structure in sign language translation.

Adaptive low-resolution palmprint image recognition based on channel attention mechanism and modified deep residual network

  • Xu, Xuebin;Meng, Kan;Xing, Xiaomin;Chen, Chen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권3호
    • /
    • pp.757-770
    • /
    • 2022
  • Palmprint recognition has drawn increasingly attentions in the past decade due to its uniqueness and reliability. Traditional palmprint recognition methods usually use high-resolution images as the identification basis so that they can achieve relatively high precision. However, high-resolution images mean more computation cost in the recognition process, which usually cannot be guaranteed in mobile computing. Therefore, this paper proposes an improved low-resolution palmprint image recognition method based on residual networks. The main contributions include: 1) We introduce a channel attention mechanism to refactor the extracted feature maps, which can pay more attention to the informative feature maps and suppress the useless ones. 2) The ResStage group structure proposed by us divides the original residual block into three stages, and we stabilize the signal characteristics before each stage by means of BN normalization operation to enhance the feature channel. Comparison experiments are conducted on a public dataset provided by the Hong Kong Polytechnic University. Experimental results show that the proposed method achieve a rank-1 accuracy of 98.17% when tested on low-resolution images with the size of 12dpi, which outperforms all the compared methods obviously.

Deep Local Multi-level Feature Aggregation Based High-speed Train Image Matching

  • Li, Jun;Li, Xiang;Wei, Yifei;Wang, Xiaojun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권5호
    • /
    • pp.1597-1610
    • /
    • 2022
  • At present, the main method of high-speed train chassis detection is using computer vision technology to extract keypoints from two related chassis images firstly, then matching these keypoints to find the pixel-level correspondence between these two images, finally, detection and other steps are performed. The quality and accuracy of image matching are very important for subsequent defect detection. Current traditional matching methods are difficult to meet the actual requirements for the generalization of complex scenes such as weather, illumination, and seasonal changes. Therefore, it is of great significance to study the high-speed train image matching method based on deep learning. This paper establishes a high-speed train chassis image matching dataset, including random perspective changes and optical distortion, to simulate the changes in the actual working environment of the high-speed rail system as much as possible. This work designs a convolutional neural network to intensively extract keypoints, so as to alleviate the problems of current methods. With multi-level features, on the one hand, the network restores low-level details, thereby improving the localization accuracy of keypoints, on the other hand, the network can generate robust keypoint descriptors. Detailed experiments show the huge improvement of the proposed network over traditional methods.

A study on the characteristics of applying oversampling algorithms to Fosberg Fire-Weather Index (FFWI) data

  • Sang Yeob Kim;Dongsoo Lee;Jung-Doung Yu;Hyung-Koo Yoon
    • Smart Structures and Systems
    • /
    • 제34권1호
    • /
    • pp.9-15
    • /
    • 2024
  • Oversampling algorithms are methods employed in the field of machine learning to address the constraints associated with data quantity. This study aimed to explore the variations in reliability as data volume is progressively increased through the use of oversampling algorithms. For this purpose, the synthetic minority oversampling technique (SMOTE) and the borderline synthetic minority oversampling technique (BSMOTE) are chosen. The data inputs, which included air temperature, humidity, and wind speed, are parameters used in the Fosberg Fire-Weather Index (FFWI). Starting with a base of 52 entries, new data sets are generated by incrementally increasing the data volume by 10% up to a total increase of 100%. This augmented data is then utilized to predict FFWI using a deep neural network. The coefficient of determination (R2) is calculated for predictions made with both the original and the augmented datasets. Suggesting that increasing data volume by more than 50% of the original dataset quantity yields more reliable outcomes. This study introduces a methodology to alleviate the challenge of establishing a standard for data augmentation when employing oversampling algorithms, as well as a means to assess reliability.

VGG16을 활용한 미학습 농작물의 효율적인 질병 진단 모델 (An Efficient Disease Inspection Model for Untrained Crops Using VGG16)

  • 정석봉;윤협상
    • 한국시뮬레이션학회논문지
    • /
    • 제29권4호
    • /
    • pp.1-7
    • /
    • 2020
  • 농작물 질병에 대한 조기 진단은 질병의 확산을 억제하고 농업 생산성을 증대하는 데에 있어 중요한 역할을 하고 있다. 최근 합성곱신경망(convolutional neural network, CNN)과 같은 딥러닝 기법을 활용하여 농작물 잎사귀 이미지 데이터세트를 분석하여 농작물 질병을 진단하는 다수의 연구가 진행되었다. 이와 같은 연구를 통해 농작물 질병을 90% 이상의 정확도로 분류할 수 있지만, 사전 학습된 농작물 질병 외에는 진단할 수 없다는 한계를 갖는다. 본 연구에서는 미학습 농작물에 대해 효율적으로 질병 여부를 진단하는 모델을 제안한다. 이를 위해, 먼저 VGG16을 활용한 농작물 질병 분류기(CDC)를 구축하고 PlantVillage 데이터세트을 통해 학습하였다. 이어 미학습 농작물의 질병 진단이 가능하도록 수정된 질병 분류기(mCDC)의 구축방안을 제안하였다. 실험을 통해 본 연구에서 제안한 수정된 질병 분류기(mCDC)가 미학습 농작물의 질병진단에 대해 기존 질병 분류기(CDC)보다 높은 성능을 보임을 확인하였다.

수피 특징 추출을 위한 상용 DCNN 모델의 비교와 다층 퍼셉트론을 이용한 수종 인식 (Comparison of Off-the-Shelf DCNN Models for Extracting Bark Feature and Tree Species Recognition Using Multi-layer Perceptron)

  • 김민기
    • 한국멀티미디어학회논문지
    • /
    • 제23권9호
    • /
    • pp.1155-1163
    • /
    • 2020
  • Deep learning approach is emerging as a new way to improve the accuracy of tree species identification using bark image. However, the approach has not been studied enough because it is confronted with the problem of acquiring a large volume of bark image dataset. This study solved this problem by utilizing a pretrained off-the-shelf DCNN model. It compares the discrimination power of bark features extracted by each DCNN model. Then it extracts the features by using a selected DCNN model and feeds them to a multi-layer perceptron (MLP). We found out that the ResNet50 model is effective in extracting bark features and the MLP could be trained well with the features reduced by the principal component analysis. The proposed approach gives accuracy of 99.1% and 98.4% for BarkTex and Trunk12 datasets respectively.

Triplet CNN과 학습 데이터 합성 기반 비디오 안정화기 연구 (Study on the Video Stabilizer based on a Triplet CNN and Training Dataset Synthesis)

  • 양병호;이명진
    • 방송공학회논문지
    • /
    • 제25권3호
    • /
    • pp.428-438
    • /
    • 2020
  • 영상 내 흔들림은 비디오의 가시성을 떨어뜨리고 영상처리나 영상압축의 효율을 저하시킨다. 최근 디지털 영상처리 분야에 딥러닝이 본격 적용되고 있으나, 비디오 안정화 분야에 딥러닝 적용은 아직 초기 단계이다. 본 논문에서는 Wobbling 왜곡 경감을 위한 triplet 형태의 CNN 기반 비디오 안정화기 구조를 제안하고, 비디오 안정화기 학습을 위한 학습데이터 합성 방법을 제안한다. 제안한 CNN 기반 비디오 안정화기는 기존 딥러닝 기반 비디오 안정화기와 비교되었으며, Wobbling 왜곡은 감소하고 더 안정적인 학습이 이루어지는 결과를 얻었다.

Content-Aware Convolutional Neural Network for Object Recognition Task

  • Poernomo, Alvin;Kang, Dae-Ki
    • International journal of advanced smart convergence
    • /
    • 제5권3호
    • /
    • pp.1-7
    • /
    • 2016
  • In existing Convolutional Neural Network (CNNs) for object recognition task, there are only few efforts known to reduce the noises from the images. Both convolution and pooling layers perform the features extraction without considering the noises of the input image, treating all pixels equally important. In computer vision field, there has been a study to weight a pixel importance. Seam carving resizes an image by sacrificing the least important pixels, leaving only the most important ones. We propose a new way to combine seam carving approach with current existing CNN model for object recognition task. We attempt to remove the noises or the "unimportant" pixels in the image before doing convolution and pooling, in order to get better feature representatives. Our model shows promising result with CIFAR-10 dataset.

Document Summarization Model Based on General Context in RNN

  • Kim, Heechan;Lee, Soowon
    • Journal of Information Processing Systems
    • /
    • 제15권6호
    • /
    • pp.1378-1391
    • /
    • 2019
  • In recent years, automatic document summarization has been widely studied in the field of natural language processing thanks to the remarkable developments made using deep learning models. To decode a word, existing models for abstractive summarization usually represent the context of a document using the weighted hidden states of each input word when they decode it. Because the weights change at each decoding step, these weights reflect only the local context of a document. Therefore, it is difficult to generate a summary that reflects the overall context of a document. To solve this problem, we introduce the notion of a general context and propose a model for summarization based on it. The general context reflects overall context of the document that is independent of each decoding step. Experimental results using the CNN/Daily Mail dataset show that the proposed model outperforms existing models.

Lightweight image classifier for CIFAR-10

  • Sharma, Akshay Kumar;Rana, Amrita;Kim, Kyung Ki
    • 센서학회지
    • /
    • 제30권5호
    • /
    • pp.286-289
    • /
    • 2021
  • Image classification is one of the fundamental applications of computer vision. It enables a system to identify an object in an image. Recently, image classification applications have broadened their scope from computer applications to edge devices. The convolutional neural network (CNN) is the main class of deep learning neural networks that are widely used in computer tasks, and it delivers high accuracy. However, CNN algorithms use a large number of parameters and incur high computational costs, which hinder their implementation in edge hardware devices. To address this issue, this paper proposes a lightweight image classifier that provides good accuracy while using fewer parameters. The proposed image classifier diverts the input into three paths and utilizes different scales of receptive fields to extract more feature maps while using fewer parameters at the time of training. This results in the development of a model of small size. This model is tested on the CIFAR-10 dataset and achieves an accuracy of 90% using .26M parameters. This is better than the state-of-the-art models, and it can be implemented on edge devices.