• Title/Summary/Keyword: Residual Network (ResNet)

Search Result 32, Processing Time 0.019 seconds

Spatio-Temporal Residual Networks for Slide Transition Detection in Lecture Videos

  • Liu, Zhijin;Li, Kai;Shen, Liquan;Ma, Ran;An, Ping
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.8
    • /
    • pp.4026-4040
    • /
    • 2019
  • In this paper, we present an approach for detecting slide transitions in lecture videos by introducing the spatio-temporal residual networks. Given a lecture video which records the digital slides, the speaker, and the audience by multiple cameras, our goal is to find keyframes where slide content changes. Since temporal dependency among video frames is important for detecting slide changes, 3D Convolutional Networks has been regarded as an efficient approach to learn the spatio-temporal features in videos. However, 3D ConvNet will cost much training time and need lots of memory. Hence, we utilize ResNet to ease the training of network, which is easy to optimize. Consequently, we present a novel ConvNet architecture based on 3D ConvNet and ResNet for slide transition detection in lecture videos. Experimental results show that the proposed novel ConvNet architecture achieves the better accuracy than other slide progression detection approaches.

A ResNet based multiscale feature extraction for classifying multi-variate medical time series

  • Zhu, Junke;Sun, Le;Wang, Yilin;Subramani, Sudha;Peng, Dandan;Nicolas, Shangwe Charmant
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.5
    • /
    • pp.1431-1445
    • /
    • 2022
  • We construct a deep neural network model named ECGResNet. This model can diagnosis diseases based on 12-lead ECG data of eight common cardiovascular diseases with a high accuracy. We chose the 16 Blocks of ResNet50 as the main body of the model and added the Squeeze-and-Excitation module to learn the data information between channels adaptively. We modified the first convolutional layer of ResNet50 which has a convolutional kernel of 7 to a superposition of convolutional kernels of 8 and 16 as our feature extraction method. This way allows the model to focus on the overall trend of the ECG signal while also noticing subtle changes. The model further improves the accuracy of cardiovascular and cerebrovascular disease classification by using a fully connected layer that integrates factors such as gender and age. The ECGResNet model adds Dropout layers to both the residual block and SE module of ResNet50, further avoiding the phenomenon of model overfitting. The model was eventually trained using a five-fold cross-validation and Flooding training method, with an accuracy of 95% on the test set and an F1-score of 0.841.We design a new deep neural network, innovate a multi-scale feature extraction method, and apply the SE module to extract features of ECG data.

Research on Damage Identification of Buried Pipeline Based on Fiber Optic Vibration Signal

  • Weihong Lin;Wei Peng;Yong Kong;Zimin Shen;Yuzhou Du;Leihong Zhang;Dawei Zhang
    • Current Optics and Photonics
    • /
    • v.7 no.5
    • /
    • pp.511-517
    • /
    • 2023
  • Pipelines play an important role in urban water supply and drainage, oil and gas transmission, etc. This paper presents a technique for pattern recognition of fiber optic vibration signals collected by a distributed vibration sensing (DVS) system using a deep learning residual network (ResNet). The optical fiber is laid on the pipeline, and the signal is collected by the DVS system and converted into a 64 × 64 single-channel grayscale image. The grayscale image is input into the ResNet to extract features, and finally the K-nearest-neighbors (KNN) algorithm is used to achieve the classification and recognition of pipeline damage.

A Study on the Outlet Blockage Determination Technology of Conveyor System using Deep Learning

  • Jeong, Eui-Han;Suh, Young-Joo;Kim, Dong-Ju
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.5
    • /
    • pp.11-18
    • /
    • 2020
  • This study proposes a technique for the determination of outlet blockage using deep learning in a conveyor system. The proposed method aims to apply the best model to the actual process, where we train various CNN models for the determination of outlet blockage using images collected by CCTV in an industrial scene. We used the well-known CNN model such as VGGNet, ResNet, DenseNet and NASNet, and used 18,000 images collected by CCTV for model training and performance evaluation. As a experiment result with various models, VGGNet showed the best performance with 99.03% accuracy and 29.05ms processing time, and we confirmed that VGGNet is suitable for the determination of outlet blockage.

Development of ResNet-based WBC Classification Algorithm Using Super-pixel Image Segmentation

  • Lee, Kyu-Man;Kang, Soon-Ah
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.4
    • /
    • pp.147-153
    • /
    • 2018
  • In this paper, we propose an efficient WBC 14-Diff classification which performs using the WBC-ResNet-152, a type of CNN model. The main point of view is to use Super-pixel for the segmentation of the image of WBC, and to use ResNet for the classification of WBC. A total of 136,164 blood image samples (224x224) were grouped for image segmentation, training, training verification, and final test performance analysis. Image segmentation using super-pixels have different number of images for each classes, so weighted average was applied and therefore image segmentation error was low at 7.23%. Using the training data-set for training 50 times, and using soft-max classifier, TPR average of 80.3% for the training set of 8,827 images was achieved. Based on this, using verification data-set of 21,437 images, 14-Diff classification TPR average of normal WBCs were at 93.4% and TPR average of abnormal WBCs were at 83.3%. The result and methodology of this research demonstrates the usefulness of artificial intelligence technology in the blood cell image classification field. WBC-ResNet-152 based morphology approach is shown to be meaningful and worthwhile method. And based on stored medical data, in-depth diagnosis and early detection of curable diseases is expected to improve the quality of treatment.

Effective Hand Gesture Recognition by Key Frame Selection and 3D Neural Network

  • Hoang, Nguyen Ngoc;Lee, Guee-Sang;Kim, Soo-Hyung;Yang, Hyung-Jeong
    • Smart Media Journal
    • /
    • v.9 no.1
    • /
    • pp.23-29
    • /
    • 2020
  • This paper presents an approach for dynamic hand gesture recognition by using algorithm based on 3D Convolutional Neural Network (3D_CNN), which is later extended to 3D Residual Networks (3D_ResNet), and the neural network based key frame selection. Typically, 3D deep neural network is used to classify gestures from the input of image frames, randomly sampled from a video data. In this work, to improve the classification performance, we employ key frames which represent the overall video, as the input of the classification network. The key frames are extracted by SegNet instead of conventional clustering algorithms for video summarization (VSUMM) which require heavy computation. By using a deep neural network, key frame selection can be performed in a real-time system. Experiments are conducted using 3D convolutional kernels such as 3D_CNN, Inflated 3D_CNN (I3D) and 3D_ResNet for gesture classification. Our algorithm achieved up to 97.8% of classification accuracy on the Cambridge gesture dataset. The experimental results show that the proposed approach is efficient and outperforms existing methods.

Classroom Roll-Call System Based on ResNet Networks

  • Zhu, Jinlong;Yu, Fanhua;Liu, Guangjie;Sun, Mingyu;Zhao, Dong;Geng, Qingtian;Su, Jinbo
    • Journal of Information Processing Systems
    • /
    • v.16 no.5
    • /
    • pp.1145-1157
    • /
    • 2020
  • A convolution neural networks (CNNs) has demonstrated outstanding performance compared to other algorithms in the field of face recognition. Regarding the over-fitting problem of CNN, researchers have proposed a residual network to ease the training for recognition accuracy improvement. In this study, a novel face recognition model based on game theory for call-over in the classroom was proposed. In the proposed scheme, an image with multiple faces was used as input, and the residual network identified each face with a confidence score to form a list of student identities. Face tracking of the same identity or low confidence were determined to be the optimisation objective, with the game participants set formed from the student identity list. Game theory optimises the authentication strategy according to the confidence value and identity set to improve recognition accuracy. We observed that there exists an optimal mapping relation between face and identity to avoid multiple faces associated with one identity in the proposed scheme and that the proposed game-based scheme can reduce the error rate, as compared to the existing schemes with deeper neural network.

A Deep Neural Network Architecture for Real-Time Semantic Segmentation on Embedded Board (임베디드 보드에서 실시간 의미론적 분할을 위한 심층 신경망 구조)

  • Lee, Junyeop;Lee, Youngwan
    • Journal of KIISE
    • /
    • v.45 no.1
    • /
    • pp.94-98
    • /
    • 2018
  • We propose Wide Inception ResNet (WIR Net) an optimized neural network architecture as a real-time semantic segmentation method for autonomous driving. The neural network architecture consists of an encoder that extracts features by applying a residual connection and inception module, and a decoder that increases the resolution by using transposed convolution and a low layer feature map. We also improved the performance by applying an ELU activation function and optimized the neural network by reducing the number of layers and increasing the number of filters. The performance evaluations used an NVIDIA Geforce GTX 1080 and TX1 boards to assess the class and category IoU for cityscapes data in the driving environment. The experimental results show that the accuracy of class IoU 53.4, category IoU 81.8 and the execution speed of $640{\times}360$, $720{\times}480$ resolution image processing 17.8fps and 13.0fps on TX1 board.

Dog recognition system using Deep Learning (딥러닝을 이용한 반려견 개체 인식 시스템)

  • Donguk Kim;Jihyeon Lee;Jihyuk Kong;Hwang Kim;Ho-young Kwak
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2023.07a
    • /
    • pp.519-520
    • /
    • 2023
  • 본 논문에서는 최근 반려동물 등록제가 확대되고 있는 바, 기존의 마이크로 칩 삽입 방법을 회피하고 반려견 이미지를 통하여 개체를 인식하는 방법을 연구하였다. 반려견의 전체 이미지를 학습시켜 해당 개체를 식별하는 지능형 시스템을 ResNet 알고리즘을 이용하여 구현하고, 수집된 반려견의 개체 사진을 학습시켜 필요한 개체를 식별할 수 있도록 하였다.

  • PDF

Enhanced 3D Residual Network for Human Fall Detection in Video Surveillance

  • Li, Suyuan;Song, Xin;Cao, Jing;Xu, Siyang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.12
    • /
    • pp.3991-4007
    • /
    • 2022
  • In the public healthcare, a computational system that can automatically and efficiently detect and classify falls from a video sequence has significant potential. With the advancement of deep learning, which can extract temporal and spatial information, has become more widespread. However, traditional 3D CNNs that usually adopt shallow networks cannot obtain higher recognition accuracy than deeper networks. Additionally, some experiences of neural network show that the problem of gradient explosions occurs with increasing the network layers. As a result, an enhanced three-dimensional ResNet-based method for fall detection (3D-ERes-FD) is proposed to directly extract spatio-temporal features to address these issues. In our method, a 50-layer 3D residual network is used to deepen the network for improving fall recognition accuracy. Furthermore, enhanced residual units with four convolutional layers are developed to efficiently reduce the number of parameters and increase the depth of the network. According to the experimental results, the proposed method outperformed several state-of-the-art methods.