• Title/Summary/Keyword: Public dataset

Search Result 254, Processing Time 0.029 seconds

Image Semantic Segmentation Using Improved ENet Network

  • Dong, Chaoxian
    • Journal of Information Processing Systems
    • /
    • v.17 no.5
    • /
    • pp.892-904
    • /
    • 2021
  • An image semantic segmentation model is proposed based on improved ENet network in order to achieve the low accuracy of image semantic segmentation in complex environment. Firstly, this paper performs pruning and convolution optimization operations on the ENet network. That is, the network structure is reasonably adjusted for better results in image segmentation by reducing the convolution operation in the decoder and proposing the bottleneck convolution structure. Squeeze-and-excitation (SE) module is then integrated into the optimized ENet network. Small-scale targets see improvement in segmentation accuracy via automatic learning of the importance of each feature channel. Finally, the experiment was verified on the public dataset. This method outperforms the existing comparison methods in mean pixel accuracy (MPA) and mean intersection over union (MIOU) values. And in a short running time, the accuracy of the segmentation and the efficiency of the operation are guaranteed.

Speech Emotion Recognition Using 2D-CNN with Mel-Frequency Cepstrum Coefficients

  • Eom, Youngsik;Bang, Junseong
    • Journal of information and communication convergence engineering
    • /
    • v.19 no.3
    • /
    • pp.148-154
    • /
    • 2021
  • With the advent of context-aware computing, many attempts were made to understand emotions. Among these various attempts, Speech Emotion Recognition (SER) is a method of recognizing the speaker's emotions through speech information. The SER is successful in selecting distinctive 'features' and 'classifying' them in an appropriate way. In this paper, the performances of SER using neural network models (e.g., fully connected network (FCN), convolutional neural network (CNN)) with Mel-Frequency Cepstral Coefficients (MFCC) are examined in terms of the accuracy and distribution of emotion recognition. For Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) dataset, by tuning model parameters, a two-dimensional Convolutional Neural Network (2D-CNN) model with MFCC showed the best performance with an average accuracy of 88.54% for 5 emotions, anger, happiness, calm, fear, and sadness, of men and women. In addition, by examining the distribution of emotion recognition accuracies for neural network models, the 2D-CNN with MFCC can expect an overall accuracy of 75% or more.

A Fall Detection Technique using Features from Multiple Sliding Windows

  • Pant, Sudarshan;Kim, Jinsoo;Lee, Sangdon
    • Smart Media Journal
    • /
    • v.7 no.4
    • /
    • pp.79-89
    • /
    • 2018
  • In recent years, falls among elderly people have gained serious attention as a major cause of injuries. Falls often lead to fatal consequences due to lack of prompt response and rescue. Therefore, a more accurate fall detection system and an effective feature extraction technique are required to prevent and reduce the risk of such incidents. In this paper, we proposed an efficient feature extraction technique based on multiple sliding windows and validated it through a series of experiments using supervised learning algorithms. The experiments were conducted using the public datasets obtained from tri-axial accelerometers. The results depicted that extraction of the feature from adjacent sliding windows led to high accuracy in supervised machine learning-based fall detection. Also, the experiments conducted in this study suggested that the best accuracy can be achieved by keeping the window size as small as 2 seconds. With the kNN classifier and dataset from wearable sensors, the experiments achieved accuracy rates of 94%.

Integrated Navigation Algorithm using Velocity Incremental Vector Approach with ORB-SLAM and Inertial Measurement (속도증분벡터를 활용한 ORB-SLAM 및 관성항법 결합 알고리즘 연구)

  • Kim, Yeonjo;Son, Hyunjin;Lee, Young Jae;Sung, Sangkyung
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.68 no.1
    • /
    • pp.189-198
    • /
    • 2019
  • In recent years, visual-inertial odometry(VIO) algorithms have been extensively studied for the indoor/urban environments because it is more robust to dynamic scenes and environment changes. In this paper, we propose loosely coupled(LC) VIO algorithm that utilizes the velocity vectors from both visual odometry(VO) and inertial measurement unit(IMU) as a filter measurement of Extended Kalman filter. Our approach improves the estimation performance of a filter without adding extra sensors while maintaining simple integration framework, which treats VO as a black box. For the VO algorithm, we employed a fundamental part of the ORB-SLAM, which uses ORB features. We performed an outdoor experiment using an RGB-D camera to evaluate the accuracy of the presented algorithm. Also, we evaluated our algorithm with the public dataset to compare with other visual navigation systems.

L1-norm Minimization based Sparse Approximation Method of EEG for Epileptic Seizure Detection

  • Shin, Younghak;Seong, Jin-Taek
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.12 no.5
    • /
    • pp.521-528
    • /
    • 2019
  • Epilepsy is one of the most prevalent neurological diseases. Electroencephalogram (EEG) signals are widely used for monitoring and diagnosis tool for epileptic seizure. Typically, a huge amount of EEG signals is needed, where they are visually examined by experienced clinicians. In this study, we propose a simple automatic seizure detection framework using intracranial EEG signals. We suggest a sparse approximation based classification (SAC) scheme by solving overdetermined system. L1-norm minimization algorithms are utilized for efficient sparse signal recovery. For evaluation of the proposed scheme, the public EEG dataset obtained by five healthy subjects and five epileptic patients is utilized. The results show that the proposed fast L1-norm minimization based SAC methods achieve the 99.5% classification accuracy which is 1% improved result than the conventional L2 norm based method with negligibly increased execution time (42msec).

Breast Cancer Classification Using Convolutional Neural Network

  • Alshanbari, Eman;Alamri, Hanaa;Alzahrani, Walaa;Alghamdi, Manal
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.6
    • /
    • pp.101-106
    • /
    • 2021
  • Breast cancer is the number one cause of deaths from cancer in women, knowing the type of breast cancer in the early stages can help us to prevent the dangers of the next stage. The performance of the deep learning depends on large number of labeled data, this paper presented convolutional neural network for classification breast cancer from images to benign or malignant. our network contains 11 layers and ends with softmax for the output, the experiments result using public BreakHis dataset, and the proposed methods outperformed the state-of-the-art methods.

X-Ray Security Checkpoint System Using Storage Media Detection Method Based on Deep Learning for Information Security

  • Lee, Han-Sung;Kim Kang-San;Kim, Won-Chan;Woo, Tea-Kun;Jung, Se-Hoon
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.10
    • /
    • pp.1433-1447
    • /
    • 2022
  • Recently, as the demand for physical security technology to prevent leakage of technical and business information of companies and public institutions increases, the high tech companies are operating X-ray security checkpoints at building entrances to protect their intellectual property and technology. X-ray security checkpoints are operated to detect cameras and storage media that may store or leak important technologies in the bags of people entering and leaving the building. In this study, we propose an X-ray security checkpoint system that automatically detects a storage medium in an X-ray image using a deep learning based object detection method. The proposed system consists of an edge computing unit and a cloud-computing unit. We employ the RetinaNet for automatic storage media detection in the X-ray security checkpoint images. The proposed approach achieved mAP of 95.92% on private dataset.

Line-Based SLAM Using Vanishing Point Measurements Loss Function (소실점 정보의 Loss 함수를 이용한 특징선 기반 SLAM)

  • Hyunjun Lim;Hyun Myung
    • The Journal of Korea Robotics Society
    • /
    • v.18 no.3
    • /
    • pp.330-336
    • /
    • 2023
  • In this paper, a novel line-based simultaneous localization and mapping (SLAM) using a loss function of vanishing point measurements is proposed. In general, the Huber norm is used as a loss function for point and line features in feature-based SLAM. The proposed loss function of vanishing point measurements is based on the unit sphere model. Because the point and line feature measurements define the reprojection error in the image plane as a residual, linear loss functions such as the Huber norm is used. However, the typical loss functions are not suitable for vanishing point measurements with unbounded problems. To tackle this problem, we propose a loss function for vanishing point measurements. The proposed loss function is based on unit sphere model. Finally, we prove the validity of the loss function for vanishing point through experiments on a public dataset.

A Contrastive Learning Framework for Weakly Supervised Video Anomaly Detection

  • Hyeon Jeong Park;Je Hyeong Hong
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2022.11a
    • /
    • pp.171-174
    • /
    • 2022
  • Weakly-supervised learning is a widely adopted approach in video anomaly detection whereby only video labels are utilized instead of expensive frame-level annotations. Since the success of multi-instance learning (MIL), almost all recent approaches are based on maximizing the margin between the set of abnormal video snippets and those of normal video snippets. In this work, we present a simple contrastive approach for weakly supervised video anomaly detection (WS-VAD) with aims to enhance the performance of existing models. The method is generic in nature and introduces a loss function to encourage attraction of output features from the same video class and repel those from different video classes. Experimental results demonstrate our method can be applied to existing algorithms to improve detection accuracy in public video anomaly dataset.

  • PDF

Image Enhancement for Visual SLAM in Low Illumination (저조도 환경에서 Visual SLAM을 위한 이미지 개선 방법)

  • Donggil You;Jihoon Jung;Hyeongjun Jeon;Changwan Han;Ilwoo Park;Junghyun Oh
    • The Journal of Korea Robotics Society
    • /
    • v.18 no.1
    • /
    • pp.66-71
    • /
    • 2023
  • As cameras have become primary sensors for mobile robots, vision based Simultaneous Localization and Mapping (SLAM) has achieved impressive results with the recent development of computer vision and deep learning. However, vision information has a disadvantage in that a lot of information disappears in a low-light environment. To overcome the problem, we propose an image enhancement method to perform visual SLAM in a low-light environment. Using the deep generative adversarial models and modified gamma correction, the quality of low-light images were improved. The proposed method is less sharp than the existing method, but it can be applied to ORB-SLAM in real time by dramatically reducing the amount of computation. The experimental results were able to prove the validity of the proposed method by applying to public Dataset TUM and VIVID++.