• Title/Summary/Keyword: deep network

Search Result 2,983, Processing Time 0.033 seconds

Dysarthric speaker identification with different degrees of dysarthria severity using deep belief networks

  • Farhadipour, Aref;Veisi, Hadi;Asgari, Mohammad;Keyvanrad, Mohammad Ali
    • ETRI Journal
    • /
    • v.40 no.5
    • /
    • pp.643-652
    • /
    • 2018
  • Dysarthria is a degenerative disorder of the central nervous system that affects the control of articulation and pitch; therefore, it affects the uniqueness of sound produced by the speaker. Hence, dysarthric speaker recognition is a challenging task. In this paper, a feature-extraction method based on deep belief networks is presented for the task of identifying a speaker suffering from dysarthria. The effectiveness of the proposed method is demonstrated and compared with well-known Mel-frequency cepstral coefficient features. For classification purposes, the use of a multi-layer perceptron neural network is proposed with two structures. Our evaluations using the universal access speech database produced promising results and outperformed other baseline methods. In addition, speaker identification under both text-dependent and text-independent conditions are explored. The highest accuracy achieved using the proposed system is 97.3%.

Real-time photoplethysmographic heart rate measurement using deep neural network filters

  • Kim, Ji Woon;Park, Sung Min;Choi, Seong Wook
    • ETRI Journal
    • /
    • v.43 no.5
    • /
    • pp.881-890
    • /
    • 2021
  • Photoplethysmography (PPG) is a noninvasive technique that can be used to conveniently measure heart rate (HR) and thus obtain relevant health-related information. However, developing an automated PPG system is difficult, because its waveforms are susceptible to motion artifacts and between-patient variation, making its interpretation difficult. We use deep neural network (DNN) filters to mimic the cognitive ability of a human expert who can distinguish the features of PPG altered by noise from various sources. Systolic (S), onset (O), and first derivative peaks (W) are recognized by three different DNN filters. In addition, the boundaries of uninformative regions caused by artifacts are identified by two different filters. The algorithm reliably derives the HR and presents recognition scores for the S, O, and W peaks and artifacts with only a 0.7-s delay. In the evaluation using data from 11 patients obtained from PhysioNet, the algorithm yields 8643 (86.12%) reliable HR measurements from a total of 10 036 heartbeats, including some with uninformative data resulting from arrhythmias and artifacts.

Traffic Light Recognition Using a Deep Convolutional Neural Network (심층 합성곱 신경망을 이용한 교통신호등 인식)

  • Kim, Min-Ki
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.11
    • /
    • pp.1244-1253
    • /
    • 2018
  • The color of traffic light is sensitive to various illumination conditions. Especially it loses the hue information when oversaturation happens on the lighting area. This paper proposes a traffic light recognition method robust to these illumination variations. The method consists of two steps of traffic light detection and recognition. It just uses the intensity and saturation in the first step of traffic light detection. It delays the use of hue information until it reaches to the second step of recognizing the signal of traffic light. We utilized a deep learning technique in the second step. We designed a deep convolutional neural network(DCNN) which is composed of three convolutional networks and two fully connected networks. 12 video clips were used to evaluate the performance of the proposed method. Experimental results show the performance of traffic light detection reporting the precision of 93.9%, the recall of 91.6%, and the recognition accuracy of 89.4%. Considering that the maximum distance between the camera and traffic lights is 70m, the results shows that the proposed method is effective.

Application of Deep Recurrent Q Network with Dueling Architecture for Optimal Sepsis Treatment Policy

  • Do, Thanh-Cong;Yang, Hyung Jeong;Ho, Ngoc-Huynh
    • Smart Media Journal
    • /
    • v.10 no.2
    • /
    • pp.48-54
    • /
    • 2021
  • Sepsis is one of the leading causes of mortality globally, and it costs billions of dollars annually. However, treating septic patients is currently highly challenging, and more research is needed into a general treatment method for sepsis. Therefore, in this work, we propose a reinforcement learning method for learning the optimal treatment strategies for septic patients. We model the patient physiological time series data as the input for a deep recurrent Q-network that learns reliable treatment policies. We evaluate our model using an off-policy evaluation method, and the experimental results indicate that it outperforms the physicians' policy, reducing patient mortality up to 3.04%. Thus, our model can be used as a tool to reduce patient mortality by supporting clinicians in making dynamic decisions.

Video Saliency Detection Using Bi-directional LSTM

  • Chi, Yang;Li, Jinjiang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.6
    • /
    • pp.2444-2463
    • /
    • 2020
  • Significant detection of video can more rationally allocate computing resources and reduce the amount of computation to improve accuracy. Deep learning can extract the edge features of the image, providing technical support for video saliency. This paper proposes a new detection method. We combine the Convolutional Neural Network (CNN) and the Deep Bidirectional LSTM Network (DB-LSTM) to learn the spatio-temporal features by exploring the object motion information and object motion information to generate video. A continuous frame of significant images. We also analyzed the sample database and found that human attention and significant conversion are time-dependent, so we also considered the significance detection of video cross-frame. Finally, experiments show that our method is superior to other advanced methods.

Lightweight CNN based Meter Digit Recognition

  • Sharma, Akshay Kumar;Kim, Kyung Ki
    • Journal of Sensor Science and Technology
    • /
    • v.30 no.1
    • /
    • pp.15-19
    • /
    • 2021
  • Image processing is one of the major techniques that are used for computer vision. Nowadays, researchers are using machine learning and deep learning for the aforementioned task. In recent years, digit recognition tasks, i.e., automatic meter recognition approach using electric or water meters, have been studied several times. However, two major issues arise when we talk about previous studies: first, the use of the deep learning technique, which includes a large number of parameters that increase the computational cost and consume more power; and second, recent studies are limited to the detection of digits and not storing or providing detected digits to a database or mobile applications. This paper proposes a system that can detect the digital number of meter readings using a lightweight deep neural network (DNN) for low power consumption and send those digits to an Android mobile application in real-time to store them and make life easy. The proposed lightweight DNN is computationally inexpensive and exhibits accuracy similar to those of conventional DNNs.

Implementation of an Autostereoscopic Virtual 3D Button in Non-contact Manner Using Simple Deep Learning Network

  • You, Sang-Hee;Hwang, Min;Kim, Ki-Hoon;Cho, Chang-Suk
    • Journal of Information Processing Systems
    • /
    • v.17 no.3
    • /
    • pp.505-517
    • /
    • 2021
  • This research presented an implementation of autostereoscopic virtual three-dimensional (3D) button device as non-contact style. The proposed device has several characteristics about visible feature, non-contact use and artificial intelligence (AI) engine. The device was designed to be contactless to prevent virus contamination and consists of 3D buttons in a virtual stereoscopic view. To specify the button pressed virtually by fingertip pointing, a simple deep learning network having two stages without convolution filters was designed. As confirmed in the experiment, if the input data composition is clearly designed, the deep learning network does not need to be configured so complexly. As the results of testing and evaluation by the certification institute, the proposed button device shows high reliability and stability.

No-Reference Sports Video-Quality Assessment Using 3D Shearlet Transform and Deep Residual Neural Network (3차원 쉐어렛 변환과 심층 잔류 신경망을 이용한 무참조 스포츠 비디오 화질 평가)

  • Lee, Gi Yong;Shin, Seung-Su;Kim, Hyoung-Gook
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.12
    • /
    • pp.1447-1453
    • /
    • 2020
  • In this paper, we propose a method for no-reference quality assessment of sports videos using 3D shearlet transform and deep residual neural networks. In the proposed method, 3D shearlet transform-based spatiotemporal features are extracted from the overlapped video blocks and applied to logistic regression concatenated with a deep residual neural network based on a conditional video block-wise constraint to learn the spatiotemporal correlation and predict the quality score. Our evaluation reveals that the proposed method predicts the video quality with higher accuracy than the conventional no-reference video quality assessment methods.

SkelGAN: A Font Image Skeletonization Method

  • Ko, Debbie Honghee;Hassan, Ammar Ul;Majeed, Saima;Choi, Jaeyoung
    • Journal of Information Processing Systems
    • /
    • v.17 no.1
    • /
    • pp.1-13
    • /
    • 2021
  • In this research, we study the problem of font image skeletonization using an end-to-end deep adversarial network, in contrast with the state-of-the-art methods that use mathematical algorithms. Several studies have been concerned with skeletonization, but a few have utilized deep learning. Further, no study has considered generative models based on deep neural networks for font character skeletonization, which are more delicate than natural objects. In this work, we take a step closer to producing realistic synthesized skeletons of font characters. We consider using an end-to-end deep adversarial network, SkelGAN, for font-image skeletonization, in contrast with the state-of-the-art methods that use mathematical algorithms. The proposed skeleton generator is proved superior to all well-known mathematical skeletonization methods in terms of character structure, including delicate strokes, serifs, and even special styles. Experimental results also demonstrate the dominance of our method against the state-of-the-art supervised image-to-image translation method in font character skeletonization task.

Efficient Driver Attention Monitoring Using Pre-Trained Deep Convolution Neural Network Models

  • Kim, JongBae
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.14 no.2
    • /
    • pp.119-128
    • /
    • 2022
  • Recently, due to the development of related technologies for autonomous vehicles, driving work is changing more safely. However, the development of support technologies for level 5 full autonomous driving is still insufficient. That is, even in the case of an autonomous vehicle, the driver needs to drive through forward attention while driving. In this paper, we propose a method to monitor driving tasks by recognizing driver behavior. The proposed method uses pre-trained deep convolutional neural network models to recognize whether the driver's face or body has unnecessary movement. The use of pre-trained Deep Convolitional Neural Network (DCNN) models enables high accuracy in relatively short time, and has the advantage of overcoming limitations in collecting a small number of driver behavior learning data. The proposed method can be applied to an intelligent vehicle safety driving support system, such as driver drowsy driving detection and abnormal driving detection.