Search | Korea Science

Audio Event Detection Based on Attention CRNN (Attention CRNN에 기반한 오디오 이벤트 검출)

Kwak, Jin-Yeol;Chung, Yong-Joo
- The Journal of the Korea institute of electronic communication sciences
- /
- v.15 no.3
- /
- pp.465-472
- /
- 2020
Recently, various deep neural networks based methods have been proposed for audio event detection. In this study, we improved the performance of audio event detection by adopting an attention approach to a baseline CRNN. We applied context gating at the input of the baseline CRNN and added an attention layer at the output. We improved the performance of the attention based CRNN by using the audio data of strong labels in frame units as well as the data of weak labels in clip levels. In the audio event detection experiments using the audio data from the Task 4 of the DCASE 2018/2019 Challenge, we could obtain maximally a 66% relative increase in the F-score in the proposed attention based CRNN compared with the baseline CRNN.
https://doi.org/10.13067/JKIECS.2020.15.3.465 인용 PDF KSCI

Solar Energy Prediction using Environmental Data via Recurrent Neural Network (RNN을 이용한 태양광 에너지 생산 예측)

Liaq, Mudassar;Byun, Yungcheol;Lee, Sang-Joon
- Proceedings of the Korea Information Processing Society Conference
- /
- 2019.10a
- /
- pp.1023-1025
- /
- 2019
Coal and Natural gas are two biggest contributors to a generation of energy throughout the world. Most of these resources create environmental pollution while making energy affecting the natural habitat. Many approaches have been proposed as alternatives to these sources. One of the leading alternatives is Solar Energy which is usually harnessed using solar farms. In artificial intelligence, the most researched area in recent times is machine learning. With machine learning, many tasks which were previously thought to be only humanly doable are done by machine. Neural networks have two major subtypes i.e. Convolutional neural networks (CNN) which are used primarily for classification and Recurrent neural networks which are utilized for time-series predictions. In this paper, we predict energy generated by solar fields and optimal angles for solar panels in these farms for the upcoming seven days using environmental and historical data. We experiment with multiple configurations of RNN using Vanilla and LSTM (Long Short-Term Memory) RNN. We are able to achieve RSME of 0.20739 using LSTMs.
https://doi.org/10.3745/PKIPS.y2019m10a.1023 인용 PDF

Social Media based Real-time Event Detection by using Deep Learning Methods

Nguyen, Van Quan;Yang, Hyung-Jeong;Kim, Young-chul;Kim, Soo-hyung;Kim, Kyungbaek
- Smart Media Journal
- /
- v.6 no.3
- /
- pp.41-48
- /
- 2017
Event detection using social media has been widespread since social network services have been an active communication channel for connecting with others, diffusing news message. Especially, the real-time characteristic of social media has created the opportunity for supporting for real-time applications/systems. Social network such as Twitter is the potential data source to explore useful information by mining messages posted by the user community. This paper proposed a novel system for temporal event detection by analyzing social data. As a result, this information can be used by first responders, decision makers, or news agents to gain insight of the situation. The proposed approach takes advantages of deep learning methods that play core techniques on the main tasks including informative data identifying from a noisy environment and temporal event detection. The former is the responsibility of Convolutional Neural Network model trained from labeled Twitter data. The latter is for event detection supported by Recurrent Neural Network module. We demonstrated our approach and experimental results on the case study of earthquake situations. Our system is more adaptive than other systems used traditional methods since deep learning enables to extract the features of data without spending lots of time constructing feature by hand. This benefit makes our approach adaptive to extend to a new context of practice. Moreover, the proposed system promised to respond to acceptable delay within several minutes that will helpful mean for supporting news channel agents or belief plan in case of disaster events.
PDF KSCI

Exploiting Neural Network for Temporal Multi-variate Air Quality and Pollutant Prediction

Khan, Muneeb A.;Kim, Hyun-chul;Park, Heemin
- Journal of Korea Multimedia Society
- /
- v.25 no.2
- /
- pp.440-449
- /
- 2022
In recent years, the air pollution and Air Quality Index (AQI) has been a pivotal point for researchers due to its effect on human health. Various research has been done in predicting the AQI but most of these studies, either lack dense temporal data or cover one or two air pollutant elements. In this paper, a hybrid Convolutional Neural approach integrated with recurrent neural network architecture (CNN-LSTM), is presented to find air pollution inference using a multivariate air pollutant elements dataset. The aim of this research is to design a robust and real-time air pollutant forecasting system by exploiting a neural network. The proposed approach is implemented on a 24-month dataset from Seoul, Republic of Korea. The predicted results are cross-validated with the real dataset and compared with the state-of-the-art techniques to evaluate its robustness and performance. The proposed model outperforms SVM, SVM-Polynomial, ANN, and RF models with 60.17%, 68.99%, 14.6%, and 6.29%, respectively. The model performs SVM and SVM-Polynomial in predicting O3 by 78.04% and 83.79%, respectively. Overall performance of the model is measured in terms of Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE) and the Root Mean Square Error (RMSE).
https://doi.org/10.9717/kmms.2022.25.2.440 인용 PDF KSCI

Estimating the workability of self-compacting concrete in different mixing conditions based on deep learning

Yang, Liu;An, Xuehui
- Computers and Concrete
- /
- v.25 no.5
- /
- pp.433-445
- /
- 2020
A method is proposed in this paper to estimate the workability of self-compacting concrete (SCC) in different mixing conditions with different mixers and mixing volumes by recording the mixing process based on deep learning (DL). The SCC mixing videos were transformed into a series of image sequences to fit the DL model to predict the SF and VF values of SCC, with four groups in total and approximately thirty thousand image sequence samples. The workability of three groups SCC whose mixing conditions were learned by the DL model, was estimated. One additionally collected group of the SCC whose mixing condition was not learned, was also predicted. The results indicate that whether the SCC mixing condition is included in the training set and learned by the model, the trained model can estimate SCC with different workability effectively at the same time. Our goal to estimate SCC workability in different mixing conditions is achieved.
https://doi.org/10.12989/cac.2020.25.5.433 인용 KSCI

Deepfake Detection using Supervised Temporal Feature Extraction model and LSTM (지도 학습한 시계열적 특징 추출 모델과 LSTM을 활용한 딥페이크 판별 방법)

Lee, Chunghwan;Kim, Jaihoon;Yoon, Kijung
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- fall
- /
- pp.91-94
- /
- 2021
As deep learning technologies becoming developed, realistic fake videos synthesized by deep learning models called "Deepfake" videos became even more difficult to distinguish from original videos. As fake news or Deepfake blackmailing are causing confusion and serious problems, this paper suggests a novel model detecting Deepfake videos. We chose Residual Convolutional Neural Network (Resnet50) as an extraction model and Long Short-Term Memory (LSTM) which is a form of Recurrent Neural Network (RNN) as a classification model. We adopted cosine similarity with hinge loss to train our extraction model in embedding the features of Deepfake and original video. The result in this paper demonstrates that temporal features in the videos are essential for detecting Deepfake videos.
PDF

Higher-Order Conditional Random Field established with CNNs for Video Object Segmentation

Hao, Chuanyan;Wang, Yuqi;Jiang, Bo;Liu, Sijiang;Yang, Zhi-Xin
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.15 no.9
- /
- pp.3204-3220
- /
- 2021
We perform the task of video object segmentation by incorporating a conditional random field (CRF) and convolutional neural networks (CNNs). Most methods employ a CRF to refine a coarse output from fully convolutional networks. Others treat the inference process of the CRF as a recurrent neural network and then combine CNNs and the CRF into an end-to-end model for video object segmentation. In contrast to these methods, we propose a novel higher-order CRF model to solve the problem of video object segmentation. Specifically, we use CNNs to establish a higher-order dependence among pixels, and this dependence can provide critical global information for a segmentation model to enhance the global consistency of segmentation. In general, the optimization of the higher-order energy is extremely difficult. To make the problem tractable, we decompose the higher-order energy into two parts by utilizing auxiliary variables and then solve it by using an iterative process. We conduct quantitative and qualitative analyses on multiple datasets, and the proposed method achieves competitive results.
https://doi.org/10.3837/tiis.2021.09.007 인용 PDF KSCI HTML

Sound event detection based on multi-channel multi-scale neural networks for home monitoring system used by the hard-of-hearing (청각 장애인용 홈 모니터링 시스템을 위한 다채널 다중 스케일 신경망 기반의 사운드 이벤트 검출)

Lee, Gi Yong;Kim, Hyoung-Gook
- The Journal of the Acoustical Society of Korea
- /
- v.39 no.6
- /
- pp.600-605
- /
- 2020
In this paper, we propose a sound event detection method using a multi-channel multi-scale neural networks for sound sensing home monitoring for the hearing impaired. In the proposed system, two channels with high signal quality are selected from several wireless microphone sensors in home. The three features (time difference of arrival, pitch range, and outputs obtained by applying multi-scale convolutional neural network to log mel spectrogram) extracted from the sensor signals are applied to a classifier based on a bidirectional gated recurrent neural network to further improve the performance of sound event detection. The detected sound event result is converted into text along with the sensor position of the selected channel and provided to the hearing impaired. The experimental results show that the sound event detection method of the proposed system is superior to the existing method and can effectively deliver sound information to the hearing impaired.
https://doi.org/10.7776/ASK.2020.39.6.600 인용 PDF KSCI

Design of a 1-D CRNN Model for Prediction of Fine Dust Risk Level (미세먼지 위험 단계 예측을 위한 1-D CRNN 모델 설계)

Lee, Ki-Hyeok;Hwang, Woo-Sung;Choi, Myung-Ryul
- Journal of Digital Convergence
- /
- v.19 no.2
- /
- pp.215-220
- /
- 2021
In order to reduce the harmful effects on the human body caused by the recent increase in the generation of fine dust in Korea, there is a need for technology to help predict the level of fine dust and take precautions. In this paper, we propose a 1D Convolutional-Recurrent Neural Network (1-D CRNN) model to predict the level of fine dust in Korea. The proposed model is a structure that combines the CNN and the RNN, and uses domestic and foreign fine dust, wind direction, and wind speed data for data prediction. The proposed model achieved an accuracy of about 76%(Partial up to 84%). The proposed model aims to data prediction model for time series data sets that need to consider various data in the future.
https://doi.org/10.14400/JDC.2021.19.2.215 인용 PDF KSCI

Performance Improvement of Mean-Teacher Models in Audio Event Detection Using Derivative Features (차분 특징을 이용한 평균-교사 모델의 음향 이벤트 검출 성능 향상)

Kwak, Jin-Yeol;Chung, Yong-Joo
- The Journal of the Korea institute of electronic communication sciences
- /
- v.16 no.3
- /
- pp.401-406
- /
- 2021
Recently, mean-teacher models based on convolutional recurrent neural networks are popularly used in audio event detection. The mean-teacher model is an architecture that consists of two parallel CRNNs and it is possible to train them effectively on the weakly-labelled and unlabeled audio data by using the consistency learning metric at the output of the two neural networks. In this study, we tried to improve the performance of the mean-teacher model by using additional derivative features of the log-mel spectrum. In the audio event detection experiments using the training and test data from the Task 4 of the DCASE 2018/2019 Challenges, we could obtain maximally a 8.1% relative decrease in the ER(Error Rate) in the mean-teacher model using proposed derivative features.
https://doi.org/10.13067/JKIECS.2021.16.3.401 인용 PDF KSCI

Search Result 90, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)