Search | Korea Science

Residual Convolutional Recurrent Neural Network-Based Sound Event Classification Applicable to Broadcast Captioning Services (자막방송을 위한 잔차 합성곱 순환 신경망 기반 음향 사건 분류)

Kim, Nam Kyun;Kim, Hong Kook;Ahn, Chung Hyun
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2021.06a
- /
- pp.26-27
- /
- 2021
본 논문에서는 자막방송 제공을 위해 방송콘텐츠를 이해하는 방법으로 잔차 합성곱 순환신경망 기반 음향 사건 분류 기법을 제안한다. 제안된 기법은 잔차 합성곱 신경망과 순환 신경망을 연결한 구조를 갖는다. 신경망의 입력 특징으로는 멜-필터벵크 특징을 활용하고, 잔차 합성곱 신경망은 하나의 스템 블록과 5개의 잔차 합성곱 신경망으로 구성된다. 잔차 합성곱 신경망은 잔차 학습으로 구성된 합성곱 신경망과 기존의 합성곱 신경망 대비 특징맵의 표현 능력 향상을 위해 합성곱 블록 주의 모듈로 구성한다. 추출된 특징맵은 순환 신경망에 연결되고, 최종적으로 음향 사건 종류와 시간정보를 추출하는 완전연결층으로 연결되는 구조를 활용한다. 제안된 모델 훈련을 위해 라벨링되지 않는 데이터 활용이 가능한 평균 교사 모델을 기반으로 훈련하였다. 제안된 모델의 성능평가를 위해 DCASE 2020 챌린지 Task 4 데이터 셋을 활용하였으며, 성능 평가 결과 46.8%의 이벤트 단위의 F1-score를 얻을 수 있었다.
PDF

Graph Convolutional - Network Architecture Search : Network architecture search Using Graph Convolution Neural Networks (그래프 합성곱-신경망 구조 탐색 : 그래프 합성곱 신경망을 이용한 신경망 구조 탐색)

Su-Youn Choi;Jong-Youel Park
- The Journal of the Convergence on Culture Technology
- /
- v.9 no.1
- /
- pp.649-654
- /
- 2023
This paper proposes the design of a neural network structure search model using graph convolutional neural networks. Deep learning has a problem of not being able to verify whether the designed model has a structure with optimized performance due to the nature of learning as a black box. The neural network structure search model is composed of a recurrent neural network that creates a model and a convolutional neural network that is the generated network. Conventional neural network structure search models use recurrent neural networks, but in this paper, we propose GC-NAS, which uses graph convolutional neural networks instead of recurrent neural networks to create convolutional neural network models. The proposed GC-NAS uses the Layer Extraction Block to explore depth, and the Hyper Parameter Prediction Block to explore spatial and temporal information (hyper parameters) based on depth information in parallel. Therefore, since the depth information is reflected, the search area is wider, and the purpose of the search area of the model is clear by conducting a parallel search with depth information, so it is judged to be superior in theoretical structure compared to GC-NAS. GC-NAS is expected to solve the problem of the high-dimensional time axis and the range of spatial search of recurrent neural networks in the existing neural network structure search model through the graph convolutional neural network block and graph generation algorithm. In addition, we hope that the GC-NAS proposed in this paper will serve as an opportunity for active research on the application of graph convolutional neural networks to neural network structure search.
https://doi.org/10.17703/JCCT.2023.9.1.649 인용 PDF

Artificial neural network for classifying with epilepsy MEG data (뇌전증 환자의 MEG 데이터에 대한 분류를 위한 인공신경망 적용 연구)

Yujin Han;Junsik Kim;Jaehee Kim
- The Korean Journal of Applied Statistics
- /
- v.37 no.2
- /
- pp.139-155
- /
- 2024
This study performed a multi-classification task to classify mesial temporal lobe epilepsy with left hippocampal sclerosis patients (left mTLE), mesial temporal lobe epilepsy with right hippocampal sclerosis (right mTLE), and healthy controls (HC) using magnetoencephalography (MEG) data. We applied various artificial neural networks and compared the results. As a result of modeling with convolutional neural networks (CNN), recurrent neural networks (RNN), and graph neural networks (GNN), the average k-fold accuracy was excellent in the order of CNN-based model, GNN-based model, and RNN-based model. The wall time was excellent in the order of RNN-based model, GNN-based model, and CNN-based model. The graph neural network, which shows good figures in accuracy, performance, and time, and has excellent scalability of network data, is the most suitable model for brain research in the future.
https://doi.org/10.5351/KJAS.2024.37.2.139 인용 PDF

Earthquake events classification using convolutional recurrent neural network (합성곱 순환 신경망 구조를 이용한 지진 이벤트 분류 기법)

Ku, Bonhwa;Kim, Gwantae;Jang, Su;Ko, Hanseok
- The Journal of the Acoustical Society of Korea
- /
- v.39 no.6
- /
- pp.592-599
- /
- 2020
This paper proposes a Convolutional Recurrent Neural Net (CRNN) structure that can simultaneously reflect both static and dynamic characteristics of seismic waveforms for various earthquake events classification. Addressing various earthquake events, including not only micro-earthquakes and artificial-earthquakes but also macro-earthquakes, requires both effective feature extraction and a classifier that can discriminate seismic waveform under noisy environment. First, we extract the static characteristics of seismic waveform through an attention-based convolution layer. Then, the extracted feature-map is sequentially injected as input to a multi-input single-output Long Short-Term Memory (LSTM) network structure to extract the dynamic characteristic for various seismic event classifications. Subsequently, we perform earthquake events classification through two fully connected layers and softmax function. Representative experimental results using domestic and foreign earthquake database show that the proposed model provides an effective structure for various earthquake events classification.
https://doi.org/10.7776/ASK.2020.39.6.592 인용 PDF KSCI

Multi-channel EEG classification method according to music tempo stimuli using 3D convolutional bidirectional gated recurrent neural network (3차원 합성곱 양방향 게이트 순환 신경망을 이용한 음악 템포 자극에 따른 다채널 뇌파 분류 방식)

Kim, Min-Soo;Lee, Gi Yong;Kim, Hyoung-Gook
- The Journal of the Acoustical Society of Korea
- /
- v.40 no.3
- /
- pp.228-233
- /
- 2021
In this paper, we propose a method to extract and classify features of multi-channel ElectroEncephalo Graphy (EEG) that change according to various musical tempo stimuli. In the proposed method, a 3D convolutional bidirectional gated recurrent neural network extracts spatio-temporal and long time-dependent features from the 3D EEG input representation transformed through the preprocessing. The experimental results show that the proposed tempo stimuli classification method is superior to the existing method and the possibility of constructing a music-based brain-computer interface.
https://doi.org/10.7776/ASK.2021.40.3.228 인용 PDF KSCI

Shooting sound analysis using convolutional neural networks and long short-term memory (합성곱 신경망과 장단기 메모리를 이용한 사격음 분석 기법)

Kang, Se Hyeok;Cho, Ji Woong
- The Journal of the Acoustical Society of Korea
- /
- v.41 no.3
- /
- pp.312-318
- /
- 2022
This paper proposes a model which classifies the type of guns and information about sound source location using deep neural network. The proposed classification model is composed of convolutional neural networks (CNN) and long short-term memory (LSTM). For training and test the model, we use the Gunshot Audio Forensic Dataset generated by the project supported by the National Institute of Justice (NIJ). The acoustic signals are transformed to Mel-Spectrogram and they are provided as learning and test data for the proposed model. The model is compared with the control model consisting of convolutional neural networks only. The proposed model shows high accuracy more than 90 %.
https://doi.org/10.7776/ASK.2022.41.3.312 인용 PDF KSCI

Recurrent Neural Network Based Spectrum Sensing Technique for Cognitive Radio Communications (인지 무선 통신을 위한 순환 신경망 기반 스펙트럼 센싱 기법)

Jung, Tae-Yun;Jeong, Eui-Rim
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.24 no.6
- /
- pp.759-767
- /
- 2020
This paper proposes a new Recurrent neural network (RNN) based spectrum sensing technique for cognitive radio communications. The proposed technique determines the existence of primary user's signal without any prior information of the primary users. The method performs high-speed sampling by considering the whole sensing bandwidth and then converts the signal into frequency spectrum via fast Fourier transform (FFT). This spectrum signal is cut in sensing channel bandwidth and entered into the RNN to determine the channel vacancy. The performance of the proposed technique is verified through computer simulations. According to the results, the proposed one is superior to more than 2 [dB] than the existing threshold-based technique and has similar performance to that of the existing Convolutional neural network (CNN) based method. In addition, experiments are carried out in indoor environments and the results show that the proposed technique performs more than 4 [dB] better than both the conventional threshold-based and the CNN based methods.
https://doi.org/10.6109/jkiice.2020.24.6.759 인용 PDF KSCI

Real-Time Lip Reading System Implementation Based on Deep Learning (딥러닝 기반의 실시간 입모양 인식 시스템 구현)

Cho, Dong-Hun;Kim, Won-Jun
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2020.11a
- /
- pp.267-269
- /
- 2020
입모양 인식(Lip Reading) 기술은 입술 움직임을 통해 발화를 분석하는 기술이다. 본 논문에서는 일상적으로 사용하는 10개의 상용구에 대해서 발화자의 안면 움직임 분석을 통해 실시간으로 분류하는 연구를 진행하였다. 시간상의 연속된 순서를 가진 영상 데이터의 특징을 고려하여 3차원 합성곱 신경망 (Convolutional Neural Network)을 사용하여 진행하였지만, 실시간 시스템 구현을 위해 연산량 감소가 필요했다. 이를 해결하기 위해 차 영상을 이용한 2차원 합성곱 신경망과 LSTM 순환 신경망 (Long Short-Term Memory) 결합 모델을 설계하였고, 해당 모델을 이용하여 실시간 시스템 구현에 성공하였다.
PDF

Learning Recurrent Neural Networks for Activity Detection from Untrimmed Videos (비분할 비디오로부터 행동 탐지를 위한 순환 신경망 학습)

Song, YeongTaek;Suh, Junbae;Kim, Incheol
- Proceedings of the Korea Information Processing Society Conference
- /
- 2017.04a
- /
- pp.892-895
- /
- 2017
본 논문에서는 비분할 비디오로부터 이 비디오에 담긴 사람의 행동을 효과적으로 탐지해내기 위한 심층 신경망 모델을 제안한다. 일반적으로 비디오에서 사람의 행동을 탐지해내는 작업은 크게 비디오에서 행동 탐지에 효과적인 특징들을 추출해내는 과정과 이 특징들을 토대로 비디오에 담긴 행동을 탐지해내는 과정을 포함한다. 본 논문에서는 특징 추출 과정과 행동 탐지 과정에 이용할 심층 신경망 모델을 제시한다. 특히 비디오로부터 각 행동별 시간적, 공간적 패턴을 잘 표현할 수 있는 특징들을 추출해내기 위해서는 C3D 및 I-ResNet 합성곱 신경망 모델을 이용하고, 시계열 특징 벡터들로부터 행동을 자동 판별해내기 위해서는 양방향 BI-LSTM 순환 신경망 모델을 이용한다. 대용량의 공개 벤치 마크 데이터 집합인 ActivityNet 비디오 데이터를 이용한 실험을 통해, 본 논문에서 제안하는 심층 신경망 모델의 성능과 효과를 확인할 수 있었다.
https://doi.org/10.3745/PKIPS.y2017m04a.892 인용 PDF

CNN-LSTM based Autonomous Driving Technology (CNN-LSTM 기반의 자율주행 기술)

Ga-Eun Park;Chi Un Hwang;Lim Se Ryung;Han Seung Jang
- The Journal of the Korea institute of electronic communication sciences
- /
- v.18 no.6
- /
- pp.1259-1268
- /
- 2023
This study proposes a throttle and steering control technology using visual sensors based on deep learning's convolutional and recurrent neural networks. It collects camera image and control value data while driving a training track in clockwise and counterclockwise directions, and generates a model to predict throttle and steering through data sampling and preprocessing for efficient learning. Afterward, the model was validated on a test track in a different environment that was not used for training to find the optimal model and compare it with a CNN (Convolutional Neural Network). As a result, we found that the proposed deep learning model has excellent performance.
https://doi.org/10.13067/JKIECS.2023.18.6.1259 인용 PDF

Search Result 42, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)