Search | Korea Science

Residual Convolutional Recurrent Neural Network-Based Sound Event Classification Applicable to Broadcast Captioning Services (자막방송을 위한 잔차 합성곱 순환 신경망 기반 음향 사건 분류)

Kim, Nam Kyun;Kim, Hong Kook;Ahn, Chung Hyun
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2021.06a
- /
- pp.26-27
- /
- 2021
본 논문에서는 자막방송 제공을 위해 방송콘텐츠를 이해하는 방법으로 잔차 합성곱 순환신경망 기반 음향 사건 분류 기법을 제안한다. 제안된 기법은 잔차 합성곱 신경망과 순환 신경망을 연결한 구조를 갖는다. 신경망의 입력 특징으로는 멜-필터벵크 특징을 활용하고, 잔차 합성곱 신경망은 하나의 스템 블록과 5개의 잔차 합성곱 신경망으로 구성된다. 잔차 합성곱 신경망은 잔차 학습으로 구성된 합성곱 신경망과 기존의 합성곱 신경망 대비 특징맵의 표현 능력 향상을 위해 합성곱 블록 주의 모듈로 구성한다. 추출된 특징맵은 순환 신경망에 연결되고, 최종적으로 음향 사건 종류와 시간정보를 추출하는 완전연결층으로 연결되는 구조를 활용한다. 제안된 모델 훈련을 위해 라벨링되지 않는 데이터 활용이 가능한 평균 교사 모델을 기반으로 훈련하였다. 제안된 모델의 성능평가를 위해 DCASE 2020 챌린지 Task 4 데이터 셋을 활용하였으며, 성능 평가 결과 46.8%의 이벤트 단위의 F1-score를 얻을 수 있었다.
PDF

System Identification Using Hybrid Recurrent Neural Networks (Hybrid 리커런트 신경망을 이용한 시스템 식별)

Choi Han-Go;Go Il-Whan;Kim Jong-In
- Journal of the Institute of Convergence Signal Processing
- /
- v.6 no.1
- /
- pp.45-52
- /
- 2005
Dynamic neural networks have been applied to diverse fields requiring temporal signal processing. This paper describes system identification using the hybrid neural network, composed of locally(LRNN) and globally recurrent neural networks(GRNN) to improve dynamics of multilayered recurrent networks(RNN). The structure of the hybrid nework combines IIR-MLP as LRNN and Elman RNN as GRNN. The hybrid network is evaluated in linear and nonlinear system identification, and compared with Elman RNN and IIR-MLP networks for the relative comparison of its performance. Simulation results show that the hybrid network performs better with respect to the convergence and accuracy, indicating that it can be a more effective network than conventional multilayered recurrent networks in system identification.
PDF

Graph Convolutional - Network Architecture Search : Network architecture search Using Graph Convolution Neural Networks (그래프 합성곱-신경망 구조 탐색 : 그래프 합성곱 신경망을 이용한 신경망 구조 탐색)

Su-Youn Choi;Jong-Youel Park
- The Journal of the Convergence on Culture Technology
- /
- v.9 no.1
- /
- pp.649-654
- /
- 2023
This paper proposes the design of a neural network structure search model using graph convolutional neural networks. Deep learning has a problem of not being able to verify whether the designed model has a structure with optimized performance due to the nature of learning as a black box. The neural network structure search model is composed of a recurrent neural network that creates a model and a convolutional neural network that is the generated network. Conventional neural network structure search models use recurrent neural networks, but in this paper, we propose GC-NAS, which uses graph convolutional neural networks instead of recurrent neural networks to create convolutional neural network models. The proposed GC-NAS uses the Layer Extraction Block to explore depth, and the Hyper Parameter Prediction Block to explore spatial and temporal information (hyper parameters) based on depth information in parallel. Therefore, since the depth information is reflected, the search area is wider, and the purpose of the search area of the model is clear by conducting a parallel search with depth information, so it is judged to be superior in theoretical structure compared to GC-NAS. GC-NAS is expected to solve the problem of the high-dimensional time axis and the range of spatial search of recurrent neural networks in the existing neural network structure search model through the graph convolutional neural network block and graph generation algorithm. In addition, we hope that the GC-NAS proposed in this paper will serve as an opportunity for active research on the application of graph convolutional neural networks to neural network structure search.
https://doi.org/10.17703/JCCT.2023.9.1.649 인용 PDF

Efficient Fixed-Point Representation for ResNet-50 Convolutional Neural Network (ResNet-50 합성곱 신경망을 위한 고정 소수점 표현 방법)

Kang, Hyeong-Ju
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.22 no.1
- /
- pp.1-8
- /
- 2018
Recently, the convolutional neural network shows high performance in many computer vision tasks. However, convolutional neural networks require enormous amount of operation, so it is difficult to adopt them in the embedded environments. To solve this problem, many studies are performed on the ASIC or FPGA implementation, where an efficient representation method is required. The fixed-point representation is adequate for the ASIC or FPGA implementation but causes a performance degradation. This paper proposes a separate optimization of representations for the convolutional layers and the batch normalization layers. With the proposed method, the required bit width for the convolutional layers is reduced from 16 bits to 10 bits for the ResNet-50 neural network. Since the computation amount of the convolutional layers occupies the most of the entire computation, the bit width reduction in the convolutional layers enables the efficient implementation of the convolutional neural networks.
https://doi.org/10.6109/jkiice.2018.22.1.1 인용 PDF KSCI

Learning of Artificial Neural Networks about the Prosody of Korean Sentences. (인공 신경망의 한국어 운율 학습)

Shin Dong-Yup;Min Kyung-Joong;Lim Un-Cheon
- Proceedings of the Acoustical Society of Korea Conference
- /
- autumn
- /
- pp.121-124
- /
- 2001
음성 합성기의 합성음의 자연감을 높이기 위해 자연음에 내재하는 정확한 운율 법칙을 구하여 음성합성 시스템에서 이를 구현해 주어야 한다 무제한 어휘 음성합성 시스템의 문-음성 합성기에서 필요한 운율 법칙은 언어학적 정보를 이용해 구하거나, 자연음에서 추출하고 있다 그러나 추출한 운율 법칙이 자연음에 내재하는 모든 운율 법칙을 반영하지 못했거나, 잘못 구현되는 경우에는 합성음의 자연성이 떨어지게 된다. 이런 점을 고려하여 본 논문에서는 한국어 자연음을 분석하여 추출한 운율 정보를 인공 신경망이 학습하도록 하고 훈련을 마친 인공 신경망에 문장을 입력하고, 출력으로 나오는 운율 정보와 자연음의 운율 정보를 비교한 결과 제안한 인공 신경망이 자연음에 내재하고 있는 운율을 학습할 수 있음을 알 수 있었다. 운율의 3대 요소는 피치 , 지속시간, 크기의 변화이다. 제안한 인공 신경망이 한국어 문장의 음소 열을 입력으로 받아들이고, 각 음소의 지속시간에 따른 피치변화와 크기 변화를 출력으로 내보내면 자연음을 분석해 구한 각 음소의 운율 정보인 목표 패턴과 출력 패턴 의 오차를 최소화하도록 인공 신경망의 가중치를 조절할 수 있도록 설계하였다. 지속시간에 따른 각 음소의 피치와 크기 변화를 학습시키기 위해 피치 및 크기 인공 신경망을 구성하였다. 이들 인공 신경망을 훈련시키기 위해 먼저 음소 균형 문장 군을 구축하여야 하고, 이들 언어 자료를 특정 화자가 일정 환경에서 읽고 이를 녹음하여 , 분석하여 구한운율 정보를 운율 데이터베이스로 구축하였다. 문장 내의 각 음소에 대해 지속 시간과 피치 변화 그리고 크기 변화를 구하고, 곡선 적응 방법을 이용하여 각 변화 곡선에 대한 다항식 계수와 초기 값을 구해 운율 데이터베이스를 구축한다. 이 운율 데이터베이스의 일부는 인공 신경망을 훈련시키는데 이용하고, 나머지로 인공 신경망의 성능을 평가하여 인공 신경망이 운율 법칙을 학습할 수 있었다. 언어 자료의 문장 수를 늘리고 발음 횟수를 늘려 운율 데이터베이스를 확장하면 인공 신경망의 성능을 높일 수 있고, 문장 내의 음소의 수를 감안하여 인공 신경망의 입력 단자의 수는 계산량과 초분절 요인을 감안하여 결정해야 할 것이다
PDF

A Study on the Input Pattern of Neural Network for Prosody Control in a Korean Sentence (문장 단위 운율 제어를 위한 신경망의 입력 패턴에 관한 연구)

민경중
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1998.08a
- /
- pp.105-109
- /
- 1998
법칙 합성 시스템은 합성 단위, 합성기, 합성방식 등 여러 가지 다양한 시스템이 있으나 순수한 법칙 합성 시스템이 아니고 기본 합성 단위를 연결하여 합성음을 발생시키는 연결 합성 시스템은 연결 단위사이 그리고 문장 단위에서의 매끄러운 합성 계수의 변화를 구현하지 못해 자연감이 떨어지는 실정이다. 자연감에 영향을 끼치는 주요 원인중의 하나가 운율 법칙의 부정확한 구현이므로 자연음으로부터 추출한 운율에 관한 법칙을 알고리듬화하는 대신 신경망으로 하여금 이 운율 법칙을 학습하도록 하여 좀더 자연음의 운율에 근접한 운율을 발생시키고자 하였다. 신경망으로 운율을 발생시키기 위해 먼저 운율에 영향을 주는 요소들을 정해 신경망 입력 패턴을 선정해야 한다. 먼저 분절요인에 의한 영햐응ㄹ 고려해주기 위해 전후 3음소를 동시에 입력시키고 문장내에서의 구문론적인 영향을 고려해주기 위해 해당 음소의 문장내에서의 위치, 운율구에 관한 정보등을 신경망의 입력 패턴으로 구성하였다.
PDF

Asphalt Concrete Pavement Surface Crack Detection using Convolutional Neural Network (합성곱 신경망을 이용한 아스팔트 콘크리트 도로포장 표면균열 검출)

Choi, Yoon-Soo;Kim, Jong-Ho;Cho, Hyun-Chul;Lee, Chang-Joon
- Journal of the Korea institute for structural maintenance and inspection
- /
- v.23 no.6
- /
- pp.38-44
- /
- 2019
A Convolution Neural Network(CNN) model was utilized to detect surface cracks in asphalt concrete pavements. The CNN used for this study consists of five layers with 3×3 convolution filter and 2×2 pooling kernel. Pavement surface crack images collected by automated road surveying equipment was used for the training and testing of the CNN. The performance of the CNN was evaluated using the accuracy, precision, recall, missing rate, and over rate of the surface crack detection. The CNN trained with the largest amount of data shows more than 96.6% of the accuracy, precision, and recall as well as less than 3.4% of the missing rate and the over rate.
https://doi.org/10.11112/jksmi.2019.23.6.38 인용 PDF KSCI

Nonlinear Adaptive Prediction using Locally and Globally Recurrent Neural Networks (지역 및 광역 리커런트 신경망을 이용한 비선형 적응예측)

최한고
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.40 no.1
- /
- pp.139-147
- /
- 2003
Dynamic neural networks have been applied to diverse fields requiring temporal signal processing such as signal prediction. This paper proposes the hybrid network, composed of locally(LRNN) and globally recurrent neural networks(GRNN), to improve dynamics of multilayered recurrent networks(RNN) and then describes nonlinear adaptive prediction using the proposed network as an adaptive filter. The hybrid network consists of IIR-MLP and Elman RNN as LRNN and GRNN, respectively. The proposed network is evaluated in nonlinear signal prediction and compared with Elman RNN and IIR-MLP networks for the relative comparison of prediction performance. Experimental results show that the hybrid network performs better with respect to convergence speed and accuracy, indicating that the proposed network can be a more effective prediction model than conventional multilayered recurrent networks in nonlinear prediction for nonstationary signals.
PDF KSCI

Depth map generation using convolutional neural network (합성곱 신경망을 이용한 깊이맵 생성)

Kim, Hong-Jin;Kim, Manbae
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2017.11a
- /
- pp.34-35
- /
- 2017
본 논문에서는 영상으로부터 생성된 깊이맵을 합성곱 신경망(CNN)으로 재생성하는 방법을 제안한다. 합성곱 신경망은 영상인식, 영상분류에 좋은 성능을 보여주는데, 이 기술을 깊이맵 생성에 활용하여 기 제작된 깊이맵 생성 기법을 간단한 합성곱 신경망으로 구현하고자 한다. 성능 실험에서는 10개의 비디오 세트에 제안 방법을 적용한 결과, 만족스러운 결과를 얻었다.
PDF

Artificial neural network for classifying with epilepsy MEG data (뇌전증 환자의 MEG 데이터에 대한 분류를 위한 인공신경망 적용 연구)

Yujin Han;Junsik Kim;Jaehee Kim
- The Korean Journal of Applied Statistics
- /
- v.37 no.2
- /
- pp.139-155
- /
- 2024
This study performed a multi-classification task to classify mesial temporal lobe epilepsy with left hippocampal sclerosis patients (left mTLE), mesial temporal lobe epilepsy with right hippocampal sclerosis (right mTLE), and healthy controls (HC) using magnetoencephalography (MEG) data. We applied various artificial neural networks and compared the results. As a result of modeling with convolutional neural networks (CNN), recurrent neural networks (RNN), and graph neural networks (GNN), the average k-fold accuracy was excellent in the order of CNN-based model, GNN-based model, and RNN-based model. The wall time was excellent in the order of RNN-based model, GNN-based model, and CNN-based model. The graph neural network, which shows good figures in accuracy, performance, and time, and has excellent scalability of network data, is the most suitable model for brain research in the future.
https://doi.org/10.5351/KJAS.2024.37.2.139 인용 PDF

Search Result 641, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)