Search | Korea Science

Korean Phoneme Recognition Model with Deep CNN (Deep CNN 기반의 한국어 음소 인식 모델 연구)

Hong, Yoon Seok;Ki, Kyung Seo;Gweon, Gahgene
- Proceedings of the Korea Information Processing Society Conference
- /
- 2018.05a
- /
- pp.398-401
- /
- 2018
본 연구에서는 심충 합성곱 신경망(Deep CNN)과 Connectionist Temporal Classification (CTC) 알고리즘을 사용하여 강제정렬 (force-alignment)이 이루어진 코퍼스 없이도 학습이 가능한 음소 인식 모델을 제안한다. 최근 해외에서는 순환 신경망(RNN)과 CTC 알고리즘을 사용한 딥 러닝 기반의 음소 인식 모델이 활발히 연구되고 있다. 하지만 한국어 음소 인식에는 HMM-GMM 이나 인공 신경망과 HMM 을 결합한 하이브리드 시스템이 주로 사용되어 왔으며, 이 방법 은 최근의 해외 연구 사례들보다 성능 개선의 여지가 적고 전문가가 제작한 강제정렬 코퍼스 없이는 학습이 불가능하다는 단점이 있다. 또한 RNN 은 학습 데이터가 많이 필요하고 학습이 까다롭다는 단점이 있어, 코퍼스가 부족하고 기반 연구가 활발하게 이루어지지 않은 한국어의 경우 사용에 제약이 있다. 이에 본 연구에서는 강제정렬 코퍼스를 필요로 하지 않는 CTC 알고리즘을 도입함과 동시에, RNN 에 비해 더 학습 속도가 빠르고 더 적은 데이터로도 학습이 가능한 합성곱 신경망(CNN)을 사용하여 딥 러닝 모델을 구축하여 한국어 음소 인식을 수행하여 보고자 하였다. 이 모델을 통해 본 연구에서는 한국어에 존재하는 49 가지의 음소를 추출하는 세 종류의 음소 인식기를 제작하였으며, 최종적으로 선정된 음소 인식 모델의 PER(phoneme Error Rate)은 9.44 로 나타났다. 선행 연구 사례와 간접적으로 비교하였을 때, 이 결과는 제안하는 모델이 기존 연구 사례와 대등하거나 조금 더 나은 성능을 보인다고 할 수 있다.
https://doi.org/10.3745/PKIPS.y2018m05a.398 인용 PDF

Microcontroller-based Gesture Recognition using 1D CNN (1D CNN을 이용한 마이크로컨트롤러기반 제스처 인식)

Kim, Ji-Hye;Choi, Kwon-Taeg
- Proceedings of the Korean Society of Computer Information Conference
- /
- 2021.01a
- /
- pp.219-220
- /
- 2021
본 논문에서는 마이크로컨트롤러에서 6축 IMU 센서를 사용한 제스쳐를 인식하기 위한 최적화된 학습 방법을 제안한다. 6축 센서값을 119번 샘플링할 경우 특징 차원이 매우 크기 때문에 다층 신경망을 이용할 경우 학습파라미터가 마이크로컨트롤러의 메모리 허용량을 초과하게 된다. 본 논문은 성능은 유지하며 학습 파라미터 개수를 효과적으로 줄이기 위한 마이크로컨트롤러에 최적화된 1D CNN을 제안한다.
PDF

A Study on the Gender and Age Classification of Speech Data Using CNN (CNN을 이용한 음성 데이터 성별 및 연령 분류 기술 연구)

Park, Dae-Seo;Bang, Joon-Il;Kim, Hwa-Jong;Ko, Young-Jun
- The Journal of Korean Institute of Information Technology
- /
- v.16 no.11
- /
- pp.11-21
- /
- 2018
Research is carried out to categorize voices using Deep Learning technology. The study examines neural network-based sound classification studies and suggests improved neural networks for voice classification. Related studies studied urban data classification. However, related studies showed poor performance in shallow neural network. Therefore, in this paper the first preprocess voice data and extract feature value. Next, Categorize the voice by entering the feature value into previous sound classification network and proposed neural network. Finally, compare and evaluate classification performance of the two neural networks. The neural network of this paper is organized deeper and wider so that learning is better done. Performance results showed that 84.8 percent of related studies neural networks and 91.4 percent of the proposed neural networks. The proposed neural network was about 6 percent high.
https://doi.org/10.14801/jkiit.2018.16.11.11 인용 KSCI

A Study on Compression of Connections in Deep Artificial Neural Networks (인공신경망의 연결압축에 대한 연구)

Ahn, Heejune
- Journal of Korea Society of Industrial Information Systems
- /
- v.22 no.5
- /
- pp.17-24
- /
- 2017
Recently Deep-learning, Technologies using Large or Deep Artificial Neural Networks, have Shown Remarkable Performance, and the Increasing Size of the Network Contributes to its Performance Improvement. However, the Increase in the Size of the Neural Network Leads to an Increase in the Calculation Amount, which Causes Problems Such as Circuit Complexity, Price, Heat Generation, and Real-time Restriction. In This Paper, We Propose and Test a Method to Reduce the Number of Network Connections by Effectively Pruning the Redundancy in the Connection and Showing the Difference between the Performance and the Desired Range of the Original Neural Network. In Particular, we Proposed a Simple Method to Improve the Performance by Re-learning and to Guarantee the Desired Performance by Allocating the Error Rate per Layer in Order to Consider the Difference of each Layer. Experiments have been Performed on a Typical Neural Network Structure such as FCN (full connection network) and CNN (convolution neural network) Structure and Confirmed that the Performance Similar to that of the Original Neural Network can be Obtained by Only about 1/10 Connection.
https://doi.org/10.9723/jksiis.2017.22.5.017 인용 PDF KSCI

Speech emotion recognition using attention mechanism-based deep neural networks (주목 메커니즘 기반의 심층신경망을 이용한 음성 감정인식)

Ko, Sang-Sun;Cho, Hye-Seung;Kim, Hyoung-Gook
- The Journal of the Acoustical Society of Korea
- /
- v.36 no.6
- /
- pp.407-412
- /
- 2017
In this paper, we propose a speech emotion recognition method using a deep neural network based on the attention mechanism. The proposed method consists of a combination of CNN (Convolution Neural Networks), GRU (Gated Recurrent Unit), DNN (Deep Neural Networks) and attention mechanism. The spectrogram of the speech signal contains characteristic patterns according to the emotion. Therefore, we modeled characteristic patterns according to the emotion by applying the tuned Gabor filters as convolutional filter of typical CNN. In addition, we applied the attention mechanism with CNN and FC (Fully-Connected) layer to obtain the attention weight by considering context information of extracted features and used it for emotion recognition. To verify the proposed method, we conducted emotion recognition experiments on six emotions. The experimental results show that the proposed method achieves higher performance in speech emotion recognition than the conventional methods.
https://doi.org/10.7776/ASK.2017.36.6.407 인용 PDF KSCI

Design of CNN with MLP Layer (MLP 층을 갖는 CNN의 설계)

Park, Jin-Hyun;Hwang, Kwang-Bok;Choi, Young-Kiu
- Journal of the Korean Society of Mechanical Technology
- /
- v.20 no.6
- /
- pp.776-782
- /
- 2018
After CNN basic structure was introduced by LeCun in 1989, there has not been a major structure change except for more deep network until recently. The deep network enhances the expression power due to improve the abstraction ability of the network, and can learn complex problems by increasing non linearity. However, the learning of a deep network means that it has vanishing gradient or longer learning time. In this study, we proposes a CNN structure with MLP layer. The proposed CNNs are superior to the general CNN in their classification performance. It is confirmed that classification accuracy is high due to include MLP layer which improves non linearity by experiment. In order to increase the performance without making a deep network, it is confirmed that the performance is improved by increasing the non linearity of the network.
https://doi.org/10.17958/ksmt.20.6.201812.776 인용 KSCI

CNN-Based Novelty Detection with Effectively Incorporating Document-Level Information (효과적인 문서 수준의 정보를 이용한 합성곱 신경망 기반의 신규성 탐지)

Jo, Seongung;Oh, Heung-Seon;Im, Sanghun;Kim, Seonho
- KIPS Transactions on Computer and Communication Systems
- /
- v.9 no.10
- /
- pp.231-238
- /
- 2020
With a large number of documents appearing on the web, document-level novelty detection has become important since it can reduce the efforts of finding novel documents by discarding documents sharing redundant information already seen. A recent work proposed a convolutional neural network (CNN)-based novelty detection model with significant performance improvements. We observed that it has a restriction of using document-level information in determining novelty but assumed that the document-level information is more important. As a solution, this paper proposed two methods of effectively incorporating document-level information using a CNN-based novelty detection model. Our methods focus on constructing a feature vector of a target document to be classified by extracting relative information between the target document and source documents given as evidence. A series of experiments showed the superiority of our methods on a standard benchmark collection, TAP-DLND 1.0.
https://doi.org/10.3745/KTCCS.2020.9.10.231 인용 PDF KSCI

Robust Deep Learning-Based Profiling Side-Channel Analysis for Jitter (지터에 강건한 딥러닝 기반 프로파일링 부채널 분석 방안)

Kim, Ju-Hwan;Woo, Ji-Eun;Park, So-Yeon;Kim, Soo-Jin;Han, Dong-Guk
- Journal of the Korea Institute of Information Security & Cryptology
- /
- v.30 no.6
- /
- pp.1271-1278
- /
- 2020
Deep learning-based profiling side-channel analysis is a powerful analysis method that utilizes the neural network to profile the relationship between the side-channel information and the intermediate value. Since the neural network interprets each point of the signal in a different dimension, jitter makes it much hard that the neural network with dimension-wise weights learns the relationship. This paper shows that replacing the fully-connected layer of the traditional CNN (Convolutional Neural Network) with global average pooling (GAP) allows us to design the inherently robust neural network inherently for jitter. We experimented with the ChipWhisperer-Lite board to demonstrate the proposed method: as a result, the validation accuracy of the CNN with a fully-connected layer was only up to 1.4%; contrastively, the validation accuracy of the CNN with GAP was very high at up to 41.7%.
https://doi.org/10.13089/JKIISC.2020.30.6.1271 인용 PDF KSCI HTML

A Design of a Cellular Neural Network for the Real Image Processing (실영상처리를 위한 셀룰러 신경망 설계)

Kim Seung-Soo;Jeon Heung-Woo
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.10 no.2
- /
- pp.283-290
- /
- 2006
The cellular neural networks have the structure that consists of an array of the same cell which is a simple processing element, and each of the cells has local connectivity and space invariant template properties. So, it has a very suitable structure for the hardware implementation. But, it is impossible to have a one-to-one mapping between the CNN hardware processors and the pixels of the practical large image. In this paper, a $5{\times}5$ CNN hardware processor with pipeline input and output that can be applied to the time-multiplexing processing scheme, which processes the large image with a small CNN cell block, is designed. the operation of the implemented $5{\times}5$ CNN hardware processor is verified from the edge detection and the shadow detection experimentations.
PDF KSCI

Human Tracking Technology using Convolutional Neural Network in Visual Surveillance (서베일런스에서 회선 신경망 기술을 이용한 사람 추적 기법)

Kang, Sung-Kwan;Chun, Sang-Hun
- Journal of Digital Convergence
- /
- v.15 no.2
- /
- pp.173-181
- /
- 2017
In this paper, we have studied tracking as a training stage of considering the position and the scale of a person given its previous position, scale, as well as next and forward image fraction. Unlike other learning methods, CNN is thereby learning combines both time and spatial features from the image for the two consecutive frames. We introduce multiple path ways in CNN to better fuse local and global information. A creative shift-variant CNN architecture is designed so as to alleviate the drift problem when the distracting objects are similar to the target in cluttered environment. Furthermore, we employ CNNs to estimate the scale through the accurate localization of some key points. These techniques are object-independent so that the proposed method can be applied to track other types of object. The capability of the tracker of handling complex situations is demonstrated in many testing sequences. The accuracy of the SVM classifier using the features learnt by the CNN is equivalent to the accuracy of the CNN. This fact confirms the importance of automatically optimized features. However, the computation time for the classification of a person using the convolutional neural network classifier is less than approximately 1/40 of the SVM computation time, regardless of the type of the used features.
https://doi.org/10.14400/JDC.2017.15.2.173 인용 PDF KSCI

Search Result 533, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)