Search | Korea Science

Speech Recognition Model Based on CNN using Spectrogram (스펙트로그램을 이용한 CNN 음성인식 모델)

Won-Seog Jeong;Haeng-Woo Lee
- The Journal of the Korea institute of electronic communication sciences
- /
- v.19 no.4
- /
- pp.685-692
- /
- 2024
In this paper, we propose a new CNN model to improve the recognition performance of command voice signals. This method obtains a spectrogram image after performing a short-time Fourier transform (STFT) of the input signal and improves command recognition performance through supervised learning using a CNN model. After Fourier transforming the input signal for each short-time section, a spectrogram image is obtained and multi-classification learning is performed using a CNN deep learning model. This effectively classifies commands by converting the time domain voice signal to the frequency domain to express the characteristics well and performing deep learning training using the spectrogram image for the conversion parameters. To verify the performance of the speech recognition system proposed in this study, a simulation program using Tensorflow and Keras libraries was created and a simulation experiment was performed. As a result of the experiment, it was confirmed that an accuracy of 92.5% could be obtained using the proposed deep learning algorithm.
https://doi.org/10.13067/JKIECS.2024.19.4.685 인용 PDF

WFMM Neural Networks Based Skin Color Filter for Face Detection (얼굴패턴 검출 문제에서 WFMM 신경망 기반의 피부색 검출 기법)

Cho Il-Gook;Kim Ho-Joon
- Proceedings of the Korea Information Processing Society Conference
- /
- 2006.05a
- /
- pp.299-302
- /
- 2006
본 논문에서는 다중필터와 복합형 신경망으로 구성된 얼굴 검출 시스템과 WFMM 신경망을 이용한 피부색 검출기법을 소개한다. 전처리 단계에 해당하는 다중필터는 대상 영역의 수를 감소 시켜 시스템의 속도를 개선한다. 다중필터에 속한 색상필터는 총 11 가지의 색상 공간에서 피부색의 특징 값을 추출하여 학습 데이터로 사용하며, 이 학습 데이터에 의해 생성된 하이퍼 박스를 통해 피부색을 분류한다. 또한 WFMM 신경망의 연관도 요소 특성을 이용하여 각 색상 공간의 상대적 중요도를 분석하여 피부색 검출에 유용한 색상 공간을 분석하고 추출 한다. 얼굴패턴 검출을 위한 복합형 신경망은 첫 단계에서 가보 변환을 사용하는 CNN 을 통해 특징 지도를 생성하고, WFMM 신경망으로 최종 얼굴패턴을 검증한다.
PDF

CNN (Convolutional Neural Network) based in-loop filter in HEVC (컨볼루션 신경망을 이용한 고효율 비디오 부호화에서의 인-루프 필터)

Park, Woonsung;Kim, Munchurl
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2016.06a
- /
- pp.369-372
- /
- 2016
본 논문에서는 고효율 비디오 부호화에서 채택하고 있는 인-루프 필터 중 SAO (sample adaptive offset)를 컨볼루션 신경망으로 대체하여 부호화 효율을 향상시키는 방법을 제안한다. SAO 는 양자화 에러를 줄이기 위해 인코더에서 디코더로 적절한 오프셋 값을 전송한다. 제안하는 컨볼루션 신경망을 사용한 인-루프 필터는 인코더와 디코더가 같은 컨볼루션 신경망을 사용하여, 추가적인 비트를 디코더로 전송할 필요 없이 양자화 에러를 줄일 수 있다. 컨볼루션 신경망의 구조는 두 가지를 각각 사용하였고, 각 컨볼루션 신경망의 구조에 대해서 입력 영상과 원래 영상의 평균제곱오차에 따라 다른 모델을 적용하였다. 따라서 제안하는 방법을 HEVC에 적용하여 기존의 방법보다 더 적은 bit 로 더 좋은 화질의 영상을 얻어서 BD-rate 의 gain 을 얻을 수 있을 뿐만 아니라, 주관적인 화질의 비교에서도 더 좋은 결과를 보인다.
PDF

A Study on the Deep Learning-based Tree Species Classification by using High-resolution Orthophoto Images (고해상도 정사영상을 이용한 딥러닝 기반의 산림수종 분류에 관한 연구)

JANG, Kwangmin
- Journal of the Korean Association of Geographic Information Studies
- /
- v.24 no.3
- /
- pp.1-9
- /
- 2021
In this study, we evaluated the accuracy of deep learning-based tree species classification model trained by using high-resolution images. We selected five species classed, i.e., pine, birch, larch, korean pine, mongolian oak for classification. We created 5,000 datasets using high-resolution orthophoto and forest type map. CNN deep learning model is used to tree species classification. We divided training data, verification data, and test data by a 5:3:2 ratio of the datasets and used it for the learning and evaluation of the model. The overall accuracy of the model was 89%. The accuracy of each species were pine 95%, birch 89%, larch 80%, korean pine 86% and mongolian oak 98%.
https://doi.org/10.11108/kagis.2021.24.3.001 인용 PDF KSCI

A Statistical Downscaling of Climate Change Scenarios Using Deep Convolutional Neural Networks (합성곱 신경망(CNN)기반 한반도 지역 대상 기후 변화 시나리오의 통계학적 상세화 기법 개발)

Kim, Yun-Sung;Uranchimeg, Sumiya;Yu, Jae-Ung;Cho, Hemie;Kwon, Hyun-Han
- Proceedings of the Korea Water Resources Association Conference
- /
- 2022.05a
- /
- pp.326-326
- /
- 2022
기후 변화 시나리오는 온실가스, 에어로졸, 토지이용 변화 등 인위적인 원인으로 발생한 복사강제력 변화를 지구시스템 모델에 적용하여 산출한 미래 기후 전망정보(기온, 강수량, 바람, 습도 등)를 생산하는데 활용된다. 또한, 미래에 기후변화로 인한 영향을 평가하고 피해를 최소화하는데 활용할 수 있는 선제적인 정보로 활용된다. GCM과 RCM은 구조 및 모수화 과정, 불확실성 등의 한계로 인하여 상대적으로 큰 시공간적 규모를 가지며, 실제 관측된 기상인자들을 재현하는데 시공간적 차이 즉 편의(bias)가 발생하며. 실제 관측된 기상인자의 시간적 변화 특성을 재현하지 못하는 문제점을 내재하고 있는 것으로 보고되고 있다. 이러한 점에서 기후모델에서 생산된 정보를 수문학적으로 적용하기 위해서는 시공간적 상세화와 편의 보정은 필수적이다. 본 연구에서는 관측자료를 사용하여 재해석 자료를 편의보정 한 뒤. 기후 변화 시나리오를 합성곱 신경망(CNN)을 기반으로 상세화 과정을 진행하여 고해상도 자료를 생산하였으며, CNN 기반 상세화 기법 적용성은 지상 관측자료 대상으로 평가하였다.
PDF

Development of an Automatic Monitoring System for Ultrasound Signals Using Artificial Intelligence and Convolutional Neural Networks (인공지능을 활용한 초음파 신호와 합성곱 신경망 기반 자동 적조 모니터링 시스템)

Daehun Kim;Hyeon-Ju Jeon;O-Joun Lee;Hae Gyun Lim
- Proceedings of the Korea Information Processing Society Conference
- /
- 2023.11a
- /
- pp.662-664
- /
- 2023
해양 식물플랑크톤의 성장은 유해적인 적조를 유발할 수 있으며, 이는 여러 국가의 생태계에 피해를 주는 상황이다. 적조를 모니터링하는 것은 식물플랑크톤 미생물의 증가를 예방하고 통제하기 위해 중요하다. 그러나 현재의 적조 모니터링 기술은 날씨, 시간 제약 및 실시간 모니터링에 대한 어려움으로 인해 측정 정확도에 영향을 미치는 한계가 있다. 본 연구는 특히 적조 발생을 감지하기 위한 목적으로 개발된 자동 실시간 모니터링 시스템의 성공적인 개발을 보여준다. 개발한 시스템은 음향 반사파 데이터 처리를 통해 합성곱 신경망(Convolutional neural networks, CNN)을 활용하여 식물플랑크톤 농도를 정확하게 구별할 수 있다. 특히, 이 CNN 모델은 음향 신호의 변환된 주파수 스펙트럼과 Cochlodinium polykrikoides (C. polykrikoides)의 농도 간의 상관 관계를 수립하는 데 뛰어난 효과를 나타냈다. 이 CNN 은 C. polykrikoides 를 감지하는 데 0.90 의 정확도를 보여준다. 이러한 모니터링과 CNN 분류의 활용은 실시간 측정의 중요한 잠재력을 보여주며, 추가적인 절차가 필요 없는 자동 모니터링 시스템을 구축할 수 있을 것으로 예상된다.
https://doi.org/10.3745/PKIPS.y2023m11a.662 인용 PDF

Uniform Motion Deblurring using Shock Filter and Convolutional Neural Network (쇼크 필터와 합성곱 신경망 기반의 균일 모션 디블러링 기법)

Jeong, Minso;Jeong, Jechang
- Journal of Broadcast Engineering
- /
- v.23 no.4
- /
- pp.484-494
- /
- 2018
The uniform motion blur removing algorithm of Cho et al. has the problem that the edge region of the image cannot be restored clearly. We propose the effective algorithm to overcome this problem by using shock filter that reconstructs a blurred step signal into a sharp edge, and convolutional neural network (CNN) that learns by extracting features from the image. Then uniform motion blur kernel is estimated from the latent sharp image to remove blur in the image. The proposed algorithm improved the disadvantages of the conventional algorithm by reconstructing the latent sharp image using shock filter and CNN. Through the experimental results, it was confirmed that the proposed algorithm shows excellent reconstruction performance in objective and subjective image quality than the conventional algorithm.
https://doi.org/10.5909/JBE.2018.23.4.484 인용 PDF KSCI KPUBS

LeafNet: Plants Segmentation using CNN (LeafNet: 합성곱 신경망을 이용한 식물체 분할)

Jo, Jeong Won;Lee, Min Hye;Lee, Hong Ro;Chung, Yong Suk;Baek, Jeong Ho;Kim, Kyung Hwan;Lee, Chang Woo
- Journal of Korea Society of Industrial Information Systems
- /
- v.24 no.4
- /
- pp.1-8
- /
- 2019
Plant phenomics is a technique for observing and analyzing morphological features in order to select plant varieties of excellent traits. The conventional methods is difficult to apply to the phenomics system. because the color threshold value must be manually changed according to the detection target. In this paper, we propose the convolution neural network (CNN) structure that can automatically segment plants from the background for the phenomics system. The LeafNet consists of nine convolution layers and a sigmoid activation function for determining the presence of plants. As a result of the learning using the LeafNet, we obtained a precision of 98.0% and a recall rate of 90.3% for the plant seedlings images. This confirms the applicability of the phenomics system.
https://doi.org/10.9723/jksiis.2019.24.4.001 인용 PDF KSCI

Classifying Images of The ASL Alphabet using Dual Homogeneous CNNs Structure (이중 동종 CNN 구조를 이용한 ASL 알파벳의 이미지 분류)

Erniyozov Shokhrukh;Man-Sung Kwan;Seong-Jong Park;Gwang-Jun Kim
- The Journal of the Korea institute of electronic communication sciences
- /
- v.18 no.3
- /
- pp.449-458
- /
- 2023
Many people think that sign language is only for people who are deaf and cannot speak, but of course it is necessary for people who want to talk with them. One of the biggest challenges in ASL(American Sign Language) alphabet recognition is the high inter-class similarities and high intra-class variance. In this paper, we proposed an architecture that can overcome these two problems, which performs similarity learning to reduces inter-class similarities and intra-class variance between images. The proposed architecture consists of the same convolutional neural network with a double configuration that shares parameters (weights and biases) and also applies the Keras API to reduce similarity learning and variance through this pathway. The similarity learning results the use of the dual CNN shows that the accuracy is improved by reducing the similarity and variability between classes by not including the poor results of the two classes.
https://doi.org/10.13067/JKIECS.2023.18.3.449 인용 PDF

Recurrent Neural Network Based Spectrum Sensing Technique for Cognitive Radio Communications (인지 무선 통신을 위한 순환 신경망 기반 스펙트럼 센싱 기법)

Jung, Tae-Yun;Jeong, Eui-Rim
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.24 no.6
- /
- pp.759-767
- /
- 2020
This paper proposes a new Recurrent neural network (RNN) based spectrum sensing technique for cognitive radio communications. The proposed technique determines the existence of primary user's signal without any prior information of the primary users. The method performs high-speed sampling by considering the whole sensing bandwidth and then converts the signal into frequency spectrum via fast Fourier transform (FFT). This spectrum signal is cut in sensing channel bandwidth and entered into the RNN to determine the channel vacancy. The performance of the proposed technique is verified through computer simulations. According to the results, the proposed one is superior to more than 2 [dB] than the existing threshold-based technique and has similar performance to that of the existing Convolutional neural network (CNN) based method. In addition, experiments are carried out in indoor environments and the results show that the proposed technique performs more than 4 [dB] better than both the conventional threshold-based and the CNN based methods.
https://doi.org/10.6109/jkiice.2020.24.6.759 인용 PDF KSCI

Search Result 533, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)