Search | Korea Science

Speech Recognition Model Based on CNN using Spectrogram (스펙트로그램을 이용한 CNN 음성인식 모델)

Won-Seog Jeong;Haeng-Woo Lee
- The Journal of the Korea institute of electronic communication sciences
- /
- v.19 no.4
- /
- pp.685-692
- /
- 2024
In this paper, we propose a new CNN model to improve the recognition performance of command voice signals. This method obtains a spectrogram image after performing a short-time Fourier transform (STFT) of the input signal and improves command recognition performance through supervised learning using a CNN model. After Fourier transforming the input signal for each short-time section, a spectrogram image is obtained and multi-classification learning is performed using a CNN deep learning model. This effectively classifies commands by converting the time domain voice signal to the frequency domain to express the characteristics well and performing deep learning training using the spectrogram image for the conversion parameters. To verify the performance of the speech recognition system proposed in this study, a simulation program using Tensorflow and Keras libraries was created and a simulation experiment was performed. As a result of the experiment, it was confirmed that an accuracy of 92.5% could be obtained using the proposed deep learning algorithm.
https://doi.org/10.13067/JKIECS.2024.19.4.685 인용 PDF

Normal data based rotating machine anomaly detection using CNN with self-labeling

Bae, Jaewoong;Jung, Wonho;Park, Yong-Hwa
- Smart Structures and Systems
- /
- v.29 no.6
- /
- pp.757-766
- /
- 2022
To train deep learning algorithms, a sufficient number of data are required. However, in most engineering systems, the acquisition of fault data is difficult or sometimes not feasible, while normal data are secured. The dearth of data is one of the major challenges to developing deep learning models, and fault diagnosis in particular cannot be made in the absence of fault data. With this context, this paper proposes an anomaly detection methodology for rotating machines using only normal data with self-labeling. Since only normal data are used for anomaly detection, a self-labeling method is used to generate a new labeled dataset. The overall procedure includes the following three steps: (1) transformation of normal data to self-labeled data based on a pretext task, (2) training the convolutional neural networks (CNN), and (3) anomaly detection using defined anomaly score based on the softmax output of the trained CNN. The softmax value of the abnormal sample shows different behavior from the normal softmax values. To verify the proposed method, four case studies were conducted, on the Case Western Reserve University (CWRU) bearing dataset, IEEE PHM 2012 data challenge dataset, PHMAP 2021 data challenge dataset, and laboratory bearing testbed; and the results were compared to those of existing machine learning and deep learning methods. The results showed that the proposed algorithm could detect faults in the bearing testbed and compressor with over 99.7% accuracy. In particular, it was possible to detect not only bearing faults but also structural faults such as unbalance and belt looseness with very high accuracy. Compared with the existing GAN, the autoencoder-based anomaly detection algorithm, the proposed method showed high anomaly detection performance.
https://doi.org/10.12989/sss.2022.29.6.757 인용 KSCI

Development of a modified model for predicting cabbage yield based on soil properties using GIS (GIS를 이용한 토양정보 기반의 배추 생산량 예측 수정모델 개발)

Choi, Yeon Oh;Lee, Jaehyeon;Sim, Jae Hoo;Lee, Seung Woo
- Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
- /
- v.40 no.5
- /
- pp.449-456
- /
- 2022
This study proposes a deep learning algorithm to predict crop yield using GIS (Geographic Information System) to extract soil properties from Soilgrids and soil suitability class maps. The proposed model modified the structure of a published CNN-RNN (Convolutional Neural Network-Recurrent Neural Network) based crop yield prediction model suitable for the domestic crop environment. The existing model has two characteristics. The first is that it replaces the original yield with the average yield of the year, and the second is that it trains the data of the predicted year. The new model uses the original field value to ensure accuracy, and the network structure has been improved so that it can train only with data prior to the year to be predicted. The proposed model predicted the yield per unit area of autumn cabbage for kimchi by region based on weather, soil, soil suitability classes, and yield data from 1980 to 2020. As a result of computing and predicting data for each of the four years from 2018 to 2021, the error amount for the test data set was about 10%, enabling accurate yield prediction, especially in regions with a large proportion of total yield. In addition, both the proposed model and the existing model show that the error gradually decreases as the number of years of training data increases, resulting in improved general-purpose performance as the number of training data increases.
https://doi.org/10.7848/ksgpc.2022.40.5.449 인용 PDF KSCI

Automatically Diagnosing Skull Fractures Using an Object Detection Method and Deep Learning Algorithm in Plain Radiography Images

Tae Seok, Jeong;Gi Taek, Yee; Kwang Gi, Kim;Young Jae, Kim;Sang Gu, Lee;Woo Kyung, Kim
- Journal of Korean Neurosurgical Society
- /
- v.66 no.1
- /
- pp.53-62
- /
- 2023
Objective : Deep learning is a machine learning approach based on artificial neural network training, and object detection algorithm using deep learning is used as the most powerful tool in image analysis. We analyzed and evaluated the diagnostic performance of a deep learning algorithm to identify skull fractures in plain radiographic images and investigated its clinical applicability. Methods : A total of 2026 plain radiographic images of the skull (fracture, 991; normal, 1035) were obtained from 741 patients. The RetinaNet architecture was used as a deep learning model. Precision, recall, and average precision were measured to evaluate the deep learning algorithm's diagnostic performance. Results : In ResNet-152, the average precision for intersection over union (IOU) 0.1, 0.3, and 0.5, were 0.7240, 0.6698, and 0.3687, respectively. When the intersection over union (IOU) and confidence threshold were 0.1, the precision was 0.7292, and the recall was 0.7650. When the IOU threshold was 0.1, and the confidence threshold was 0.6, the true and false rates were 82.9% and 17.1%, respectively. There were significant differences in the true/false and false-positive/false-negative ratios between the anterior-posterior, towne, and both lateral views (p=0.032 and p=0.003). Objects detected in false positives had vascular grooves and suture lines. In false negatives, the detection performance of the diastatic fractures, fractures crossing the suture line, and fractures around the vascular grooves and orbit was poor. Conclusion : The object detection algorithm applied with deep learning is expected to be a valuable tool in diagnosing skull fractures.
https://doi.org/10.3340/jkns.2022.0062 인용 PDF

Anomalous Trajectory Detection in Surveillance Systems Using Pedestrian and Surrounding Information

Doan, Trung Nghia;Kim, Sunwoong;Vo, Le Cuong;Lee, Hyuk-Jae
- IEIE Transactions on Smart Processing and Computing
- /
- v.5 no.4
- /
- pp.256-266
- /
- 2016
Concurrently detected and annotated abnormal events can have a significant impact on surveillance systems. By considering the specific domain of pedestrian trajectories, this paper presents two main contributions. First, as introduced in much of the work on trajectory-based anomaly detection in the literature, only information about pedestrian paths, such as direction and speed, is considered. Differing from previous work, this paper proposes a framework that deals with additional types of trajectory-based anomalies. These abnormal events take places when a person enters prohibited areas. Those restricted regions are constructed by an online learning algorithm that uses surrounding information, including detected pedestrians and background scenes. Second, a simple data-boosting technique is introduced to overcome a lack of training data; such a problem particularly challenges all previous work, owing to the significantly low frequency of abnormal events. This technique only requires normal trajectories and fundamental information about scenes to increase the amount of training data for both normal and abnormal trajectories. With the increased amount of training data, the conventional abnormal trajectory classifier is able to achieve better prediction accuracy without falling into the over-fitting problem caused by complex learning models. Finally, the proposed framework (which annotates tracks that enter prohibited areas) and a conventional abnormal trajectory detector (using the data-boosting technique) are integrated to form a united detector. Such a detector deals with different types of anomalous trajectories in a hierarchical order. The experimental results show that all proposed detectors can effectively detect anomalous trajectories in the test phase.
https://doi.org/10.5573/IEIESPC.2016.5.4.256 인용 PDF KSCI

Application of CNN for Fish Species Classification (어종 분류를 위한 CNN의 적용)

Park, Jin-Hyun;Hwang, Kwang-Bok;Park, Hee-Mun;Choi, Young-Kiu
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.23 no.1
- /
- pp.39-46
- /
- 2019
In this study, before system development for the elimination of foreign fish species, we propose an algorithm to classify fish species by training fish images with CNN. The raw data for CNN learning were directly captured images for each species, Dataset 1 increases the number of images to improve the classification of fish species and Dataset 2 realizes images close to natural environment are constructed and used as training and test data. The classification performance of four CNNs are over 99.97% for dataset 1 and 99.5% for dataset 2, in particular, we confirm that the learned CNN using Data Set 2 has satisfactory performance for fish images similar to the natural environment. And among four CNNs, AlexNet achieves satisfactory performance, and this has also the shortest execution time and training time, we confirm that it is the most suitable structure to develop the system for the elimination of foreign fish species.
https://doi.org/10.6109/jkiice.2019.23.1.39 인용 PDF KSCI HTML

Age and Gender Classification with Small Scale CNN (소규모 합성곱 신경망을 사용한 연령 및 성별 분류)

Jamoliddin, Uraimov;Yoo, Jae Hung
- The Journal of the Korea institute of electronic communication sciences
- /
- v.17 no.1
- /
- pp.99-104
- /
- 2022
Artificial intelligence is getting a crucial part of our lives with its incredible benefits. Machines outperform humans in recognizing objects in images, particularly in classifying people into correct age and gender groups. In this respect, age and gender classification has been one of the hot topics among computer vision researchers in recent decades. Deployment of deep Convolutional Neural Network(: CNN) models achieved state-of-the-art performance. However, the most of CNN based architectures are very complex with several dozens of training parameters so they require much computation time and resources. For this reason, we propose a new CNN-based classification algorithm with significantly fewer training parameters and training time compared to the existing methods. Despite its less complexity, our model shows better accuracy of age and gender classification on the UTKFace dataset.
https://doi.org/10.13067/JKIECS.2022.17.1.99 인용 PDF KSCI

Computational intelligence models for predicting the frictional resistance of driven pile foundations in cold regions

Shiguan Chen;Huimei Zhang;Kseniya I. Zykova;Hamed Gholizadeh Touchaei;Chao Yuan;Hossein Moayedi;Binh Nguyen Le
- Computers and Concrete
- /
- v.32 no.2
- /
- pp.217-232
- /
- 2023
Numerous studies have been performed on the behavior of pile foundations in cold regions. This study first attempted to employ artificial neural networks (ANN) to predict pile-bearing capacity focusing on pile data recorded primarily on cold regions. As the ANN technique has disadvantages such as finding global minima or slower convergence rates, this study in the second phase deals with the development of an ANN-based predictive model improved with an Elephant herding optimizer (EHO), Dragonfly Algorithm (DA), Genetic Algorithm (GA), and Evolution Strategy (ES) methods for predicting the piles' bearing capacity. The network inputs included the pile geometrical features, pile area (m²), pile length (m), internal friction angle along the pile body and pile tip (Ø°), and effective vertical stress. The MLP model pile's output was the ultimate bearing capacity. A sensitivity analysis was performed to determine the optimum parameters to select the best predictive model. A trial-and-error technique was also used to find the optimum network architecture and the number of hidden nodes. According to the results, there is a good consistency between the pile-bearing DA-MLP-predicted capacities and the measured bearing capacities. Based on the R2 and determination coefficient as 0.90364 and 0.8643 for testing and training datasets, respectively, it is suggested that the DA-MLP model can be effectively implemented with higher reliability, efficiency, and practicability to predict the bearing capacity of piles.
https://doi.org/10.12989/cac.2023.32.2.217 인용

Improving Learning Performance of Support Vector Machine using the Kernel Relaxation and the Dynamic Momentum (Kernel Relaxation과 동적 모멘트를 조합한 Support Vector Machine의 학습 성능 향상)

Kim, Eun-Mi;Lee, Bae-Ho
- The KIPS Transactions:PartB
- /
- v.9B no.6
- /
- pp.735-744
- /
- 2002
This paper proposes learning performance improvement of support vector machine using the kernel relaxation and the dynamic momentum. The dynamic momentum is reflected to different momentum according to current state. While static momentum is equally influenced on the whole, the proposed dynamic momentum algorithm can control to the convergence rate and performance according to the change of the dynamic momentum by training. The proposed algorithm has been applied to the kernel relaxation as the new sequential learning method of support vector machine presented recently. The proposed algorithm has been applied to the SONAR data which is used to the standard classification problems for evaluating neural network. The simulation results of proposed algorithm have better the convergence rate and performance than those using kernel relaxation and static momentum, respectively.
https://doi.org/10.3745/KIPSTB.2002.9B.6.735 인용 PDF KSCI

Encoder Type Semantic Segmentation Algorithm Using Multi-scale Learning Type for Road Surface Damage Recognition (도로 노면 파손 인식을 위한 Multi-scale 학습 방식의 암호화 형식 의미론적 분할 알고리즘)

Shim, Seungbo;Song, Young Eun
- The Journal of The Korea Institute of Intelligent Transport Systems
- /
- v.19 no.2
- /
- pp.89-103
- /
- 2020
As we face an aging society, the demand for personal mobility for disabled and aged people is increasing. In fact, as of 2017, the number of electric wheelchair in the country continues to increase to 90,000. However, people with disabilities and seniors are more likely to have accidents while driving, because their judgment and coordination are inferior to normal people. One of the causes of the accident is the interference of personal vehicle steering control due to unbalanced road surface conditions. In this paper, we introduce a encoder type semantic segmentation algorithm that can recognize road conditions at high speed to prevent such accidents. To this end, more than 1,500 training data and 150 test data including road surface damage were newly secured. With the data, we proposed a deep neural network composed of encoder stages, unlike the Auto-encoding type consisting of encoder and decoder stages. Compared to the conventional method, this deep neural network has a 4.45% increase in mean accuracy, a 59.2% decrease in parameters, and an 11.9% increase in computation speed. It is expected that safe personal transportation will be come soon by utilizing such high speed algorithm.
https://doi.org/10.12815/kits.2020.19.2.89 인용 PDF KSCI

Search Result 612, Processing Time 0.036 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)