Search | Korea Science

RoutingConvNet: A Light-weight Speech Emotion Recognition Model Based on Bidirectional MFCC (RoutingConvNet: 양방향 MFCC 기반 경량 음성감정인식 모델)

Hyun Taek Lim;Soo Hyung Kim;Guee Sang Lee;Hyung Jeong Yang
- Smart Media Journal
- /
- v.12 no.5
- /
- pp.28-35
- /
- 2023
In this study, we propose a new light-weight model RoutingConvNet with fewer parameters to improve the applicability and practicality of speech emotion recognition. To reduce the number of learnable parameters, the proposed model connects bidirectional MFCCs on a channel-by-channel basis to learn long-term emotion dependence and extract contextual features. A light-weight deep CNN is constructed for low-level feature extraction, and self-attention is used to obtain information about channel and spatial signals in speech signals. In addition, we apply dynamic routing to improve the accuracy and construct a model that is robust to feature variations. The proposed model shows parameter reduction and accuracy improvement in the overall experiments of speech emotion datasets (EMO-DB, RAVDESS, and IEMOCAP), achieving 87.86%, 83.44%, and 66.06% accuracy respectively with about 156,000 parameters. In this study, we proposed a metric to calculate the trade-off between the number of parameters and accuracy for performance evaluation against light-weight.
https://doi.org/10.30693/SMJ.2023.12.5.28 인용 PDF

A Study on the Health Index Based on Degradation Patterns in Time Series Data Using ProphetNet Model (ProphetNet 모델을 활용한 시계열 데이터의 열화 패턴 기반 Health Index 연구)

Sun-Ju Won;Yong Soo Kim
- Journal of Korean Society of Industrial and Systems Engineering
- /
- v.46 no.3
- /
- pp.123-138
- /
- 2023
The Fourth Industrial Revolution and sensor technology have led to increased utilization of sensor data. In our modern society, data complexity is rising, and the extraction of valuable information has become crucial with the rapid changes in information technology (IT). Recurrent neural networks (RNN) and long short-term memory (LSTM) models have shown remarkable performance in natural language processing (NLP) and time series prediction. Consequently, there is a strong expectation that models excelling in NLP will also excel in time series prediction. However, current research on Transformer models for time series prediction remains limited. Traditional RNN and LSTM models have demonstrated superior performance compared to Transformers in big data analysis. Nevertheless, with continuous advancements in Transformer models, such as GPT-2 (Generative Pre-trained Transformer 2) and ProphetNet, they have gained attention in the field of time series prediction. This study aims to evaluate the classification performance and interval prediction of remaining useful life (RUL) using an advanced Transformer model. The performance of each model will be utilized to establish a health index (HI) for cutting blades, enabling real-time monitoring of machine health. The results are expected to provide valuable insights for machine monitoring, evaluation, and management, confirming the effectiveness of advanced Transformer models in time series analysis when applied in industrial settings.
https://doi.org/10.11627/jksie.2023.46.3.123 인용 PDF

Effcient Neural Network Architecture for Fat Target Detection and Recognition (목표물의 고속 탐지 및 인식을 위한 효율적인 신경망 구조)

Weon, Yong-Kwan;Baek, Yong-Chang;Lee, Jeong-Su
- The Transactions of the Korea Information Processing Society
- /
- v.4 no.10
- /
- pp.2461-2469
- /
- 1997
Target detection and recognition problems, in which neural networks are widely used, require translation invariant and real-time processing in addition to the requirements that general pattern recognition problems need. This paper presents a novel architecture that meets the requirements and explains effective methodology to train the network. The proposed neural network is an architectural extension of the shared-weight neural network that is composed of the feature extraction stage followed by the pattern recognition stage. Its feature extraction stage performs correlational operation on the input with a weight kernel, and the entire neural network can be considered a nonlinear correlation filter. Therefore, the output of the proposed neural network is correlational plane with peak values at the location of the target. The architecture of this neural network is suitable for implementing with parallel or distributed computers, and this fact allows the application to the problems which require realtime processing. Net training methodology to overcome the problem caused by unbalance of the number of targets and non-targets is also introduced. To verify the performance, the proposed network is applied to detection and recognition problem of a specific automobile driving around in a parking lot. The results show no false alarms and fast processing enough to track a target that moves as fast as about 190 km per hour.
PDF

A High-Voltage Compliant Neural Stimulation IC for Implant Devices Using Standard CMOS Process (체내 이식 기기용 표준 CMOS 고전압 신경 자극 집적 회로)

Abdi, Alfian;Cha, Hyouk-Kyu
- Journal of the Institute of Electronics and Information Engineers
- /
- v.52 no.5
- /
- pp.58-65
- /
- 2015
This paper presents the design of an implantable stimulation IC intended for neural prosthetic devices using $0.18-{\mu}m$ standard CMOS technology. The proposed single-channel biphasic current stimulator prototype is designed to deliver up to 1 mA of current to the tissue-equivalent $10-k{\Omega}$ load using 12.8-V supply voltage. To utilize only low-voltage standard CMOS transistors in the design, transistor stacking with dynamic gate biasing technique is used for reliable operation at high-voltage. In addition, active charge balancing circuit is used to maintain zero net charge at the stimulation site over the complete stimulation cycle. The area of the total stimulator IC consisting of DAC, current stimulation output driver, level-shifters, digital logic, and active charge balancer is $0.13mm^2$ and is suitable to be applied for multi-channel neural prosthetic devices.
https://doi.org/10.5573/ieie.2015.52.5.058 인용 PDF KSCI

PowerShell-based Malware Detection Method Using Command Execution Monitoring and Deep Learning (명령 실행 모니터링과 딥 러닝을 이용한 파워셸 기반 악성코드 탐지 방법)

Lee, Seung-Hyeon;Moon, Jong-Sub
- Journal of the Korea Institute of Information Security & Cryptology
- /
- v.28 no.5
- /
- pp.1197-1207
- /
- 2018
PowerShell is command line shell and scripting language, built on the .NET framework, and it has several advantages as an attack tool, including built-in support for Windows, easy code concealment and persistence, and various pen-test frameworks. Accordingly, malwares using PowerShell are increasing rapidly, however, there is a limit to cope with the conventional malware detection technique. In this paper, we propose an improved monitoring method to observe commands executed in the PowerShell and a deep learning based malware classification model that extract features from commands using Convolutional Neural Network(CNN) and send them to Recurrent Neural Network(RNN) according to the order of execution. As a result of testing the proposed model with 5-fold cross validation using 1,916 PowerShell-based malwares collected at malware sharing site and 38,148 benign scripts disclosed by an obfuscation detection study, it shows that the model effectively detects malwares with about 97% True Positive Rate(TPR) and 1% False Positive Rate(FPR).
https://doi.org/10.13089/JKIISC.2018.28.5.1197 인용 PDF KSCI HTML

Development of Roughness Estimation Model for Plunge Grinding of Valve Parts Using Neural Network (뉴럴 네트워크를 이용한 밸브 부품 생산용 플런지 연삭의 거칠기 예측모델 개발)

Choi, Jeong-Ju;Park, Joon-Hong
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.12 no.1
- /
- pp.62-67
- /
- 2011
Grinding process is executed in the final machining stage to meet the quality requirements. In generally the ground surface of workpiece is affected by dressing condition as well as grinding condition. In order to estimate the roughness of workpiece, the several roughness models have been researched. These models defined the specific parameters and considered the several parameters which affect to roughness as multiply relationship among them. However, the multiply relationship among parameters is not enough to show the complicated grinding mechanism. Therefore, the neural network algorithm is used in this paper to predict the ground roughness for the plunge grinding. The proposed structure is composed of the initial roughness as well as final roughness model. The input parameters of proposed neural network are referred with the existing roughness model's. The performance of the proposed model is verified through experiments.
https://doi.org/10.5762/KAIS.2011.12.1.062 인용 PDF KSCI

A Dynamic Three Dimensional Neuro System with Multi-Discriminator (다중 판별자를 가지는 동적 삼차원 뉴로 시스템)

Kim, Seong-Jin;Lee, Dong-Hyung;Lee, Soo-Dong
- Journal of KIISE:Software and Applications
- /
- v.34 no.7
- /
- pp.585-594
- /
- 2007
The back propagation algorithm took a long time to learn the input patterns and was difficult to train the additional or repeated learning patterns. So Aleksander proposed the binary neural network which could overcome the disadvantages of BP Network. But it had the limitation of repeated learning and was impossible to extract a generalized pattern. In this paper, we proposed a dynamic 3 dimensional Neuro System which was consisted of a learning network which was based on weightless neural network and a feedback module which could accumulate the characteristic. The proposed system was enable to train additional and repeated patterns. Also it could be produced a generalized pattern by putting a proper threshold into each learning-net's discriminator which was resulted from learning procedures. And then we reused the generalized pattern to elevate the recognition rate. In the last processing step to decide right category, we used maximum response detector. We experimented using the MNIST database of NIST and got 99.3% of right recognition rate for training data.
PDF KSCI

A novel radioactive particle tracking algorithm based on deep rectifier neural network

Dam, Roos Sophia de Freitas;dos Santos, Marcelo Carvalho;do Desterro, Filipe Santana Moreira;Salgado, William Luna;Schirru, Roberto;Salgado, Cesar Marques
- Nuclear Engineering and Technology
- /
- v.53 no.7
- /
- pp.2334-2340
- /
- 2021
Radioactive particle tracking (RPT) is a minimally invasive nuclear technique that tracks a radioactive particle inside a volume of interest by means of a mathematical location algorithm. During the past decades, many algorithms have been developed including ones based on artificial intelligence techniques. In this study, RPT technique is applied in a simulated test section that employs a simplified mixer filled with concrete, six scintillator detectors and a¹³⁷Cs radioactive particle emitting gamma rays of 662 keV. The test section was developed using MCNPX code, which is a mathematical code based on Monte Carlo simulation, and 3516 different radioactive particle positions (x,y,z) were simulated. Novelty of this paper is the use of a location algorithm based on a deep learning model, more specifically a 6-layers deep rectifier neural network (DRNN), in which hyperparameters were defined using a Bayesian optimization method. DRNN is a type of deep feedforward neural network that substitutes the usual sigmoid based activation functions, traditionally used in vanilla Multilayer Perceptron Networks, for rectified activation functions. Results show the great accuracy of the DRNN in a RPT tracking system. Root mean squared error for x, y and coordinates of the radioactive particle is, respectively, 0.03064, 0.02523 and 0.07653.
https://doi.org/10.1016/j.net.2021.01.002 인용 PDF KSCI

A Deep Learning-based Hand Gesture Recognition Robust to External Environments (외부 환경에 강인한 딥러닝 기반 손 제스처 인식)

Oh, Dong-Han;Lee, Byeong-Hee;Kim, Tae-Young
- The Journal of Korean Institute of Next Generation Computing
- /
- v.14 no.5
- /
- pp.31-39
- /
- 2018
Recently, there has been active studies to provide a user-friendly interface in a virtual reality environment by recognizing user hand gestures based on deep learning. However, most studies use separate sensors to obtain hand information or go through pre-process for efficient learning. It also fails to take into account changes in the external environment, such as changes in lighting or some of its hands being obscured. This paper proposes a hand gesture recognition method based on deep learning that is strong in external environments without the need for pre-process of RGB images obtained from general webcam. In this paper we improve the VGGNet and the GoogLeNet structures and compared the performance of each structure. The VGGNet and the GoogLeNet structures presented in this paper showed a recognition rate of 93.88% and 93.75%, respectively, based on data containing dim, partially obscured, or partially out-of-sight hand images. In terms of memory and speed, the GoogLeNet used about 3 times less memory than the VGGNet, and its processing speed was 10 times better. The results of this paper can be processed in real-time and used as a hand gesture interface in various areas such as games, education, and medical services in a virtual reality environment.

Semantic Segmentation of Hazardous Facilities in Rural Area Using U-Net from KOMPSAT Ortho Mosaic Imagery (KOMPSAT 정사모자이크 영상으로부터 U-Net 모델을 활용한 농촌위해시설 분류)

Sung-Hyun Gong;Hyung-Sup Jung;Moung-Jin Lee;Kwang-Jae Lee;Kwan-Young Oh;Jae-Young Chang
- Korean Journal of Remote Sensing
- /
- v.39 no.6_3
- /
- pp.1693-1705
- /
- 2023
Rural areas, which account for about 90% of the country's land area, are increasing in importance and value as a space that performs various public functions. However, facilities that adversely affect residents' lives, such as livestock facilities, factories, and solar panels, are being built indiscriminately near residential areas, damaging the rural environment and landscape and lowering the quality of residents' lives. In order to prevent disorderly development in rural areas and manage rural space in a planned manner, detection and monitoring of hazardous facilities in rural areas is necessary. Data can be acquired through satellite imagery, which can be acquired periodically and provide information on the entire region. Effective detection is possible by utilizing image-based deep learning techniques using convolutional neural networks. Therefore, U-Net model, which shows high performance in semantic segmentation, was used to classify potentially hazardous facilities in rural areas. In this study, KOMPSAT ortho-mosaic optical imagery provided by the Korea Aerospace Research Institute in 2020 with a spatial resolution of 0.7 meters was used, and AI training data for livestock facilities, factories, and solar panels were produced by hand for training and inference. After training with U-Net, pixel accuracy of 0.9739 and mean Intersection over Union (mIoU) of 0.7025 were achieved. The results of this study can be used for monitoring hazardous facilities in rural areas and are expected to be used as basis for rural planning.
https://doi.org/10.7780/kjrs.2023.39.6.3.3 인용 PDF HTML

Search Result 750, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)