• Title/Summary/Keyword: Short-Term Memory

Search Result 754, Processing Time 0.023 seconds

A Method for Generating Malware Countermeasure Samples Based on Pixel Attention Mechanism

  • Xiangyu Ma;Yuntao Zhao;Yongxin Feng;Yutao Hu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.2
    • /
    • pp.456-477
    • /
    • 2024
  • With information technology's rapid development, the Internet faces serious security problems. Studies have shown that malware has become a primary means of attacking the Internet. Therefore, adversarial samples have become a vital breakthrough point for studying malware. By studying adversarial samples, we can gain insights into the behavior and characteristics of malware, evaluate the performance of existing detectors in the face of deceptive samples, and help to discover vulnerabilities and improve detection methods for better performance. However, existing adversarial sample generation methods still need help regarding escape effectiveness and mobility. For instance, researchers have attempted to incorporate perturbation methods like Fast Gradient Sign Method (FGSM), Projected Gradient Descent (PGD), and others into adversarial samples to obfuscate detectors. However, these methods are only effective in specific environments and yield limited evasion effectiveness. To solve the above problems, this paper proposes a malware adversarial sample generation method (PixGAN) based on the pixel attention mechanism, which aims to improve adversarial samples' escape effect and mobility. The method transforms malware into grey-scale images and introduces the pixel attention mechanism in the Deep Convolution Generative Adversarial Networks (DCGAN) model to weigh the critical pixels in the grey-scale map, which improves the modeling ability of the generator and discriminator, thus enhancing the escape effect and mobility of the adversarial samples. The escape rate (ASR) is used as an evaluation index of the quality of the adversarial samples. The experimental results show that the adversarial samples generated by PixGAN achieve escape rates of 97%, 94%, 35%, 39%, and 43% on the Random Forest (RF), Support Vector Machine (SVM), Convolutional Neural Network (CNN), Convolutional Neural Network and Recurrent Neural Network (CNN_RNN), and Convolutional Neural Network and Long Short Term Memory (CNN_LSTM) algorithmic detectors, respectively.

KOSPI index prediction using topic modeling and LSTM

  • Jin-Hyeon Joo;Geun-Duk Park
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.7
    • /
    • pp.73-80
    • /
    • 2024
  • In this paper, we proposes a method to improve the accuracy of predicting the Korea Composite Stock Price Index (KOSPI) by combining topic modeling and Long Short-Term Memory (LSTM) neural networks. In this paper, we use the Latent Dirichlet Allocation (LDA) technique to extract ten major topics related to interest rate increases and decreases from financial news data. The extracted topics, along with historical KOSPI index data, are input into an LSTM model to predict the KOSPI index. The proposed model has the characteristic of predicting the KOSPI index by combining the time series prediction method by inputting the historical KOSPI index into the LSTM model and the topic modeling method by inputting news data. To verify the performance of the proposed model, this paper designs four models (LSTM_K model, LSTM_KNS model, LDA_K model, LDA_KNS model) based on the types of input data for the LSTM and presents the predictive performance of each model. The comparison of prediction performance results shows that the LSTM model (LDA_K model), which uses financial news topic data and historical KOSPI index data as inputs, recorded the lowest RMSE (Root Mean Square Error), demonstrating the best predictive performance.

Comparison of regression model and LSTM-RNN model in predicting deterioration of prestressed concrete box girder bridges

  • Gao Jing;Lin Ruiying;Zhang Yao
    • Structural Engineering and Mechanics
    • /
    • v.91 no.1
    • /
    • pp.39-47
    • /
    • 2024
  • Bridge deterioration shows the change of bridge condition during its operation, and predicting bridge deterioration is important for implementing predictive protection and planning future maintenance. However, in practical application, the raw inspection data of bridges are not continuous, which has a greater impact on the accuracy of the prediction results. Therefore, two kinds of bridge deterioration models are established in this paper: one is based on the traditional regression theory, combined with the distribution fitting theory to preprocess the data, which solves the problem of irregular distribution and incomplete quantity of raw data. Secondly, based on the theory of Long Short-Term Memory (LSTM) Recurrent Neural Network (RNN), the network is trained using the raw inspection data, which can realize the prediction of the future deterioration of bridges through the historical data. And the inspection data of 60 prestressed concrete box girder bridges in Xiamen, China are used as an example for validation and comparative analysis, and the results show that both deterioration models can predict the deterioration of prestressed concrete box girder bridges. The regression model shows that the bridge deteriorates gradually, while the LSTM-RNN model shows that the bridge keeps great condition during the first 5 years and degrades rapidly from 5 years to 15 years. Based on the current inspection database, the LSTM-RNN model performs better than the regression model because it has smaller prediction error. With the continuous improvement of the database, the results of this study can be extended to other bridge types or other degradation factors can be introduced to improve the accuracy and usefulness of the deterioration model.

Network Anomaly Traffic Detection Using WGAN-CNN-BiLSTM in Big Data Cloud-Edge Collaborative Computing Environment

  • Yue Wang
    • Journal of Information Processing Systems
    • /
    • v.20 no.3
    • /
    • pp.375-390
    • /
    • 2024
  • Edge computing architecture has effectively alleviated the computing pressure on cloud platforms, reduced network bandwidth consumption, and improved the quality of service for user experience; however, it has also introduced new security issues. Existing anomaly detection methods in big data scenarios with cloud-edge computing collaboration face several challenges, such as sample imbalance, difficulty in dealing with complex network traffic attacks, and difficulty in effectively training large-scale data or overly complex deep-learning network models. A lightweight deep-learning model was proposed to address these challenges. First, normalization on the user side was used to preprocess the traffic data. On the edge side, a trained Wasserstein generative adversarial network (WGAN) was used to supplement the data samples, which effectively alleviates the imbalance issue of a few types of samples while occupying a small amount of edge-computing resources. Finally, a trained lightweight deep learning network model is deployed on the edge side, and the preprocessed and expanded local data are used to fine-tune the trained model. This ensures that the data of each edge node are more consistent with the local characteristics, effectively improving the system's detection ability. In the designed lightweight deep learning network model, two sets of convolutional pooling layers of convolutional neural networks (CNN) were used to extract spatial features. The bidirectional long short-term memory network (BiLSTM) was used to collect time sequence features, and the weight of traffic features was adjusted through the attention mechanism, improving the model's ability to identify abnormal traffic features. The proposed model was experimentally demonstrated using the NSL-KDD, UNSW-NB15, and CIC-ISD2018 datasets. The accuracies of the proposed model on the three datasets were as high as 0.974, 0.925, and 0.953, respectively, showing superior accuracy to other comparative models. The proposed lightweight deep learning network model has good application prospects for anomaly traffic detection in cloud-edge collaborative computing architectures.

Speech Emotion Recognition in People at High Risk of Dementia

  • Dongseon Kim;Bongwon Yi;Yugwon Won
    • Dementia and Neurocognitive Disorders
    • /
    • v.23 no.3
    • /
    • pp.146-160
    • /
    • 2024
  • Background and Purpose: The emotions of people at various stages of dementia need to be effectively utilized for prevention, early intervention, and care planning. With technology available for understanding and addressing the emotional needs of people, this study aims to develop speech emotion recognition (SER) technology to classify emotions for people at high risk of dementia. Methods: Speech samples from people at high risk of dementia were categorized into distinct emotions via human auditory assessment, the outcomes of which were annotated for guided deep-learning method. The architecture incorporated convolutional neural network, long short-term memory, attention layers, and Wav2Vec2, a novel feature extractor to develop automated speech-emotion recognition. Results: Twenty-seven kinds of Emotions were found in the speech of the participants. These emotions were grouped into 6 detailed emotions: happiness, interest, sadness, frustration, anger, and neutrality, and further into 3 basic emotions: positive, negative, and neutral. To improve algorithmic performance, multiple learning approaches were applied using different data sources-voice and text-and varying the number of emotions. Ultimately, a 2-stage algorithm-initial text-based classification followed by voice-based analysis-achieved the highest accuracy, reaching 70%. Conclusions: The diverse emotions identified in this study were attributed to the characteristics of the participants and the method of data collection. The speech of people at high risk of dementia to companion robots also explains the relatively low performance of the SER algorithm. Accordingly, this study suggests the systematic and comprehensive construction of a dataset from people with dementia.

Performance Analysis of Deep Learning-based Normalization According to Input-output Structure and Neural Network Model (입출력구조와 신경망 모델에 따른 딥러닝 기반 정규화 기법의 성능 분석)

  • Changsoo Ryu;Geunhwan Kim
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.29 no.4
    • /
    • pp.13-24
    • /
    • 2024
  • In this paper, we analyzed the performance of normalization according to various neural network models and input-output structures. For the analysis, a simulation-based dataset for noise environments with homogeneous and up to three interfering signals was used. As a result, the end-to-end structure that directly outputs noise variance showed superior performance when using a 1-D convolutional neural network and BiLSTM model, and was analyzed to be particularly robust against interference signals. This is because the 1-D convolutional neural network and bidirectional long short-term memory models have stronger inductive bias than the multilayer perceptron and transformer models. The analysis of this paper are expected to be used as a useful reference for future research on deep learning-based normalization.

Life prediction of IGBT module for nuclear power plant rod position indicating and rod control system based on SDAE-LSTM

  • Zhi Chen;Miaoxin Dai;Jie Liu;Wei Jiang;Yuan Min
    • Nuclear Engineering and Technology
    • /
    • v.56 no.9
    • /
    • pp.3740-3749
    • /
    • 2024
  • To reduce the losses caused by aging failure of insulation gate bipolar transistor (IGBT), which is the core components of nuclear power plant rod position indicating and rod control (RPC) system. It is necessary to conduct studies on its life prediction. The selection of IGBT failure characteristic parameters in existing research relies heavily on failure principles and expert experience. Moreover, the analysis and learning of time-domain degradation data have not been fully conducted, resulting in low prediction efficiency as the monotonicity, time correlation, and poor anti-interference ability of extracted degradation features. This paper utilizes the advantages of the stacked denoising autoencoder(SDAE) network in adaptive feature extraction and denoising capabilities to perform adaptive feature extraction on IGBT time-domain degradation data; establishes a long-short-term memory (LSTM) prediction model, and optimizes the learning rate, number of nodes in the hidden layer, and number of hidden layers using the Gray Wolf Optimization (GWO) algorithm; conducts verification experiments on the IGBT accelerated aging dataset provided by NASA PCoE Research Center, and selects performance evaluation indicators to compare and analyze the prediction results of the SDAE-LSTM model, PSOLSTM model, and BP model. The results show that the SDAE-LSTM model can achieve more accurate and stable IGBT life prediction.

Comparative Analysis of RNN Architectures and Activation Functions with Attention Mechanisms for Mars Weather Prediction

  • Jaehyeok Jo;Yunho Sin;Bo-Young Kim;Jihoon Moon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.10
    • /
    • pp.1-9
    • /
    • 2024
  • In this paper, we propose a comparative analysis to evaluate the impact of activation functions and attention mechanisms on the performance of time-series models for Mars meteorological data. Mars meteorological data are nonlinear and irregular due to low atmospheric density, rapid temperature variations, and complex terrain. We use long short-term memory (LSTM), bidirectional LSTM (BiLSTM), gated recurrent unit (GRU), and bidirectional GRU (BiGRU) architectures to evaluate the effectiveness of different activation functions and attention mechanisms. The activation functions tested include rectified linear unit (ReLU), leaky ReLU, exponential linear unit (ELU), Gaussian error linear unit (GELU), Swish, and scaled ELU (SELU), and model performance was measured using mean absolute error (MAE) and root mean square error (RMSE) metrics. Our results show that the integration of attentional mechanisms improves both MAE and RMSE, with Swish and ReLU achieving the best performance for minimum temperature prediction. Conversely, GELU and ELU were less effective for pressure prediction. These results highlight the critical role of selecting appropriate activation functions and attention mechanisms in improving model accuracy for complex time-series forecasting.

A Deep Learning Based Approach to Recognizing Accompanying Status of Smartphone Users Using Multimodal Data (스마트폰 다종 데이터를 활용한 딥러닝 기반의 사용자 동행 상태 인식)

  • Kim, Kilho;Choi, Sangwoo;Chae, Moon-jung;Park, Heewoong;Lee, Jaehong;Park, Jonghun
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.163-177
    • /
    • 2019
  • As smartphones are getting widely used, human activity recognition (HAR) tasks for recognizing personal activities of smartphone users with multimodal data have been actively studied recently. The research area is expanding from the recognition of the simple body movement of an individual user to the recognition of low-level behavior and high-level behavior. However, HAR tasks for recognizing interaction behavior with other people, such as whether the user is accompanying or communicating with someone else, have gotten less attention so far. And previous research for recognizing interaction behavior has usually depended on audio, Bluetooth, and Wi-Fi sensors, which are vulnerable to privacy issues and require much time to collect enough data. Whereas physical sensors including accelerometer, magnetic field and gyroscope sensors are less vulnerable to privacy issues and can collect a large amount of data within a short time. In this paper, a method for detecting accompanying status based on deep learning model by only using multimodal physical sensor data, such as an accelerometer, magnetic field and gyroscope, was proposed. The accompanying status was defined as a redefinition of a part of the user interaction behavior, including whether the user is accompanying with an acquaintance at a close distance and the user is actively communicating with the acquaintance. A framework based on convolutional neural networks (CNN) and long short-term memory (LSTM) recurrent networks for classifying accompanying and conversation was proposed. First, a data preprocessing method which consists of time synchronization of multimodal data from different physical sensors, data normalization and sequence data generation was introduced. We applied the nearest interpolation to synchronize the time of collected data from different sensors. Normalization was performed for each x, y, z axis value of the sensor data, and the sequence data was generated according to the sliding window method. Then, the sequence data became the input for CNN, where feature maps representing local dependencies of the original sequence are extracted. The CNN consisted of 3 convolutional layers and did not have a pooling layer to maintain the temporal information of the sequence data. Next, LSTM recurrent networks received the feature maps, learned long-term dependencies from them and extracted features. The LSTM recurrent networks consisted of two layers, each with 128 cells. Finally, the extracted features were used for classification by softmax classifier. The loss function of the model was cross entropy function and the weights of the model were randomly initialized on a normal distribution with an average of 0 and a standard deviation of 0.1. The model was trained using adaptive moment estimation (ADAM) optimization algorithm and the mini batch size was set to 128. We applied dropout to input values of the LSTM recurrent networks to prevent overfitting. The initial learning rate was set to 0.001, and it decreased exponentially by 0.99 at the end of each epoch training. An Android smartphone application was developed and released to collect data. We collected smartphone data for a total of 18 subjects. Using the data, the model classified accompanying and conversation by 98.74% and 98.83% accuracy each. Both the F1 score and accuracy of the model were higher than the F1 score and accuracy of the majority vote classifier, support vector machine, and deep recurrent neural network. In the future research, we will focus on more rigorous multimodal sensor data synchronization methods that minimize the time stamp differences. In addition, we will further study transfer learning method that enables transfer of trained models tailored to the training data to the evaluation data that follows a different distribution. It is expected that a model capable of exhibiting robust recognition performance against changes in data that is not considered in the model learning stage will be obtained.

Prediction of Urban Flood Extent by LSTM Model and Logistic Regression (LSTM 모형과 로지스틱 회귀를 통한 도시 침수 범위의 예측)

  • Kim, Hyun Il;Han, Kun Yeun;Lee, Jae Yeong
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.40 no.3
    • /
    • pp.273-283
    • /
    • 2020
  • Because of climate change, the occurrence of localized and heavy rainfall is increasing. It is important to predict floods in urban areas that have suffered inundation in the past. For flood prediction, not only numerical analysis models but also machine learning-based models can be applied. The LSTM (Long Short-Term Memory) neural network used in this study is appropriate for sequence data, but it demands a lot of data. However, rainfall that causes flooding does not appear every year in a single urban basin, meaning it is difficult to collect enough data for deep learning. Therefore, in addition to the rainfall observed in the study area, the observed rainfall in another urban basin was applied in the predictive model. The LSTM neural network was used for predicting the total overflow, and the result of the SWMM (Storm Water Management Model) was applied as target data. The prediction of the inundation map was performed by using logistic regression; the independent variable was the total overflow and the dependent variable was the presence or absence of flooding in each grid. The dependent variable of logistic regression was collected through the simulation results of a two-dimensional flood model. The input data of the two-dimensional flood model were the overflow at each manhole calculated by the SWMM. According to the LSTM neural network parameters, the prediction results of total overflow were compared. Four predictive models were used in this study depending on the parameter of the LSTM. The average RMSE (Root Mean Square Error) for verification and testing was 1.4279 ㎥/s, 1.0079 ㎥/s for the four LSTM models. The minimum RMSE of the verification and testing was calculated as 1.1655 ㎥/s and 0.8797 ㎥/s. It was confirmed that the total overflow can be predicted similarly to the SWMM simulation results. The prediction of inundation extent was performed by linking the logistic regression with the results of the LSTM neural network, and the maximum area fitness was 97.33 % when more than 0.5 m depth was considered. The methodology presented in this study would be helpful in improving urban flood response based on deep learning methodology.