• Title/Summary/Keyword: Convolutional Long Short Term Memory Layer

Search Result 13, Processing Time 0.021 seconds

Development of Deep Learning Models for Multi-class Sentiment Analysis (딥러닝 기반의 다범주 감성분석 모델 개발)

  • Syaekhoni, M. Alex;Seo, Sang Hyun;Kwon, Young S.
    • Journal of Information Technology Services
    • /
    • v.16 no.4
    • /
    • pp.149-160
    • /
    • 2017
  • Sentiment analysis is the process of determining whether a piece of document, text or conversation is positive, negative, neural or other emotion. Sentiment analysis has been applied for several real-world applications, such as chatbot. In the last five years, the practical use of the chatbot has been prevailing in many field of industry. In the chatbot applications, to recognize the user emotion, sentiment analysis must be performed in advance in order to understand the intent of speakers. The specific emotion is more than describing positive or negative sentences. In light of this context, we propose deep learning models for conducting multi-class sentiment analysis for identifying speaker's emotion which is categorized to be joy, fear, guilt, sad, shame, disgust, and anger. Thus, we develop convolutional neural network (CNN), long short term memory (LSTM), and multi-layer neural network models, as deep neural networks models, for detecting emotion in a sentence. In addition, word embedding process was also applied in our research. In our experiments, we have found that long short term memory (LSTM) model performs best compared to convolutional neural networks and multi-layer neural networks. Moreover, we also show the practical applicability of the deep learning models to the sentiment analysis for chatbot.

MALICIOUS URL RECOGNITION AND DETECTION USING ATTENTION-BASED CNN-LSTM

  • Peng, Yongfang;Tian, Shengwei;Yu, Long;Lv, Yalong;Wang, Ruijin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.11
    • /
    • pp.5580-5593
    • /
    • 2019
  • A malicious Uniform Resource Locator (URL) recognition and detection method based on the combination of Attention mechanism with Convolutional Neural Network and Long Short-Term Memory Network (Attention-Based CNN-LSTM), is proposed. Firstly, the WHOIS check method is used to extract and filter features, including the URL texture information, the URL string statistical information of attributes and the WHOIS information, and the features are subsequently encoded and pre-processed followed by inputting them to the constructed Convolutional Neural Network (CNN) convolution layer to extract local features. Secondly, in accordance with the weights from the Attention mechanism, the generated local features are input into the Long-Short Term Memory (LSTM) model, and subsequently pooled to calculate the global features of the URLs. Finally, the URLs are detected and classified by the SoftMax function using global features. The results demonstrate that compared with the existing methods, the Attention-based CNN-LSTM mechanism has higher accuracy for malicious URL detection.

Earthquake events classification using convolutional recurrent neural network (합성곱 순환 신경망 구조를 이용한 지진 이벤트 분류 기법)

  • Ku, Bonhwa;Kim, Gwantae;Jang, Su;Ko, Hanseok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.6
    • /
    • pp.592-599
    • /
    • 2020
  • This paper proposes a Convolutional Recurrent Neural Net (CRNN) structure that can simultaneously reflect both static and dynamic characteristics of seismic waveforms for various earthquake events classification. Addressing various earthquake events, including not only micro-earthquakes and artificial-earthquakes but also macro-earthquakes, requires both effective feature extraction and a classifier that can discriminate seismic waveform under noisy environment. First, we extract the static characteristics of seismic waveform through an attention-based convolution layer. Then, the extracted feature-map is sequentially injected as input to a multi-input single-output Long Short-Term Memory (LSTM) network structure to extract the dynamic characteristic for various seismic event classifications. Subsequently, we perform earthquake events classification through two fully connected layers and softmax function. Representative experimental results using domestic and foreign earthquake database show that the proposed model provides an effective structure for various earthquake events classification.

Change Detection for High-resolution Satellite Images Using Transfer Learning and Deep Learning Network (전이학습과 딥러닝 네트워크를 활용한 고해상도 위성영상의 변화탐지)

  • Song, Ah Ram;Choi, Jae Wan;Kim, Yong Il
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.37 no.3
    • /
    • pp.199-208
    • /
    • 2019
  • As the number of available satellites increases and technology advances, image information outputs are becoming increasingly diverse and a large amount of data is accumulating. In this study, we propose a change detection method for high-resolution satellite images that uses transfer learning and a deep learning network to overcome the limit caused by insufficient training data via the use of pre-trained information. The deep learning network used in this study comprises convolutional layers to extract the spatial and spectral information and convolutional long-short term memory layers to analyze the time series information. To use the learned information, the two initial convolutional layers of the change detection network are designed to use learned values from 40,000 patches of the ISPRS (International Society for Photogrammertry and Remote Sensing) dataset as initial values. In addition, 2D (2-Dimensional) and 3D (3-dimensional) kernels were used to find the optimized structure for the high-resolution satellite images. The experimental results for the KOMPSAT-3A (KOrean Multi-Purpose SATllite-3A) satellite images show that this change detection method can effectively extract changed/unchanged pixels but is less sensitive to changes due to shadow and relief displacements. In addition, the change detection accuracy of two sites was improved by using 3D kernels. This is because a 3D kernel can consider not only the spatial information but also the spectral information. This study indicates that we can effectively detect changes in high-resolution satellite images using the constructed image information and deep learning network. In future work, a pre-trained change detection network will be applied to newly obtained images to extend the scope of the application.

A Study on People Counting in Public Metro Service using Hybrid CNN-LSTM Algorithm (Hybrid CNN-LSTM 알고리즘을 활용한 도시철도 내 피플 카운팅 연구)

  • Choi, Ji-Hye;Kim, Min-Seung;Lee, Chan-Ho;Choi, Jung-Hwan;Lee, Jeong-Hee;Sung, Tae-Eung
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.131-145
    • /
    • 2020
  • In line with the trend of industrial innovation, IoT technology utilized in a variety of fields is emerging as a key element in creation of new business models and the provision of user-friendly services through the combination of big data. The accumulated data from devices with the Internet-of-Things (IoT) is being used in many ways to build a convenience-based smart system as it can provide customized intelligent systems through user environment and pattern analysis. Recently, it has been applied to innovation in the public domain and has been using it for smart city and smart transportation, such as solving traffic and crime problems using CCTV. In particular, it is necessary to comprehensively consider the easiness of securing real-time service data and the stability of security when planning underground services or establishing movement amount control information system to enhance citizens' or commuters' convenience in circumstances with the congestion of public transportation such as subways, urban railways, etc. However, previous studies that utilize image data have limitations in reducing the performance of object detection under private issue and abnormal conditions. The IoT device-based sensor data used in this study is free from private issue because it does not require identification for individuals, and can be effectively utilized to build intelligent public services for unspecified people. Especially, sensor data stored by the IoT device need not be identified to an individual, and can be effectively utilized for constructing intelligent public services for many and unspecified people as data free form private issue. We utilize the IoT-based infrared sensor devices for an intelligent pedestrian tracking system in metro service which many people use on a daily basis and temperature data measured by sensors are therein transmitted in real time. The experimental environment for collecting data detected in real time from sensors was established for the equally-spaced midpoints of 4×4 upper parts in the ceiling of subway entrances where the actual movement amount of passengers is high, and it measured the temperature change for objects entering and leaving the detection spots. The measured data have gone through a preprocessing in which the reference values for 16 different areas are set and the difference values between the temperatures in 16 distinct areas and their reference values per unit of time are calculated. This corresponds to the methodology that maximizes movement within the detection area. In addition, the size of the data was increased by 10 times in order to more sensitively reflect the difference in temperature by area. For example, if the temperature data collected from the sensor at a given time were 28.5℃, the data analysis was conducted by changing the value to 285. As above, the data collected from sensors have the characteristics of time series data and image data with 4×4 resolution. Reflecting the characteristics of the measured, preprocessed data, we finally propose a hybrid algorithm that combines CNN in superior performance for image classification and LSTM, especially suitable for analyzing time series data, as referred to CNN-LSTM (Convolutional Neural Network-Long Short Term Memory). In the study, the CNN-LSTM algorithm is used to predict the number of passing persons in one of 4×4 detection areas. We verified the validation of the proposed model by taking performance comparison with other artificial intelligence algorithms such as Multi-Layer Perceptron (MLP), Long Short Term Memory (LSTM) and RNN-LSTM (Recurrent Neural Network-Long Short Term Memory). As a result of the experiment, proposed CNN-LSTM hybrid model compared to MLP, LSTM and RNN-LSTM has the best predictive performance. By utilizing the proposed devices and models, it is expected various metro services will be provided with no illegal issue about the personal information such as real-time monitoring of public transport facilities and emergency situation response services on the basis of congestion. However, the data have been collected by selecting one side of the entrances as the subject of analysis, and the data collected for a short period of time have been applied to the prediction. There exists the limitation that the verification of application in other environments needs to be carried out. In the future, it is expected that more reliability will be provided for the proposed model if experimental data is sufficiently collected in various environments or if learning data is further configured by measuring data in other sensors.

A Network Intrusion Security Detection Method Using BiLSTM-CNN in Big Data Environment

  • Hong Wang
    • Journal of Information Processing Systems
    • /
    • v.19 no.5
    • /
    • pp.688-701
    • /
    • 2023
  • The conventional methods of network intrusion detection system (NIDS) cannot measure the trend of intrusiondetection targets effectively, which lead to low detection accuracy. In this study, a NIDS method which based on a deep neural network in a big-data environment is proposed. Firstly, the entire framework of the NIDS model is constructed in two stages. Feature reduction and anomaly probability output are used at the core of the two stages. Subsequently, a convolutional neural network, which encompasses a down sampling layer and a characteristic extractor consist of a convolution layer, the correlation of inputs is realized by introducing bidirectional long short-term memory. Finally, after the convolution layer, a pooling layer is added to sample the required features according to different sampling rules, which promotes the overall performance of the NIDS model. The proposed NIDS method and three other methods are compared, and it is broken down under the conditions of the two databases through simulation experiments. The results demonstrate that the proposed model is superior to the other three methods of NIDS in two databases, in terms of precision, accuracy, F1- score, and recall, which are 91.64%, 93.35%, 92.25%, and 91.87%, respectively. The proposed algorithm is significant for improving the accuracy of NIDS.

Using machine learning to forecast and assess the uncertainty in the response of a typical PWR undergoing a steam generator tube rupture accident

  • Tran Canh Hai Nguyen ;Aya Diab
    • Nuclear Engineering and Technology
    • /
    • v.55 no.9
    • /
    • pp.3423-3440
    • /
    • 2023
  • In this work, a multivariate time-series machine learning meta-model is developed to predict the transient response of a typical nuclear power plant (NPP) undergoing a steam generator tube rupture (SGTR). The model employs Recurrent Neural Networks (RNNs), including the Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and a hybrid CNN-LSTM model. To address the uncertainty inherent in such predictions, a Bayesian Neural Network (BNN) was implemented. The models were trained using a database generated by the Best Estimate Plus Uncertainty (BEPU) methodology; coupling the thermal hydraulics code, RELAP5/SCDAP/MOD3.4 to the statistical tool, DAKOTA, to predict the variation in system response under various operational and phenomenological uncertainties. The RNN models successfully captures the underlying characteristics of the data with reasonable accuracy, and the BNN-LSTM approach offers an additional layer of insight into the level of uncertainty associated with the predictions. The results demonstrate that LSTM outperforms GRU, while the hybrid CNN-LSTM model is computationally the most efficient. This study aims to gain a better understanding of the capabilities and limitations of machine learning models in the context of nuclear safety. By expanding the application of ML models to more severe accident scenarios, where operators are under extreme stress and prone to errors, ML models can provide valuable support and act as expert systems to assist in decision-making while minimizing the chances of human error.

Analysis of streamflow prediction performance by various deep learning schemes

  • Le, Xuan-Hien;Lee, Giha
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2021.06a
    • /
    • pp.131-131
    • /
    • 2021
  • Deep learning models, especially those based on long short-term memory (LSTM), have presented their superiority in addressing time series data issues recently. This study aims to comprehensively evaluate the performance of deep learning models that belong to the supervised learning category in streamflow prediction. Therefore, six deep learning models-standard LSTM, standard gated recurrent unit (GRU), stacked LSTM, bidirectional LSTM (BiLSTM), feed-forward neural network (FFNN), and convolutional neural network (CNN) models-were of interest in this study. The Red River system, one of the largest river basins in Vietnam, was adopted as a case study. In addition, deep learning models were designed to forecast flowrate for one- and two-day ahead at Son Tay hydrological station on the Red River using a series of observed flowrate data at seven hydrological stations on three major river branches of the Red River system-Thao River, Da River, and Lo River-as the input data for training, validation, and testing. The comparison results have indicated that the four LSTM-based models exhibit significantly better performance and maintain stability than the FFNN and CNN models. Moreover, LSTM-based models may reach impressive predictions even in the presence of upstream reservoirs and dams. In the case of the stacked LSTM and BiLSTM models, the complexity of these models is not accompanied by performance improvement because their respective performance is not higher than the two standard models (LSTM and GRU). As a result, we realized that in the context of hydrological forecasting problems, simple architectural models such as LSTM and GRU (with one hidden layer) are sufficient to produce highly reliable forecasts while minimizing computation time because of the sequential data nature.

  • PDF

Improved Convolutional Neural Network Based Cooperative Spectrum Sensing For Cognitive Radio

  • Uppala, Appala Raju;Narasimhulu C, Venkata;Prasad K, Satya
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.6
    • /
    • pp.2128-2147
    • /
    • 2021
  • Cognitive radio systems are being implemented recently to tackle spectrum underutilization problems and aid efficient data traffic. Spectrum sensing is the crucial step in cognitive applications in which cognitive user detects the presence of primary user (PU) in a particular channel thereby switching to another channel for continuous transmission. In cognitive radio systems, the capacity to precisely identify the primary user's signal is essential to secondary user so as to use idle licensed spectrum. Based on the inherent capability, a new spectrum sensing technique is proposed in this paper to identify all types of primary user signals in a cognitive radio condition. Hence, a spectrum sensing algorithm using improved convolutional neural network and long short-term memory (CNN-LSTM) is presented. The principle used in our approach is simulated annealing that discovers reasonable number of neurons for each layer of a completely associated deep neural network to tackle the streamlining issue. The probability of detection is considered as the determining parameter to find the efficiency of the proposed algorithm. Experiments are carried under different signal to noise ratio to indicate better performance of the proposed algorithm. The PU signal will have an associated modulation format and hence identifying the presence of a modulation format itself establishes the presence of PU signal.

Deep Learning Architectures and Applications (딥러닝의 모형과 응용사례)

  • Ahn, SungMahn
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.2
    • /
    • pp.127-142
    • /
    • 2016
  • Deep learning model is a kind of neural networks that allows multiple hidden layers. There are various deep learning architectures such as convolutional neural networks, deep belief networks and recurrent neural networks. Those have been applied to fields like computer vision, automatic speech recognition, natural language processing, audio recognition and bioinformatics where they have been shown to produce state-of-the-art results on various tasks. Among those architectures, convolutional neural networks and recurrent neural networks are classified as the supervised learning model. And in recent years, those supervised learning models have gained more popularity than unsupervised learning models such as deep belief networks, because supervised learning models have shown fashionable applications in such fields mentioned above. Deep learning models can be trained with backpropagation algorithm. Backpropagation is an abbreviation for "backward propagation of errors" and a common method of training artificial neural networks used in conjunction with an optimization method such as gradient descent. The method calculates the gradient of an error function with respect to all the weights in the network. The gradient is fed to the optimization method which in turn uses it to update the weights, in an attempt to minimize the error function. Convolutional neural networks use a special architecture which is particularly well-adapted to classify images. Using this architecture makes convolutional networks fast to train. This, in turn, helps us train deep, muti-layer networks, which are very good at classifying images. These days, deep convolutional networks are used in most neural networks for image recognition. Convolutional neural networks use three basic ideas: local receptive fields, shared weights, and pooling. By local receptive fields, we mean that each neuron in the first(or any) hidden layer will be connected to a small region of the input(or previous layer's) neurons. Shared weights mean that we're going to use the same weights and bias for each of the local receptive field. This means that all the neurons in the hidden layer detect exactly the same feature, just at different locations in the input image. In addition to the convolutional layers just described, convolutional neural networks also contain pooling layers. Pooling layers are usually used immediately after convolutional layers. What the pooling layers do is to simplify the information in the output from the convolutional layer. Recent convolutional network architectures have 10 to 20 hidden layers and billions of connections between units. Training deep learning networks has taken weeks several years ago, but thanks to progress in GPU and algorithm enhancement, training time has reduced to several hours. Neural networks with time-varying behavior are known as recurrent neural networks or RNNs. A recurrent neural network is a class of artificial neural network where connections between units form a directed cycle. This creates an internal state of the network which allows it to exhibit dynamic temporal behavior. Unlike feedforward neural networks, RNNs can use their internal memory to process arbitrary sequences of inputs. Early RNN models turned out to be very difficult to train, harder even than deep feedforward networks. The reason is the unstable gradient problem such as vanishing gradient and exploding gradient. The gradient can get smaller and smaller as it is propagated back through layers. This makes learning in early layers extremely slow. The problem actually gets worse in RNNs, since gradients aren't just propagated backward through layers, they're propagated backward through time. If the network runs for a long time, that can make the gradient extremely unstable and hard to learn from. It has been possible to incorporate an idea known as long short-term memory units (LSTMs) into RNNs. LSTMs make it much easier to get good results when training RNNs, and many recent papers make use of LSTMs or related ideas.