• Title/Summary/Keyword: Convolutional recurrent neural network

Search Result 90, Processing Time 0.021 seconds

A Study on Person Re-Identification System using Enhanced RNN (확장된 RNN을 활용한 사람재인식 시스템에 관한 연구)

  • Choi, Seok-Gyu;Xu, Wenjie
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.17 no.2
    • /
    • pp.15-23
    • /
    • 2017
  • The person Re-identification is the most challenging part of computer vision due to the significant changes in human pose and background clutter with occlusions. The picture from non-overlapping cameras enhance the difficulty to distinguish some person from the other. To reach a better performance match, most methods use feature selection and distance metrics separately to get discriminative representations and proper distance to describe the similarity between person and kind of ignoring some significant features. This situation has encouraged us to consider a novel method to deal with this problem. In this paper, we proposed an enhanced recurrent neural network with three-tier hierarchical network for person re-identification. Specifically, the proposed recurrent neural network (RNN) model contain an iterative expectation maximum (EM) algorithm and three-tier Hierarchical network to jointly learn both the discriminative features and metrics distance. The iterative EM algorithm can fully use of the feature extraction ability of convolutional neural network (CNN) which is in series before the RNN. By unsupervised learning, the EM framework can change the labels of the patches and train larger datasets. Through the three-tier hierarchical network, the convolutional neural network, recurrent network and pooling layer can jointly be a feature extractor to better train the network. The experimental result shows that comparing with other researchers' approaches in this field, this method also can get a competitive accuracy. The influence of different component of this method will be analyzed and evaluated in the future research.

Using machine learning to forecast and assess the uncertainty in the response of a typical PWR undergoing a steam generator tube rupture accident

  • Tran Canh Hai Nguyen ;Aya Diab
    • Nuclear Engineering and Technology
    • /
    • v.55 no.9
    • /
    • pp.3423-3440
    • /
    • 2023
  • In this work, a multivariate time-series machine learning meta-model is developed to predict the transient response of a typical nuclear power plant (NPP) undergoing a steam generator tube rupture (SGTR). The model employs Recurrent Neural Networks (RNNs), including the Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and a hybrid CNN-LSTM model. To address the uncertainty inherent in such predictions, a Bayesian Neural Network (BNN) was implemented. The models were trained using a database generated by the Best Estimate Plus Uncertainty (BEPU) methodology; coupling the thermal hydraulics code, RELAP5/SCDAP/MOD3.4 to the statistical tool, DAKOTA, to predict the variation in system response under various operational and phenomenological uncertainties. The RNN models successfully captures the underlying characteristics of the data with reasonable accuracy, and the BNN-LSTM approach offers an additional layer of insight into the level of uncertainty associated with the predictions. The results demonstrate that LSTM outperforms GRU, while the hybrid CNN-LSTM model is computationally the most efficient. This study aims to gain a better understanding of the capabilities and limitations of machine learning models in the context of nuclear safety. By expanding the application of ML models to more severe accident scenarios, where operators are under extreme stress and prone to errors, ML models can provide valuable support and act as expert systems to assist in decision-making while minimizing the chances of human error.

DeepAct: A Deep Neural Network Model for Activity Detection in Untrimmed Videos

  • Song, Yeongtaek;Kim, Incheol
    • Journal of Information Processing Systems
    • /
    • v.14 no.1
    • /
    • pp.150-161
    • /
    • 2018
  • We propose a novel deep neural network model for detecting human activities in untrimmed videos. The process of human activity detection in a video involves two steps: a step to extract features that are effective in recognizing human activities in a long untrimmed video, followed by a step to detect human activities from those extracted features. To extract the rich features from video segments that could express unique patterns for each activity, we employ two different convolutional neural network models, C3D and I-ResNet. For detecting human activities from the sequence of extracted feature vectors, we use BLSTM, a bi-directional recurrent neural network model. By conducting experiments with ActivityNet 200, a large-scale benchmark dataset, we show the high performance of the proposed DeepAct model.

Text Classification Using Parallel Word-level and Character-level Embeddings in Convolutional Neural Networks

  • Geonu Kim;Jungyeon Jang;Juwon Lee;Kitae Kim;Woonyoung Yeo;Jong Woo Kim
    • Asia pacific journal of information systems
    • /
    • v.29 no.4
    • /
    • pp.771-788
    • /
    • 2019
  • Deep learning techniques such as Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) show superior performance in text classification than traditional approaches such as Support Vector Machines (SVMs) and Naïve Bayesian approaches. When using CNNs for text classification tasks, word embedding or character embedding is a step to transform words or characters to fixed size vectors before feeding them into convolutional layers. In this paper, we propose a parallel word-level and character-level embedding approach in CNNs for text classification. The proposed approach can capture word-level and character-level patterns concurrently in CNNs. To show the usefulness of proposed approach, we perform experiments with two English and three Korean text datasets. The experimental results show that character-level embedding works better in Korean and word-level embedding performs well in English. Also the experimental results reveal that the proposed approach provides better performance than traditional CNNs with word-level embedding or character-level embedding in both Korean and English documents. From more detail investigation, we find that the proposed approach tends to perform better when there is relatively small amount of data comparing to the traditional embedding approaches.

Deep Learning-based Prediction of PM10 Fluctuation from Gwanak-gu Urban Area, Seoul, Korea (서울 관악구 도심지역 미세먼지(PM10) 관측 값을 활용한 딥러닝 기반의 농도변동 예측)

  • Choi, Han-Soo;Kang, Myungjoo;Kim, Yong Cheol;Choi, Hanna
    • Journal of Soil and Groundwater Environment
    • /
    • v.25 no.3
    • /
    • pp.74-83
    • /
    • 2020
  • Since fine dust (PM10) has a significant influence on soil and groundwater composition during dry and wet deposition processes, it is of a vital importance to understand the fate and transport of aerosol in geological environments. Fine dust is formed after the chemical reaction of several precursors, typically observed in short intervals within a few hours. In this study, deep learning approach was applied to predict the fate of fine dust in an urban area. Deep learning training was performed by combining convolutional neural network (CNN) and recurrent neural network (RNN) techniques. The PM10 concentration after 1 hour was predicted based on three-hour data by setting SO2, CO, O3, NO2, and PM10 as training data. The obtained coefficient of determination value, R2, was 0.8973 between predicted and measured values for the entire concentration range of PM10, suggesting deep learning method can be developed into a reliable and viable tool for prediction of fine dust concentration.

PowerShell-based Malware Detection Method Using Command Execution Monitoring and Deep Learning (명령 실행 모니터링과 딥 러닝을 이용한 파워셸 기반 악성코드 탐지 방법)

  • Lee, Seung-Hyeon;Moon, Jong-Sub
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.28 no.5
    • /
    • pp.1197-1207
    • /
    • 2018
  • PowerShell is command line shell and scripting language, built on the .NET framework, and it has several advantages as an attack tool, including built-in support for Windows, easy code concealment and persistence, and various pen-test frameworks. Accordingly, malwares using PowerShell are increasing rapidly, however, there is a limit to cope with the conventional malware detection technique. In this paper, we propose an improved monitoring method to observe commands executed in the PowerShell and a deep learning based malware classification model that extract features from commands using Convolutional Neural Network(CNN) and send them to Recurrent Neural Network(RNN) according to the order of execution. As a result of testing the proposed model with 5-fold cross validation using 1,916 PowerShell-based malwares collected at malware sharing site and 38,148 benign scripts disclosed by an obfuscation detection study, it shows that the model effectively detects malwares with about 97% True Positive Rate(TPR) and 1% False Positive Rate(FPR).

Detection of NoSQL Injection Attack in Non-Relational Database Using Convolutional Neural Network and Recurrent Neural Network (비관계형 데이터베이스 환경에서 CNN과 RNN을 활용한 NoSQL 삽입 공격 탐지 모델)

  • Seo, Jeong-eun;Moon, Jong-sub
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.3
    • /
    • pp.455-464
    • /
    • 2020
  • With a variety of data types and high utilization of data, non-relational databases are a popular data storage because it supports better availability and scalability. The increasing use of this technology also brings the risk of NoSQL injection attacks. Existing works mostly discuss the rule-based detection of NoSQL injection attacks that it is hard to deal with NoSQL queries beyond the coverage of the rules. In this paper, we propose a model for detecting NoSQL injection attacks. Our model is based on deep learning algorithms that select features from NoSQL queries using CNN, and classify NoSQL queries using RNN. Also, we experiment the proposed model to compare with existing models, and find that our model outperforms traditional models in terms of detection rate.

A study on training DenseNet-Recurrent Neural Network for sound event detection (음향 이벤트 검출을 위한 DenseNet-Recurrent Neural Network 학습 방법에 관한 연구)

  • Hyeonjin Cha;Sangwook Park
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.5
    • /
    • pp.395-401
    • /
    • 2023
  • Sound Event Detection (SED) aims to identify not only sound category but also time interval for target sounds in an audio waveform. It is a critical technique in field of acoustic surveillance system and monitoring system. Recently, various models have introduced through Detection and Classification of Acoustic Scenes and Events (DCASE) Task 4. This paper explored how to design optimal parameters of DenseNet based model, which has led to outstanding performance in other recognition system. In experiment, DenseRNN as an SED model consists of DensNet-BC and bi-directional Gated Recurrent Units (GRU). This model is trained with Mean teacher model. With an event-based f-score, evaluation is performed depending on parameters, related to model architecture as well as model training, under the assessment protocol of DCASE task4. Experimental result shows that the performance goes up and has been saturated to near the best. Also, DenseRNN would be trained more effectively without dropout technique.

Automatic gasometer reading system using selective optical character recognition (관심 문자열 인식 기술을 이용한 가스계량기 자동 검침 시스템)

  • Lee, Kyohyuk;Kim, Taeyeon;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.1-25
    • /
    • 2020
  • In this paper, we suggest an application system architecture which provides accurate, fast and efficient automatic gasometer reading function. The system captures gasometer image using mobile device camera, transmits the image to a cloud server on top of private LTE network, and analyzes the image to extract character information of device ID and gas usage amount by selective optical character recognition based on deep learning technology. In general, there are many types of character in an image and optical character recognition technology extracts all character information in an image. But some applications need to ignore non-of-interest types of character and only have to focus on some specific types of characters. For an example of the application, automatic gasometer reading system only need to extract device ID and gas usage amount character information from gasometer images to send bill to users. Non-of-interest character strings, such as device type, manufacturer, manufacturing date, specification and etc., are not valuable information to the application. Thus, the application have to analyze point of interest region and specific types of characters to extract valuable information only. We adopted CNN (Convolutional Neural Network) based object detection and CRNN (Convolutional Recurrent Neural Network) technology for selective optical character recognition which only analyze point of interest region for selective character information extraction. We build up 3 neural networks for the application system. The first is a convolutional neural network which detects point of interest region of gas usage amount and device ID information character strings, the second is another convolutional neural network which transforms spatial information of point of interest region to spatial sequential feature vectors, and the third is bi-directional long short term memory network which converts spatial sequential information to character strings using time-series analysis mapping from feature vectors to character strings. In this research, point of interest character strings are device ID and gas usage amount. Device ID consists of 12 arabic character strings and gas usage amount consists of 4 ~ 5 arabic character strings. All system components are implemented in Amazon Web Service Cloud with Intel Zeon E5-2686 v4 CPU and NVidia TESLA V100 GPU. The system architecture adopts master-lave processing structure for efficient and fast parallel processing coping with about 700,000 requests per day. Mobile device captures gasometer image and transmits to master process in AWS cloud. Master process runs on Intel Zeon CPU and pushes reading request from mobile device to an input queue with FIFO (First In First Out) structure. Slave process consists of 3 types of deep neural networks which conduct character recognition process and runs on NVidia GPU module. Slave process is always polling the input queue to get recognition request. If there are some requests from master process in the input queue, slave process converts the image in the input queue to device ID character string, gas usage amount character string and position information of the strings, returns the information to output queue, and switch to idle mode to poll the input queue. Master process gets final information form the output queue and delivers the information to the mobile device. We used total 27,120 gasometer images for training, validation and testing of 3 types of deep neural network. 22,985 images were used for training and validation, 4,135 images were used for testing. We randomly splitted 22,985 images with 8:2 ratio for training and validation respectively for each training epoch. 4,135 test image were categorized into 5 types (Normal, noise, reflex, scale and slant). Normal data is clean image data, noise means image with noise signal, relfex means image with light reflection in gasometer region, scale means images with small object size due to long-distance capturing and slant means images which is not horizontally flat. Final character string recognition accuracies for device ID and gas usage amount of normal data are 0.960 and 0.864 respectively.

LSTM Network with Tracking Association for Multi-Object Tracking

  • Farhodov, Xurshedjon;Moon, Kwang-Seok;Lee, Suk-Hwan;Kwon, Ki-Ryong
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.10
    • /
    • pp.1236-1249
    • /
    • 2020
  • In a most recent object tracking research work, applying Convolutional Neural Network and Recurrent Neural Network-based strategies become relevant for resolving the noticeable challenges in it, like, occlusion, motion, object, and camera viewpoint variations, changing several targets, lighting variations. In this paper, the LSTM Network-based Tracking association method has proposed where the technique capable of real-time multi-object tracking by creating one of the useful LSTM networks that associated with tracking, which supports the long term tracking along with solving challenges. The LSTM network is a different neural network defined in Keras as a sequence of layers, where the Sequential classes would be a container for these layers. This purposing network structure builds with the integration of tracking association on Keras neural-network library. The tracking process has been associated with the LSTM Network feature learning output and obtained outstanding real-time detection and tracking performance. In this work, the main focus was learning trackable objects locations, appearance, and motion details, then predicting the feature location of objects on boxes according to their initial position. The performance of the joint object tracking system has shown that the LSTM network is more powerful and capable of working on a real-time multi-object tracking process.