• Title/Summary/Keyword: Long Short Term Memory

Search Result 639, Processing Time 0.023 seconds

Korean Abbreviation Generation using Sequence to Sequence Learning (Sequence-to-sequence 학습을 이용한 한국어 약어 생성)

  • Choi, Su Jeong;Park, Seong-Bae;Kim, Kweon-Yang
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.3
    • /
    • pp.183-187
    • /
    • 2017
  • Smart phone users prefer fast reading and texting. Hence, users frequently use abbreviated sequences of words and phrases. Nowadays, abbreviations are widely used from chat terms to technical terms. Therefore, gathering abbreviations would be helpful to many services, including information retrieval, recommendation system, and so on. However, manually gathering abbreviations needs to much effort and cost. This is because new abbreviations are continuously generated whenever a new material such as a TV program or a phenomenon is made. Thus it is required to generate of abbreviations automatically. To generate Korean abbreviations, the existing methods use the rule-based approach. The rule-based approach has limitations, in that it is unable to generate irregular abbreviations. Another problem is to decide the correct abbreviation among candidate abbreviations generated rules. To address the limitations, we propose a method of generating Korean abbreviations automatically using sequence-to-sequence learning in this paper. The sequence-to-sequence learning can generate irregular abbreviation and does not lead to the problem of deciding correct abbreviation among candidate abbreviations. Accordingly, it is suitable for generating Korean abbreviations. To evaluate the proposed method, we use dataset of two type. As experimental results, we prove that our method is effective for irregular abbreviations.

The Ability of L2 LSTM Language Models to Learn the Filler-Gap Dependency

  • Kim, Euhee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.11
    • /
    • pp.27-40
    • /
    • 2020
  • In this paper, we investigate the correlation between the amount of English sentences that Korean English learners (L2ers) are exposed to and their sentence processing patterns by examining what Long Short-Term Memory (LSTM) language models (LMs) can learn about implicit syntactic relationship: that is, the filler-gap dependency. The filler-gap dependency refers to a relationship between a (wh-)filler, which is a wh-phrase like 'what' or 'who' overtly in clause-peripheral position, and its gap in clause-internal position, which is an invisible, empty syntactic position to be filled by the (wh-)filler for proper interpretation. Here to implement L2ers' English learning, we build LSTM LMs that in turn learn a subset of the known restrictions on the filler-gap dependency from English sentences in the L2 corpus that L2ers can potentially encounter in their English learning. Examining LSTM LMs' behaviors on controlled sentences designed with the filler-gap dependency, we show the characteristics of L2ers' sentence processing using the information-theoretic metric of surprisal that quantifies violations of the filler-gap dependency or wh-licensing interaction effects. Furthermore, comparing L2ers' LMs with native speakers' LM in light of processing the filler-gap dependency, we not only note that in their sentence processing both L2ers' LM and native speakers' LM can track abstract syntactic structures involved in the filler-gap dependency, but also show using linear mixed-effects regression models that there exist significant differences between them in processing such a dependency.

Proposal of a Step-by-Step Optimized Campus Power Forecast Model using CNN-LSTM Deep Learning (CNN-LSTM 딥러닝 기반 캠퍼스 전력 예측 모델 최적화 단계 제시)

  • Kim, Yein;Lee, Seeun;Kwon, Youngsung
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.10
    • /
    • pp.8-15
    • /
    • 2020
  • A forecasting method using deep learning does not have consistent results due to the differences in the characteristics of the dataset, even though they have the same forecasting models and parameters. For example, the forecasting model X optimized with dataset A would not produce the optimized result with another dataset B. The forecasting model with the characteristics of the dataset needs to be optimized to increase the accuracy of the forecasting model. Therefore, this paper proposes novel optimization steps for outlier removal, dataset classification, and a CNN-LSTM-based hyperparameter tuning process to forecast the daily power usage of a university campus based on the hourly interval. The proposing model produces high forecasting accuracy with a 2% of MAPE with a single power input variable. The proposing model can be used in EMS to suggest improved strategies to users and consequently to improve the power efficiency.

Comparison of physics-based and data-driven models for streamflow simulation of the Mekong river (메콩강 유출모의를 위한 물리적 및 데이터 기반 모형의 비교·분석)

  • Lee, Giha;Jung, Sungho;Lee, Daeeop
    • Journal of Korea Water Resources Association
    • /
    • v.51 no.6
    • /
    • pp.503-514
    • /
    • 2018
  • In recent, the hydrological regime of the Mekong river is changing drastically due to climate change and haphazard watershed development including dam construction. Information of hydrologic feature like streamflow of the Mekong river are required for water disaster prevention and sustainable water resources development in the river sharing countries. In this study, runoff simulations at the Kratie station of the lower Mekong river are performed using SWAT (Soil and Water Assessment Tool), a physics-based hydrologic model, and LSTM (Long Short-Term Memory), a data-driven deep learning algorithm. The SWAT model was set up based on globally-available database (topography: HydroSHED, landuse: GLCF-MODIS, soil: FAO-Soil map, rainfall: APHRODITE, etc) and then simulated daily discharge from 2003 to 2007. The LSTM was built using deep learning open-source library TensorFlow and the deep-layer neural networks of the LSTM were trained based merely on daily water level data of 10 upper stations of the Kratie during two periods: 2000~2002 and 2008~2014. Then, LSTM simulated daily discharge for 2003~2007 as in SWAT model. The simulation results show that Nash-Sutcliffe Efficiency (NSE) of each model were calculated at 0.9(SWAT) and 0.99(LSTM), respectively. In order to simply simulate hydrological time series of ungauged large watersheds, data-driven model like the LSTM method is more applicable than the physics-based hydrological model having complexity due to various database pressure because it is able to memorize the preceding time series sequences and reflect them to prediction.

Automated Vehicle Research by Recognizing Maneuvering Modes using LSTM Model (LSTM 모델 기반 주행 모드 인식을 통한 자율 주행에 관한 연구)

  • Kim, Eunhui;Oh, Alice
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.16 no.4
    • /
    • pp.153-163
    • /
    • 2017
  • This research is based on the previous research that personally preferred safe distance, rotating angle and speed are differentiated. Thus, we use machine learning model for recognizing maneuvering modes trained per personal or per similar driving pattern groups, and we evaluate automatic driving according to maneuvering modes. By utilizing driving knowledge, we subdivided 8 kinds of longitudinal modes and 4 kinds of lateral modes, and by combining the longitudinal and lateral modes, we build 21 kinds of maneuvering modes. we train the labeled data set per time stamp through RNN, LSTM and Bi-LSTM models by the trips of drivers, which are supervised deep learning models, and evaluate the maneuvering modes of automatic driving for the test data set. The evaluation dataset is aggregated of living trips of 3,000 populations by VTTI in USA for 3 years and we use 1500 trips of 22 people and training, validation and test dataset ratio is 80%, 10% and 10%, respectively. For recognizing longitudinal 8 kinds of maneuvering modes, RNN achieves better accuracy compared to LSTM, Bi-LSTM. However, Bi-LSTM improves the accuracy in recognizing 21 kinds of longitudinal and lateral maneuvering modes in comparison with RNN and LSTM as 1.54% and 0.47%, respectively.

Real-time PM10 Concentration Prediction LSTM Model based on IoT Streaming Sensor data (IoT 스트리밍 센서 데이터에 기반한 실시간 PM10 농도 예측 LSTM 모델)

  • Kim, Sam-Keun;Oh, Tack-Il
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.19 no.11
    • /
    • pp.310-318
    • /
    • 2018
  • Recently, the importance of big data analysis is increasing as a large amount of data is generated by various devices connected to the Internet with the advent of Internet of Things (IoT). Especially, it is necessary to analyze various large-scale IoT streaming sensor data generated in real time and provide various services through new meaningful prediction. This paper proposes a real-time indoor PM10 concentration prediction LSTM model based on streaming data generated from IoT sensor using AWS. We also construct a real-time indoor PM10 concentration prediction service based on the proposed model. Data used in the paper is streaming data collected from the PM10 IoT sensor for 24 hours. This time series data is converted into sequence data consisting of 30 consecutive values from time series data for use as input data of LSTM. The LSTM model is learned through a sliding window process of moving to the immediately adjacent dataset. In order to improve the performance of the model, incremental learning method is applied to the streaming data collected every 24 hours. The linear regression and recurrent neural networks (RNN) models are compared to evaluate the performance of LSTM model. Experimental results show that the proposed LSTM prediction model has 700% improvement over linear regression and 140% improvement over RNN model for its performance level.

Radar rainfall prediction based on deep learning considering temporal consistency (시간 연속성을 고려한 딥러닝 기반 레이더 강우예측)

  • Shin, Hongjoon;Yoon, Seongsim;Choi, Jaemin
    • Journal of Korea Water Resources Association
    • /
    • v.54 no.5
    • /
    • pp.301-309
    • /
    • 2021
  • In this study, we tried to improve the performance of the existing U-net-based deep learning rainfall prediction model, which can weaken the meaning of time series order. For this, ConvLSTM2D U-Net structure model considering temporal consistency of data was applied, and we evaluated accuracy of the ConvLSTM2D U-Net model using a RainNet model and an extrapolation-based advection model. In addition, we tried to improve the uncertainty in the model training process by performing learning not only with a single model but also with 10 ensemble models. The trained neural network rainfall prediction model was optimized to generate 10-minute advance prediction data using four consecutive data of the past 30 minutes from the present. The results of deep learning rainfall prediction models are difficult to identify schematically distinct differences, but with ConvLSTM2D U-Net, the magnitude of the prediction error is the smallest and the location of rainfall is relatively accurate. In particular, the ensemble ConvLSTM2D U-Net showed high CSI, low MAE, and a narrow error range, and predicted rainfall more accurately and stable prediction performance than other models. However, the prediction performance for a specific point was very low compared to the prediction performance for the entire area, and the deep learning rainfall prediction model also had limitations. Through this study, it was confirmed that the ConvLSTM2D U-Net neural network structure to account for the change of time could increase the prediction accuracy, but there is still a limitation of the convolution deep neural network model due to spatial smoothing in the strong rainfall region or detailed rainfall prediction.

A Comparative Study of Machine Learning Algorithms Based on Tensorflow for Data Prediction (데이터 예측을 위한 텐서플로우 기반 기계학습 알고리즘 비교 연구)

  • Abbas, Qalab E.;Jang, Sung-Bong
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.10 no.3
    • /
    • pp.71-80
    • /
    • 2021
  • The selection of an appropriate neural network algorithm is an important step for accurate data prediction in machine learning. Many algorithms based on basic artificial neural networks have been devised to efficiently predict future data. These networks include deep neural networks (DNNs), recurrent neural networks (RNNs), long short-term memory (LSTM) networks, and gated recurrent unit (GRU) neural networks. Developers face difficulties when choosing among these networks because sufficient information on their performance is unavailable. To alleviate this difficulty, we evaluated the performance of each algorithm by comparing their errors and processing times. Each neural network model was trained using a tax dataset, and the trained model was used for data prediction to compare accuracies among the various algorithms. Furthermore, the effects of activation functions and various optimizers on the performance of the models were analyzed The experimental results show that the GRU and LSTM algorithms yields the lowest prediction error with an average RMSE of 0.12 and an average R2 score of 0.78 and 0.75 respectively, and the basic DNN model achieves the lowest processing time but highest average RMSE of 0.163. Furthermore, the Adam optimizer yields the best performance (with DNN, GRU, and LSTM) in terms of error and the worst performance in terms of processing time. The findings of this study are thus expected to be useful for scientists and developers.

A Study on Performance Improvement of Recurrent Neural Networks Algorithm using Word Group Expansion Technique (단어그룹 확장 기법을 활용한 순환신경망 알고리즘 성능개선 연구)

  • Park, Dae Seung;Sung, Yeol Woo;Kim, Cheong Ghil
    • Journal of Industrial Convergence
    • /
    • v.20 no.4
    • /
    • pp.23-30
    • /
    • 2022
  • Recently, with the development of artificial intelligence (AI) and deep learning, the importance of conversational artificial intelligence chatbots is being highlighted. In addition, chatbot research is being conducted in various fields. To build a chatbot, it is developed using an open source platform or a commercial platform for ease of development. These chatbot platforms mainly use RNN and application algorithms. The RNN algorithm has the advantages of fast learning speed, ease of monitoring and verification, and good inference performance. In this paper, a method for improving the inference performance of RNNs and applied algorithms was studied. The proposed method used the word group expansion learning technique of key words for each sentence when RNN and applied algorithm were applied. As a result of this study, the RNN, GRU, and LSTM three algorithms with a cyclic structure achieved a minimum of 0.37% and a maximum of 1.25% inference performance improvement. The research results obtained through this study can accelerate the adoption of artificial intelligence chatbots in related industries. In addition, it can contribute to utilizing various RNN application algorithms. In future research, it will be necessary to study the effect of various activation functions on the performance improvement of artificial neural network algorithms.

Experimental Comparison of Network Intrusion Detection Models Solving Imbalanced Data Problem (데이터의 불균형성을 제거한 네트워크 침입 탐지 모델 비교 분석)

  • Lee, Jong-Hwa;Bang, Jiwon;Kim, Jong-Wouk;Choi, Mi-Jung
    • KNOM Review
    • /
    • v.23 no.2
    • /
    • pp.18-28
    • /
    • 2020
  • With the development of the virtual community, the benefits that IT technology provides to people in fields such as healthcare, industry, communication, and culture are increasing, and the quality of life is also improving. Accordingly, there are various malicious attacks targeting the developed network environment. Firewalls and intrusion detection systems exist to detect these attacks in advance, but there is a limit to detecting malicious attacks that are evolving day by day. In order to solve this problem, intrusion detection research using machine learning is being actively conducted, but false positives and false negatives are occurring due to imbalance of the learning dataset. In this paper, a Random Oversampling method is used to solve the unbalance problem of the UNSW-NB15 dataset used for network intrusion detection. And through experiments, we compared and analyzed the accuracy, precision, recall, F1-score, training and prediction time, and hardware resource consumption of the models. Based on this study using the Random Oversampling method, we develop a more efficient network intrusion detection model study using other methods and high-performance models that can solve the unbalanced data problem.