• Title/Summary/Keyword: Bidirectional long short-term memory

Search Result 61, Processing Time 0.026 seconds

Chinese-clinical-record Named Entity Recognition using IDCNN-BiLSTM-Highway Network

  • Tinglong Tang;Yunqiao Guo;Qixin Li;Mate Zhou;Wei Huang;Yirong Wu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.7
    • /
    • pp.1759-1772
    • /
    • 2023
  • Chinese named entity recognition (NER) is a challenging work that seeks to find, recognize and classify various types of information elements in unstructured text. Due to the Chinese text has no natural boundary like the spaces in the English text, Chinese named entity identification is much more difficult. At present, most deep learning based NER models are developed using a bidirectional long short-term memory network (BiLSTM), yet the performance still has some space to improve. To further improve their performance in Chinese NER tasks, we propose a new NER model, IDCNN-BiLSTM-Highway, which is a combination of the BiLSTM, the iterated dilated convolutional neural network (IDCNN) and the highway network. In our model, IDCNN is used to achieve multiscale context aggregation from a long sequence of words. Highway network is used to effectively connect different layers of networks, allowing information to pass through network layers smoothly without attenuation. Finally, the global optimum tag result is obtained by introducing conditional random field (CRF). The experimental results show that compared with other popular deep learning-based NER models, our model shows superior performance on two Chinese NER data sets: Resume and Yidu-S4k, The F1-scores are 94.98 and 77.59, respectively.

Assessment of maximum liquefaction distance using soft computing approaches

  • Kishan Kumar;Pijush Samui;Shiva S. Choudhary
    • Geomechanics and Engineering
    • /
    • v.37 no.4
    • /
    • pp.395-418
    • /
    • 2024
  • The epicentral region of earthquakes is typically where liquefaction-related damage takes place. To determine the maximum distance, such as maximum epicentral distance (Re), maximum fault distance (Rf), or maximum hypocentral distance (Rh), at which an earthquake can inflict damage, given its magnitude, this study, using a recently updated global liquefaction database, multiple ML models are built to predict the limiting distances (Re, Rf, or Rh) required for an earthquake of a given magnitude to cause damage. Four machine learning models LSTM (Long Short-Term Memory), BiLSTM (Bidirectional Long Short-Term Memory), CNN (Convolutional Neural Network), and XGB (Extreme Gradient Boosting) are developed using the Python programming language. All four proposed ML models performed better than empirical models for limiting distance assessment. Among these models, the XGB model outperformed all the models. In order to determine how well the suggested models can predict limiting distances, a number of statistical parameters have been studied. To compare the accuracy of the proposed models, rank analysis, error matrix, and Taylor diagram have been developed. The ML models proposed in this paper are more robust than other current models and may be used to assess the minimal energy of a liquefaction disaster caused by an earthquake or to estimate the maximum distance of a liquefied site provided an earthquake in rapid disaster mapping.

Prediction of Dissolved Oxygen in Jindong Bay Using Time Series Analysis (시계열 분석을 이용한 진동만의 용존산소량 예측)

  • Han, Myeong-Soo;Park, Sung-Eun;Choi, Youngjin;Kim, Youngmin;Hwang, Jae-Dong
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.26 no.4
    • /
    • pp.382-391
    • /
    • 2020
  • In this study, we used artificial intelligence algorithms for the prediction of dissolved oxygen in Jindong Bay. To determine missing values in the observational data, we used the Bidirectional Recurrent Imputation for Time Series (BRITS) deep learning algorithm, Auto-Regressive Integrated Moving Average (ARIMA), a widely used time series analysis method, and the Long Short-Term Memory (LSTM) deep learning method were used to predict the dissolved oxygen. We also compared accuracy of ARIMA and LSTM. The missing values were determined with high accuracy by BRITS in the surface layer; however, the accuracy was low in the lower layers. The accuracy of BRITS was unstable due to the experimental conditions in the middle layer. In the middle and bottom layers, the LSTM model showed higher accuracy than the ARIMA model, whereas the ARIMA model showed superior performance in the surface layer.

A Text Content Classification Using LSTM For Objective Category Classification

  • Noh, Young-Dan;Cho, Kyu-Cheol
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.5
    • /
    • pp.39-46
    • /
    • 2021
  • AI is deeply applied to various algorithms that assists us, not only daily technologies like translator and Face ID, but also contributing to innumerable fields in industry, due to its dominance. In this research, we provide convenience through AI categorization, extracting the only data that users need, with objective classification, rather than verifying all data to find from the internet, where exists an immense number of contents. In this research, we propose a model using LSTM(Long-Short Term Memory Network), which stands out from text classification, and compare its performance with models of RNN(Recurrent Neural Network) and BiLSTM(Bidirectional LSTM), which is suitable structure for natural language processing. The performance of the three models is compared using measurements of accuracy, precision, and recall. As a result, the LSTM model appears to have the best performance. Therefore, in this research, text classification using LSTM is recommended.

Transition-Based Korean Dependency Parsing using Bidirectional LSTM (Bidirectional LSTM을 이용한 전이기반 한국어 의존 구문분석)

  • Ha, Tae-Bin;Lee, Tae-Hyeon;Seo, Young-Hoon
    • Annual Conference on Human and Language Technology
    • /
    • 2018.10a
    • /
    • pp.527-529
    • /
    • 2018
  • 초기 자연언어처리에 FNN(Feedforward Neural Network)을 적용한 연구들에 비해 LSTM(Long Short-Term Memory)은 현재 시점의 정보뿐만 아니라 이전 시점의 정보를 담고 있어 문장을 이루는 어절들, 어절을 이루는 형태소 등 순차적인(sequential) 데이터를 처리하는데 좋은 성능을 보인다. 본 논문에서는 스택과 버퍼에 있는 어절을 양방향 LSTM encoding을 이용한 representation으로 표현하여 전이기반 의존구문분석에 적용하여 현재 UAS 89.4%의 정확도를 보였고, 자질 추가 및 정제작업을 통해 성능이 개선될 것으로 보인다.

  • PDF

A patent application filing forecasting method based on the bidirectional LSTM (양방향 LSTM기반 시계열 특허 동향 예측 연구)

  • Seungwan, Choi;Kwangsoo, Kim;Sooyeong, Kwak
    • Journal of IKEEE
    • /
    • v.26 no.4
    • /
    • pp.545-552
    • /
    • 2022
  • The number of patent application filing for a specific technology has a good relation with the technology's life cycle and future industry development on that area. So industry and governments are highly interested in forecasting the number of patent application filing in order to take appropriate preparations in advance. In this paper, a new method based on the bidirectional long short-term memory(LSTM), a kind of recurrent neural network(RNN), is proposed to improve the forecasting accuracy compared to related methods. Compared with the Bass model which is one of conventional diffusion modeling methods, the proposed method shows the 16% higher performance with the Korean patent filing data on the five selected technology areas.

Expansion of Word Representation for Named Entity Recognition Based on Bidirectional LSTM CRFs (Bidirectional LSTM CRF 기반의 개체명 인식을 위한 단어 표상의 확장)

  • Yu, Hongyeon;Ko, Youngjoong
    • Journal of KIISE
    • /
    • v.44 no.3
    • /
    • pp.306-313
    • /
    • 2017
  • Named entity recognition (NER) seeks to locate and classify named entities in text into pre-defined categories such as names of persons, organizations, locations, expressions of times, etc. Recently, many state-of-the-art NER systems have been implemented with bidirectional LSTM CRFs. Deep learning models based on long short-term memory (LSTM) generally depend on word representations as input. In this paper, we propose an approach to expand word representation by using pre-trained word embedding, part of speech (POS) tag embedding, syllable embedding and named entity dictionary feature vectors. Our experiments show that the proposed approach creates useful word representations as an input of bidirectional LSTM CRFs. Our final presentation shows its efficacy to be 8.05%p higher than baseline NERs with only the pre-trained word embedding vector.

Analysis of streamflow prediction performance by various deep learning schemes

  • Le, Xuan-Hien;Lee, Giha
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2021.06a
    • /
    • pp.131-131
    • /
    • 2021
  • Deep learning models, especially those based on long short-term memory (LSTM), have presented their superiority in addressing time series data issues recently. This study aims to comprehensively evaluate the performance of deep learning models that belong to the supervised learning category in streamflow prediction. Therefore, six deep learning models-standard LSTM, standard gated recurrent unit (GRU), stacked LSTM, bidirectional LSTM (BiLSTM), feed-forward neural network (FFNN), and convolutional neural network (CNN) models-were of interest in this study. The Red River system, one of the largest river basins in Vietnam, was adopted as a case study. In addition, deep learning models were designed to forecast flowrate for one- and two-day ahead at Son Tay hydrological station on the Red River using a series of observed flowrate data at seven hydrological stations on three major river branches of the Red River system-Thao River, Da River, and Lo River-as the input data for training, validation, and testing. The comparison results have indicated that the four LSTM-based models exhibit significantly better performance and maintain stability than the FFNN and CNN models. Moreover, LSTM-based models may reach impressive predictions even in the presence of upstream reservoirs and dams. In the case of the stacked LSTM and BiLSTM models, the complexity of these models is not accompanied by performance improvement because their respective performance is not higher than the two standard models (LSTM and GRU). As a result, we realized that in the context of hydrological forecasting problems, simple architectural models such as LSTM and GRU (with one hidden layer) are sufficient to produce highly reliable forecasts while minimizing computation time because of the sequential data nature.

  • PDF

Stock Prediction Model based on Bidirectional LSTM Recurrent Neural Network (양방향 LSTM 순환신경망 기반 주가예측모델)

  • Joo, Il-Taeck;Choi, Seung-Ho
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.11 no.2
    • /
    • pp.204-208
    • /
    • 2018
  • In this paper, we proposed and evaluated the time series deep learning prediction model for learning fluctuation pattern of stock price. Recurrent neural networks, which can store previous information in the hidden layer, are suitable for the stock price prediction model, which is time series data. In order to maintain the long - term dependency by solving the gradient vanish problem in the recurrent neural network, we use LSTM with small memory inside the recurrent neural network. Furthermore, we proposed the stock price prediction model using bidirectional LSTM recurrent neural network in which the hidden layer is added in the reverse direction of the data flow for solving the limitation of the tendency of learning only based on the immediately preceding pattern of the recurrent neural network. In this experiment, we used the Tensorflow to learn the proposed stock price prediction model with stock price and trading volume input. In order to evaluate the performance of the stock price prediction, the mean square root error between the real stock price and the predicted stock price was obtained. As a result, the stock price prediction model using bidirectional LSTM recurrent neural network has improved prediction accuracy compared with unidirectional LSTM recurrent neural network.

Korean sentence spacing correction model using syllable and morpheme information (음절과 형태소 정보를 이용한 한국어 문장 띄어쓰기 교정 모델)

  • Choi, Jeong-Myeong;Oh, Byoung-Doo;Heo, Tak-Sung;Jeong, Yeong-Seok;Kim, Yu-Seop
    • Annual Conference on Human and Language Technology
    • /
    • 2020.10a
    • /
    • pp.141-144
    • /
    • 2020
  • 한국어에서 문장의 가독성이나 맥락 파악을 위해 띄어쓰기는 매우 중요하다. 또한 자연 언어 처리를 할 때 띄어쓰기 오류가 있는 문장을 사용하면 문장의 구조가 달라지기 때문에 성능에 영향을 미칠 수 있다. 기존 연구에서는 N-gram 기반 통계적인 방법과 형태소 분석기를 이용하여 띄어쓰기 교정을 해왔다. 최근 들어 심층 신경망을 활용하는 많은 띄어쓰기 교정 연구가 진행되고 있다. 기존 심층 신경망을 이용한 연구에서는 문장을 음절 단위 또는 형태소 단위로 처리하여 교정 모델을 만들었다. 본 연구에서는 음절과 형태소 단위 모두 모델의 입력으로 사용하여 두 정보를 결합하여 띄어쓰기 교정 문제를 해결하고자 한다. 모델은 문장의 음절과 형태소 시퀀스에서 지역적 정보를 학습할 수 있는 Convolutional Neural Network와 순서정보를 정방향, 후방향으로 학습할 수 있는 Bidirectional Long Short-Term Memory 구조를 사용한다. 모델의 성능은 음절의 정확도와 어절의 정밀도, 어절의 재현율, 어절의 F1 score를 사용해 평가하였다. 제안한 모델의 성능 평가 결과 어절의 F1 score가 96.06%로 우수한 성능을 냈다.

  • PDF