Search | Korea Science

A Comparative Study of Machine Learning Algorithms Using LID-DS DataSet (LID-DS 데이터 세트를 사용한 기계학습 알고리즘 비교 연구)

Park, DaeKyeong;Ryu, KyungJoon;Shin, DongIl;Shin, DongKyoo;Park, JeongChan;Kim, JinGoog
- KIPS Transactions on Software and Data Engineering
- /
- v.10 no.3
- /
- pp.91-98
- /
- 2021
Today's information and communication technology is rapidly developing, the security of IT infrastructure is becoming more important, and at the same time, cyber attacks of various forms are becoming more advanced and sophisticated like intelligent persistent attacks (Advanced Persistent Threat). Early defense or prediction of increasingly sophisticated cyber attacks is extremely important, and in many cases, the analysis of network-based intrusion detection systems (NIDS) related data alone cannot prevent rapidly changing cyber attacks. Therefore, we are currently using data generated by intrusion detection systems to protect against cyber attacks described above through Host-based Intrusion Detection System (HIDS) data analysis. In this paper, we conducted a comparative study on machine learning algorithms using LID-DS (Leipzig Intrusion Detection-Data Set) host-based intrusion detection data including thread information, metadata, and buffer data missing from previously used data sets. The algorithms used were Decision Tree, Naive Bayes, MLP (Multi-Layer Perceptron), Logistic Regression, LSTM (Long Short-Term Memory model), and RNN (Recurrent Neural Network). Accuracy, accuracy, recall, F1-Score indicators and error rates were measured for evaluation. As a result, the LSTM algorithm had the highest accuracy.
https://doi.org/10.3745/KTSDE.2021.10.3.91 인용 PDF KSCI

Fundamental Study on Algorithm Development for Prediction of Smoke Spread Distance Based on Deep Learning (딥러닝 기반의 연기 확산거리 예측을 위한 알고리즘 개발 기초연구)

Kim, Byeol;Hwang, Kwang-Il
- Journal of the Korean Society of Marine Environment & Safety
- /
- v.27 no.1
- /
- pp.22-28
- /
- 2021
This is a basic study on the development of deep learning-based algorithms to detect smoke before the smoke detector operates in the event of a ship fire, analyze and utilize the detected data, and support fire suppression and evacuation activities by predicting the spread of smoke before it spreads to remote areas. Proposed algorithms were reviewed in accordance with the following procedures. As a first step, smoke images obtained through fire simulation were applied to the YOLO (You Only Look Once) model, which is a deep learning-based object detection algorithm. The mean average precision (mAP) of the trained YOLO model was measured to be 98.71%, and smoke was detected at a processing speed of 9 frames per second (FPS). The second step was to estimate the spread of smoke using the coordinates of the boundary box, from which was utilized to extract the smoke geometry from YOLO. This smoke geometry was then applied to the time series prediction algorithm, long short-term memory (LSTM). As a result, smoke spread data obtained from the coordinates of the boundary box between the estimated fire occurrence and 30 s were entered into the LSTM learning model to predict smoke spread data from 31 s to 90 s in the smoke image of a fast fire obtained from fire simulation. The average square root error between the estimated spread of smoke and its predicted value was 2.74.
https://doi.org/10.7837/kosomes.2021.27.1.022 인용 PDF KSCI

Role of unstructured data on water surface elevation prediction with LSTM: case study on Jamsu Bridge, Korea (LSTM 기법을 활용한 수위 예측 알고리즘 개발 시 비정형자료의 역할에 관한 연구: 잠수교 사례)

Lee, Seung Yeon;Yoo, Hyung Ju;Lee, Seung Oh
- Journal of Korea Water Resources Association
- /
- v.54 no.spc1
- /
- pp.1195-1204
- /
- 2021
Recently, local torrential rain have become more frequent and severe due to abnormal climate conditions, causing a surge in human and properties damage including infrastructures along the river. In this study, water surface elevation prediction algorithm was developed using the LSTM (Long Short-term Memory) technique specialized for time series data among Machine Learning to estimate and prevent flooding of the facilities. The study area is Jamsu Bridge, the study period is 6 years (2015~2020) of June, July and August and the water surface elevation of the Jamsu Bridge after 3 hours was predicted. Input data set is composed of the water surface elevation of Jamsu Bridge (EL.m), the amount of discharge from Paldang Dam (m³/s), the tide level of Ganghwa Bridge (cm) and the number of tweets in Seoul. Complementary data were constructed by using not only structured data mainly used in precedent research but also unstructured data constructed through wordcloud, and the role of unstructured data was presented through comparison and analysis of whether or not unstructured data was used. When predicting the water surface elevation of the Jamsu Bridge, the accuracy of prediction was improved and realized that complementary data could be conservative alerts to reduce casualties. In this study, it was concluded that the use of complementary data was relatively effective in providing the user's safety and convenience of riverside infrastructure. In the future, more accurate water surface elevation prediction would be expected through the addition of types of unstructured data or detailed pre-processing of input data.
https://doi.org/10.3741/JKWRA.2021.54.S-1.1195 인용 PDF KSCI

A Data-driven Classifier for Motion Detection of Soldiers on the Battlefield using Recurrent Architectures and Hyperparameter Optimization (순환 아키텍쳐 및 하이퍼파라미터 최적화를 이용한 데이터 기반 군사 동작 판별 알고리즘)

Joonho Kim;Geonju Chae;Jaemin Park;Kyeong-Won Park
- Journal of Intelligence and Information Systems
- /
- v.29 no.1
- /
- pp.107-119
- /
- 2023
The technology that recognizes a soldier's motion and movement status has recently attracted large attention as a combination of wearable technology and artificial intelligence, which is expected to upend the paradigm of troop management. The accuracy of state determination should be maintained at a high-end level to make sure of the expected vital functions both in a training situation; an evaluation and solution provision for each individual's motion, and in a combat situation; overall enhancement in managing troops. However, when input data is given as a timer series or sequence, existing feedforward networks would show overt limitations in maximizing classification performance. Since human behavior data (3-axis accelerations and 3-axis angular velocities) handled for military motion recognition requires the process of analyzing its time-dependent characteristics, this study proposes a high-performance data-driven classifier which utilizes the long-short term memory to identify the order dependence of acquired data, learning to classify eight representative military operations (Sitting, Standing, Walking, Running, Ascending, Descending, Low Crawl, and High Crawl). Since the accuracy is highly dependent on a network's learning conditions and variables, manual adjustment may neither be cost-effective nor guarantee optimal results during learning. Therefore, in this study, we optimized hyperparameters using Bayesian optimization for maximized generalization performance. As a result, the final architecture could reduce the error rate by 62.56% compared to the existing network with a similar number of learnable parameters, with the final accuracy of 98.39% for various military operations.
https://doi.org/10.13088/jiis.2023.29.1.107 인용 PDF

Towards Carbon-Neutralization: Deep Learning-Based Server Management Method for Efficient Energy Operation in Data Centers (탄소중립을 향하여: 데이터 센터에서의 효율적인 에너지 운영을 위한 딥러닝 기반 서버 관리 방안)

Sang-Gyun Ma;Jaehyun Park;Yeong-Seok Seo
- KIPS Transactions on Software and Data Engineering
- /
- v.12 no.4
- /
- pp.149-158
- /
- 2023
As data utilization is becoming more important recently, the importance of data centers is also increasing. However, the data center is a problem in terms of environment and economy because it is a massive power-consuming facility that runs 24 hours a day. Recently, studies using deep learning techniques to reduce power used in data centers or servers or predict traffic have been conducted from various perspectives. However, the amount of traffic data processed by the server is anomalous, which makes it difficult to manage the server. In addition, many studies on dynamic server management techniques are still required. Therefore, in this paper, we propose a dynamic server management technique based on Long-Term Short Memory (LSTM), which is robust to time series data prediction. The proposed model allows servers to be managed more reliably and efficiently in the field environment than before, and reduces power used by servers more effectively. For verification of the proposed model, we collect transmission and reception traffic data from six of Wikipedia's data centers, and then analyze and experiment with statistical-based analysis on the relationship of each traffic data. Experimental results show that the proposed model is helpful for reliably and efficiently running servers.
https://doi.org/10.3745/KTSDE.2023.12.4.149 인용 PDF

Comparison of the Characteristics between the Dynamical Model and the Artificial Intelligence Model of the Lorenz System (Lorenz 시스템의 역학 모델과 자료기반 인공지능 모델의 특성 비교)

YOUNG HO KIM;NAKYOUNG IM;MIN WOO KIM;JAE HEE JEONG;EUN SEO JEONG
- The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
- /
- v.28 no.4
- /
- pp.133-142
- /
- 2023
In this paper, we built a data-driven artificial intelligence model using RNN-LSTM (Recurrent Neural Networks-Long Short-Term Memory) to predict the Lorenz system, and examined the possibility of whether this model can replace chaotic dynamic models. We confirmed that the data-driven model reflects the chaotic nature of the Lorenz system, where a small error in the initial conditions produces fundamentally different results, and the system moves around two stable poles, repeating the transition process, the characteristic of "deterministic non-periodic flow", and simulates the bifurcation phenomenon. We also demonstrated the advantage of adjusting integration time intervals to reduce computational resources in data-driven models. Thus, we anticipate expanding the applicability of data-driven artificial intelligence models through future research on refining data-driven models and data assimilation techniques for data-driven models.
https://doi.org/10.7850/jkso.2023.28.4.133 인용 PDF

Prediction of Water Storage Rate for Agricultural Reservoirs Using Univariate and Multivariate LSTM Models (단변량 및 다변량 LSTM을 이용한 농업용 저수지의 저수율 예측)

Sunguk Joh;Yangwon Lee
- Korean Journal of Remote Sensing
- /
- v.39 no.5_4
- /
- pp.1125-1134
- /
- 2023
Out of the total 17,000 reservoirs in Korea, 13,600 small agricultural reservoirs do not have hydrological measurement facilities, making it difficult to predict water storage volume and appropriate operation. This paper examined univariate and multivariate long short-term memory (LSTM) modeling to predict the storage rate of agricultural reservoirs using remote sensing and artificial intelligence. The univariate LSTM model used only water storage rate as an explanatory variable, and the multivariate LSTM model added n-day accumulative precipitation and date of year (DOY) as explanatory variables. They were trained using eight years data (2013 to 2020) for Idong Reservoir, and the predictions of the daily water storage in 2021 were validated for accuracy assessment. The univariate showed the root-mean square error (RMSE) of 1.04%, 2.52%, and 4.18% for the one, three, and five-day predictions. The multivariate model showed the RMSE 0.98%, 1.95%, and 2.76% for the one, three, and five-day predictions. In addition to the time-series storage rate, DOY and daily and 5-day cumulative precipitation variables were more significant than others for the daily model, which means that the temporal range of the impacts of precipitation on the everyday water storage rate was approximately five days.
https://doi.org/10.7780/kjrs.2023.39.5.4.6 인용 PDF HTML

A Study on the Health Index Based on Degradation Patterns in Time Series Data Using ProphetNet Model (ProphetNet 모델을 활용한 시계열 데이터의 열화 패턴 기반 Health Index 연구)

Sun-Ju Won;Yong Soo Kim
- Journal of Korean Society of Industrial and Systems Engineering
- /
- v.46 no.3
- /
- pp.123-138
- /
- 2023
The Fourth Industrial Revolution and sensor technology have led to increased utilization of sensor data. In our modern society, data complexity is rising, and the extraction of valuable information has become crucial with the rapid changes in information technology (IT). Recurrent neural networks (RNN) and long short-term memory (LSTM) models have shown remarkable performance in natural language processing (NLP) and time series prediction. Consequently, there is a strong expectation that models excelling in NLP will also excel in time series prediction. However, current research on Transformer models for time series prediction remains limited. Traditional RNN and LSTM models have demonstrated superior performance compared to Transformers in big data analysis. Nevertheless, with continuous advancements in Transformer models, such as GPT-2 (Generative Pre-trained Transformer 2) and ProphetNet, they have gained attention in the field of time series prediction. This study aims to evaluate the classification performance and interval prediction of remaining useful life (RUL) using an advanced Transformer model. The performance of each model will be utilized to establish a health index (HI) for cutting blades, enabling real-time monitoring of machine health. The results are expected to provide valuable insights for machine monitoring, evaluation, and management, confirming the effectiveness of advanced Transformer models in time series analysis when applied in industrial settings.
https://doi.org/10.11627/jksie.2023.46.3.123 인용 PDF

Real-time prediction on the slurry concentration of cutter suction dredgers using an ensemble learning algorithm

Han, Shuai;Li, Mingchao;Li, Heng;Tian, Huijing;Qin, Liang;Li, Jinfeng
- International conference on construction engineering and project management
- /
- 2020.12a
- /
- pp.463-481
- /
- 2020
Cutter suction dredgers (CSDs) are widely used in various dredging constructions such as channel excavation, wharf construction, and reef construction. During a CSD construction, the main operation is to control the swing speed of cutter to keep the slurry concentration in a proper range. However, the slurry concentration cannot be monitored in real-time, i.e., there is a "time-lag effect" in the log of slurry concentration, making it difficult for operators to make the optimal decision on controlling. Concerning this issue, a solution scheme that using real-time monitored indicators to predict current slurry concentration is proposed in this research. The characteristics of the CSD monitoring data are first studied, and a set of preprocessing methods are presented. Then we put forward the concept of "index class" to select the important indices. Finally, an ensemble learning algorithm is set up to fit the relationship between the slurry concentration and the indices of the index classes. In the experiment, log data over seven days of a practical dredging construction is collected. For comparison, the Deep Neural Network (DNN), Long Short Time Memory (LSTM), Support Vector Machine (SVM), Random Forest (RF), Gradient Boosting Decision Tree (GBDT), and the Bayesian Ridge algorithm are tried. The results show that our method has the best performance with an R² of 0.886 and a mean square error (MSE) of 5.538. This research provides an effective way for real-time predicting the slurry concentration of CSDs and can help to improve the stationarity and production efficiency of dredging construction.
PDF

Implementation Strategy for the Numerical Efficiency Improvement of the Multiscale Interpolation Wavelet-Galerkin Method

Seo Jeong Hun;Earmme Taemin;Jang Gang-Won;Kim Yoon Young
- Journal of Mechanical Science and Technology
- /
- v.20 no.1
- /
- pp.110-124
- /
- 2006
The multi scale wavelet-Galerkin method implemented in an adaptive manner has an advantage of obtaining accurate solutions with a substantially reduced number of interpolation points. The method is becoming popular, but its numerical efficiency still needs improvement. The objectives of this investigation are to present a new numerical scheme to improve the performance of the multi scale adaptive wavelet-Galerkin method and to give detailed implementation procedure. Specifically, the subdomain technique suitable for multiscale methods is developed and implemented. When the standard wavelet-Galerkin method is implemented without domain subdivision, the interaction between very long scale wavelets and very short scale wavelets leads to a poorly-sparse system matrix, which considerably worsens numerical efficiency for large-sized problems. The performance of the developed strategy is checked in terms of numerical costs such as the CPU time and memory size. Since the detailed implementation procedure including preprocessing and stiffness matrix construction is given, researchers having experiences in standard finite element implementation may be able to extend the multi scale method further or utilize some features of the multiscale method in their own applications.
PDF KSCI

Search Result 284, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)