• Title/Summary/Keyword: Ensemble models

Search Result 373, Processing Time 0.027 seconds

Prediction of Ship Travel Time in Harbour using 1D-Convolutional Neural Network (1D-CNN을 이용한 항만내 선박 이동시간 예측)

  • Sang-Lok Yoo;Kwang-Il Ki;Cho-Young Jung
    • Proceedings of the Korean Institute of Navigation and Port Research Conference
    • /
    • 2022.06a
    • /
    • pp.275-276
    • /
    • 2022
  • VTS operators instruct ships to wait for entry and departure to sail in one-way to prevent ship collision accidents in ports with narrow routes. Currently, the instructions are not based on scientific and statistical data. As a result, there is a significant deviation depending on the individual capability of the VTS operators. Accordingly, this study built a 1d-convolutional neural network model by collecting ship and weather data to predict the exact travel time for ship entry/departure waiting for instructions in the port. It was confirmed that the proposed model was improved by more than 4.5% compared to other ensemble machine learning models. Through this study, it is possible to predict the time required to enter and depart a vessel in various situations, so it is expected that the VTS operators will help provide accurate information to the vessel and determine the waiting order.

  • PDF

Predictive Models for the Tourism and Accommodation Industry in the Era of Smart Tourism: Focusing on the COVID-19 Pandemic (스마트관광 시대의 관광숙박업 영업 예측 모형: 코로나19 팬더믹을 중심으로)

  • Yu Jin Jo;Cha Mi Kim;Seung Yeon Son;Mi Jin Noh
    • Smart Media Journal
    • /
    • v.12 no.8
    • /
    • pp.18-25
    • /
    • 2023
  • The COVID-19 outbreak in 2020 caused continuous damage worldwode, especially the smart tourism industry was hit directly by the blockade of sky roads and restriction of going out. At a time when overseas travel and domestic travel have decreased significantly, the number of tourist hotels that are colsed and closed due to the continued deficit is increasing. Therefore, in this study, licensing data from the Ministry of Public Administraion and Security were collected and visualized to understand the operation status of the tourism and lodging industry. The machine learning classification algorithm was applied to implement the business status prediction model of the tourist hotel, the performance of the prediction model was optimized using the ensemble algorithm, and the performance of the model was evaluated through 5-Fold cross-validation. It was predicted that the survival rate of tourist hotels would decrease somewhat, but the actual survival rate was analyzed to be no different from before COVID-19. Through the prediction of the business status of the hotel industry in this paper, it can be used as a basis for grasping the operability and development trends of the entire tourism and lodging industry.

Development of AI-based Smart Agriculture Early Warning System

  • Hyun Sim;Hyunwook Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.12
    • /
    • pp.67-77
    • /
    • 2023
  • This study represents an innovative research conducted in the smart farm environment, developing a deep learning-based disease and pest detection model and applying it to the Intelligent Internet of Things (IoT) platform to explore new possibilities in the implementation of digital agricultural environments. The core of the research was the integration of the latest ImageNet models such as Pseudo-Labeling, RegNet, EfficientNet, and preprocessing methods to detect various diseases and pests in complex agricultural environments with high accuracy. To this end, ensemble learning techniques were applied to maximize the accuracy and stability of the model, and the model was evaluated using various performance indicators such as mean Average Precision (mAP), precision, recall, accuracy, and box loss. Additionally, the SHAP framework was utilized to gain a deeper understanding of the model's prediction criteria, making the decision-making process more transparent. This analysis provided significant insights into how the model considers various variables to detect diseases and pests.

Students' Performance Prediction in Higher Education Using Multi-Agent Framework Based Distributed Data Mining Approach: A Review

  • M.Nazir;A.Noraziah;M.Rahmah
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.10
    • /
    • pp.135-146
    • /
    • 2023
  • An effective educational program warrants the inclusion of an innovative construction which enhances the higher education efficacy in such a way that accelerates the achievement of desired results and reduces the risk of failures. Educational Decision Support System (EDSS) has currently been a hot topic in educational systems, facilitating the pupil result monitoring and evaluation to be performed during their development. Insufficient information systems encounter trouble and hurdles in making the sufficient advantage from EDSS owing to the deficit of accuracy, incorrect analysis study of the characteristic, and inadequate database. DMTs (Data Mining Techniques) provide helpful tools in finding the models or forms of data and are extremely useful in the decision-making process. Several researchers have participated in the research involving distributed data mining with multi-agent technology. The rapid growth of network technology and IT use has led to the widespread use of distributed databases. This article explains the available data mining technology and the distributed data mining system framework. Distributed Data Mining approach is utilized for this work so that a classifier capable of predicting the success of students in the economic domain can be constructed. This research also discusses the Intelligent Knowledge Base Distributed Data Mining framework to assess the performance of the students through a mid-term exam and final-term exam employing Multi-agent system-based educational mining techniques. Using single and ensemble-based classifiers, this study intends to investigate the factors that influence student performance in higher education and construct a classification model that can predict academic achievement. We also discussed the importance of multi-agent systems and comparative machine learning approaches in EDSS development.

A Bi-directional Information Learning Method Using Reverse Playback Video for Fully Supervised Temporal Action Localization (완전지도 시간적 행동 검출에서 역재생 비디오를 이용한 양방향 정보 학습 방법)

  • Huiwon Gwon;Hyejeong Jo;Sunhee Jo;Chanho Jung
    • Journal of IKEEE
    • /
    • v.28 no.2
    • /
    • pp.145-149
    • /
    • 2024
  • Recently, research on temporal action localization has been actively conducted. In this paper, unlike existing methods, we propose two approaches for learning bidirectional information by creating reverse playback videos for fully supervised temporal action localization. One approach involves creating training data by combining reverse playback videos and forward playback videos, while the other approach involves training separate models on videos with different playback directions. Experiments were conducted on the THUMOS-14 dataset using TALLFormer. When using both reverse and forward playback videos as training data, the performance was 5.1% lower than that of the existing method. On the other hand, using a model ensemble shows a 1.9% improvement in performance.

The Prediction of Export Credit Guarantee Accident using Machine Learning (기계학습을 이용한 수출신용보증 사고예측)

  • Cho, Jaeyoung;Joo, Jihwan;Han, Ingoo
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.83-102
    • /
    • 2021
  • The government recently announced various policies for developing big-data and artificial intelligence fields to provide a great opportunity to the public with respect to disclosure of high-quality data within public institutions. KSURE(Korea Trade Insurance Corporation) is a major public institution for financial policy in Korea, and thus the company is strongly committed to backing export companies with various systems. Nevertheless, there are still fewer cases of realized business model based on big-data analyses. In this situation, this paper aims to develop a new business model which can be applied to an ex-ante prediction for the likelihood of the insurance accident of credit guarantee. We utilize internal data from KSURE which supports export companies in Korea and apply machine learning models. Then, we conduct performance comparison among the predictive models including Logistic Regression, Random Forest, XGBoost, LightGBM, and DNN(Deep Neural Network). For decades, many researchers have tried to find better models which can help to predict bankruptcy since the ex-ante prediction is crucial for corporate managers, investors, creditors, and other stakeholders. The development of the prediction for financial distress or bankruptcy was originated from Smith(1930), Fitzpatrick(1932), or Merwin(1942). One of the most famous models is the Altman's Z-score model(Altman, 1968) which was based on the multiple discriminant analysis. This model is widely used in both research and practice by this time. The author suggests the score model that utilizes five key financial ratios to predict the probability of bankruptcy in the next two years. Ohlson(1980) introduces logit model to complement some limitations of previous models. Furthermore, Elmer and Borowski(1988) develop and examine a rule-based, automated system which conducts the financial analysis of savings and loans. Since the 1980s, researchers in Korea have started to examine analyses on the prediction of financial distress or bankruptcy. Kim(1987) analyzes financial ratios and develops the prediction model. Also, Han et al.(1995, 1996, 1997, 2003, 2005, 2006) construct the prediction model using various techniques including artificial neural network. Yang(1996) introduces multiple discriminant analysis and logit model. Besides, Kim and Kim(2001) utilize artificial neural network techniques for ex-ante prediction of insolvent enterprises. After that, many scholars have been trying to predict financial distress or bankruptcy more precisely based on diverse models such as Random Forest or SVM. One major distinction of our research from the previous research is that we focus on examining the predicted probability of default for each sample case, not only on investigating the classification accuracy of each model for the entire sample. Most predictive models in this paper show that the level of the accuracy of classification is about 70% based on the entire sample. To be specific, LightGBM model shows the highest accuracy of 71.1% and Logit model indicates the lowest accuracy of 69%. However, we confirm that there are open to multiple interpretations. In the context of the business, we have to put more emphasis on efforts to minimize type 2 error which causes more harmful operating losses for the guaranty company. Thus, we also compare the classification accuracy by splitting predicted probability of the default into ten equal intervals. When we examine the classification accuracy for each interval, Logit model has the highest accuracy of 100% for 0~10% of the predicted probability of the default, however, Logit model has a relatively lower accuracy of 61.5% for 90~100% of the predicted probability of the default. On the other hand, Random Forest, XGBoost, LightGBM, and DNN indicate more desirable results since they indicate a higher level of accuracy for both 0~10% and 90~100% of the predicted probability of the default but have a lower level of accuracy around 50% of the predicted probability of the default. When it comes to the distribution of samples for each predicted probability of the default, both LightGBM and XGBoost models have a relatively large number of samples for both 0~10% and 90~100% of the predicted probability of the default. Although Random Forest model has an advantage with regard to the perspective of classification accuracy with small number of cases, LightGBM or XGBoost could become a more desirable model since they classify large number of cases into the two extreme intervals of the predicted probability of the default, even allowing for their relatively low classification accuracy. Considering the importance of type 2 error and total prediction accuracy, XGBoost and DNN show superior performance. Next, Random Forest and LightGBM show good results, but logistic regression shows the worst performance. However, each predictive model has a comparative advantage in terms of various evaluation standards. For instance, Random Forest model shows almost 100% accuracy for samples which are expected to have a high level of the probability of default. Collectively, we can construct more comprehensive ensemble models which contain multiple classification machine learning models and conduct majority voting for maximizing its overall performance.

Analysis of Future Bioclimatic Zones Using Multi-climate Models (다중기후모형을 활용한 동북아시아의 미래 생물기후권역 변화분석)

  • Choi, Yuyoung;Lim, Chul-Hee;Ryu, Jieun;Jeon, Seongwoo
    • Journal of Environmental Impact Assessment
    • /
    • v.27 no.5
    • /
    • pp.489-508
    • /
    • 2018
  • As climate changes, it is necessary to predict changes in the habitat environment in order to establish more aggressive adaptation strategies. The bioclimatic classification which clusters of areas with similar habitats can provide a useful ecosystem management framework. Therefore, in this study, biological habitat environment of Northeast Asia was identified through the establishment of the bioclimatic zones, and the impac of climate change on the biological habitat was analyzed. An ISODATA clustering was used to classify Northeast Asia (NEA)into 15 bioclimatic zones, and climate change impacts were predicted by projecting the future spatial distribution of bioclimatic zones based upon an ensemble of 17 GCMs across RCP4.5 and 8.5 scenarios for 2050s, and 2070s. Results demonstrated that significant changes in bioclimatic conditions can be expected throughout the NEA by 2050s and 2070s. The overall zones moved upward, and some zones were predicted to be greatly expanded or shrunk where we suggested as regions requiring intensive management. This analysis provides the basis for understanding potential impacts of climate change on biodiversity and ecosystem. Also, this could be used more effectively to support decision making on climate change adaptation.

Improving Efficiency of Food Hygiene Surveillance System by Using Machine Learning-Based Approaches (기계학습을 이용한 식품위생점검 체계의 효율성 개선 연구)

  • Cho, Sanggoo;Cho, Seung Yong
    • The Journal of Bigdata
    • /
    • v.5 no.2
    • /
    • pp.53-67
    • /
    • 2020
  • This study employees a supervised learning prediction model to detect nonconformity in advance of processed food manufacturing and processing businesses. The study was conducted according to the standard procedure of machine learning, such as definition of objective function, data preprocessing and feature engineering and model selection and evaluation. The dependent variable was set as the number of supervised inspection detections over the past five years from 2014 to 2018, and the objective function was to maximize the probability of detecting the nonconforming companies. The data was preprocessed by reflecting not only basic attributes such as revenues, operating duration, number of employees, but also the inspections track records and extraneous climate data. After applying the feature variable extraction method, the machine learning algorithm was applied to the data by deriving the company's risk, item risk, environmental risk, and past violation history as feature variables that affect the determination of nonconformity. The f1-score of the decision tree, one of ensemble models, was much higher than those of other models. Based on the results of this study, it is expected that the official food control for food safety management will be enhanced and geared into the data-evidence based management as well as scientific administrative system.

Transfer Learning based DNN-SVM Hybrid Model for Breast Cancer Classification

  • Gui Rae Jo;Beomsu Baek;Young Soon Kim;Dong Hoon Lim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.11
    • /
    • pp.1-11
    • /
    • 2023
  • Breast cancer is the disease that affects women the most worldwide. Due to the development of computer technology, the efficiency of machine learning has increased, and thus plays an important role in cancer detection and diagnosis. Deep learning is a field of machine learning technology based on an artificial neural network, and its performance has been rapidly improved in recent years, and its application range is expanding. In this paper, we propose a DNN-SVM hybrid model that combines the structure of a deep neural network (DNN) based on transfer learning and a support vector machine (SVM) for breast cancer classification. The transfer learning-based proposed model is effective for small training data, has a fast learning speed, and can improve model performance by combining all the advantages of a single model, that is, DNN and SVM. To evaluate the performance of the proposed DNN-SVM Hybrid model, the performance test results with WOBC and WDBC breast cancer data provided by the UCI machine learning repository showed that the proposed model is superior to single models such as logistic regression, DNN, and SVM, and ensemble models such as random forest in various performance measures.

Crack detection in concrete using deep learning for underground facility safety inspection (지하시설물 안전점검을 위한 딥러닝 기반 콘크리트 균열 검출)

  • Eui-Ik Jeon;Impyeong Lee;Donggyou Kim
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.25 no.6
    • /
    • pp.555-567
    • /
    • 2023
  • The cracks in the tunnel are currently determined through visual inspections conducted by inspectors based on images acquired using tunnel imaging acquisition systems. This labor-intensive approach, relying on inspectors, has inherent limitations as it is subject to their subjective judgments. Recently research efforts have actively explored the use of deep learning to automatically detect tunnel cracks. However, most studies utilize public datasets or lack sufficient objectivity in the analysis process, making it challenging to apply them effectively in practical operations. In this study, we selected test datasets consisting of images in the same format as those obtained from the actual inspection system to perform an objective evaluation of deep learning models. Additionally, we introduced ensemble techniques to complement the strengths and weaknesses of the deep learning models, thereby improving the accuracy of crack detection. As a result, we achieved high recall rates of 80%, 88%, and 89% for cracks with sizes of 0.2 mm, 0.3 mm, and 0.5 mm, respectively, in the test images. In addition, the crack detection result of deep learning included numerous cracks that the inspector could not find. if cracks are detected with sufficient accuracy in a more objective evaluation by selecting images from other tunnels that were not used in this study, it is judged that deep learning will be able to be introduced to facility safety inspection.