• Title/Summary/Keyword: Data prediction

Search Result 9,935, Processing Time 0.045 seconds

Development and Comparison of Data Mining-based Prediction Models of Building Fire Probability

  • Hong, Sung-gwan;Jeong, Seung Ryul
    • Journal of Internet Computing and Services
    • /
    • v.19 no.6
    • /
    • pp.101-112
    • /
    • 2018
  • A lot of manpower and budgets are being used to prevent fires, and only a small portion of the data generated during this process is used for disaster prevention activities. This study develops a prediction model of fire occurrence probability based on data mining in order to more actively use these data for disaster prevention activities. For this purpose, variables for predicting fire occurrence probability of various buildings were selected and data of construction administrative system, national fire information system, and Korea Fire Insurance Association were collected and integrated data set was constructed. After appropriate data cleansing and preprocessing, various data mining methodologies such as artificial neural network, decision trees, SVM, and Naive Bayesian were used to develop a prediction model of the fire occurrence probability of buildings. The most accurate model among the derived models is Linear SVM model which shows 68.42% as experimental data and 63.54% as verification data and it is the best model to predict fire occurrence probability of buildings. As this study develops the prediction model which uses only the set values of the specific ranges, future studies may explore more opportunites to use various setting values not shown in this study.

Dynamic data-base Typhoon Track Prediction (DYTRAP) (동적 데이터베이스 기반 태풍 진로 예측)

  • Lee, Yunje;Kwon, H. Joe;Joo, Dong-Chan
    • Atmosphere
    • /
    • v.21 no.2
    • /
    • pp.209-220
    • /
    • 2011
  • A new consensus algorithm for the prediction of tropical cyclone track has been developed. Conventional consensus is a simple average of a few fixed models that showed the good performance in track prediction for the past few years. Meanwhile, the consensus in this study is a weighted average of a few models that may change for every individual forecast time. The models are selected as follows. The first step is to find the analogous past tropical cyclone tracks to the current track. The next step is to evaluate the model performances for those past tracks. Finally, we take the weighted average of the selected models. More weight is given to the higher performance model. This new algorithm has been named as DYTRAP (DYnamic data-base Typhoon tRAck Prediction) in the sense that the data base is used to find the analogous past tracks and the effective models for every individual track prediction case. DYTRAP has been applied to all 2009 tropical cyclone track prediction. The results outperforms those of all models as well as all the official forecasts of the typhoon centers. In order to prove the real usefulness of DYTRAP, it is necessary to apply the DYTRAP system to the real time prediction because the forecast in typhoon centers usually uses 6-hour or 12-hour-old model guidances.

Developing an User Location Prediction Model for Ubiquitous Computing based on a Spatial Information Management Technique

  • Choi, Jin-Won;Lee, Yung-Il
    • Architectural research
    • /
    • v.12 no.2
    • /
    • pp.15-22
    • /
    • 2010
  • Our prediction model is based on the development of "Semantic Location Model." It embodies geometrical and topological information which can increase the efficiency in prediction and make it easy to manipulate the prediction model. Data mining is being implemented to extract the inhabitant's location patterns generated day by day. As a result, the self-learning system will be able to semantically predict the inhabitant's location in advance. This context-aware system brings about the key component of the ubiquitous computing environment. First, we explain the semantic location model and data mining methods. Then the location prediction model for the ubiquitous computing system is described in details. Finally, the prototype system is introduced to demonstrate and evaluate our prediction model.

Improving the prediction accuracy by using the number of neighbors in collaborative filtering (협력적 필터링 추천기법에서 이웃 수를 이용한 선호도 예측 정확도 향상)

  • Lee, Hee-Choon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.3
    • /
    • pp.505-514
    • /
    • 2009
  • The researcher analyzes the relationship between the number of neighbors and the prediction accuracy in the preference prediction process using collaborative filtering system. The number of neighbors who are involved in the preference prediction process are divided into four groups. Each group shows a little difference in the preference prediction. By using prediction error averages in each group, linear functions are suggested. Through the result of this study, the accuracy of preference prediction can be raised when using linear functions by using the number of neighbors in the suggested system.

  • PDF

Design of Disease Prediction Algorithm Applying Machine Learning Time Series Prediction

  • Hye-Kyeong Ko
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.16 no.3
    • /
    • pp.321-328
    • /
    • 2024
  • This paper designs a disease prediction algorithm to diagnose migraine among the types of diseases in advance by learning algorithms using machine learning-based time series analysis. This study utilizes patient data statistics, such as electroencephalogram activity, to design a prediction algorithm to determine the onset signals of migraine symptoms, so that patients can efficiently predict and manage their disease. The results of the study evaluate how accurate the proposed prediction algorithm is in predicting migraine and how quickly it can predict the onset of migraine for disease prevention purposes. In this paper, a machine learning algorithm is used to analyze time series of data indicators used for migraine identification. We designed an algorithm that can efficiently predict and manage patients' diseases by quickly determining the onset signaling symptoms of disease development using existing patient data as input. The experimental results show that the proposed prediction algorithm can accurately predict the occurrence of migraine using machine learning algorithms.

Analysis Model Evaluation based on IoT Data and Machine Learning Algorithm for Prediction of Acer Mono Sap Liquid Water

  • Lee, Han Sung;Jung, Se Hoon
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.10
    • /
    • pp.1286-1295
    • /
    • 2020
  • It has been increasingly difficult to predict the amounts of Acer mono sap to be collected due to droughts and cold waves caused by recent climate changes with few studies conducted on the prediction of its collection volume. This study thus set out to propose a Big Data prediction system based on meteorological information for the collection of Acer mono sap. The proposed system would analyze collected data and provide managers with a statistical chart of prediction values regarding climate factors to affect the amounts of Acer mono sap to be collected, thus enabling efficient work. It was designed based on Hadoop for data collection, treatment and analysis. The study also analyzed and proposed an optimal prediction model for climate conditions to influence the volume of Acer mono sap to be collected by applying a multiple regression analysis model based on Hadoop and Mahout.

Comparison of prediction methods for Nonlinear Time series data with Intervention1)

  • Lee, Sung-Duck;Kim, Ju-Sung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.14 no.2
    • /
    • pp.265-274
    • /
    • 2003
  • Time series data are influenced by the external events such as holiday, strike, oil shock, and political change, so the external events cause a sudden change to the time series data. We regard the observation as outlier that occurred as a result of external events. In general, it is called intervention if we know the period and the reason of external events, and it makes an analyst difficult to establish a time series model. Therefore, it is important that we analyze the styles and effects of intervention. In this paper, we considered the linear time series model with invention and compared with nonlinear time series models such as ARCH, GARCH model and also we compared with the combination prediction method that Tong(1990) introduced. In the practical case study, we compared prediction power with RMSE among linear, nonlinear time series model with intervention and combination prediction method.

  • PDF

Bayesian Typhoon Track Prediction Using Wind Vector Data

  • Han, Minkyu;Lee, Jaeyong
    • Communications for Statistical Applications and Methods
    • /
    • v.22 no.3
    • /
    • pp.241-253
    • /
    • 2015
  • In this paper we predict the track of typhoons using a Bayesian principal component regression model based on wind field data. Data is obtained at each time point and we applied the Bayesian principal component regression model to conduct the track prediction based on the time point. Based on regression model, we applied to variable selection prior and two kinds of prior distribution; normal and Laplace distribution. We show prediction results based on Bayesian Model Averaging (BMA) estimator and Median Probability Model (MPM) estimator. We analysis 8 typhoons in 2006 using data obtained from previous 6 years (2000-2005). We compare our prediction results with a moving-nest typhoon model (MTM) proposed by the Korea Meteorological Administration. We posit that is possible to predict the track of a typhoon accurately using only a statistical model and without a dynamical model.

A Comparative Study between Stock Price Prediction Models Using Sentiment Analysis and Machine Learning Based on SNS and News Articles (SNS와 뉴스기사의 감성분석과 기계학습을 이용한 주가예측 모형 비교 연구)

  • Kim, Dongyoung;Park, Jeawon;Choi, Jaehyun
    • Journal of Information Technology Services
    • /
    • v.13 no.3
    • /
    • pp.221-233
    • /
    • 2014
  • Because people's interest of the stock market has been increased with the development of economy, a lot of studies have been going to predict fluctuation of stock prices. Latterly many studies have been made using scientific and technological method among the various forecasting method, and also data using for study are becoming diverse. So, in this paper we propose stock prices prediction models using sentiment analysis and machine learning based on news articles and SNS data to improve the accuracy of prediction of stock prices. Stock prices prediction models that we propose are generated through the four-step process that contain data collection, sentiment dictionary construction, sentiment analysis, and machine learning. The data have been collected to target newspapers related to economy in the case of news article and to target twitter in the case of SNS data. Sentiment dictionary was built using news articles among the collected data, and we utilize it to process sentiment analysis. In machine learning phase, we generate prediction models using various techniques of classification and the data that was made through sentiment analysis. After generating prediction models, we conducted 10-fold cross-validation to measure the performance of they. The experimental result showed that accuracy is over 80% in a number of ways and F1 score is closer to 0.8. The result can be seen as significantly enhanced result compared with conventional researches utilizing opinion mining or data mining techniques.

Credit Card Bad Debt Prediction Model based on Support Vector Machine (신용카드 대손회원 예측을 위한 SVM 모형)

  • Kim, Jin Woo;Jhee, Won Chul
    • Journal of Information Technology Services
    • /
    • v.11 no.4
    • /
    • pp.233-250
    • /
    • 2012
  • In this paper, credit card delinquency means the possibility of occurring bad debt within the certain near future from the normal accounts that have no debt and the problem is to predict, on the monthly basis, the occurrence of delinquency 3 months in advance. This prediction is typical binary classification problem but suffers from the issue of data imbalance that means the instances of target class is very few. For the effective prediction of bad debt occurrence, Support Vector Machine (SVM) with kernel trick is adopted using credit card usage and payment patterns as its inputs. SVM is widely accepted in the data mining society because of its prediction accuracy and no fear of overfitting. However, it is known that SVM has the limitation in its ability to processing the large-scale data. To resolve the difficulties in applying SVM to bad debt occurrence prediction, two stage clustering is suggested as an effective data reduction method and ensembles of SVM models are also adopted to mitigate the difficulty due to data imbalance intrinsic to the target problem of this paper. In the experiments with the real world data from one of the major domestic credit card companies, the suggested approach reveals the superior prediction accuracy to the traditional data mining approaches that use neural networks, decision trees or logistics regressions. SVM ensemble model learned from T2 training set shows the best prediction results among the alternatives considered and it is noteworthy that the performance of neural networks with T2 is better than that of SVM with T1. These results prove that the suggested approach is very effective for both SVM training and the classification problem of data imbalance.