• 제목/요약/키워드: Data Models

검색결과 14,030건 처리시간 0.038초

Extending the Scope of Automatic Time Series Model Selection: The Package autots for R

  • Jang, Dong-Ik;Oh, Hee-Seok;Kim, Dong-Hoh
    • Communications for Statistical Applications and Methods
    • /
    • 제18권3호
    • /
    • pp.319-331
    • /
    • 2011
  • In this paper, we propose automatic procedures for the model selection of various univariate time series data. Automatic model selection is important, especially in data mining with large number of time series, for example, the number (in thousands) of signals accessing a web server during a specific time period. Several methods have been proposed for automatic model selection of time series. However, most existing methods focus on linear time series models such as exponential smoothing and autoregressive integrated moving average(ARIMA) models. The key feature that distinguishes the proposed procedures from previous approaches is that the former can be used for both linear time series models and nonlinear time series models such as threshold autoregressive(TAR) models and autoregressive moving average-generalized autoregressive conditional heteroscedasticity(ARMA-GARCH) models. The proposed methods select a model from among the various models in the prediction error sense. We also provide an R package autots that implements the proposed automatic model selection procedures. In this paper, we illustrate these algorithms with the artificial and real data, and describe the implementation of the autots package for R.

Bayesian Pattern Mixture Model for Longitudinal Binary Data with Nonignorable Missingness

  • Kyoung, Yujung;Lee, Keunbaik
    • Communications for Statistical Applications and Methods
    • /
    • 제22권6호
    • /
    • pp.589-598
    • /
    • 2015
  • In longitudinal studies missing data are common and require a complicated analysis. There are two popular modeling frameworks, pattern mixture model (PMM) and selection models (SM) to analyze the missing data. We focus on the PMM and we also propose Bayesian pattern mixture models using generalized linear mixed models (GLMMs) for longitudinal binary data. Sensitivity analysis is used under the missing not at random assumption.

Hierarchical Bayes Analysis of Longitudinal Poisson Count Data

  • 김달호;신임희;최인순
    • Journal of the Korean Data and Information Science Society
    • /
    • 제13권2호
    • /
    • pp.227-234
    • /
    • 2002
  • In this paper, we consider hierarchical Bayes generalized linear models for the analysis of longitudinal count data. Specifically we introduce the hierarchical Bayes random effects models. We discuss implementation of the Bayes procedures via Markov chain Monte Carlo (MCMC) integration techniques. The hierarchical Baye method is illustrated with a real dataset and is compared with other statistical methods.

  • PDF

주요 국가별 표준 도서관 RFID 데이터 모델의 비교 및 분석 (Comparison and Analysis of Library RFID Data Model for Major National Standards)

  • 최재황
    • 한국도서관정보학회지
    • /
    • 제40권2호
    • /
    • pp.87-110
    • /
    • 2009
  • 본 연구의 목적은 이미 국가적으로 도서관 RFID 데이터 모델을 발표한 덴마크, 핀란드, 네덜란드, 프랑스, 미국, 호주, 우리나라의 도서관 RFID 데이터 모델을 분석하고, 비교하는 것이다. 유럽의 4개국 즉, 덴마크, 네덜란드, 핀란드, 프랑스와 우리나라는 고정길이 부호화 방식인 규정 데이터 모델을 채택하고 있고, 미국과 호주는 ISO 15962에 기반 하는 부호화 방식인 객체기반 데이터 모델을 따르고 있다. 본 연구는 앞으로 우리나라 도서관계에서 RFID 데이터 모델을 재정립할 때 토론의 중요한 발판이 될 것으로 기대한다.

  • PDF

Near-real time Kp forecasting methods based on neural network and support vector machine

  • 지은영;문용재;박종엽;이동훈
    • 천문학회보
    • /
    • 제37권2호
    • /
    • pp.123.1-123.1
    • /
    • 2012
  • We have compared near-real time Kp forecast models based on neural network (NN) and support vector machine (SVM) algorithms. We consider four models as follows: (1) a NN model using ACE solar wind data; (2) a SVM model using ACE solar wind data; (3) a NN model using ACE solar wind data and preliminary kp values from US ground-based magnetometers; (4) a SVM model using the same input data as model 3. For the comparison of these models, we estimate correlation coefficients and RMS errors between the observed Kp and the predicted Kp. As a result, we found that the model 3 is better than the other models. The values of correlation coefficients and RMS error of the model 3 are 0.93 and 0.48, respectively. For the forecast evaluation of models for geomagnetic storms ($Kp{\geq}6$), we present contingency tables and estimate statistical parameters such as probability of detection yes (PODy), false alarm ratio (FAR), bias, and critical success index (CSI). From a comparison of these statistical parameters, we found that the SVM models (model 2 and model 4) are better than the NN models (model 1 and model 3). The values of PODy and CSI of the model 4 are the highest among these models (PODy: 0.57 and CSI: 0.48). From these results, we suggest that the NN models are better than the SVM models for predicting Kp and the SVM models are better than the NN models for forecasting geomagnetic storms.

  • PDF

Analysis of streamflow prediction performance by various deep learning schemes

  • Le, Xuan-Hien;Lee, Giha
    • 한국수자원학회:학술대회논문집
    • /
    • 한국수자원학회 2021년도 학술발표회
    • /
    • pp.131-131
    • /
    • 2021
  • Deep learning models, especially those based on long short-term memory (LSTM), have presented their superiority in addressing time series data issues recently. This study aims to comprehensively evaluate the performance of deep learning models that belong to the supervised learning category in streamflow prediction. Therefore, six deep learning models-standard LSTM, standard gated recurrent unit (GRU), stacked LSTM, bidirectional LSTM (BiLSTM), feed-forward neural network (FFNN), and convolutional neural network (CNN) models-were of interest in this study. The Red River system, one of the largest river basins in Vietnam, was adopted as a case study. In addition, deep learning models were designed to forecast flowrate for one- and two-day ahead at Son Tay hydrological station on the Red River using a series of observed flowrate data at seven hydrological stations on three major river branches of the Red River system-Thao River, Da River, and Lo River-as the input data for training, validation, and testing. The comparison results have indicated that the four LSTM-based models exhibit significantly better performance and maintain stability than the FFNN and CNN models. Moreover, LSTM-based models may reach impressive predictions even in the presence of upstream reservoirs and dams. In the case of the stacked LSTM and BiLSTM models, the complexity of these models is not accompanied by performance improvement because their respective performance is not higher than the two standard models (LSTM and GRU). As a result, we realized that in the context of hydrological forecasting problems, simple architectural models such as LSTM and GRU (with one hidden layer) are sufficient to produce highly reliable forecasts while minimizing computation time because of the sequential data nature.

  • PDF

BEYOND LINEAR PROGRAMMING

  • Smith, Palmer W.;Phillips, J. Donal;Lucas, William H.
    • 한국경영과학회지
    • /
    • 제3권1호
    • /
    • pp.81-91
    • /
    • 1978
  • Decision models are an attempt to reduce uncertainty in the decision making process. The models describe the relationships of variables and given proper input data generate solutions to managerial problems. These solutions may not be answers to the problems for one of two reasons. First, the data input into the model may not be consistant with the underlying assumptions of the model being used. Frequently parameters are assumed to be deterministic when in fact they are probabilistic in nature. The second failure is that often the decision maker recognizes that the data available are not appropriate for the model being used and begins to collect the required data. By the time these data has been compiled the solution is no longer an answer to the problem. This relates to the timeliness of decision making. The authors point out throught the use of an illustrative problem that stocastic models are well developed and that they do not suffer from any lack of mathematical exactiness. The primary problem is that generally accepted procedures for data generation are historical in nature and not relevant for probabilistic decision models. The authors advocate that management information system designers and accountants must become more familiar with these decision models and the input data required for their effective implementation. This will provide these professionals with the background necessary to generate data in a form that makes it relevant and timely for the decision making process.

  • PDF

패널자료를 이용한 가로구간 교통사고분석 - 청주시 간선도로를 사례로 - (Traffic Accident Analysis of Link Sections Using Panel Data in the Case of Cheongju Arterial Roads)

  • 김준용;나희;박병호
    • 한국안전학회지
    • /
    • 제27권3호
    • /
    • pp.141-146
    • /
    • 2012
  • This study deals with the accident model using panel data which are composed of time series data of 2005 through 2007 and cross sectional data of link sections in Cheongju. Panel data are repeatedly collected over time from the same sample. The purpose of the study is to develop the traffic accident model using the above panel data. In pursuing the above, this study gives particular attentions to deriving the optimal models among various models including TSCSREG (Time Series Cross Section Regression). The main results are as follows. First, 8 panel data models which explained the various effects of accidents were developed. Second, $R^2$ values of fixed effect models were analyzed to be higher than those of random effect models. Finally, such the variables as the sum of the number of crosswalk on intersections and sum of the number of intersections were analyzed to be positive to the accidents.

A Comparison of Seasonal Linear Models and Seasonal ARIMA Models for Forecasting Intra-Day Call Arrivals

  • Kim, Myung-Suk
    • Communications for Statistical Applications and Methods
    • /
    • 제18권2호
    • /
    • pp.237-244
    • /
    • 2011
  • In call forecasting literature, both the seasonal autoregressive integrated moving average(ARIMA) type models and seasonal linear models have been popularly suggested as competing models. However, their parallel comparison for the forecasting accuracy was not strictly investigated before. This study evaluates the accuracy of both the seasonal linear models and the seasonal ARIMA-type models when predicting intra-day call arrival rates using both real and simulated data. The seasonal linear models outperform the seasonal ARIMA-type models in both one-day-ahead and one-week-ahead call forecasting in our empirical study.

Analysis of AI Model Hub

  • Yo-Seob Lee
    • International Journal of Advanced Culture Technology
    • /
    • 제11권4호
    • /
    • pp.442-448
    • /
    • 2023
  • Artificial Intelligence (AI) technology has recently grown explosively and is being used in a variety of application fields. Accordingly, the number of AI models is rapidly increasing. AI models are adapted and developed to fit a variety of data types, tasks, and environments, and the variety and volume of models continues to grow. The need to share models and collaborate within the AI community is becoming increasingly important. Collaboration is essential for AI models to be shared and improved publicly and used in a variety of applications. Therefore, with the advancement of AI, the introduction of Model Hub has become more important, improving the sharing, reuse, and collaboration of AI models and increasing the utilization of AI technology. In this paper, we collect data on the model hub and analyze the characteristics of the model hub and the AI models provided. The results of this research can be of great help in developing various multimodal AI models in the future, utilizing AI models in various fields, and building services by fusing various AI models.