• Title/Summary/Keyword: Ordinary Least Squares regression

Search Result 108, Processing Time 0.025 seconds

Exploring Spatial Patterns of Theft Crimes Using Geographically Weighted Regression

  • Yoo, Youngwoo;Baek, Taekyung;Kim, Jinsoo;Park, Soyoung
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.35 no.1
    • /
    • pp.31-39
    • /
    • 2017
  • The goal of this study was to efficiently analyze the relationships of the number of thefts with related factors, considering the spatial patterns of theft crimes. Theft crime data for a 5-year period (2009-2013) were collected from Haeundae Police Station. A logarithmic transformation was performed to ensure an effective statistical analysis and the number of theft crimes was used as the dependent variable. Related factors were selected through a literature review and divided into social, environmental, and defensive factors. Seven factors, were selected as independent variables: the numbers of foreigners, aged persons, single households, companies, entertainment venues, community security centers, and CCTV (Closed-Circuit Television) systems. OLS (Ordinary Least Squares) and GWR (Geographically Weighted Regression) were used to analyze the relationship between the dependent variable and independent variables. In the GWR results, each independent variable had regression coefficients that differed by location over the study area. The GWR model calculated local values for, and could explain the relationships between, variables more efficiently than the OLS model. Additionally, the adjusted R square value of the GWR model was 10% higher than that of the OLS model, and the GWR model produced a AICc (Corrected Akaike Information Criterion) value that was lower by 230, as well as lower Moran's I values. From these results, it was concluded that the GWR model was more robust in explaining the relationship between the number of thefts and the factors related to theft crime.

A Study on the User Satisfaction of Demand Response Transport(DRT) by Quantile Regression Analysis (분위회귀분석에 의한 수요응답형교통 이용자 만족도 분석)

  • Jang, Tae Youn;Han, Woo Jin;Kim, Jeong Ho
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.15 no.3
    • /
    • pp.118-128
    • /
    • 2016
  • As the rural areas have experienced the population reduction and the aging, the service level of public transit decreases. This study analyzes the effecting factor to user satisfaction of demand response transport(DRT) as alternative to rural public transit by the quantile regression that aims at estimating either the conditional median or other quantiles of the response variable. Jeonbuk Province tested DRT operations in Dongsang of Wanju County and Sannae of Jeongup City each in 2015. The user DRT satisfaction of Wanju was higher than one of Jeongup in basic statistics analysis. The difference in satisfaction between higher quantile and lower quntile of Wanju is smaller than one of Jeongupy as a result of quantile regression analysis. Also, Wanju DRT continues the second test operation of DRT as satisfaction from Ordinary Least Squares(OLS) close to higher satisfaction quantile.

Short-term Reactive Power Load Forecasting Using Multiple Time-Series Model (다중 시계열 모델을 이용한 단기 부하 무효전력 예측)

  • Lee, Hyo-Sang;Cho, Jong-Man;Park, Woo-Hyun;Kim, Jin-O
    • Journal of the Korean Institute of Illuminating and Electrical Installation Engineers
    • /
    • v.18 no.5
    • /
    • pp.105-111
    • /
    • 2004
  • This paper shows that active and reactive power load have significant positive relationship and there exist two types of relationship between them using Test Statistics. In investigating the cross plots at every hour, we found out that from 0 to 8 hours, there relationships are linear, while from 9 to 23 hours, they are two piece-wise linear. Also, reactive power loads was estimated and forecasted using active power load as the explanary variable with OLS (Ordinary Least Squares) regression methods. MAPE (Mean Absolute Percentage Error) for each model is calculated for one-hour ahead forecasting.

Tutorial: Methodologies for sufficient dimension reduction in regression

  • Yoo, Jae Keun
    • Communications for Statistical Applications and Methods
    • /
    • v.23 no.2
    • /
    • pp.105-117
    • /
    • 2016
  • In the paper, as a sequence of the first tutorial, we discuss sufficient dimension reduction methodologies used to estimate central subspace (sliced inverse regression, sliced average variance estimation), central mean subspace (ordinary least square, principal Hessian direction, iterative Hessian transformation), and central $k^{th}$-moment subspace (covariance method). Large-sample tests to determine the structural dimensions of the three target subspaces are well derived in most of the methodologies; however, a permutation test (which does not require large-sample distributions) is introduced. The test can be applied to the methodologies discussed in the paper. Theoretical relationships among the sufficient dimension reduction methodologies are also investigated and real data analysis is presented for illustration purposes. A seeded dimension reduction approach is then introduced for the methodologies to apply to large p small n regressions.

Application of Logit Model in Qualitative Dependent Variables (로짓모형을 이용한 질적 종속변수의 분석)

  • Lee, Kil-Soon;Yu, Wann
    • Journal of Families and Better Life
    • /
    • v.10 no.1 s.19
    • /
    • pp.131-138
    • /
    • 1992
  • Regression analysis has become a standard statistical tool in the behavioral science. Because of its widespread popularity. regression has been often misused. Such is the case when the dependent variable is a qualitative measure rather than a continuous, interval measure. Regression estimates with a qualitative dependent variable does not meet the assumptions underlying regression. It can lead to serious errors in the standard statistical inference. Logit model is recommended as alternatives to the regression model for qualitative dependent variables. Researchers can employ this model to measure the relationship between independent variables and qualitative dependent variables without assuming that logit model was derived from probabilistic choice theory. Coefficients in logit model are typically estimated by the method of Maximum Likelihood Estimation in contrast to ordinary regression model which estimated by the method of Least Squares Estimation. Goodness of fit in logit model is based on the likelihood ratio statistics and the t-statistics is used for testing the null hypothesis.

  • PDF

Wage Determinants Analysis by Quantile Regression Tree

  • Chang, Young-Jae
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.2
    • /
    • pp.293-301
    • /
    • 2012
  • Quantile regression proposed by Koenker and Bassett (1978) is a statistical technique that estimates conditional quantiles. The advantage of using quantile regression is the robustness in response to large outliers compared to ordinary least squares(OLS) regression. A regression tree approach has been applied to OLS problems to fit flexible models. Loh (2002) proposed the GUIDE algorithm that has a negligible selection bias and relatively low computational cost. Quantile regression can be regarded as an analogue of OLS, therefore it can also be applied to GUIDE regression tree method. Chaudhuri and Loh (2002) proposed a nonparametric quantile regression method that blends key features of piecewise polynomial quantile regression and tree-structured regression based on adaptive recursive partitioning. Lee and Lee (2006) investigated wage determinants in the Korean labor market using the Korean Labor and Income Panel Study(KLIPS). Following Lee and Lee, we fit three kinds of quantile regression tree models to KLIPS data with respect to the quantiles, 0.05, 0.2, 0.5, 0.8, and 0.95. Among the three models, multiple linear piecewise quantile regression model forms the shortest tree structure, while the piecewise constant quantile regression model has a deeper tree structure with more terminal nodes in general. Age, gender, marriage status, and education seem to be the determinants of the wage level throughout the quantiles; in addition, education experience appears as the important determinant of the wage level in the highly paid group.

Exploring NDVI Gradient Varying Across Landform and Solar Intensity using GWR: a Case Study of Mt. Geumgang in North Korea (GWR을 활용한 NDVI와 지형·태양광도의 상관성 평가 : 금강산 지역을 사례로)

  • Kim, Jun Woo;Um, Jung Sup
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.21 no.4
    • /
    • pp.73-81
    • /
    • 2013
  • Ordinary least squares (OLS) regression is the primary statistical method in previous studies for vegetation distribution patterns in relation to landform. However, this global regression lacks the ability to uncover some local-specific relationships and spatial autocorrelation in model residuals. This study employed geographically weighted regression (GWR) to examine the spatially varying relationships between NDVI (Normalized Difference Vegetation Index) patterns and changing trends of landform (elevation, slope) and solar intensity (insolation and duration of sunshine) in Mt Geum-gang of North-Korea. Results denoted that GWR was more powerful than OLS in interpreting relationships between NDVI patterns and landform/solar intensity, since GWR was characterized by higher adjusted R2, and reduced spatial autocorrelations in model residuals. Unlike OLS regression, GWR allowed the coefficients of explanatory variables to differ by locality by giving relatively more weight to NDVI patterns which are affected by local landform and solar factors. The strength of the regression relationships in the GWR increased significantly, by showing regression coefficient of higher than 70% (0.744) in the southern ridge of the experimental area. It is anticipated that this research output will serve to increase the scientific and objective vegetation monitoring in relation to landform and solar intensity by overcoming serious constraints suffered from the past non-GWR-based approach.

Regional Low Flow Frequency Analysis Using Bayesian Multiple Regression (Bayesian 다중회귀분석을 이용한 저수량(Low flow) 지역 빈도분석)

  • Kim, Sang-Ug;Lee, Kil-Seong
    • Journal of Korea Water Resources Association
    • /
    • v.41 no.3
    • /
    • pp.325-340
    • /
    • 2008
  • This study employs Bayesian multiple regression analysis using the ordinary least squares method for regional low flow frequency analysis. The parameter estimates using the Bayesian multiple regression analysis were compared to conventional analysis using the t-distribution. In these comparisons, the mean values from the t-distribution and the Bayesian analysis at each return period are not significantly different. However, the difference between upper and lower limits is remarkably reduced using the Bayesian multiple regression. Therefore, from the point of view of uncertainty analysis, Bayesian multiple regression analysis is more attractive than the conventional method based on a t-distribution because the low flow sample size at the site of interest is typically insufficient to perform low flow frequency analysis. Also, we performed low flow prediction, including confidence interval, at two ungauged catchments in the Nakdong River basin using the developed Bayesian multiple regression model. The Bayesian prediction proves effective to infer the low flow characteristic at the ungauged catchment.

A Study on Change of Logistics in the region of Seoul, Incheon, Kyunggi (물류예측모형에 관한 연구 -수도권 물동량 예측을 중심으로-)

  • Roh Kyung-Ho
    • Management & Information Systems Review
    • /
    • v.7
    • /
    • pp.427-450
    • /
    • 2001
  • This research suggests the estimation methodology of Logistics. This paper elucidates the main problems associated with estimation in the regression model. We review the methods for estimating the parameters in the model and introduce a modified procedure in which all models are fitted and combined to construct a combination of estimates. The resulting estimators are found to be as efficient as the maximum likelihood (ML) estimators in various cases. Our method requires more computations but has an advantage for large data sets. Also, it enables to detect particular features in the data structure. Examples of real data are used to illustrate the properties of the estimators. The backgrounds of estimation of logistic regression model is the increasing logistic environment importance today. In the first phase, we conduct an exploratory study to discuss 9 independent variables. In the second phase, we try to find the fittest logistic regression model. In the third phase, we calculate the logistic estimation using logistic regression model. The parameters of logistic regression model were estimated using ordinary least squares regression. The standard assumptions of OLS estimation were tested. The calculated value of the F-statistics for the logistic regression model is significant at the 5% level. The logistic regression model also explains a significant amount of variance in the dependent variable. The parameter estimates of the logistic regression model with t-statistics in parentheses are presented in Table. The object of this paper is to find the best logistic regression model to estimate the comparative accurate logistics.

  • PDF

A study on the properties of sensitivity analysis in principal component regression and latent root regression (주성분회귀와 고유값회귀에 대한 감도분석의 성질에 대한 연구)

  • Shin, Jae-Kyoung;Chang, Duk-Joon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.2
    • /
    • pp.321-328
    • /
    • 2009
  • In regression analysis, the ordinary least squares estimates of regression coefficients become poor, when the correlations among predictor variables are high. This phenomenon, which is called multicollinearity, causes serious problems in actual data analysis. To overcome this multicollinearity, many methods have been proposed. Ridge regression, shrinkage estimators and methods based on principal component analysis (PCA) such as principal component regression (PCR) and latent root regression (LRR). In the last decade, many statisticians discussed sensitivity analysis (SA) in ordinary multiple regression and same topic in PCR, LRR and logistic principal component regression (LPCR). In those methods PCA plays important role. Many statisticians discussed SA in PCA and related multivariate methods. We introduce the method of PCR and LRR. We also introduce the methods of SA in PCR and LRR, and discuss the properties of SA in PCR and LRR.

  • PDF