• 제목/요약/키워드: Linear regression models

검색결과 961건 처리시간 0.031초

Estimating small area proportions with kernel logistic regressions models

  • Shim, Jooyong;Hwang, Changha
    • Journal of the Korean Data and Information Science Society
    • /
    • 제25권4호
    • /
    • pp.941-949
    • /
    • 2014
  • Unit level logistic regression model with mixed effects has been used for estimating small area proportions, which treats the spatial effects as random effects and assumes linearity between the logistic link and the covariates. However, when the functional form of the relationship between the logistic link and the covariates is not linear, it may lead to biased estimators of the small area proportions. In this paper, we relax the linearity assumption and propose two types of kernel-based logistic regression models for estimating small area proportions. We also demonstrate the efficiency of our propose models using simulated data and real data.

Cumulative Sums of Residuals in GLMM and Its Implementation

  • Choi, DoYeon;Jeong, KwangMo
    • Communications for Statistical Applications and Methods
    • /
    • 제21권5호
    • /
    • pp.423-433
    • /
    • 2014
  • Test statistics using cumulative sums of residuals have been widely used in various regression models including generalized linear models(GLM). Recently, Pan and Lin (2005) extended this testing procedure to the generalized linear mixed models(GLMM) having random effects, in which we encounter difficulties in computing the marginal likelihood that is expressed as an integral of random effects distribution. The Gaussian quadrature algorithm is commonly used to approximate the marginal likelihood. Many commercial statistical packages provide an option to apply this type of goodness-of-fit test in GLMs but available programs are very rare for GLMMs. We suggest a computational algorithm to implement the testing procedure in GLMMs by a freely accessible R package, and also illustrate through practical examples.

Quantitative Structure Activity Relationship Prediction of Oral Bioavailabilities Using Support Vector Machine

  • Fatemi, Mohammad Hossein;Fadaei, Fatemeh
    • 대한화학회지
    • /
    • 제58권6호
    • /
    • pp.543-552
    • /
    • 2014
  • A quantitative structure activity relationship (QSAR) study is performed for modeling and prediction of oral bioavailabilities of 216 diverse set of drugs. After calculation and screening of molecular descriptors, linear and nonlinear models were developed by using multiple linear regression (MLR), artificial neural network (ANN), support vector machine (SVM) and random forest (RF) techniques. Comparison between statistical parameters of these models indicates the suitability of SVM over other models. The root mean square errors of SVM model were 5.933 and 4.934 for training and test sets, respectively. Robustness and reliability of the developed SVM model was evaluated by performing of leave many out cross validation test, which produces the statistic of $Q^2_{SVM}=0.603$ and SPRESS = 7.902. Moreover, the chemical applicability domains of model were determined via leverage approach. The results of this study revealed the applicability of QSAR approach by using SVM in prediction of oral bioavailability of drugs.

Matrix Formation in Univariate and Multivariate General Linear Models

  • Arwa A. Alkhalaf
    • International Journal of Computer Science & Network Security
    • /
    • 제24권4호
    • /
    • pp.44-50
    • /
    • 2024
  • This paper offers an overview of matrix formation and calculation techniques within the framework of General Linear Models (GLMs). It takes a sequential approach, beginning with a detailed exploration of matrix formation and calculation methods in regression analysis and univariate analysis of variance (ANOVA). Subsequently, it extends the discussion to cover multivariate analysis of variance (MANOVA). The primary objective of this study was to provide a clear and accessible explanation of the underlying matrices that play a crucial role in GLMs. Through linking, essentially different statistical methods, by fundamental principles and algebraic foundations that underpin the GLM estimation. Insights presented here aim to assist researchers, statisticians, and data analysts in enhancing their understanding of GLMs and their practical implementation in diverse research domains. This paper contributes to a better comprehension of the matrix-based techniques that can be extended to GLMs.

분광분석법을 이용한 단립 쌀의 함수율 및 단백질 함량 예측모델 개발 (Development of Prediction Model for Moisture and Protein Content of Single Kernel Rice using Spectroscopy)

  • 김재민;최창현;민봉기;김종훈
    • Journal of Biosystems Engineering
    • /
    • 제23권1호
    • /
    • pp.49-56
    • /
    • 1998
  • The objectives of this study were to develop models to predict the contents of moisture and protein of single kernel of brown rice based on visible/NIR (near-infrared) spectroscopic technique. The reflectance spectra of rice were obtained in the range of the wavelength 400 to 2,500 nm with 2 nm intervals. Multiple linear regression(MLR) and partial least squares (PLS) were used to develop the models. The MLR model using the first derivative spectra(10 nm of gap) with Standard Normal Variate and Detrending (SNV and Drt.) preprocessing showed the best results to predict moisture content of the sin린e kernel brown rice. To predict the protein content of a single kernel of brown ricer the PLS model used the raw spectra with multiplicative scatter correction(MSC) preprocessing over the wavelength of 1,100~1,500 nm.

  • PDF

Is it Possible to Predict the ADI of Pesticides using the QSAR Approach?

  • Kim, Jae Hyoun
    • 한국환경보건학회지
    • /
    • 제38권6호
    • /
    • pp.550-560
    • /
    • 2012
  • Objectives: QSAR methodology was applied to explain two different sets of acceptable daily intake (ADI) data of 74 pesticides proposed by both the USEPA and WHO in terms of setting guidelines for food and drinking water. Methods: A subset of calculated descriptors was selected from Dragon$^{(R)}$ software. QSARs were then developed utilizing a statistical technique, genetic algorithm-multiple linear regression (GA-MLR). The differences in each specific model in the prediction of the ADI of the pesticides were discussed. Results: The stepwise multiple linear regression analysis resulted in a statistically significant QSAR model with five descriptors. Resultant QSAR models were robust, showing good utility across multiple classes of pesticide compounds. The applicability domain was also defined. The proposed models were robust and satisfactory. Conclusions: The QSAR model could be a feasible and effective tool for predicting ADI and for the comparison of logADIEPA to logADIWHO. The statistical results agree with the fact that USEPA focuses on more subtle endpoints than does WHO.

텐서 스플라인 모형 선택에 관한 연구 (A study on selection of tensor spline models)

  • 구자용
    • 응용통계연구
    • /
    • 제5권2호
    • /
    • pp.181-192
    • /
    • 1992
  • 본 논문에서는 텐서 스플라인을 이용하여, 일반화된 선형모형의 회귀합수를 자료에만 의존 하는 방식으로 추정하는 문제를 고려하였다. 최우 추정법을 이용하여 회귀 함수를 추정하는 데, 이용된 텐서 스틀라인은 접목점의 수가 유한개이며, 독립변수 영역의 주변에서는 선형으 로 제한되었다. 접목점을 자료의 각 좌표의 순서 통계량에 위치하도록 했고 그 수는 AIC의 변형된 식을 최소로 하는 수로 결정 했다. 모의 실험 예를 통하여 추정량을 예시하였다.

  • PDF

Evaluation of mathematical models for prediction of slump, compressive strength and durability of concrete with limestone powder

  • Bazrafkan, Aryan;Habibi, Alireza;Sayari, Arash
    • Advances in concrete construction
    • /
    • 제10권6호
    • /
    • pp.463-478
    • /
    • 2020
  • Multiple mathematical modeling for prediction of slump, compressive strength and depth of water penetration at 28 days were performed using statistical analysis for the concrete containing waste limestone powder as partial replacement of sand obtained from experimental program reported in this research. To extract experimental data, 180 concrete cubic samples with 20 different mix designs were investigated. The twenty non-linear regression models were used to predict each of the concrete properties including slump, compressive strength and water depth penetration of concrete with waste limestone powder. Evaluation of the models using numerical methods showed that the majority of models give acceptable prediction with a high accuracy and trivial error rates. The 15-term regression models for predicting the slump, compressive strength and water depth were found to have the best agreement with the tested concrete specimens.

국내 회전교차로의 추돌사고 모형 개발 (Developing Rear-End Collision Models of Roundabouts in Korea)

  • 박병호;백태헌
    • 한국안전학회지
    • /
    • 제29권6호
    • /
    • pp.151-157
    • /
    • 2014
  • This study deals with the rear-end collision at roundabouts. The purpose of this study is to develop the accident models of rear-end collision in Korea. In pursuing the above, this study gives particular attention to developing the appropriate models using Poisson, negative binomial model, ZAM, multiple linear and nonlinear regression models, and statistical analysis tools. The main results are as follows. First, the Vuong statistics and overdispersion parameters indicate that ZIP is the most appropriate model among count data models. Second, RMSE, MPB, MAD and correlation coefficient tests show that the multiple nonlinear model is the most suitable to the rear-end collision data. Finally, such the independent variables as traffic volume, ratio of heavy vehicle, number of circulatory roadway lane, number of crosswalk and stop line are adopted in the optimal model.

Predicting the Young's modulus of frozen sand using machine learning approaches: State-of-the-art review

  • Reza Sarkhani Benemaran;Mahzad Esmaeili-Falak
    • Geomechanics and Engineering
    • /
    • 제34권5호
    • /
    • pp.507-527
    • /
    • 2023
  • Accurately estimation of the geo-mechanical parameters in Artificial Ground Freezing (AGF) is a most important scientific topic in soil improvement and geotechnical engineering. In order for this, one way is using classical and conventional constitutive models based on different theories like critical state theory, Hooke's law, and so on, which are time-consuming, costly, and troublous. The others are the application of artificial intelligence (AI) techniques to predict considered parameters and behaviors accurately. This study presents a comprehensive data-mining-based model for predicting the Young's Modulus of frozen sand under the triaxial test. For this aim, several single and hybrid models were considered including additive regression, bagging, M5-Rules, M5P, random forests (RF), support vector regression (SVR), locally weighted linear (LWL), gaussian process regression (GPR), and multi-layered perceptron neural network (MLP). In the present study, cell pressure, strain rate, temperature, time, and strain were considered as the input variables, where the Young's Modulus was recognized as target. The results showed that all selected single and hybrid predicting models have acceptable agreement with measured experimental results. Especially, hybrid Additive Regression-Gaussian Process Regression and Bagging-Gaussian Process Regression have the best accuracy based on Model performance assessment criteria.