• Title/Summary/Keyword: stepwise regression model

Search Result 382, Processing Time 0.03 seconds

The Selection of the Suitable Site for Forest Tree(Pinus thunbergii) (임목(林木)((해송(海松)) 적지선정(適地選定)에 관한 연구(硏究))

  • Chung, Young Gwan;Park, Nam Chang;Son, Yeong Mo
    • Journal of Korean Society of Forest Science
    • /
    • v.82 no.4
    • /
    • pp.420-430
    • /
    • 1993
  • This study was conducted to investigate the effect of the forest environmental factors(5 items) and physico-chemical properties of soil(13 items) on the growth of Pinus thunbergii stands. The 218 plots were sampled over the coastal district of the whole country. In statistical analysis, the explanatory variables were soil and environmental factors(18 items), and the response variable was the site index of Pinus thunbergii stands. Data computation was processed in order of preparation of original data, computation of inner correlation matrix table by correlation analysis, calculation of partial correlation coefficients and coefficients of determination, estimation of regression equation by stepwise begression analysis, and stepwise regression analysis by factor score of factor analysis. The main results obtained were summarized as follows ; 1. The site index in Pinus thunbergii stands way highly correlated with effective soil depth(r=0.8668), slope percentage, organic matter, and total nitrogen. 2. According to the coefficients by partial correlation analysis, effective soil depth(r=0.6270), slope percentage (r=-0.5423) and base saturation(r=0.3278) among environmental factors had a great effect on tree growth. 3. With stepwise regression analysis, the factors effecting on the Pinus thunbergii stands growth were effective soil depth, slope percentage, organic matter, base saturation, soil pH, content of silt, exchangeable Ca, and etc. 4. Estimation equation for the site index of Pinus thunbergii stands was given by $Y=13.2691+0.0242\;X_2-1.2244\;X_4+0.6142\;X_5-0.3472\;X_{11}+0.0355\;X_{13}+0.1552\;X_{15}-0.1002\;X_{17}$. The coefficient of determination for the estimation model was 0.77, which was significant at the 1 percent level. 5. In result of factor analysis by the environmental factors, principal components were 6 factors, and communality contribution percentage was 71.1 percent. 6. By stepwise regression analysis between factor score and site index of Pinus thunbergii stands, the factor group effecting on site index was 5 principal components. The coefficients of determination was 85 percent, which was significant at the 1 percent level. In conclusion, on the occasion of analizing which factors to effect on the tree height growth in Pinus thunbergii stands the stepwise regression analysis proved to be greatly significant. Also the management of Pinus thunbergii stands should be working by the above selected growth factors.

  • PDF

FACTORS AFFECTING PRODUCTIVITY ON DAIRY FARMS IN TROPICAL AND SUB-TROPICAL ENVIRONMENTS

  • Kerr, D.V.;Davison, T.M.;Cowan, R.T.;Chaseling, J.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.8 no.5
    • /
    • pp.505-513
    • /
    • 1995
  • The major factors affecting productivity on daily farms in Queensland, Australia, were determined using the stepwise linear regression approach. The data were obtained from a survey conducted on the total population of daily farms in Queensland in 1987. These data were divided into six major dailying regions. The technique was applied using 12 independent variables believed by a panel of experienced research and extension personnel to exert the most influence on milk production. The regression equations were all significant (p < 0.001) with the percentage coefficients of determination ranging from 62 to 76% for equations developed using' total farm milk: production as the dependent variable. Three of the variables affecting total farm milk: production were found to be common to all six regions. These were; the amount of supplementary energy fed, the area set aside to irrigate winter feed and the size of the area used for dailying. Higher production farms appeared to be more efficient in that they consistently produced milk production levels higher than those estimated from the regression equation for their region. Other methods of analysis including robust regression and non linear regression techniques were unsuccessful in overcoming this problem and allowing development of a model appropriate for farms at all levels of production.

Factors of Predicting Difficulty of Mathematics Test Items in College Scholastic Ability Test (고등학교 수리영역 시험의 난이도 예측 요인 분석)

  • Ko, Ho-Kyoung;Yi, Hyun-Sook
    • Journal of the Korean School Mathematics Society
    • /
    • v.10 no.1
    • /
    • pp.113-127
    • /
    • 2007
  • This study explored the possibility of building a statistical model predicting difficulty of mathematics test items through the analysis of nation-wide scholastic ability test results for the past 5 years. Multiple linear regression analysis was conducted in predicting difficulty of mathematics test items. We adopted three major areas for independent variables: the content area, the behavior area, and the test item format area, each of which was categorized into more detailed sub-areas. For the dependent variable, the proportion of correct answer was used to represent the item difficulty. Statistically significant independent variables were included in the regression model based on the stepwise selection method. Several important factors affecting difficulty of mathematics test items for each area were identified. R-squares for the final regression model were fairly high, implying that the regression equation can be used to predict difficulty of test items at an acceptable level. Lastly, the regression model was cross-validated using independently collected data. We believe that this study will provide basic but very critical information for predicting the proportion of correct answer by showing the factors that should be considered for developing mathematics test items for the college entrance examination or high school classroom test.

  • PDF

A Study on the Level of Family Adaptation to Schizophrenic Patients: An Application of the Family Resiliency Model (가족탄력 모델을 이용한 정신분열병 환자가족의 부적응에 관한 연구)

  • Lee, Eun-Hee
    • Korean Journal of Social Welfare
    • /
    • v.41
    • /
    • pp.173-200
    • /
    • 2000
  • The purpose of this study is to examine the variables that may influence the level of family adaptation to schizophrenic patients using the Family Resiliency Model. The Family Resiliency Model is the most current extension of family stress Model. According to the Family Resiliency Model, The level of family adaptation in the face of a crisis situation is determined by a number of interacting components. The subjects are 151 family members with schizophrenic patient. The result from the research were as follows: 1) The following variables significantly correlated with the family adaptation: income of the family, educational level of the family, intimacy between family and patient, knowledge on schizophrenia, recognition of prognosis on schizophrenia. 2) The factors that compose the Family Resiliency Model significantly correlated with the level of family adaptation. 3) The result of stepwise multiple regression analysis indicated that factors which predict the level of family adaptation were the family control, the quality of family communication, and the support from the extended family, these findings give us significant practical implications for social work intervention.

  • PDF

Consideration on Precedence of Crime Occurrence on Stock Price of Security Company (범죄 발생의 경비업체 주가에 대한 선행성 고찰)

  • Joo, Il-Yeob
    • Korean Security Journal
    • /
    • no.34
    • /
    • pp.313-336
    • /
    • 2013
  • The purpose of this study is to derive an optimal regression model for occurrences of major crimes on a security company's stock price through identifying precedence of the occurrences of major crimes on the security company's stock price, relationship between the occurrences of major crimes and the security company's stock price. Followings are the results of this study. First, the occurrences of murder crime, robbery crime, rape crime, theft crime move along the security company's monthly stock price simultaneously, and the occurrence of violence crime precedes 6 months to the security company's monthly stock price depending on the results of cross-correlation analysis of precedence of occurrences of major crimes, such as murder crime, robbery crime, rape crime, theft crime, violence crime on the security company's monthly stock price. Second, the explanation of the occurrences of robbery crime, rape crime, theft crime on the security company's monthly stock price is 61.7%($R^2$ = .617) excluding murder crime, violence crime depending on the results of multiple regression analysis(stepwise method) by putting the occurrences of major crimes, such as murder crime, robbery crime, rape crime, theft crime, violence crime into the security company's monthly stock price.

  • PDF

Approximate Optimization of High-speed Train Shape and Tunnel Condition to Reduce the Micro-pressure Wave (미기압파 저감을 위한 고속전철 열차-터널 조건의 근사최적설계)

  • Kim, Jung-Hui;Lee, Jong-Soo;Kwon, Hyeok-Bin
    • Proceedings of the KSME Conference
    • /
    • 2004.04a
    • /
    • pp.1023-1028
    • /
    • 2004
  • A micro-pressure wave is generated by the high-speed train which enters a tunnel, and it causes explosive noise and vibration at the exit. It is known that train speed, train-tunnel area ratio, nose slenderness and nose shape mainly influence on generating micro-pressure wave. So it is required to minimize it by searching optimal values of such train shape factors and tunnel condition. In this study, response surface model, one of approximation models, is used to perform optimization effectively and analyze sensitivity of design variables. Owen's randomized orthogonal array and D-optimal Design are used to construct response surface model. In order to increase accuracy of model, stepwise regression is selected. Finally SQP(Sequential Quadratic Programming) optimization algorithm is used to minimize the maximum micro-pressure wave by using built approximation model.

  • PDF

Risk Prediction Using Genome-Wide Association Studies on Type 2 Diabetes

  • Choi, Sungkyoung;Bae, Sunghwan;Park, Taesung
    • Genomics & Informatics
    • /
    • v.14 no.4
    • /
    • pp.138-148
    • /
    • 2016
  • The success of genome-wide association studies (GWASs) has enabled us to improve risk assessment and provide novel genetic variants for diagnosis, prevention, and treatment. However, most variants discovered by GWASs have been reported to have very small effect sizes on complex human diseases, which has been a big hurdle in building risk prediction models. Recently, many statistical approaches based on penalized regression have been developed to solve the "large p and small n" problem. In this report, we evaluated the performance of several statistical methods for predicting a binary trait: stepwise logistic regression (SLR), least absolute shrinkage and selection operator (LASSO), and Elastic-Net (EN). We first built a prediction model by combining variable selection and prediction methods for type 2 diabetes using Affymetrix Genome-Wide Human SNP Array 5.0 from the Korean Association Resource project. We assessed the risk prediction performance using area under the receiver operating characteristic curve (AUC) for the internal and external validation datasets. In the internal validation, SLR-LASSO and SLR-EN tended to yield more accurate predictions than other combinations. During the external validation, the SLR-SLR and SLR-EN combinations achieved the highest AUC of 0.726. We propose these combinations as a potentially powerful risk prediction model for type 2 diabetes.

Eutrophication of Nakdong River and Statistical Analtsis of Envitonmental Factors (낙동강 부영양화와 수질환경요인의 통계적 분석)

  • Kim, Mi-Suk;Chung, Young-Ryun;Suh, Euy-Hoon;Song, Won-Sup
    • ALGAE
    • /
    • v.17 no.2
    • /
    • pp.105-115
    • /
    • 2002
  • Influences of vrious environmental factors on the eutrophication of Nakdong River were analyzed statistically using water samples collected from 1 January, 1999, to 30 September, 2001 at Namji area. The relationships between the concentration of chlorophyll α (eutrophication index) and environmental factors and were analyzed to develop a statistical model which can predict the status of eutrophication. The concentation of chlorophyll α ranged from 66.2 mg · $m^{-3}$ to 70.8 mg · $m^{-3}$ during dry winter season and the average concentration during this study period was 35.5 mg · $m^{-3}$ Namji area of Nakdong River was in the hypereutrohic stage in terms of water quality. Stephanodiscus sp. and Aulacoseria granulata var. angustissima were dominant species during the witnter to spring time and summer to autumn period, respectively. Based on the correlation analysis and the analysis of variance between chlorophyll α concentration and environmental factors, significantly high positive relationships were found in the order of BOD> pH> COD > KMnO₄ consumption > DO > conductivity > alkalinity. In contrast to these factors, significantly negrative relationships were found as in the order of $PO₄^{3-}-P$ >water level>the rate of Namgang-dam discharge > NH₃-N> the rate of Andong-dam discharge> the rate of Hapchoen-dam discharge. Based on the factors analysis of environmental factors on the concentration of chlorophyll α, we obtained five factors as follows. The first factor included water level, pH, turbiditiy, conductivity, alkalinity and the rate of Namgang-dam discharge. The second factor included water temperature DO, NH₄+-N, NO₃- -N. The third factor included KMnO₄ consumption COD and BOD. The fourth factor included the rate of Andong-dam discharge, the rate of Hapcheon-dam discharge, and the rate of Imha-dam discharge. The final factor included T-N T-P and $PO₄^{3-}-P$ > concentration. We derived two statistica models that can predict the occurrence of eutrophication based on the factors by factor analysis, using regression analysis. The first model is the stepwise regression model whose independent variables are the factors produced by factor analysis : chl α (mg · $m^{-3}$ = 42.923+(18.637 factor 3) + (-17.147 factor 1) + (-12.095 factor 5) + (-4.828 factor 4). The second model is the alternative stepwise regression model whose independent variables are the sums of the standardized main component variables:chl α (mg · $m^{-3}$ = 37.295+(7.326 Zfactor 3) + (-2.704 Zfactor 1)+(-2.341 Zfactor 5).

Variable selection in partial linear regression using the least angle regression (부분선형모형에서 LARS를 이용한 변수선택)

  • Seo, Han Son;Yoon, Min;Lee, Hakbae
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.6
    • /
    • pp.937-944
    • /
    • 2021
  • The problem of selecting variables is addressed in partial linear regression. Model selection for partial linear models is not easy since it involves nonparametric estimation such as smoothing parameter selection and estimation for linear explanatory variables. In this work, several approaches for variable selection are proposed using a fast forward selection algorithm, least angle regression (LARS). The proposed procedures use t-test, all possible regressions comparisons or stepwise selection process with variables selected by LARS. An example based on real data and a simulation study on the performance of the suggested procedures are presented.

Model Development for Estimating Total Arsenic Contents with Chemical Properties and Extractable Heavy Metal Contents in Paddy Soils (논토양의 이화학적 특성 및 침출성 중금속 함량을 이용한 비소의 전함량 예측)

  • Lee, Jeong-Mi;Go, Woo-Ri;Kunhikrishnan, Anitha;Yoo, Ji-Hyock;Kim, Ji-Young;Kim, Doo-Ho;Kim, Won-Il
    • Korean Journal of Soil Science and Fertilizer
    • /
    • v.45 no.6
    • /
    • pp.920-924
    • /
    • 2012
  • This study was performed to estimate total contents of arsenic (As) by stepwise multiple-regression analysis using chemical properties and extractable contents of metal in paddy soil adjacent to abandoned mines. The soil was collected from paddies near abandoned mines. Soil pH, electrical conductively (EC), organic mater (OM), available phosphorus ($P_2O_5$), and exchangeable cations (Ca, K, Mg, Na) were measured. Total contents of As and extractable contents of metals were analyzed by ICP-OES. From stepwise analysis, it was showed that the contents of extractable As, available phosphorus, extractable Cu, exchangeable K, exchangeable Na, and organic mater significantly influenced the total contents of As in soil (p<0.001). The multiple linear regression models have been established as Log (Total-As) = 0.741 + 0.716 Log (extractable-As) - 0.734 Log (avail-$P_2O_5$) + 0.334 Log (extractable-Cu) + 0.186 Log (exchangeable-K) - 0.593 Log (exchangeable-Na) + 0.558 Log (OM). The estimated value in total contents of As was significantly correlated with the measured value in soil ($R^2$=0.84196, p<0.0001). This predictive model for estimating total As contents in paddy soil will be properly applied to the numerous datasets which were surveyed with extractable heavy metal contents based on Soil Environmental Conservation Act before 2010.