• Title/Summary/Keyword: multivariate linear models

Search Result 68, Processing Time 0.026 seconds

EPB-TBM performance prediction using statistical and neural intelligence methods

  • Ghodrat Barzegari;Esmaeil Sedghi;Ata Allah Nadiri
    • Geomechanics and Engineering
    • /
    • v.37 no.3
    • /
    • pp.197-211
    • /
    • 2024
  • This research studies the effect of geotechnical factors on EPB-TBM performance parameters. The modeling was performed using simple and multivariate linear regression methods, artificial neural networks (ANNs), and Sugeno fuzzy logic (SFL) algorithm. In ANN, 80% of the data were randomly allocated to training and 20% to network testing. Meanwhile, in the SFL algorithm, 75% of the data were used for training and 25% for testing. The coefficient of determination (R2) obtained between the observed and estimated values in this model for the thrust force and cutterhead torque was 0.19 and 0.52, respectively. The results showed that the SFL outperformed the other models in predicting the target parameters. In this method, the R2 obtained between observed and predicted values for thrust force and cutterhead torque is 0.73 and 0.63, respectively. The sensitivity analysis results show that the internal friction angle (φ) and standard penetration number (SPT) have the greatest impact on thrust force. Also, earth pressure and overburden thickness have the highest effect on cutterhead torque.

On a Bayesian Estimation of Multivariate Regression Models with Constrained Coefficient Matrix

  • Kim, Hea-Jung
    • Journal of Korean Society for Quality Management
    • /
    • v.26 no.4
    • /
    • pp.151-165
    • /
    • 1998
  • Consider the linear multivariate regression model $Y=X_1B_1+X_2B_2+U$, where Vec(U)~N(0, $\sum \bigotimes I_N$). This paper is concerned with Bayes infreence of the model when it is suspected that the elements of $B_2$ are constrained in the form of intervals. The use of the Gibbs sampler as a method for calculating Bayesian marginal posterior desnities of the parameters under a generalized conjugate prior is developed. It is shown that the a, pp.oach is straightforward to specify distributionally and to implement computationally, with output readily adopted for required inference summaries. The method developed is a, pp.ied to a real problem.

  • PDF

The Relationship between Hospital Specialization and Operational Performance: Focusing on Diseases of the Musculoskeletal System and Connective Tissue (병원의 전문화 전략과 운영성과 간의 관계: 근골격계 및 결합조직 질환을 중심으로)

  • Seo, Seul-Ki;Kim, Yang-Kyun
    • Korea Journal of Hospital Management
    • /
    • v.25 no.3
    • /
    • pp.53-66
    • /
    • 2020
  • This study is aimed at investigated and compared the differences in the affect of hospital specialization according to hospital size using claims data of the Health Insurance and Review Assessment National Inpatient Sample in 2018 for diseases of the musculoskeletal system and connective tissue. To this end, we used multivariate hierarchical linear models(a.k.a., multi-level models) using two-tier data from 106,599 patients discharged after diseases of the musculoskeletal system and connective tissue from 734 hospitals. Multivariate results indicate that patients who were discharged with diseases of the musculoskeletal system and connective tissue from specialized hospitals with 200 beds or less stayed shorter and paid less inpatient charge than those who were discharged from less specialized hospitals. But for hospitals with 201-300 beds, no positive impact relationship was found between hospital specialization and operational performance. This finding may be limited evidence that the affect of a hospital's specialization strategy may vary depending on the size of the hospital. We discussed several managerial and health policy implications below.

Development and Validation of Generalized Linear Regression Models to Predict Vessel Enhancement on Coronary CT Angiography

  • Masuda, Takanori;Nakaura, Takeshi;Funama, Yoshinori;Sato, Tomoyasu;Higaki, Toru;Kiguchi, Masao;Matsumoto, Yoriaki;Yamashita, Yukari;Imada, Naoyuki;Awai, Kazuo
    • Korean Journal of Radiology
    • /
    • v.19 no.6
    • /
    • pp.1021-1030
    • /
    • 2018
  • Objective: We evaluated the effect of various patient characteristics and time-density curve (TDC)-factors on the test bolus-affected vessel enhancement on coronary computed tomography angiography (CCTA). We also assessed the value of generalized linear regression models (GLMs) for predicting enhancement on CCTA. Materials and Methods: We performed univariate and multivariate regression analysis to evaluate the effect of patient characteristics and to compare contrast enhancement per gram of iodine on test bolus (${\Delta}HUTEST$) and CCTA (${\Delta}HUCCTA$). We developed GLMs to predict ${\Delta}HUCCTA$. GLMs including independent variables were validated with 6-fold cross-validation using the correlation coefficient and Bland-Altman analysis. Results: In multivariate analysis, only total body weight (TBW) and ${\Delta}HUTEST$ maintained their independent predictive value (p < 0.001). In validation analysis, the highest correlation coefficient between ${\Delta}HUCCTA$ and the prediction values was seen in the GLM (r = 0.75), followed by TDC (r = 0.69) and TBW (r = 0.62). The lowest Bland-Altman limit of agreement was observed with GLM-3 (mean difference, $-0.0{\pm}5.1$ Hounsfield units/grams of iodine [HU/gI]; 95% confidence interval [CI], -10.1, 10.1), followed by ${\Delta}HUCCTA$ ($-0.0{\pm}5.9HU/gI$; 95% CI, -11.9, 11.9) and TBW ($1.1{\pm}6.2HU/gI$; 95% CI, -11.2, 13.4). Conclusion: We demonstrated that the patient's TBW and ${\Delta}HUTEST$ significantly affected contrast enhancement on CCTA images and that the combined use of clinical information and test bolus results is useful for predicting aortic enhancement.

A Study on the Quantitative Evaluation of Outdoor-Recreational Function and User Satisfaction with Urban Park and Open Space (도시공원녹지에 대한 실외위락기능과 만족도의 계량적 평가에 관한 연구)

  • 박승범
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.18 no.4
    • /
    • pp.127-140
    • /
    • 1991
  • The Primary purpose of this study is to investigate factors and variables which have significant effects on user satisfaction with recreational facilities in Taejong-Dae recreational complex, thereby establishing indices of planning and development of urban parks and open space. To test the causal models of this research, the date were gathered by self-administered questionnaires from 967 households in Pusan City which were selected by the multi-stage probability sampling methood. The analysis of the multi-stage primarily consists of two phase : The first analysis dealt exploratory factor analysis which identified major factors involved in satisfaction with recreational activities and facilities in Taejong-Dae recreational complex and the second analysis tested the fit of the causal models of this research by employing LISREL methodology. There are three advantages of using LISREL over other multivariate analysis methods : First, measurement error is allowed and calculated in LISREL, otherwise there is a risk of seriously misleading estimates of coefficients ; Second, LISREL deals with latent variables or unmeasured variables ; Third, it enables to test causal relations among variables. The factors analysis identified that five factors are involved in satisfaction with recreational facilities. The five factors of satisfaction with recreational facilities are space for repose and relaxation, active recreation facilities such as pool and zoo, physical exercise facility, convenience and maintenance facility, and linear facility, and linear facility for walking. The second phase analysis tested the fit of the causal models for satisfaction with recreational facilities to the data and identified statistically significant causal linkage among overall satisfaction with Taejong-Dae recreational complex, other endogenous factors and exogenous variables. Overall fits of both causal models were very good. Among endogenous factors, facility for repose and relaxation. linear facility for walking, active recreation facility, facility for convenience and maintenance were identified as having significant effects on overall satisfaction. Exogenous variables which have significant effects on endogenous variables wer also identified. These significant relationships indicate important factors and variables that should be considered in planning and development of the recreational complex. On the basis of these significant causal relationships, implications for planning and the delovepment of Taejong-Dae recreational complex were suggested.

  • PDF

Comparison study of modeling covariance matrix for multivariate longitudinal data (다변량 경시적 자료 분석을 위한 공분산 행렬의 모형화 비교 연구)

  • Kwak, Na Young;Lee, Keunbaik
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.3
    • /
    • pp.281-296
    • /
    • 2020
  • Repeated outcomes from the same subjects are referred to as longitudinal data. Analysis of the data requires different methods unlike cross-sectional data analysis. It is important to model the covariance matrix because the correlation between the repeated outcomes must be considered when estimating the effects of covariates on the mean response. However, the modeling of the covariance matrix is tricky because there are many parameters to be estimated, and the estimated covariance matrix should be positive definite. In this paper, we consider analysis of multivariate longitudinal data via two modeling methodologies for the covariance matrix for multivariate longitudinal data. Both methods describe serial correlations of multivariate longitudinal outcomes using a modified Cholesky decomposition. However, the two methods consider different decompositions to explain the correlation between simultaneous responses. The first method uses enhanced linear covariance models so that the covariance matrix satisfies a positive definiteness condition; in addition, and principal component analysis and maximization-minimization algorithm (MM algorithm) were used to estimate model parameters. The second method considers variance-correlation decomposition and hypersphere decomposition to model covariance matrix. Simulations are used to compare the performance of the two methodologies.

Continuity Simulation and Trend Analysis of Water Qualities in Incoming Flows to Lake Paldang by Log Linear Models (로그선형모델을 이용한 팔당호 유입지류 수질의 연속성 시뮬레이션과 경향 분석)

  • Na, Eun-Hye;Park, Seok-Soon
    • Korean Journal of Ecology and Environment
    • /
    • v.36 no.3 s.104
    • /
    • pp.336-343
    • /
    • 2003
  • Two types of statistical models, simple and multivariate log linear models, were studied for continuity simulation and trend analysis of water qualities in incoming flows to Lake Paldang. Water quality is a function of one independent variable (flow) in the simple log linear model, and of three different variables (flow, time, and seasonal cycle) in multivariate model. The independent variables act as surrogate variables of water quality in both models. The model coefficients were determined by the monthly data. The water qualities included 5-day Biochemical Oxygen Demand ($BOD_5$), Total Nitrogen (TN), and Total Phosphorus (TP) measured from 1995 to 2000 in the South and the North branches of Han River and the Kyoungan Stream. The results indicated that the multivariate model provided better agreements with field measurements than the simple one in a31 attempted cases. Flow dependency, seasonality, and temporal trends of water quality were tested on the determined coefficients of the multivariate model. The test of flow dependency indicated that BOD concentrations decreased as the water flow increased. In TN and TP concentrations, however, there were no discernible flow effects. From the temporal trend analyses, the following results were obtained: 1) no trends on BOD at all three upstreams, 2) increase on TN at the South Branch and the Kyoungan Stream, 3)decrease on TN at the North Branch,4) no trends on TP at the North and the South Branches and 5) increase on TP at the Kyoungan Stream by 3 to 8% per years. The seasonality test showed that there were significant seasonal variations in all three water qualities at three incoming flows.

A Comparative Study of Estimation by Analogy using Data Mining Techniques

  • Nagpal, Geeta;Uddin, Moin;Kaur, Arvinder
    • Journal of Information Processing Systems
    • /
    • v.8 no.4
    • /
    • pp.621-652
    • /
    • 2012
  • Software Estimations provide an inclusive set of directives for software project developers, project managers, and the management in order to produce more realistic estimates based on deficient, uncertain, and noisy data. A range of estimation models are being explored in the industry, as well as in academia, for research purposes but choosing the best model is quite intricate. Estimation by Analogy (EbA) is a form of case based reasoning, which uses fuzzy logic, grey system theory or machine-learning techniques, etc. for optimization. This research compares the estimation accuracy of some conventional data mining models with a hybrid model. Different data mining models are under consideration, including linear regression models like the ordinary least square and ridge regression, and nonlinear models like neural networks, support vector machines, and multivariate adaptive regression splines, etc. A precise and comprehensible predictive model based on the integration of GRA and regression has been introduced and compared. Empirical results have shown that regression when used with GRA gives outstanding results; indicating that the methodology has great potential and can be used as a candidate approach for software effort estimation.

Bayesian Analysis of Multivariate Threshold Animal Models Using Gibbs Sampling

  • Lee, Seung-Chun;Lee, Deukhwan
    • Journal of the Korean Statistical Society
    • /
    • v.31 no.2
    • /
    • pp.177-198
    • /
    • 2002
  • The estimation of variance components or variance ratios in linear model is an important issue in plant or animal breeding fields, and various estimation methods have been devised to estimate variance components or variance ratios. However, many traits of economic importance in those fields are observed as dichotomous or polychotomous outcomes. The usual estimation methods might not be appropriate for these cases. Recently threshold linear model is considered as an important tool to analyze discrete traits specially in animal breeding field. In this note, we consider a hierarchical Bayesian method for the threshold animal model. Gibbs sampler for making full Bayesian inferences about random effects as well as fixed effects is described to analyze jointly discrete traits and continuous traits. Numerical example of the model with two discrete ordered categorical traits, calving ease of calves from born by heifer and calving ease of calf from born by cow, and one normally distributed trait, birth weight, is provided.

Comparison of National Occupational Accident Fatality Rates using Statistical Analysis on Economic and Social Indicators (경제⋅사회지표의 다변량 통계 분석을 활용한 국가 간 산업재해 사고사망 상대수준 비교)

  • Kyunghun, Kim;Sudong, Lee
    • Journal of the Korean Society of Safety
    • /
    • v.37 no.6
    • /
    • pp.128-135
    • /
    • 2022
  • The comparative evaluation of occupational accident fatality rates (OAFRs) of different countries is complicated owing to the differences in their level of socio-economic development. However, such evaluation is necessary to assess the national occupational safety and health system of a country. This study proposes a statistical method to compare the OAFRs of countries taking into consideration the difference in their level of socio-economic development. We first collected data on the socio-economic indicators and OAFRs of 11 countries over a 30-year period. Next, based on literature survey and statistical correlation analysis, we selected the significant independent variables and built multiple linear regression models to predict OAFR. We also determined the groups of countries having heterogeneous relationships between the independent variables and OAFRs, which are represented by the regression models. The proposed method is demonstrated by comparing the OAFR of Korea with the OAFRs of 10 other developed countries.