• Title/Summary/Keyword: Linear regression models

Search Result 947, Processing Time 0.027 seconds

The health effects of low blood lead level in oxidative stress as a marker, serum gamma-glutamyl transpeptidase level, in male steelworkers

  • Su-Yeon Lee;Yong-Jin Lee;Young-Sun Min;Eun-Chul Jang;Soon-Chan Kwon;Inho Lee
    • Annals of Occupational and Environmental Medicine
    • /
    • v.34
    • /
    • pp.34.1-34.13
    • /
    • 2022
  • Background: This study aimed to investigate the association between lead exposure and serum gamma-glutamyl transpeptidase (γGT) levels as an oxidative stress marker in male steelworkers. Methods: Data were collected during the annual health examination of workers in 2020. A total of 1,654 steelworkers were selected, and the variables for adjustment included the workers' general characteristics, lifestyle, and occupational characteristics. The association between the blood lead level (BLL) and serum γGT level was investigated by multiple linear and logistic regression analyses. The BLL and serum γGT values that were transformed into natural logarithms were used in multiple linear regression analysis, and the tertile of BLL was used in logistic regression analysis. Results: The geometric mean of the participants' BLLs and serum γGT level was 1.36 ㎍/dL and 27.72 IU/L, respectively. Their BLLs differed depending on age, body mass index (BMI), smoking status, drinking status, shift work, and working period, while their serum γGT levels differed depending on age, BMI, smoking status, drinking status, physical activity, and working period. In multiple linear regression analysis, the difference in models 1, 2, and 3 was significant, obtaining 0.326, 0.176, and 0.172 (all: p < 0.001), respectively. In the multiple linear regression analysis stratified according to drinking status, BMI, and age, BLLs were positively associated with serum γGT levels. Regarding the logistic regression analysis, the odds ratio of the third BLL tertile in models 1, 2, and 3 (for having an elevated serum γGT level within the first tertile reference) was 2.74, 1.83, and 1.81, respectively. Conclusions: BLL was positively associated with serum γGT levels in male steelworkers even at low lead concentrations (< 5 ㎍/dL).

Design and Assessment of an Ozone Potential Forecasting Model using Multi-regression Equations in Ulsan Metropolitan Area (중회귀 모형을 이용한 울산지역 오존 포텐셜 모형의 설계 및 평가)

  • Kim, Yoo-Keun;Lee, So-Young;Lim, Yun-Kyu;Song, Sang-Keun
    • Journal of Korean Society for Atmospheric Environment
    • /
    • v.23 no.1
    • /
    • pp.14-28
    • /
    • 2007
  • This study presented the selection of ozone ($O_3$) potential factors and designed and assessed its potential prediction model using multiple-linear regression equations in Ulsan area during the springtime from April to June, $2000{\sim}2004$. $O_3$ potential factors were selected by analyzing the relationship between meterological parameters and surface $O_3$ concentrations. In addition, cluster analysis (e.g., average linkage and K-means clustering techniques) was performed to identify three major synoptic patterns (e.g., $P1{\sim}P3$) for an $O_3$ potential prediction model. P1 is characterized by a presence of a low-pressure system over northeastern Korea, the Ulsan was influenced by the northwesterly synoptic flow leading to a retarded sea breeze development. P2 is characterized by a weakening high-pressure system over Korea, and P3 is clearly associated with a migratory anticyclone. The stepwise linear regression was performed to develop models for prediction of the highest 1-h $O_3$ occurring in the Ulsan. The results of the models were rather satisfactory, and the high $O_3$ simulation accuracy for $P1{\sim}P3$ synoptic patterns was found to be 79, 85, and 95%, respectively ($2000{\sim}2004$). The $O_3$ potential prediction model for $P1{\sim}P3$ using the predicted meteorological data in 2005 showed good high $O_3$ prediction performance with 78, 75, and 70%, respectively. Therefore the regression models can be a useful tool for forecasting of local $O_3$ concentration.

Comparison of Local and Global Fitting for Exercise BP Estimation Using PTT (PTT를 이용한 운동 중 혈압 예측을 위한 Local과 Global Fitting의 비교)

  • Kim, Chul-Seung;Moon, Ki-Wook;Eom, Gwang-Moon
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.56 no.12
    • /
    • pp.2265-2267
    • /
    • 2007
  • The purpose of this work is to compare the local fitting and global fitting approaches while applying regression model to the PTT-BP data for the prediction of exercise blood pressures. We used linear and nonlinear regression models to represent the PTT-BP relationship during exercise. PTT-BP data were acquired both under resting state and also after cycling exercise with several load conditions. PTT was calculated as the time between R-peak of ECG and the peak of differential photo-plethysmogram. For the identification of the regression models, we used local fitting which used only the resting state data and global fitting which used the whole region of data including exercise BP. The results showed that the global fitting was superior to the local fitting in terms of the coefficient of determination and the RMS (root mean square) error between the experimental and estimated BP. The nonlinear regression model which used global fitting showed slightly better performance than the linear one (no significant difference). We confirmed that the wide-range of data is required for the regression model to appropriately predict the exercise BP.

Study on Accident Prediction Models in Urban Railway Casualty Accidents Using Logistic Regression Analysis Model (로지스틱회귀분석 모델을 활용한 도시철도 사상사고 사고예측모형 개발에 대한 연구)

  • Jin, Soo-Bong;Lee, Jong-Woo
    • Journal of the Korean Society for Railway
    • /
    • v.20 no.4
    • /
    • pp.482-490
    • /
    • 2017
  • This study is a railway accident investigation statistic study with the purpose of prediction and classification of accident severity. Linear regression models have some difficulties in classifying accident severity, but a logistic regression model can be used to overcome the weaknesses of linear regression models. The logistic regression model is applied to escalator (E/S) accidents in all stations on 5~8 lines of the Seoul Metro, using data mining techniques such as logistic regression analysis. The forecasting variables of E/S accidents in urban railway stations are considered, such as passenger age, drinking, overall situation, behavior, and handrail grip. In the overall accuracy analysis, the logistic regression accuracy is explained 76.7%. According to the results of this analysis, it has been confirmed that the accuracy and the level of significance of the logistic regression analysis make it a useful data mining technique to establish an accident severity prediction model for urban railway casualty accidents.

Traffic Accident Density Models Reflecting the Characteristics of the Traffic Analysis Zone in Cheongju (존별 특성을 반영한 교통사고밀도 모형 - 청주시 사례를 중심으로 -)

  • Kim, Kyeong Yong;Beck, Tea Hun;Lim, Jin Kang;Park, Byung Ho
    • International Journal of Highway Engineering
    • /
    • v.17 no.6
    • /
    • pp.75-83
    • /
    • 2015
  • PURPOSES : This study deals with the traffic accidents classified by the traffic analysis zone. The purpose is to develop the accident density models by using zonal traffic and socioeconomic data. METHODS : The traffic accident density models are developed through multiple linear regression analysis. In this study, three multiple linear models were developed. The dependent variable was traffic accident density, which is a measure of the relative distribution of traffic accidents. The independent variables were various traffic and socioeconomic variables. CONCLUSIONS : Three traffic accident density models were developed, and all models were statistically significant. Road length, trip production volume, intersections, van ratio, and number of vehicles per person in the transportation-based model were analyzed to be positive to the accident. Residential and commercial area ratio and transportation vulnerability ratio obtained using the socioeconomic-based model were found to affect the accident. The major arterial road ratio, trip production volume, intersection, van ratio, commercial ratio, and number of companies in the integrated model were also found to be related to the accident.

Accident Models of Rotary by Age Group in Korea (국내 로터리의 연령대별 사고모형)

  • Park, Min Kyu;Park, Byung Ho
    • International Journal of Highway Engineering
    • /
    • v.15 no.2
    • /
    • pp.121-129
    • /
    • 2013
  • PURPOSES : This study deals with the traffic accidents of rotary in Korea. The objective of this study is to develop the accident models by age group based on the various data of rotaries. METHODS : In pursuing the above, this study gives particular attentions to classifying the accident data of 17 rotaries by age, collecting the data of geometric structure, traffic volume and others, and developing the models using SPSS 17.0 and EXCEL. RESULTS : First, 3 multiple linear regression models which were all statistically significant were developed. The value of model of under 30-49 age group were, however, evaluated to be 0.688 and be less than those of other models. Second, the most powerful variables were analyzed to be traffic volume in the model of under 30 age group, circulatory roadway width in the model of 30-49 age group, and the number of approach lane in the model of above 50 age group. Finally, the test results of accident models using RMSE were all evaluated to be fitted to the given data. CONCLUSIONS : This study propose install streetlights, speed humps and widen Circulatory as effective improvements for reduction of accident in rotary.

Evaluation and Predicting PM10 Concentration Using Multiple Linear Regression and Machine Learning (다중선형회귀와 기계학습 모델을 이용한 PM10 농도 예측 및 평가)

  • Son, Sanghun;Kim, Jinsoo
    • Korean Journal of Remote Sensing
    • /
    • v.36 no.6_3
    • /
    • pp.1711-1720
    • /
    • 2020
  • Particulate matter (PM) that has been artificially generated during the recent of rapid industrialization and urbanization moves and disperses according to weather conditions, and adversely affects the human skin and respiratory systems. The purpose of this study is to predict the PM10 concentration in Seoul using meteorological factors as input dataset for multiple linear regression (MLR), support vector machine (SVM), and random forest (RF) models, and compared and evaluated the performance of the models. First, the PM10 concentration data obtained at 39 air quality monitoring sites (AQMS) in Seoul were divided into training and validation dataset (8:2 ratio). The nine meteorological factors (mean, maximum, and minimum temperature, precipitation, average and maximum wind speed, wind direction, yellow dust, and relative humidity), obtained by the automatic weather system (AWS), were composed to input dataset of models. The coefficients of determination (R2) between the observed PM10 concentration and that predicted by the MLR, SVM, and RF models was 0.260, 0.772, and 0.793, respectively, and the RF model best predicted the PM10 concentration. Among the AQMS used for model validation, Gwanak-gu and Gangnam-daero AQMS are relatively close to AWS, and the SVM and RF models were highly accurate according to the model validations. The Jongno-gu AQMS is relatively far from the AWS, but since PM10 concentration for the two adjacent AQMS were used for model training, both models presented high accuracy. By contrast, Yongsan-gu AQMS was relatively far from AQMS and AWS, both models performed poorly.

Large Robust Designs for Generalized Linear Model

  • Kim, Young-Il;Kahng, Myung-Wook
    • Journal of the Korean Data and Information Science Society
    • /
    • v.10 no.2
    • /
    • pp.289-298
    • /
    • 1999
  • We consider a minimax approach to make a design robust to many types or uncertainty arising in reality when dealing with non-normal linear models. We try to build a design to protect against the worst case, i.e. to improve the "efficiency" of the worst situation that can happen. In this paper, we especially deal with the generalized linear model. It is a known fact that the generalized linear model is a universal approach, an extension of the normal linear regression model to cover other distributions. Therefore, the optimal design for the generalized linear model has very similar properties as the normal linear model except that it has some special characteristics. Uncertainties regarding the unknown parameters, link function, and the model structure are discussed. We show that the suggested approach is proven to be highly efficient and useful in practice. In the meantime, a computer algorithm is discussed and a conclusion follows.

  • PDF

Study on the Critical Storm Duration Decision of the Rivers Basin (중소하천유역의 임계지속시간 결정에 관한 연구)

  • Ahn, Seung-Seop;Lee, Hyeo-Jung;Jung, Do-June
    • Journal of Environmental Science International
    • /
    • v.16 no.11
    • /
    • pp.1301-1312
    • /
    • 2007
  • The objective of this study is to propose a critical storm duration forecasting model on storm runoff in small river basin. The critical storm duration data of 582 sub-basin which introduced disaster impact assessment report on the National Emergency Management Agency during the period from 2004 to 2007 were collected, analyzed and studied. The stepwise multiple regression method are used to establish critical storm duration forecasting models(Linear and exponential type). The results of multiple regression analysis discriminated the linear type more than exponential type. The results of multiple linear regression analysis between the critical storm duration and 5 basin characteristics parameters such as basin area, main stream length, average slope of main stream, shape factor and CN showed more than 0.75 of correlation in terms of the multi correlation coefficient.

A Statistical Approach to Examine the Impact of Various Meteorological Parameters on Pan Evaporation

  • Pandey, Swati;Kumar, Manoj;Chakraborty, Soubhik;Mahanti, N.C.
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.3
    • /
    • pp.515-530
    • /
    • 2009
  • Evaporation from surface water bodies is influenced by a number of meteorological parameters. The rate of evaporation is primarily controlled by incoming solar radiation, air and water temperature and wind speed and relative humidity. In the present study, influence of weekly meteorological variables such as air temperature, relative humidity, bright sunshine hours, wind speed, wind velocity, rainfall on rate of evaporation has been examined using 35 years(1971-2005) of meteorological data. Statistical analysis was carried out employing linear regression models. The developed regression models were tested for goodness of fit, multicollinearity along with normality test and constant variance test. These regression models were subsequently validated using the observed and predicted parameter estimates with the meteorological data of the year 2005. Further these models were checked with time order sequence of residual plots to identify the trend of the scatter plot and then new standardized regression models were developed using standardized equations. The highest significant positive correlation was observed between pan evaporation and maximum air temperature. Mean air temperature and wind velocity have highly significant influence on pan evaporation whereas minimum air temperature, relative humidity and wind direction have no such significant influence.