• Title/Summary/Keyword: Multiple-Regression

Search Result 13,278, Processing Time 0.038 seconds

Multivariate Statistical Analysis and Prediction for the Flash Points of Binary Systems Using Physical Properties of Pure Substances (순수 성분의 물성 자료를 이용한 2성분계 혼합물의 인화점에 대한 다변량 통계 분석 및 예측)

  • Lee, Bom-Sock;Kim, Sung-Young
    • Journal of the Korean Institute of Gas
    • /
    • v.11 no.3
    • /
    • pp.13-18
    • /
    • 2007
  • The multivariate statistical analysis, using the multiple linear regression(MLR), have been applied to analyze and predict the flash points of binary systems. Prediction for the flash points of flammable substances is important for the examination of the fire and explosion hazards in the chemical process design. In this paper, the flash points are predicted by MLR based on the physical properties of pure substances and the experimental flash points data. The results of regression and prediction by MLR are compared with the values calculated by Raoult's law and Van Laar equation.

  • PDF

Estimation of Annual Capacity of Small Hydro Power Using Agricultural Reservoirs (농업용저수지를 이용한 소수력의 연간발전량 추정)

  • Woo, Jae-Yeoul;Kim, Jin-Soo
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.52 no.6
    • /
    • pp.1-7
    • /
    • 2010
  • This study was carried out to investigate the effect of hydro power factors (e.g., irrigation area, watershed area, active storage, gross head) on annual generation capacity and operation ratio for agricultural reservoirs in Chungbuk Province with active storage of over 1 million $m^3$. The annual generation capacity and operation ratio were estimated using HOMWRS (Hydrological Operation Model for Water Resources System) from last 10-year daily hydrological data. The correlation coefficients between annual generation capacity and the hydro power factors except gross head were high (over 0.87), but the correlation coefficients between operational rate and the factors were low (below 0.28). The optimum multiple regression equations of the annual generation capacity were expressed as the functions of watershed area, active storage, and gross head. Also, the simple regression equation of annual generation capacity was expressed as a function of watershed area. The average relative root-mean-square-error (RRMSE) between observed and estimated values by the optimum multiple regression equations was smaller than that by the simple regression equation, suggesting that the former has more accuracy than the latter.

Regression Analysis Between Specific Sediments of Reservoirs and Physiographic Factors of Watersheds (유역의 지상적 요인과 저수지 비퇴사량과의 관계분석)

  • 서승덕;박흥익;천만복;윤경덕
    • Magazine of the Korean Society of Agricultural Engineers
    • /
    • v.30 no.4
    • /
    • pp.45-61
    • /
    • 1988
  • The purpose of this study is to develop regression equations between annual specific sedi- ment of reservoirs and physiographic factors of watersheds. 122 irrigation reservoirs, which have irrigation areas equal to or larger than 200 ha, located in Korea except Cheju province are used in the analysis. Simple regression analyses between the specific annual sediment and each of the physical characteristic factors of the reservoirs are carried out at first. Then, multiple regression analyses between the annual specific sediment and the physical characteristic factors with high correlation coefficients in the simple regression analyses are made. The results obtained from this study are as follows : 1. The results of the sirnple regression analyses show that in each province the watershed area, the length of mainstream, the circumferential length of watershed have high cor- relation coefficients (R=0.814-0.986), and that drainage density, reservoir capacity per watershed area, drainage frequency, basin relief have low correlation coefficients (R=0. 387-0.955). 2. The purposed multiple regression equations between the annual specific sediment of reservoirs and three major characteritic factors of watersheds, namely, the watershed area, the circumferential length of watershed, and the length of mainstream, are proposed as given in Table 2. 3. The result of the simple regression analyses with respect to the reservoir elevation except Jeonnam province, which has very different characteristics comparing to other provinces, shows that watershed area, main stream length and circumferential length have high correlation coefficients (R=0.806-0.884) in low-elevation reservoirs and intermediate- elevation reservoirs, but low correlation coefficients (R=0.639-0.739) in high-elevation reservoirs. 4. With respect to the reservoir elevation, the proposed multiple regression equations bet- ween the annual specific sediment of reservoirs and the three major characteristic factors of watershed which have high correlation coefficients are proposed as given in Table 5.

  • PDF

Prediction Techniques for Difficulty Level of Hanja Using Multiple Linear Regression (다중 회귀 분석을 이용한 한자 난이도 예측 기법 연구)

  • Choi, Jeongwhan;Noh, Jiwoo;Kim, Suntae
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.6
    • /
    • pp.219-225
    • /
    • 2019
  • There is a problem with the existing method of selecting the difficulty levels of Hanja characters. Some Hanja characters selected by the existing methods are different from Sino-Korean words used in real life and it is impossible to know how many times the Hanja characters are used. To solve this problem, we measure the difficulty of Hanja characters using the multiple regression analysis with the frequency as the features. Based on the elementary textbooks, FWS and FHU are counted. A questionnaire is written using the two frequencies and stroke together to answer the appropriate timing of learning the Hanja characters and use them as target variables for regression. Use stepwise regression to select the appropriate features and perform multiple linear regression. The R2 score of the model was 0.1105 and the RMSE was 0.1105.

Relationship between Aiming Patterns and Scores in Archery Shooting

  • Quan, ChengHao;Lee, Sangmin
    • Korean Journal of Applied Biomechanics
    • /
    • v.26 no.4
    • /
    • pp.353-360
    • /
    • 2016
  • Objective: The aim of this study was to investigate the relationship between aiming patterns and scores in archery shooting. Method: Four (N = 4) elementary-level archers from middle school participated in this study. Aiming pattern was defined by averaged acceleration data measured from accelerometers attached on the body during the aiming phase in archery shooting. Stepwise multiple regression analysis was used to test whether a model incorporating aiming patterns from all nine accelerometers could predict the scores. In order to extract period of interest (POI) data from raw data, a Dynamic Time Warping (DTW)-based extraction method was presented. Results: Regression models for all four subjects are conducted with different significance levels and variables. The significance levels of the regression models are 0.12%, 1.61%, 0.55%, and 0.4% respectively; the $R^2$ of the regression models is 64.04%, 27.93%, 72.02%, and 45.62% respectively; and the maximum significance levels of parameters in the regression models are 1.26%, 4.58%, 5.1%, and 4.98% respectively. Conclusion: Our results indicated that the relationship between aiming patterns and scores was described by a regression model. Analysis of the significance levels, variables, and parameters of the regression model showed that our approach - regression analysis with DTW - is an effective way to raise scores in archery shooting.

A Combined Multiple Regression Trees Predictor for Screening Large Chemical Databases (대용량 화학 데이터 베이스를 선별하기위한 결합다중회귀나무 예측치)

  • 임용빈;이소영;정종희
    • The Korean Journal of Applied Statistics
    • /
    • v.14 no.1
    • /
    • pp.91-101
    • /
    • 2001
  • It has been shown that the multiple trees predictors are more accurate in reducing test set error than a single tree predictor. There are two ways of generating multiple trees. One is to generate modified training sets by resampling the original training set, and then construct trees. It is known that arcing algorithm is efficient. The other is to perturb randomly the working split at each node from a list of best splits, which is expected to generate reasonably good trees for the original training set. We propose a new combined multiple regression trees predictor which uses the latter multiple regression tree predictor as a predictor based on a modified training set at each stage of arcing. The efficiency of those prediction methods are compared by applying to high throughput screening of chemical compounds for biological effects.

  • PDF

A New Deletion Criterion of Principal Components Regression with Orientations of the Parameters

  • Lee, Won-Woo
    • Journal of the Korean Statistical Society
    • /
    • v.16 no.2
    • /
    • pp.55-70
    • /
    • 1987
  • The principal components regression is one of the substitues for least squares method when there exists multicollinearity in the multiple linear regression model. It is observed graphically that the performance of the principal components regression is strongly dependent upon the values of the parameters. Accordingly, a new deletion criterion which determines proper principal components to be deleted from the analysis is developed and its usefulness is checked by simulations.

  • PDF

Forecasting Energy Consumption of Steel Industry Using Regression Model (회귀 모델을 활용한 철강 기업의 에너지 소비 예측)

  • Sung-Ho KANG;Hyun-Ki KIM
    • Journal of Korea Artificial Intelligence Association
    • /
    • v.1 no.2
    • /
    • pp.21-25
    • /
    • 2023
  • The purpose of this study was to compare the performance using multiple regression models to predict the energy consumption of steel industry. Specific independent variables were selected in consideration of correlation among various attributes such as CO2 concentration, NSM, Week Status, Day of week, and Load Type, and preprocessing was performed to solve the multicollinearity problem. In data preprocessing, we evaluated linear and nonlinear relationships between each attribute through correlation analysis. In particular, we decided to select variables with high correlation and include appropriate variables in the final model to prevent multicollinearity problems. Among the many regression models learned, Boosted Decision Tree Regression showed the best predictive performance. Ensemble learning in this model was able to effectively learn complex patterns while preventing overfitting by combining multiple decision trees. Consequently, these predictive models are expected to provide important information for improving energy efficiency and management decision-making at steel industry. In the future, we plan to improve the performance of the model by collecting more data and extending variables, and the application of the model considering interactions with external factors will also be considered.

A study on Estimation of NO2 concentration by Statistical model (통계모형을 이용한 NO2 농도 예측에 관한 연구)

  • Jang Nan-Sim
    • Journal of Environmental Science International
    • /
    • v.14 no.11
    • /
    • pp.1049-1056
    • /
    • 2005
  • [ $NO_2$ ] concentration characteristics of Busan metropolitan city was analysed by statistical method using hourly $NO_2$ concentration data$(1998\~2000)$ collected from air quality monitoring sites of the metropolitan city. 4 representative regions were selected among air quality monitoring sites of Ministry of environment. Concentration data of $NO_2$, 5 air pollutants, and data collected at AWS was used. Both Stepwise Multiple Regression model and ARIMA model for prediction of $NO_2$ concentrations were adopted, and then their results were compared with observed concentration. While ARIMA model was useful for the prediction of daily variation of the concentration, it was not satisfactory for the prediction of both rapid variation and seasonal variation of the concentration. Multiple Regression model was better estimated than ARIMA model for prediction of $NO_2$ concentration.

A Cost Estimation Model for Highway Projects in Korea

  • Kim, Soo-Yong;Kim, Young-Mok;Luu, Truong-Van
    • Proceedings of the Korean Institute Of Construction Engineering and Management
    • /
    • 2008.11a
    • /
    • pp.922-925
    • /
    • 2008
  • Many highway projects are under way in Korea. However, owners frequently find that the project cost exceeds the budget and they are unable to identify the underlining reasons. The main purpose of this research is to develop cost models for transportation projects in Korea using the multiple linear regression (MLR). The data consist of 27 completed transportation projects, built from 1991 to 2001, The technique of multiple regression analysis is used to develop the parametric cost estimating model for total budget cost per highway square meter (TBC/$m^2$). Findings of the study indicated that MLR car be applied to highway projects in Korea. There are twf) major contributions of this research. (1) the identification of transportation parameters as a significant cost driver for transportation costs and (2) the successful development of the parametric cost estimating models for transportation projects in Korea.

  • PDF