• Title/Summary/Keyword: Multi-Variable Regression Method

Search Result 50, Processing Time 0.033 seconds

Optimization of cost and mechanical properties of concrete with admixtures using MARS and PSO

  • Benemaran, Reza Sarkhani;Esmaeili-Falak, Mahzad
    • Computers and Concrete
    • /
    • v.26 no.4
    • /
    • pp.309-316
    • /
    • 2020
  • The application of multi-variable adaptive regression spline (MARS) in predicting he long-term compressive strength of a concrete with various admixtures has been investigated in this study. The compressive strength of concrete specimens, which were made based on 24 different mix designs using various mineral and chemical admixtures in different curing ages have been obtained. First, The values of fly ash (FA), micro-silica (MS), water-reducing admixture (WRA), coarse and fine aggregates, cement, water, age of samples and compressive strength were defined as inputs to the model, and MARS analysis was used to model the compressive strength of concrete and to evaluate the most important parameters affecting the estimation of compressive strength of the concrete. Next, the proposed equation by the MARS method using particle swarm optimization (PSO) algorithm has been optimized to have more efficient equation from the economical point of view. The proposed model in this study predicted the compressive strength of the concrete with various admixtures with a correlation coefficient of R=0.958 rather than the measured compressive strengths within the laboratory. The final model reduced the production cost and provided compressive strength by reducing the WRA and increasing the FA and curing days, simultaneously. It was also found that due to the use of the liquid membrane-forming compounds (LMFC) for its lower cost than water spraying method (SWM) and also for the longer operating time of the LMFC having positive mechanical effects on the final concrete, the final product had lower cost and better mechanical properties.

Incremental Ensemble Learning for The Combination of Multiple Models of Locally Weighted Regression Using Genetic Algorithm (유전 알고리즘을 이용한 국소가중회귀의 다중모델 결합을 위한 점진적 앙상블 학습)

  • Kim, Sang Hun;Chung, Byung Hee;Lee, Gun Ho
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.9
    • /
    • pp.351-360
    • /
    • 2018
  • The LWR (Locally Weighted Regression) model, which is traditionally a lazy learning model, is designed to obtain the solution of the prediction according to the input variable, the query point, and it is a kind of the regression equation in the short interval obtained as a result of the learning that gives a higher weight value closer to the query point. We study on an incremental ensemble learning approach for LWR, a form of lazy learning and memory-based learning. The proposed incremental ensemble learning method of LWR is to sequentially generate and integrate LWR models over time using a genetic algorithm to obtain a solution of a specific query point. The weaknesses of existing LWR models are that multiple LWR models can be generated based on the indicator function and data sample selection, and the quality of the predictions can also vary depending on this model. However, no research has been conducted to solve the problem of selection or combination of multiple LWR models. In this study, after generating the initial LWR model according to the indicator function and the sample data set, we iterate evolution learning process to obtain the proper indicator function and assess the LWR models applied to the other sample data sets to overcome the data set bias. We adopt Eager learning method to generate and store LWR model gradually when data is generated for all sections. In order to obtain a prediction solution at a specific point in time, an LWR model is generated based on newly generated data within a predetermined interval and then combined with existing LWR models in a section using a genetic algorithm. The proposed method shows better results than the method of selecting multiple LWR models using the simple average method. The results of this study are compared with the predicted results using multiple regression analysis by applying the real data such as the amount of traffic per hour in a specific area and hourly sales of a resting place of the highway, etc.

A Study on the Barrier-Free Space through IPA Method for the Elderly in Multi-family Housing (IPA 분석기법을 통한 공동주택의 무장애공간 인증기준 적합성 분석연구)

  • Kim, Ju-Whan;Kim, Won-Pil
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.1
    • /
    • pp.187-194
    • /
    • 2020
  • When a human being grew older, followed by visually and perceptually impaired, and dementia, it jeopardizes safety and life unless supportive design is secured for a living environment. This supportive space is based on universal design concept which offers safe-oriented, and simple use by incorporating gender and physical/mental limitation. The study of purppose was to examine the appropriateness of barrier-free standard for seniors' living in apartment through IPA. Chi-square analysis found that satisfaction with BF space is lowered as aging is continued and for female group. Regression analysis indicated that sink was the prime predictor in satisfaction, and stair/elevator was the most important variable. IPA concluded that sink, bath, shower/locker and alert/egress were prime BF indexes to be improved among 14 elements, implying careful design in sanitation area for seniors.

The Variation of Water Temperature and Turbidity of Stream Flows entering Imha Reservoir (임하호 유입지천의 수온과 탁도 변화)

  • Kim, Woo-Gu;Jung, Kwan-Soo;Yi, Yong-Kon
    • Korean Journal of Ecology and Environment
    • /
    • v.39 no.1 s.115
    • /
    • pp.13-20
    • /
    • 2006
  • The changing patterns of water temperature and turbidity in streams entering Imha Reservoir were studied. The turbidity variation near the intake tower in Imha Reservoir was investigated in relation with the variation of water temperature and turbidity in streams. Water temperature was estimated using multi-regression method with air temperature and dew point as independent variables. Peak turbidity was also estimated using non-linear regression method with rainfall intensity as an independent variable. Although more independent variables representing watershed characteristics seem to be needed to increase estimation accuracies, the methodology used in this study can be applied to estimate water temperature and peak turbidity in other streams.

Corporate Default Prediction Model Using Deep Learning Time Series Algorithm, RNN and LSTM (딥러닝 시계열 알고리즘 적용한 기업부도예측모형 유용성 검증)

  • Cha, Sungjae;Kang, Jungseok
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.1-32
    • /
    • 2018
  • In addition to stakeholders including managers, employees, creditors, and investors of bankrupt companies, corporate defaults have a ripple effect on the local and national economy. Before the Asian financial crisis, the Korean government only analyzed SMEs and tried to improve the forecasting power of a default prediction model, rather than developing various corporate default models. As a result, even large corporations called 'chaebol enterprises' become bankrupt. Even after that, the analysis of past corporate defaults has been focused on specific variables, and when the government restructured immediately after the global financial crisis, they only focused on certain main variables such as 'debt ratio'. A multifaceted study of corporate default prediction models is essential to ensure diverse interests, to avoid situations like the 'Lehman Brothers Case' of the global financial crisis, to avoid total collapse in a single moment. The key variables used in corporate defaults vary over time. This is confirmed by Beaver (1967, 1968) and Altman's (1968) analysis that Deakins'(1972) study shows that the major factors affecting corporate failure have changed. In Grice's (2001) study, the importance of predictive variables was also found through Zmijewski's (1984) and Ohlson's (1980) models. However, the studies that have been carried out in the past use static models. Most of them do not consider the changes that occur in the course of time. Therefore, in order to construct consistent prediction models, it is necessary to compensate the time-dependent bias by means of a time series analysis algorithm reflecting dynamic change. Based on the global financial crisis, which has had a significant impact on Korea, this study is conducted using 10 years of annual corporate data from 2000 to 2009. Data are divided into training data, validation data, and test data respectively, and are divided into 7, 2, and 1 years respectively. In order to construct a consistent bankruptcy model in the flow of time change, we first train a time series deep learning algorithm model using the data before the financial crisis (2000~2006). The parameter tuning of the existing model and the deep learning time series algorithm is conducted with validation data including the financial crisis period (2007~2008). As a result, we construct a model that shows similar pattern to the results of the learning data and shows excellent prediction power. After that, each bankruptcy prediction model is restructured by integrating the learning data and validation data again (2000 ~ 2008), applying the optimal parameters as in the previous validation. Finally, each corporate default prediction model is evaluated and compared using test data (2009) based on the trained models over nine years. Then, the usefulness of the corporate default prediction model based on the deep learning time series algorithm is proved. In addition, by adding the Lasso regression analysis to the existing methods (multiple discriminant analysis, logit model) which select the variables, it is proved that the deep learning time series algorithm model based on the three bundles of variables is useful for robust corporate default prediction. The definition of bankruptcy used is the same as that of Lee (2015). Independent variables include financial information such as financial ratios used in previous studies. Multivariate discriminant analysis, logit model, and Lasso regression model are used to select the optimal variable group. The influence of the Multivariate discriminant analysis model proposed by Altman (1968), the Logit model proposed by Ohlson (1980), the non-time series machine learning algorithms, and the deep learning time series algorithms are compared. In the case of corporate data, there are limitations of 'nonlinear variables', 'multi-collinearity' of variables, and 'lack of data'. While the logit model is nonlinear, the Lasso regression model solves the multi-collinearity problem, and the deep learning time series algorithm using the variable data generation method complements the lack of data. Big Data Technology, a leading technology in the future, is moving from simple human analysis, to automated AI analysis, and finally towards future intertwined AI applications. Although the study of the corporate default prediction model using the time series algorithm is still in its early stages, deep learning algorithm is much faster than regression analysis at corporate default prediction modeling. Also, it is more effective on prediction power. Through the Fourth Industrial Revolution, the current government and other overseas governments are working hard to integrate the system in everyday life of their nation and society. Yet the field of deep learning time series research for the financial industry is still insufficient. This is an initial study on deep learning time series algorithm analysis of corporate defaults. Therefore it is hoped that it will be used as a comparative analysis data for non-specialists who start a study combining financial data and deep learning time series algorithm.

Variable Selection for Multi-Purpose Multivariate Data Analysis (다목적 다변량 자료분석을 위한 변수선택)

  • Huh, Myung-Hoe;Lim, Yong-Bin;Lee, Yong-Goo
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.1
    • /
    • pp.141-149
    • /
    • 2008
  • Recently we frequently analyze multivariate data with quite large number of variables. In such data sets, virtually duplicated variables may exist simultaneously even though they are conceptually distinguishable. Duplicate variables may cause problems such as the distortion of principal axes in principal component analysis and factor analysis and the distortion of the distances between observations, i.e. the input for cluster analysis. Also in supervised learning or regression analysis, duplicated explanatory variables often cause the instability of fitted models. Since real data analyses are aimed often at multiple purposes, it is necessary to reduce the number of variables to a parsimonious level. The aim of this paper is to propose a practical algorithm for selection of a subset of variables from a given set of p input variables, by the criterion of minimum trace of partial variances of unselected variables unexplained by selected variables. The usefulness of proposed method is demonstrated in visualizing the relationship between selected and unselected variables, in building a predictive model with very large number of independent variables, and in reducing the number of variables and purging/merging categories in categorical data.

Steel Plate Faults Diagnosis with S-MTS (S-MTS를 이용한 강판의 표면 결함 진단)

  • Kim, Joon-Young;Cha, Jae-Min;Shin, Junguk;Yeom, Choongsub
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.1
    • /
    • pp.47-67
    • /
    • 2017
  • Steel plate faults is one of important factors to affect the quality and price of the steel plates. So far many steelmakers generally have used visual inspection method that could be based on an inspector's intuition or experience. Specifically, the inspector checks the steel plate faults by looking the surface of the steel plates. However, the accuracy of this method is critically low that it can cause errors above 30% in judgment. Therefore, accurate steel plate faults diagnosis system has been continuously required in the industry. In order to meet the needs, this study proposed a new steel plate faults diagnosis system using Simultaneous MTS (S-MTS), which is an advanced Mahalanobis Taguchi System (MTS) algorithm, to classify various surface defects of the steel plates. MTS has generally been used to solve binary classification problems in various fields, but MTS was not used for multiclass classification due to its low accuracy. The reason is that only one mahalanobis space is established in the MTS. In contrast, S-MTS is suitable for multi-class classification. That is, S-MTS establishes individual mahalanobis space for each class. 'Simultaneous' implies comparing mahalanobis distances at the same time. The proposed steel plate faults diagnosis system was developed in four main stages. In the first stage, after various reference groups and related variables are defined, data of the steel plate faults is collected and used to establish the individual mahalanobis space per the reference groups and construct the full measurement scale. In the second stage, the mahalanobis distances of test groups is calculated based on the established mahalanobis spaces of the reference groups. Then, appropriateness of the spaces is verified by examining the separability of the mahalanobis diatances. In the third stage, orthogonal arrays and Signal-to-Noise (SN) ratio of dynamic type are applied for variable optimization. Also, Overall SN ratio gain is derived from the SN ratio and SN ratio gain. If the derived overall SN ratio gain is negative, it means that the variable should be removed. However, the variable with the positive gain may be considered as worth keeping. Finally, in the fourth stage, the measurement scale that is composed of selected useful variables is reconstructed. Next, an experimental test should be implemented to verify the ability of multi-class classification and thus the accuracy of the classification is acquired. If the accuracy is acceptable, this diagnosis system can be used for future applications. Also, this study compared the accuracy of the proposed steel plate faults diagnosis system with that of other popular classification algorithms including Decision Tree, Multi Perception Neural Network (MLPNN), Logistic Regression (LR), Support Vector Machine (SVM), Tree Bagger Random Forest, Grid Search (GS), Genetic Algorithm (GA) and Particle Swarm Optimization (PSO). The steel plates faults dataset used in the study is taken from the University of California at Irvine (UCI) machine learning repository. As a result, the proposed steel plate faults diagnosis system based on S-MTS shows 90.79% of classification accuracy. The accuracy of the proposed diagnosis system is 6-27% higher than MLPNN, LR, GS, GA and PSO. Based on the fact that the accuracy of commercial systems is only about 75-80%, it means that the proposed system has enough classification performance to be applied in the industry. In addition, the proposed system can reduce the number of measurement sensors that are installed in the fields because of variable optimization process. These results show that the proposed system not only can have a good ability on the steel plate faults diagnosis but also reduce operation and maintenance cost. For our future work, it will be applied in the fields to validate actual effectiveness of the proposed system and plan to improve the accuracy based on the results.

Exposure Assessment and Asbestosis Pulmonum among Inhabitants near Abandoned Asbestos Mines Using Deposited Dust (폐석면광산 주변 지역의 주택 침적먼지의 석면 검출과 석면폐증의 관련성)

  • Ahn, Hoki;Yang, Wonho;Hwangbo, Young;Lee, Yong Jin
    • Journal of Environmental Health Sciences
    • /
    • v.41 no.6
    • /
    • pp.369-379
    • /
    • 2015
  • Objectives: The lack of reliable information on environmental pollution and health impacts related to asbestos contamination from abandoned mines has drawn attention to the need for a community health study. This study was performed to evaluate asbestos-related health symptoms among residents near abandoned asbestos mines located in the Chungcheong Provinces. In addition, exposure assessment for asbestos is needed although the exposure to asbestos was in the past. Methods: Past exposure to asbestos among inhabitants near abandoned asbestos mines was estimated by using surface sampling of deposited dust in indoor and outdoor residences. A total of 54 participants were divided into two groups with (34 cases) and without (20 controls) diseases related to asbestos. Surface sampling of deposited dust was carried out in indoor and outdoor residences by collecting 105 samples. Deposited dust for sampling was analyzed by polarization microscope (PLM) and scanning electron microscope?energy dispersive x-ray spectrometer (SEM-EDX) to detect asbestos. Subsequently, the elements of the deposited dust with asbestos were analyzed by SEM-EDX to assess the contribution of sources such as abandoned mines, slate and soil. Results: Among the 105 samples, asbestos was detected by PLM in 29 (27.6%) sampling points, and detected by SEM in 56 (48.6%) sampling points. Asbestos in indoor residences was detected by PLM in four sampling points, and by SEM in 12 sampling points. Asbestos detection in indoor residences may be due to ventilation between indoors and outdoors, and indicates long-term exposure. The asbestos detection rate for outdoor residences in the case group was higher than that in the control group. This can be explained as the case group having had higher exposure to asbestos, and there has been continuous exposure to asbestos in the control group as well as the case group. Conclusion: Past residential asbestos exposure may be associated with asbestosis among local residents near abandoned asbestos mines. Odds ratios were calculated for asbestos detection in outdoor residence by logistic regression analysis. Odds ratio between asbestos detection and asbestosis pulmonum was 3.36 (95% CI 0.90-12.53) (p=0.072), adjusting for age, sex, smoking status and work history with multi-variable logistic regression by PLM analysis method.

The Impact of Medical Utilization on Subjective Health and Happiness Index and Quality of Life according to the Economic Level of the Elderly (노인의 경제적 수준에 따른 의료이용이 주관적 건강수준과 행복감 지수 및 삶의 질에 미치는 영향)

  • So, Kwon-Seob;Hwang, Hye-Jeong;Kim, Eun-Mi
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.20 no.3
    • /
    • pp.544-552
    • /
    • 2019
  • The purpose of this study was to find concrete measures to improve the subjective health level, happiness and quality of life of the elderly according to economic level and to propose social and policy alternatives accordingly. As a research method, 63,929 elderly people aged 65 or older were surveyed using the Community Health Survey (Indicator Bank) _v09, and the frequency of health use by economic level, subjective health level, euphoria and quality of life Analysis and Chi square analysis and independent t-test. Multi variate logistic regression analysis was performed with subjective health level as a dependent variable and multiple linear regression analysis was performed to determine the factors affecting euphoria and quality of life. The results of the study are as follows. In the case of recipients, medical use was lower than that of non-recipients, lower education level, female age of 75 years or older, and less stress, In case of present or past recipients, the result of non - receipt increased as the subjective health level was worse, and the non - recipient had higher euphoria and quality of life. As a result, there is a need for alternatives to increase opportunities for medical use among the recipients, with particular attention being paid to women and elderly people over 75 years old. It is expected to be used as a basic data to effectively improve the health promotion, happiness and quality of life of the elderly people of low income group.

Development of Demand Forecasting Model for Public Bicycles in Seoul Using GRU (GRU 기법을 활용한 서울시 공공자전거 수요예측 모델 개발)

  • Lee, Seung-Woon;Kwahk, Kee-Young
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.4
    • /
    • pp.1-25
    • /
    • 2022
  • After the first Covid-19 confirmed case occurred in Korea in January 2020, interest in personal transportation such as public bicycles not public transportation such as buses and subways, increased. The demand for 'Ddareungi', a public bicycle operated by the Seoul Metropolitan Government, has also increased. In this study, a demand prediction model of a GRU(Gated Recurrent Unit) was presented based on the rental history of public bicycles by time zone(2019~2021) in Seoul. The usefulness of the GRU method presented in this study was verified based on the rental history of Around Exit 1 of Yeouido, Yeongdengpo-gu, Seoul. In particular, it was compared and analyzed with multiple linear regression models and recurrent neural network models under the same conditions. In addition, when developing the model, in addition to weather factors, the Seoul living population was used as a variable and verified. MAE and RMSE were used as performance indicators for the model, and through this, the usefulness of the GRU model proposed in this study was presented. As a result of this study, the proposed GRU model showed higher prediction accuracy than the traditional multi-linear regression model and the LSTM model and Conv-LSTM model, which have recently been in the spotlight. Also the GRU model was faster than the LSTM model and the Conv-LSTM model. Through this study, it will be possible to help solve the problem of relocation in the future by predicting the demand for public bicycles in Seoul more quickly and accurately.