• Title/Summary/Keyword: Linear Regression Algorithm

Search Result 286, Processing Time 0.027 seconds

Bootstrap Estimation for GEE Models (일반화추정방정식(GEE)에 대한 부스트랩의 적용)

  • Park, Chong-Sun;Jeon, Yong-Moon
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.1
    • /
    • pp.207-216
    • /
    • 2011
  • Bootstrap is a resampling technique to find an estimate of parameters or to evaluate the estimate. This technique has been used in estimating parameters in linear model(LM) and generalized linear model(GLM). In this paper, we explore the possibility of applying Bootstrapping Residuals, Pairs, and an Estimating Equation that are most widely used in LM and GLM to the generalized estimating equation(GEE) algorithm for modelling repeatedly measured regression data sets. We compared three bootstrapping methods with coefficient and standard error estimates of GEE models from one simulated and one real data set. Overall, the estimates obtained from bootstrap methods are quite comparable, except that estimates from bootstrapping pairs are somewhat different from others. We conjecture that the strange behavior of estimates from bootstrapping pairs comes from the inconsistency of those estimates. However, we need a more thorough simulation study to generalize it since those results are coming from only two small data sets.

Prediction Models to Control Pro-chlorination in Water Treatment Plant (정수장 후염소 공정제어를 위한 예측모델 개발)

  • Shin, Gang-Wook;Lee, Kyung-Hyuk
    • Journal of Korean Society of Water and Wastewater
    • /
    • v.22 no.2
    • /
    • pp.213-218
    • /
    • 2008
  • Prediction models for post-chlorination require complicated information of reaction time, chlorine dosage considering flow rate as well as environmental conditions such as turbidity, temperature and pH. In order to operate post-chlorination process effectively, the correlations between inlet and outlet of clear well were investigated to develop prediction models of chlorine dosages in post-chlorination process. Correlations of environmental conditions including turbidity and chlorine dosage were investigated to predict residual chlorine at the outlet of clear well. A linear regression model and autoregressive model were developed to apply for the post-chlorination which take place time delay due to detention in clear well tank. The results from autoregressive model show the correlationship of 0.915~0.995. Consequently, the autoregressive model developed in this study would be applicable for real time control for post chlorination process. As a result, the autoregressive model for post chlorination which take place time delay and have multi parameters to control system would contribute to water treatment automation system by applying the process control algorithm.

Implementation of Smart Ventilation Control System using IoT and Machine Learning (IoT와 기계학습을 이용한 스마트 환풍기 제어 시스템 구현)

  • Lee, Hui-Eun;Choi, Jin-ku
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.2
    • /
    • pp.283-287
    • /
    • 2020
  • In this paper, we implemented a control for ventilation system based on IoT. It can on/off of system and monitoring current status through the smartphone app. We applied linear regression, one of machine learning algorithm. It autonomously collects data about temperature, humidity in home and works diagnosing system status. Using this proposed control method, the energy efficiency can be improved. It is expected to be used in energy efficiency and convenience.

Modeling of Photovoltaic Power Systems using Clustering Algorithm and Modular Networks (군집화 알고리즘 및 모듈라 네트워크를 이용한 태양광 발전 시스템 모델링)

  • Lee, Chang-Sung;Ji, Pyeong-Shik
    • The Transactions of the Korean Institute of Electrical Engineers P
    • /
    • v.65 no.2
    • /
    • pp.108-113
    • /
    • 2016
  • The real-world problems usually show nonlinear and multi-variate characteristics, so it is difficult to establish concrete mathematical models for them. Thus, it is common to practice data-driven modeling techniques in these cases. Among them, most widely adopted techniques are regression model and intelligent model such as neural networks. Regression model has drawback showing lower performance when much non-linearity exists between input and output data. Intelligent model has been shown its superiority to the linear model due to ability capable of effectively estimate desired output in cases of both linear and nonlinear problem. This paper proposes modeling method of daily photovoltaic power systems using ELM(Extreme Learning Machine) based modular networks. The proposed method uses sub-model by fuzzy clustering rather than using a single model. Each sub-model is implemented by ELM. To show the effectiveness of the proposed method, we performed various experiments by dataset acquired during 2014 in real-plant.

Shadow Economy, Corruption and Economic Growth: An Analysis of BRICS Countries

  • NGUYEN, Diep Van;DUONG, My Tien Ha
    • The Journal of Asian Finance, Economics and Business
    • /
    • v.8 no.4
    • /
    • pp.665-672
    • /
    • 2021
  • The paper examines the impact of shadow economy and corruption, along with public expenditure, trade openness, foreign direct investment (FDI), inflation, and tax revenue on the economic growth of the BRICS countries. Data were collected from the World Bank, Transparency International, and Heritage Foundation over the 1991-2017 period. The Bayesian linear regression method is used to examine whether shadow economy, corruption and other indicators affect the economic growth of countries studied. This paper applies the normal prior suggested by Lemoine (2019) while the posterior distribution is simulated using Monte Carlo Markov Chain (MCMC) technique through the Gibbs sampling algorithm. The results indicate that public expenditure and trade openness can enhance the BRICS countries' economic growth, with the positive impact probability of 75.69% and 67.11%, respectively. Also, FDI, inflation, and tax revenue positively affect this growth, though the probability of positive effect is ambiguous, ranging from 51.13% to 56.36%. Further, the research's major finding is that shadow economy and control of corruption have a positive effect on the economic growth of the BRICS countries. Nevertheless, the posterior probabilities of these two factors are 62.23% and 65.25%, respectively. This result suggests that their positive effect probability is not high.

Business Intelligence Design for Strategic Decision Making for Small and Midium-size E-Commerce Sellers: Focusing on Promotion Strategy (중소 전자상거래 판매상의 전략적 의사결정을 위한 비즈니스 인텔리전스 설계: 프로모션 전략을 중심으로)

  • Seung-Joo Lee;Young-Hyun Lee;Jin-Hyun Lee;Kang-Hyun Lee;Kwang-Sup Shin
    • The Journal of Bigdata
    • /
    • v.8 no.2
    • /
    • pp.201-222
    • /
    • 2023
  • As the e-Commerce gets increased based on the platform, a lot of small and medium sized sellers have tried to develop the more effective strategies to maximize the profit. In order to increase the profitability, it is quite important to make the strategic decisions based on the range of promotion, discount rate and categories of products. This research aims to develop the business intelligence application which can help sellers of e-Commerce platform make better decisions. To decide whether or not to promote, it is needed to predict the level of increase in sales after promotion. I n this research, we have applied the various machine learning algorithm such as MLP(Multi Layer Perceptron), Gradient Boosting Regression, Random Forest, and Linear Regression. Because of the complexity of data structure and distinctive characteristics of product categories, Random Forest and MLP showed the best performance. It seems possible to apply the proposed approach in this research in support the small and medium sized sellers to react on the market changes and to make the reasonable decisions based on the data, not their own experience.

Estimating excess post-exercise oxygen consumption using multiple linear regression in healthy Korean adults: a pilot study

  • Jung, Won-Sang;Park, Hun-Young;Kim, Sung-Woo;Kim, Jisu;Hwang, Hyejung;Lim, Kiwon
    • Korean Journal of Exercise Nutrition
    • /
    • v.25 no.1
    • /
    • pp.35-41
    • /
    • 2021
  • [Purpose] This pilot study aimed to develop a regression model to estimate the excess post-exercise oxygen consumption (EPOC) of Korean adults using various easy-to-measure dependent variables. [Methods] The EPOC and dependent variables for its estimation (e.g., sex, age, height, weight, body mass index, fat-free mass [FFM], fat mass, % body fat, and heart rate_sum [HR_sum]) were measured in 75 healthy adults (31 males, 44 females). Statistical analysis was performed to develop an EPOC estimation regression model using the stepwise regression method. [Results] We confirmed that FFM and HR_sum were important variables in the EPOC regression models of various exercise types. The explanatory power and standard errors of estimates (SEE) for EPOC of each exercise type were as follows: the continuous exercise (CEx) regression model was 86.3% (R2) and 85.9% (adjusted R2), and the mean SEE was 11.73 kcal, interval exercise (IEx) regression model was 83.1% (R2) and 82.6% (adjusted R2), while the mean SEE was 13.68 kcal, and the accumulation of short-duration exercise (AEx) regression models was 91.3% (R2) and 91.0% (adjusted R2), while the mean SEE was 27.71 kcal. There was no significant difference between the measured EPOC using a metabolic gas analyzer and the predicted EPOC for each exercise type. [Conclusion] This pilot study developed a regression model to estimate EPOC in healthy Korean adults. The regression model was as follows: CEx = -37.128 + 1.003 × (FFM) + 0.016 × (HR_sum), IEx = -49.265 + 1.442 × (FFM) + 0.013 × (HR_sum), and AEx = -100.942 + 2.209 × (FFM) + 0.020 × (HR_sum).

Estimation of VOCs Affecting a Used Car Air Conditioning Smell via PLSR (부분최소자승법을 이용한 중고차 에어컨냄새 원인물질 추정)

  • You, Hanmin;Lee, Taehee;Sung, Kiwoo
    • Transactions of the Korean Society of Automotive Engineers
    • /
    • v.21 no.6
    • /
    • pp.175-182
    • /
    • 2013
  • Lately, customers think highly of the emotional satisfaction and as a result, issues on odor are matters of concern. The cases are odor of interior material and air-conditioner of vehicles. In particualar, with respect to the odor of air-conditioner, customers strongly claimed defects with provocative comments : "It smells like something rotten," "It smells like a foot odor," "It stinks like a rag." Generally, it is known that mold of evaporator core in the air-conditioning system decays and this produce VOCs which causes the odor to occur. In this study, partial least squares regression model is applied to predict the strength of the odor and select of important VOCs which affect car air conditioning smell. The PLS method is basically a particular multilinear regression algorithm which can handle correlated inputs and limited data. The number of latent variable is determined by the point which is stabilized mean absolute deviations of VOCs data. Also multiple linear regression is carried out to confirm the validity of PLS method.

Sequential prediction of TBM penetration rate using a gradient boosted regression tree during tunneling

  • Lee, Hang-Lo;Song, Ki-Il;Qi, Chongchong;Kim, Kyoung-Yul
    • Geomechanics and Engineering
    • /
    • v.29 no.5
    • /
    • pp.523-533
    • /
    • 2022
  • Several prediction model of penetration rate (PR) of tunnel boring machines (TBMs) have been focused on applying to design stage. In construction stage, however, the expected PR and its trends are changed during tunneling owing to TBM excavation skills and the gap between the investigated and actual geological conditions. Monitoring the PR during tunneling is crucial to rescheduling the excavation plan in real-time. This study proposes a sequential prediction method applicable in the construction stage. Geological and TBM operating data are collected from Gunpo cable tunnel in Korea, and preprocessed through normalization and augmentation. The results show that the sequential prediction for 1 ring unit prediction distance (UPD) is R2≥0.79; whereas, a one-step prediction is R2≤0.30. In modeling algorithm, a gradient boosted regression tree (GBRT) outperformed a least square-based linear regression in sequential prediction method. For practical use, a simple equation between the R2 and UPD is proposed. When UPD increases R2 decreases exponentially; In particular, UPD at R2=0.60 is calculated as 28 rings using the equation. Such a time interval will provide enough time for decision-making. Evidently, the UPD can be adjusted depending on other project and the R2 value targeted by an operator. Therefore, a calculation process for the equation between the R2 and UPD is addressed.

Dynamical Polynomial Regression Prefetcher for DRAM-PCM Hybrid Main Memory (DRAM-PCM 하이브리드 메인 메모리에 대한 동적 다항식 회귀 프리페처)

  • Zhang, Mengzhao;Kim, Jung-Geun;Kim, Shin-Dug
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.11a
    • /
    • pp.20-23
    • /
    • 2020
  • This research is to design an effective prefetching method required for DRAM-PCM hybrid main memory systems especially used for big data applications and massive-scale computing environment. Conventional prefetchers perform well with regular memory access patterns. However, workloads such as graph processing show extremely irregular memory access characteristics and thus could not be prefetched accurately. Therefore, this research proposes an efficient dynamical prefetching algorithm based on the regression method. We have designed an intelligent prefetch engine that can identify the characteristics of the memory access sequences. It can perform regular, linear regression or polynomial regression predictive analysis based on the memory access sequences' characteristics, and dynamically determine the number of pages required for prefetching. Besides, we also present a DRAM-PCM hybrid memory structure, which can reduce the energy cost and solve the conventional DRAM memory system's thermal problem. Experiment result shows that the performance has increased by 40%, compared with the conventional DRAM memory structure.