• Title/Summary/Keyword: Ridge regression

Search Result 117, Processing Time 0.026 seconds

A Study on the Application of Measurement Data Using Machine Learning Regression Models

  • Yun-Seok Seo;Young-Gon Kim
    • International journal of advanced smart convergence
    • /
    • v.12 no.2
    • /
    • pp.47-55
    • /
    • 2023
  • The automotive industry is undergoing a paradigm shift due to the convergence of IT and rapid digital transformation. Various components, including embedded structures and systems with complex architectures that incorporate IC semiconductors, are being integrated and modularized. As a result, there has been a significant increase in vehicle defects, raising expectations for the quality of automotive parts. As more and more data is being accumulated, there is an active effort to go beyond traditional reliability analysis methods and apply machine learning models based on the accumulated big data. However, there are still not many cases where machine learning is used in product development to identify factors of defects in performance and durability of products and incorporate feedback into the design to improve product quality. In this paper, we applied a prediction algorithm to the defects of automotive door devices equipped with automatic responsive sensors, which are commonly installed in recent electric and hydrogen vehicles. To do so, we selected test items, built a measurement emulation system for data acquisition, and conducted comparative evaluations by applying different machine learning algorithms to the measured data. The results in terms of R2 score were as follows: Ordinary multiple regression 0.96, Ridge regression 0.95, Lasso regression 0.89, Elastic regression 0.91.

回歸分析에 있어서의 多共線性과 名稱을 保全시키는 資料變換 技法

  • 兪浣
    • Journal of the Korean Statistical Society
    • /
    • v.8 no.2
    • /
    • pp.109-116
    • /
    • 1979
  • 두 개의 변수의 대체효과(substitution effect)를 연구하기 위하여 수요 또는 공급의 모형을 만들었을 경우 이에 관련된 변수들의 이름이 중요시 된다. 실제 관측 자료를 사용하였을 경우 흔히 일어나는 다공선성(multicollinearity) 문제를 다루기 위한 대안으로써 선형회귀선을 예로 들어 능형회귀기법(ridge regression technique)과 요인분석기법(factor analytic technique)을 소개하였으며 이에서 얻어지는 계수(coefficient)를 OLS 추정치로 설명하기 위하여 원래의 자료를 변환하였다. 실지 수요와 공급의 모형이 비선형일 경우 일반적으로 능형회귀나 요인분석을 쓰지 못한다는 점을 감안, 이러한 방법을 자료의 변환방법으로 설명함으로써 비선형모형에서도 다공선성문제를 위하여 능형회귀분석법이나 요인분석기법을 사용할 수 있도록 하였다.

  • PDF

The Procedure of Finding Operating Conditions Minimizing Quality Loss and Case Study (공정 데이터를 이용한 조업 조건 결정 절차와 사례연구)

  • Jeong Il Gyo;Jeon Chi Hyeok
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2003.05a
    • /
    • pp.76-80
    • /
    • 2003
  • The procedure of finding operating conditions minimizing qualify loss is proposed with a real industry example. The procedure consists or major two parts - the selection or process variables critical to the response and He determination or operating conditions. The coefficients or ridge regression and the and stores or partial least squares are applied to select important process variables. Functional approach and Non-functional approach are used to find proper operating conditions of important process variables.

  • PDF

A study on the properties of sensitivity analysis in principal component regression and latent root regression (주성분회귀와 고유값회귀에 대한 감도분석의 성질에 대한 연구)

  • Shin, Jae-Kyoung;Chang, Duk-Joon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.2
    • /
    • pp.321-328
    • /
    • 2009
  • In regression analysis, the ordinary least squares estimates of regression coefficients become poor, when the correlations among predictor variables are high. This phenomenon, which is called multicollinearity, causes serious problems in actual data analysis. To overcome this multicollinearity, many methods have been proposed. Ridge regression, shrinkage estimators and methods based on principal component analysis (PCA) such as principal component regression (PCR) and latent root regression (LRR). In the last decade, many statisticians discussed sensitivity analysis (SA) in ordinary multiple regression and same topic in PCR, LRR and logistic principal component regression (LPCR). In those methods PCA plays important role. Many statisticians discussed SA in PCA and related multivariate methods. We introduce the method of PCR and LRR. We also introduce the methods of SA in PCR and LRR, and discuss the properties of SA in PCR and LRR.

  • PDF

Sentiment Analysis for Public Opinion in the Social Network Service (SNS 기반 여론 감성 분석)

  • HA, Sang Hyun;ROH, Tae Hyup
    • The Journal of the Convergence on Culture Technology
    • /
    • v.6 no.1
    • /
    • pp.111-120
    • /
    • 2020
  • As an application of big data and artificial intelligence techniques, this study proposes an atypical language-based sentimental opinion poll methodology, unlike conventional opinion poll methodology. An alternative method for the sentimental classification model based on existing statistical analysis was to collect real-time Twitter data related to parliamentary elections and perform empirical analyses on the Polarity and Intensity of public opinion using attribute-based sensitivity analysis. In order to classify the polarity of words used on individual SNS, the polarity of the new Twitter data was estimated using the learned Lasso and Ridge regression models while extracting independent variables that greatly affect the polarity variables. A social network analysis of the relationships of people with friends on SNS suggested a way to identify peer group sensitivity. Based on what voters expressed on social media, political opinion sensitivity analysis was used to predict party approval rating and measure the accuracy of the predictive model polarity analysis, confirming the applicability of the sensitivity analysis methodology in the political field.

A Study on Time Series Cross-Validation Techniques for Enhancing the Accuracy of Reservoir Water Level Prediction Using Automated Machine Learning TPOT (자동기계학습 TPOT 기반 저수위 예측 정확도 향상을 위한 시계열 교차검증 기법 연구)

  • Bae, Joo-Hyun;Park, Woon-Ji;Lee, Seoro;Park, Tae-Seon;Park, Sang-Bin;Kim, Jonggun;Lim, Kyoung-Jae
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.66 no.1
    • /
    • pp.1-13
    • /
    • 2024
  • This study assessed the efficacy of improving the accuracy of reservoir water level prediction models by employing automated machine learning models and efficient cross-validation methods for time-series data. Considering the inherent complexity and non-linearity of time-series data related to reservoir water levels, we proposed an optimized approach for model selection and training. The performance of twelve models was evaluated for the Obong Reservoir in Gangneung, Gangwon Province, using the TPOT (Tree-based Pipeline Optimization Tool) and four cross-validation methods, which led to the determination of the optimal pipeline model. The pipeline model consisting of Extra Tree, Stacking Ridge Regression, and Simple Ridge Regression showed outstanding predictive performance for both training and test data, with an R2 (Coefficient of determination) and NSE (Nash-Sutcliffe Efficiency) exceeding 0.93. On the other hand, for predictions of water levels 12 hours later, the pipeline model selected through time-series split cross-validation accurately captured the change pattern of time-series water level data during the test period, with an NSE exceeding 0.99. The methodology proposed in this study is expected to greatly contribute to the efficient generation of reservoir water level predictions in regions with high rainfall variability.

Negative Ion Generation Index according to Altitude in the Autumn of Pine Forest in Gyeongju Namsan (경주 남산 소나무림의 가을철 해발고도별 음이온 발생지수)

  • Kim, Jeong Ho;Yoon, Ji Hun;Lee, Sang Hoon;Choi, Won Jun;Yoon, Yong Han
    • Korean Journal of Environment and Ecology
    • /
    • v.32 no.4
    • /
    • pp.413-424
    • /
    • 2018
  • The study analyzed the effects of topographic structures and altitude in mountainous parks in Mt. Namsan in Gyeongju on the generation of anions. The temperature was at ridge ($9.82^{\circ}C$) > valley ($8.44^{\circ}C$), the relative humidity valley (59.01 %) > ridge (58.64 %), the solar radiation ridge ($34.40W/m^2$) > valley($14.69W/m^2$), the wind speed ridge (0.63m/s) > valley(0.37m/s), and the negative ion valley($636.81ea/cm^3$) > ridge($580.04ea/cm^3$). In the valley, the correlation with altitude was verified for the temperature, relative humidity, solar radiation, and negative ion generation in the valley. The relative humidity, solar radiation, and negative ion indicated a positive correlation while the temperature had a negative correlation. In the ridge, the correlation with altitude was verified for the temperature, relative humidity, wind speed, solar radiation, and negative ion generation. The relative humidity, solar radiation, and negative ion generation indicated a positive correlation while the temperature and wind speed had a negative correlation. The regression analysis showed the prediction equation of y=-0.006x+9.663 (x=altitude, y=temperature) in the valley and y=-0.009x+11.595 (x=altitude, y=temperature) in the ridge for the temperature, y=0.027x+53.561 (x=altitude, y=relative humidity) in the valley and y=0.008x+56.646 (x=altitude, y=relative humidity) in the ridges for the relative humidity, and y=0.027x+53.561 (x=altitude, y=negative Ion generation) in the valley and y= 0.008x+56.646 (x=altitude, y=negative Ion generation) in the ridge for the negative ion generation.

Prediction of Residual Resistance Coefficient of Low-Speed Full Ships Using Hull Form Variables and Machine Learning Approaches (선형변수 기계학습 기법을 활용한 저속비대선의 잉여저항계수 추정)

  • Kim, Yoo-Chul;Yang, Kyung-Kyu;Kim, Myung-Soo;Lee, Young-Yeon;Kim, Kwang-Soo
    • Journal of the Society of Naval Architects of Korea
    • /
    • v.57 no.6
    • /
    • pp.312-321
    • /
    • 2020
  • In this study, machine learning techniques were applied to predict the residual resistance coefficient (Cr) of low-speed full ships. The used machine learning methods are Ridge regression, support vector regression, random forest, neural network and their ensemble model. 19 hull form variables were used as input variables for machine learning methods. The hull form variables and Cr data obtained from 139 hull forms of KRISO database were used in analysis. 80 % of the total data were used as training models and the rest as validation. Some non-linear models showed the overfitted results and the ensemble model showed better results than others.

Application of Regularized Linear Regression Models Using Public Domain data for Cycle Life Prediction of Commercial Lithium-Ion Batteries (상업용 리튬 배터리의 수명 예측을 위한 고속대량충방전 데이터 정규화 선형회귀모델의 적용)

  • KIM, JANG-GOON;LEE, JONG-SOOK
    • Transactions of the Korean hydrogen and new energy society
    • /
    • v.32 no.6
    • /
    • pp.592-611
    • /
    • 2021
  • In this study a rarely available high-throughput cycling data set of 124 commercial lithium iron phosphate/graphite cells cycled under fast-charging conditions, with widely varying cycle lives ranging from 150 to 2,300 cycles including in-cycle temperature and per-cycle IR measurements. We worked out own Python codes which reproduced the various data plots and machine learning approaches for cycle life prediction using early cycles and more details not presented in the article and the supplementary information. Particularly, we applied regularized ridge, lasso and elastic net linear regression models using features extracted from capacity fade curves, discharge voltage curves, and other data such as internal resistance and cell can temperature. We found that due to the limitation in the quantity and quality of the data from costly and lengthy battery testing a careful hyperparameter tuning may be required and that model features need to be extracted based on the domain knowledge.

Assessing Landscape Impacts of Apartment Complex on Suburban Hilly Openspace; Multilateral Approach by Analysis of Physical Landscape Variables and Eye Fixation Movements (도시주변 능선녹지를 배경으로 하는 아파트 경관의 시각적 영향 - 물리적 경관변수 및 와시점분석에 의한 다각적 접근-)

  • Choi, Yun;Cho, Tong-Buhm
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.22 no.2
    • /
    • pp.81-103
    • /
    • 1994
  • In recent years, the visual characteristics of natural open space and greenbelt surrounding the urban landscapes have been changed with sprawling of residential areas and highrised residential buildings. Since these natural areas being the background element of residential areas are topographically sloped mountains in many cities. It is easy to be seen in the distance and it is important to preserve these areas as a visual infrastructure of the urban landscape. The purposes of this study are to extract the factors of landscape impact evaluation for these areas and to clarify the physical landscape variables representing these factors, and to infer the visual-perceptional relationships between image and landscape variables. As results, conceptional three factors were extracted with semantic differential evaluation to classified 18 landscape slide, and three regression models were established with factor score of landscapes and physical variables measured in photographs. On the basis of these relationships, visual-perceptional characteristics were discussed by analyzing the data form eye-movement recording to each of landscapes. The factors of "spatial unfolding of backdropped hilly greenspace", "horizontal quence of residential buildings", and "landscape complexity" prove to be important. And it prove important variables of "skyline of mountainous ridge" and "visual edge of building structure" in regression models and eye fixation movements.

  • PDF