• Title/Summary/Keyword: Multi-Variable Regression Method

검색결과 50건 처리시간 0.031초

다수준 프레일티모형 변수선택법을 이용한 다기관 방광암 생존자료분석 (Analysis of multi-center bladder cancer survival data using variable-selection method of multi-level frailty models)

  • 김보현;하일도;이동환
    • Journal of the Korean Data and Information Science Society
    • /
    • 제27권2호
    • /
    • pp.499-510
    • /
    • 2016
  • 생존분석 회귀모형에서 적절한 변수를 선택하는 것은 매우 중요하다. 본 논문에서는 "frailtyHL" R 패키지 (Ha 등, 2012)를 기반으로 하여 다수준 프레일티 모형 (multi-level frailty models)에서 벌점화 변수선택 방법 (penalized variable-selection method)의 절차를 소개한다. 여기서 모형 추정은 벌점화 다단계 가능도에 기초하며, 세 가지 벌점 함수 (LASSO, SCAD 및 HL)가 고려된다. 개발된 방법의 예증을 위해 벨기에 EORTC (European Organization for Research and Treatment of Cancer; 유럽 암 치료기구)에서 수행된 다국가/다기관 임상시험 자료를 이용하여 세 가지 변수 선택 방법의 결과를 비교하고, 그 결과들의 상대적 장 단점에 대해 토론한다. 특히, 자료 분석 결과에 의하면 SCAD와 HL방법이 LASSO보다 중요한 변수를 잘 선택하는 것으로 나타났다.

다축-다변량회귀분석 기법을 이용한 회분식 공정의 이상감지 및 통계적 제어 방법 (Fault Detection & SPC of Batch Process using Multi-way Regression Method)

  • 우경섭;이창준;한경훈;고재욱;윤인섭
    • Korean Chemical Engineering Research
    • /
    • 제45권1호
    • /
    • pp.32-38
    • /
    • 2007
  • 통계적인 공정 제어 기법을 회분식 공정에 적용하여, 일반적인 회분식 공정의 데이터를 통해 보다 빠르고, 손쉽게 공정의 상태를 진단할 수 있는 시스템을 구현해 보았다. 대표적인 회분식 공정의 하나인 반도체 식각공정과 반회분식 스타이렌-부타디엔 고무 생산 공정의 데이터를 이용하여 공정 변수와 공정의 상태간의 연관 관계를 규명할 수 있는 모델을 수립하였으며, 이 모델의 출력(output) 결과를 이용해 통계적 공정 제어 차트를 구성하고, 시간에 따른 공정의 추이를 분석해 이상을 판별해 보았다. 회분식 공정의 다축(multi-way) 데이터를 두개의 축으로 만드는 펼치기(unfolding) 과정을 거쳤으며, 모델링 방법으로는 Support Vector Regression 및 Partial Least Square 등의 다변량 회귀분석 방법을 이용하였다. 또한 에러차트 및 변수 기여도 차트(variable contribution chart)를 이용해 이상의 세기, 형태 및 이상 데이터에 대한 각 변수들의 기여도를 계산해 보았으며, 그 결과 이상의 발생 유무 및 발생시점 뿐만아니라 이상의 세기 및 원인 까지 진단해 볼 수 있는 우수한 성능을 보이는 것을 확인할 수 있었다.

MP-Lasso chart: a multi-level polar chart for visualizing group Lasso analysis of genomic data

  • Min Song;Minhyuk Lee;Taesung Park;Mira Park
    • Genomics & Informatics
    • /
    • 제20권4호
    • /
    • pp.48.1-48.7
    • /
    • 2022
  • Penalized regression has been widely used in genome-wide association studies for joint analyses to find genetic associations. Among penalized regression models, the least absolute shrinkage and selection operator (Lasso) method effectively removes some coefficients from the model by shrinking them to zero. To handle group structures, such as genes and pathways, several modified Lasso penalties have been proposed, including group Lasso and sparse group Lasso. Group Lasso ensures sparsity at the level of pre-defined groups, eliminating unimportant groups. Sparse group Lasso performs group selection as in group Lasso, but also performs individual selection as in Lasso. While these sparse methods are useful in high-dimensional genetic studies, interpreting the results with many groups and coefficients is not straightforward. Lasso's results are often expressed as trace plots of regression coefficients. However, few studies have explored the systematic visualization of group information. In this study, we propose a multi-level polar Lasso (MP-Lasso) chart, which can effectively represent the results from group Lasso and sparse group Lasso analyses. An R package to draw MP-Lasso charts was developed. Through a real-world genetic data application, we demonstrated that our MP-Lasso chart package effectively visualizes the results of Lasso, group Lasso, and sparse group Lasso.

Latent Variable Fit to Interlaboratory Studies

  • Jeon, Gyeongbae
    • Communications for Statistical Applications and Methods
    • /
    • 제7권3호
    • /
    • pp.885-897
    • /
    • 2000
  • The use of an unweighted mean and of separate tests is part of the current practice for analyzing interlaboratory studies, and we hope to improve on this method. We fit, using maximum likelihood(ML), a rather intricate, multi-parameter measurement model with the material's true value as a latent variable in a situation where quite serviceable regression and ANOVA calculations have already been developed. The model fit leads to both a weighted estimate of he overall mean, and to tests for equality of means, slopes and variances. Maximum likelihood tests for difference among variances poses a challenge in that the likelihood can easily becoem unbounded. Thus the major objective become to provide a useful test of variance equality.

  • PDF

건구온파를 오인한 장기최대전력수요예측에 관한 연구 (Long-Term Maximum Power Demand Forecasting in Consideration of Dry Bulb Temperature)

  • 고희석;정재길
    • 대한전기학회논문지
    • /
    • 제34권10호
    • /
    • pp.389-398
    • /
    • 1985
  • Recently maximum power demand of our country has become to be under the great in fluence of electric cooling and air conditioning demand which are sensitive to weather conditions. This paper presents the technique and algorithm to forecast the long-term maximum power demand considering the characteristics of electric power and weather variable. By introducing a weather load model for forecasting long-term maximum power demand with the recent statistic data of power demand, annual maximum power demand is separated into two parts such as the base load component, affected little by weather, and the weather sensitive load component by means of multi-regression analysis method. And we derive the growth trend regression equations of above two components and their individual coefficients, the maximum power demand of each forecasting year can be forecasted with the sum of above two components. In this case we use the coincident dry bulb temperature as the weather variable at the occurence of one-day maximum power demand. As the growth trend regression equation we choose an exponential trend curve for the base load component, and real quadratic curve for the weather sensitive load component. The validity of the forecasting technique and algorithm proposed in this paper is proved by the case study for the present Korean power system.

  • PDF

회귀 분석을 이용한 용접 변수와 이탈 액적 크기의 상호 관계 (Correlation between Welding Parameters and Detaching Drop Size using Regression)

  • 최상균;한창우;이상룡;이영문
    • Journal of Welding and Joining
    • /
    • 제20권1호
    • /
    • pp.83-90
    • /
    • 2002
  • Metal Transfer in gas metal arc (GMA) welding is a complex phenomenon affected by many parameters of the welding conditions and material properties. In this research, the correlation equation between the welding condition and detaching droplet size and detaching velocity in GMA welding was studied via recession analysis on the results of numerical analysis using the volume-of-fluid (VOF) method. Welding parameters and material properties were grouped into three dimensionless numbers and detaching droplet size was expressed as the function of them. Second order and exponential multi-variable correlation forms were assumed, and the coefficients of these equations were calculated for globular and spray modes as well as entire transfer modes. Applying correlation equation into available experimental data, it shows good agreement.

Subset selection in multiple linear regression: An improved Tabu search

  • Bae, Jaegug;Kim, Jung-Tae;Kim, Jae-Hwan
    • Journal of Advanced Marine Engineering and Technology
    • /
    • 제40권2호
    • /
    • pp.138-145
    • /
    • 2016
  • This paper proposes an improved tabu search method for subset selection in multiple linear regression models. Variable selection is a vital combinatorial optimization problem in multivariate statistics. The selection of the optimal subset of variables is necessary in order to reliably construct a multiple linear regression model. Its applications widely range from machine learning, timeseries prediction, and multi-class classification to noise detection. Since this problem has NP-complete nature, it becomes more difficult to find the optimal solution as the number of variables increases. Two typical metaheuristic methods have been developed to tackle the problem: the tabu search algorithm and hybrid genetic and simulated annealing algorithm. However, these two methods have shortcomings. The tabu search method requires a large amount of computing time, and the hybrid algorithm produces a less accurate solution. To overcome the shortcomings of these methods, we propose an improved tabu search algorithm to reduce moves of the neighborhood and to adopt an effective move search strategy. To evaluate the performance of the proposed method, comparative studies are performed on small literature data sets and on large simulation data sets. Computational results show that the proposed method outperforms two metaheuristic methods in terms of the computing time and solution quality.

기계학습모형을 이용한 다분광 위성 영상 기반 낙동강 부유 물질 농도 계측 기법 개발 (Development of suspended solid concentration measurement technique based on multi-spectral satellite imagery in Nakdong River using machine learning model)

  • 권시윤;서일원;백동해
    • 한국수자원학회논문집
    • /
    • 제54권2호
    • /
    • pp.121-133
    • /
    • 2021
  • 하천에서 발생하는 부유 물질은 주로 유역으로부터 유입되거나 하천 내에서 자생으로 발생하기도 하며, 퇴적되어 중장기적인 수질 오염을 초래할 수도 있는 중요한 수질 인자이다. 하지만, 부유물질의 재래식 계측방식은 점 단위 계측이기 때문에 노동 집약적이며 방대한 양의 자료를 취득하기는 어렵다. 따라서, 본 연구에서는 고해상도 다분광 위성영상을 제공하는 Sentinel-2 위성 자료를 이용하여 낙동강 전역에 대한 원격탐사 기반 부유 물질 농도 계측 기법을 개발하였다. 개발된 기법은 기존 원격탐사 기반 회귀식들의 한계점을 개선하고 낙동강 전체 영역의 지역적 특성을 반영하기 위해 기계학습 모형인 서포트 벡터 회귀(Support Vector Regression, SVR) 모형을 이용하여 다양한 파장대의 분광 밴드들과 밴드비(band ratios)를 고려하였으며, 이를 입력 변수들의 최적 조합으로 재귀적 특징 제거법(Recursive Feature Elimination, RFE)과 SVR의 각 변수별 가중계수를 활용하여 도출하였다. 가장 중요도가 높은 분광 밴드로는 Red-edge 파장대 영역에 속하는 705 nm 밴드가 산출되었으며, 최종적으로 구축된 SVR 모형을 선행 연구들에서 제시한 회귀식들과 비교한 결과, 가장 정확한 계측 결과를 제공하는 것으로 밝혀졌다. 본 연구에서 개발된 SVR 모형은 RFE를 통해 산출된 최적 분광 밴드 조합을 바탕으로 하기 때문에 기존 단일 분광 밴드 혹은 밴드비를 기반으로 구축된 회귀식들이 가지는 변수 의존도를 낮추는 동시에 더욱 정확한 부유물질 농도 공간분포를 제공할 수 있을 것으로 판단된다.

An Improved Method for Monitoring of Soil Moisture Using NOAA-AVHRR Data

  • Fu, June;Pang, Zhiguo;Xiao, Qianguang
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2003년도 Proceedings of ACRS 2003 ISRS
    • /
    • pp.195-197
    • /
    • 2003
  • Soil moisture is a crucial variable in research works of hydrology, meteorology and plant sciences. Adequate soil moisture is essential for plant growth; excesses and deficits of soil moisture must be considered in agricultural practices. There are already several remote sensing methods used for monitoring soil moisture, such as thermal inertia, vegetation water-supplying index, crop water stress index and multi-factor regression. In this paper, an improved method has been discussed which is based on the thermal inertia. We analyzed the problems of monitoring soil moisture using satellites at first, and then put forward an simplified method which directly uses land surface temperature differences to measure soil moisture. Also we have taken the influence of vegetation into account, and import NDVI into the model. The method was used in the study of soil moisture in Heilongjiang Province, China, and we draw the conclusion by the experiments that the model can evidently increase the precision of monitoring soil moisture.

  • PDF