• 제목/요약/키워드: Partial Least Square Method

검색결과 208건 처리시간 0.026초

A modified partial least squares regression for the analysis of gene expression data with survival information

  • Lee, So-Yoon;Huh, Myung-Hoe;Park, Mira
    • Journal of the Korean Data and Information Science Society
    • /
    • 제25권5호
    • /
    • pp.1151-1160
    • /
    • 2014
  • In DNA microarray studies, the number of genes far exceeds the number of samples and the gene expression measures are highly correlated. Partial least squares regression (PLSR) is one of the popular methods for dimensional reduction and known to be useful for the classifications of microarray data by several studies. In this study, we suggest a modified version of the partial least squares regression to analyze gene expression data with survival information. The method is designed as a new gene selection method using PLSR with an iterative procedure of imputing censored survival time. Mean square error of prediction criterion is used to determine the dimension of the model. To visualize the data, plot for variables superimposed with samples are used. The method is applied to two microarray data sets, both containing survival time. The results show that the proposed method works well for interpreting gene expression microarray data.

PLS기반 c-퍼지 모델트리를 이용한 클로로필-a 농도 예측 (Chlorophyll-a Forcasting using PLS Based c-Fuzzy Model Tree)

  • 이대종;박상영;정남정;이혜근;박진일;전명근
    • 한국지능시스템학회논문지
    • /
    • 제16권6호
    • /
    • pp.777-784
    • /
    • 2006
  • 본 논문에서는 부분최소법 (PLS: Partial least square)과 c-퍼지 모델트리를 적용하여 클로로필-a 농도의 예측 모델을 제안한다. 제안된 방법은 모든 입력속성을 고려하여 퍼지 클러스터에 의해 계산된 중심벡터를 설정한 후, 각각의 중심벡터들과 입력속성간의 소속도를 이용하여 내부 노드를 형성하고, 형성된 내부노드에서 PLS를 적용하여 지역모델(Local model)을 구축한다. 노드의 분리기준으로서 부모노드(patent node)에서 구축된 모델에서 계산된 에러값이 자식노드(child node)에서 계산된 에러값보다 클 경우에 분기가 이루어진다. 최종 단계에서는 임의의 입력데이터와 잎노드에서 계산된 클러스터 중심값과 비교하여 소속도가 높은 클러스터에 속한 지역모델을 선택하여 출력값을 예측한다. 제안된 방법의 우수성을 보이기 위해 수질 데이터를 대상으로 실험한 결과 기존의 모델트리 방식에 비하여 향상된 성능을 보임을 알 수 있었다.

도로포장 반응모형에 대한 통계모형 개발 (A Development of Statistical Model for Pavement Response Model)

  • 이문섭;박희문;김부일;허태영
    • 한국산업정보학회논문지
    • /
    • 제17권5호
    • /
    • pp.89-96
    • /
    • 2012
  • 도로포장 반응모형의 구축을 위하여 새로운 방법론으로 부분최소제곱회귀모형의 활용성을 소개하고 실제 FWD 실험자료에 적용시켰다. 실증분석 결과 일반 다중회귀모형에서 발생된 다중공선성 문제를 부분최소제곱회귀모형을 통하여 해결방안을 제시하였으며, 변환된 자료가 아닌 원시자료를 이용하여 모형을 구축할 수 있다는 장점도 가지고 있다.

고속 최소자승 점별계산법을 이용한 멀티 스케일 문제의 해석 (FCM for the Multi-Scale Problems)

  • 김도완;김용식
    • 한국전산구조공학회:학술대회논문집
    • /
    • 한국전산구조공학회 2002년도 가을 학술발표회 논문집
    • /
    • pp.599-603
    • /
    • 2002
  • We propose a new meshfree method to be called the fast moving least square reproducing kernel collocation method(FCM). This methodology is composed of the fast moving least square reproducing kernel(FMLSRK) approximation and the point collocation scheme. Using point collocation makes the meshfree method really come true. In this paper, FCM Is shown to be a good method at least to calculate the numerical solutions governed by second order elliptic partial differential equations with geometric singularity or geometric multi-scales. To treat such problems, we use the concept of variable dilation parameter.

  • PDF

부분최소자승법을 이용한 혈압 측정에 관한 연구 (A Study on Measurement of Blood Pressure by Partial Least Square Method)

  • 김용주;남은혜;최창현;김종덕
    • Journal of Biosystems Engineering
    • /
    • 제33권6호
    • /
    • pp.438-445
    • /
    • 2008
  • The purpose of this study was to develop a measurement model based on PLS (Partial least square) method for blood pressures. Measurement system for blood pressure signals consisted of pressure sensor, va interface and embedded module. A mercury sphygmomanometer was connected with the measurement system through 3-way stopcock and used as reference of blood pressures. The blood pressure signals of 20 subjects were measured and tests were repeated 5 times per each subject. Total of 100 data were divided into a calibration set and a prediction set. The PLS models were developed to determine the systolic and the diastolic blood pressures. The PLS models were evaluated by the standard methods of the British Hypertension Society (BHS) protocol and the American Association for the Advancement of Medical Instrumentation (AAMI). The results of the PLS models were compared with those of MAA (maximum amplitude algorithm). The measured blood pressures with PLS method were highly correlated to those with a mercury sphygmomanometer in the systolic ($R^2=0.85$) and the diastolic blood pressure ($R^2=0.84$). The results showed that the PLS models were the effective tools for blood pressure measurements with high accuracy, and satisfied the standards of the BHS protocol and the AAMI.

사용편의성 모델수립을 위한 제품 설계 변수의 선별방법 : 유전자 알고리즘 접근방법 (A Method for Screening Product Design Variables for Building A Usability Model : Genetic Algorithm Approach)

  • 양희철;한성호
    • 대한인간공학회지
    • /
    • 제20권1호
    • /
    • pp.45-62
    • /
    • 2001
  • This study suggests a genetic algorithm-based partial least squares (GA-based PLS) method to select the design variables for building a usability model. The GA-based PLS uses a genetic algorithm to minimize the root-mean-squared error of a partial least square regression model. A multiple linear regression method is applied to build a usability model that contains the variables seleded by the GA-based PLS. The performance of the usability model turned out to be generally better than that of the previous usability models using other variable selection methods such as expert rating, principal component analysis, cluster analysis, and partial least squares. Furthermore, the model performance was drastically improved by supplementing the category type variables selected by the GA-based PLS in the usability model. It is recommended that the GA-based PLS be applied to the variable selection for developing a usability model.

  • PDF

부분최소제곱법 모델의 파라미터 추정을 이용한 화학공정의 이상진단 모델 개발 (The Development of a Fault Diagnosis Model based on the Parameter Estimations of Partial Least Square Models)

  • 이광오;이창준
    • 한국안전학회지
    • /
    • 제34권4호
    • /
    • pp.59-67
    • /
    • 2019
  • Since it is really hard to construct process models based on prior process knowledges, various statistical approaches have been employed to build fault diagnosis models. However, the crucial drawback of these approaches is that the solutions may vary according to the fault magnitude, even if the same fault occurs. In this study, the parameter monitoring approach is suggested. When a fault occurs in a chemical process, this leads to trigger the change of a process model and the monitoring parameters of process models is able to provide the efficient fault diagnosis model. A few important variables are selected and their predictive models are constructed by partial least square (PLS) method. The Euclidean norms of parameters of PLS models are estimated and a fault diagnosis can be performed as comparing with parameters of PLS models based on normal operational conditions. To improve the monitoring performance, cumulative summation (CUSUM) control chart is employed and the changes of model parameters are recorded to identify the type of an unknown fault. To verify the efficacy of the proposed model, Tennessee Eastman (TE) process is tested and this model can be easily applied to other complex processes.

근적외분광분석법을 사용한 암브록솔 정제의 비파괴적 정량분석 (Nondestructive Quantification of Intact Ambroxol Tablet using Near-infrared Spectroscopy)

  • 임현량;우영아;김도형;김효진;강신정;최현철;최한곤
    • 약학회지
    • /
    • 제48권1호
    • /
    • pp.60-64
    • /
    • 2004
  • Near-infrared (NIR) spectroscopy was used to determine rapidly and nondestructively the content of ambroxol in intact ambroxol tablets containing 30 mg (12.5% m/m nominal concentration) by collecting NIR spectra in range 1100-1750 nm. The laboratory-made samples had 10.3∼15.9% m/m nominal ambroxol concentration. The measurements were made by reflection using a fiber-optic probe and calibration was carried out by partial least square regression (PLSR) with autoscaling. Model validation was performed by randomly splitting the data set into calibration and validation data set (7 samples as a calibration data set and 5 samples as a validation data set). The developed NIR method gave results comparable to the known values of tablets in a laboratorial manufacturing Process, standard error of calibration (SEC) and standard error of prediction (SEP) being 0.49% and 0.49% m/m respectively. The method showed good accuracy and repeatability NIR spectroscopic determination in intact tablets allowed the potential use of real time monitoring for a running production process.

Multivariate Procedure for Variable Selection and Classification of High Dimensional Heterogeneous Data

  • Mehmood, Tahir;Rasheed, Zahid
    • Communications for Statistical Applications and Methods
    • /
    • 제22권6호
    • /
    • pp.575-587
    • /
    • 2015
  • The development in data collection techniques results in high dimensional data sets, where discrimination is an important and commonly encountered problem that are crucial to resolve when high dimensional data is heterogeneous (non-common variance covariance structure for classes). An example of this is to classify microbial habitat preferences based on codon/bi-codon usage. Habitat preference is important to study for evolutionary genetic relationships and may help industry produce specific enzymes. Most classification procedures assume homogeneity (common variance covariance structure for all classes), which is not guaranteed in most high dimensional data sets. We have introduced regularized elimination in partial least square coupled with QDA (rePLS-QDA) for the parsimonious variable selection and classification of high dimensional heterogeneous data sets based on recently introduced regularized elimination for variable selection in partial least square (rePLS) and heterogeneous classification procedure quadratic discriminant analysis (QDA). A comparison of proposed and existing methods is conducted over the simulated data set; in addition, the proposed procedure is implemented to classify microbial habitat preferences by their codon/bi-codon usage. Five bacterial habitats (Aquatic, Host Associated, Multiple, Specialized and Terrestrial) are modeled. The classification accuracy of each habitat is satisfactory and ranges from 89.1% to 100% on test data. Interesting codon/bi-codons usage, their mutual interactions influential for respective habitat preference are identified. The proposed method also produced results that concurred with known biological characteristics that will help researchers better understand divergence of species.

근적외선분광법을 이용한 택사의 산지 판별법 연구 (Discrimination of Alismatis Rhizoma According to Geographical Origins using Near Infrared Spectroscopy)

  • 이동영;김승현;김효진;성상현
    • 생약학회지
    • /
    • 제44권4호
    • /
    • pp.344-349
    • /
    • 2013
  • Near infrared spectroscopy (NIRS) combined with multivariate analysis was used to discriminate the geographical origin of Alisma orientale from Korea (n=94) and China (n=72). Two-thirds of samples were selected randomly for the training set, and one-third of samples for the test set. Second derivative was used for the pretreatment of NIR spectra. Partial least square discriminant analysis (PLS-DA) models correctly discriminated 100% of the Korean and Chinese A. orientale samples. These results demonstrate the potential use of NIR spectroscopy combined with multivariate analysis as a rapid and accurate method to discriminate A. orientale according to their geographical origin.