• 제목/요약/키워드: statistical analysis.

검색결과 17,816건 처리시간 0.036초

Classification of Microarray Gene Expression Data by MultiBlock Dimension Reduction

  • Oh, Mi-Ra;Kim, Seo-Young;Kim, Kyung-Sook;Baek, Jang-Sun;Son, Young-Sook
    • Communications for Statistical Applications and Methods
    • /
    • 제13권3호
    • /
    • pp.567-576
    • /
    • 2006
  • In this paper, we applied the multiblock dimension reduction methods to the classification of tumor based on microarray gene expressions data. This procedure involves clustering selected genes, multiblock dimension reduction and classification using linear discrimination analysis and quadratic discrimination analysis.

Asymptotic Test for Dimensionality in Probabilistic Principal Component Analysis with Missing Values

  • Park, Chong-sun
    • Communications for Statistical Applications and Methods
    • /
    • 제11권1호
    • /
    • pp.49-58
    • /
    • 2004
  • In this talk we proposed an asymptotic test for dimensionality in the latent variable model for probabilistic principal component analysis with missing values at random. Proposed algorithm is a sequential likelihood ratio test for an appropriate Normal latent variable model for the principal component analysis. Modified EM-algorithm is used to find MLE for the model parameters. Results from simulations and real data sets give us promising evidences that the proposed method is useful in finding necessary number of components in the principal component analysis with missing values at random.

Development of Discriminant Analysis System by Graphical User Interface of Visual Basic

  • Lee, Yong-Kyun;Shin, Young-Jae;Cha, Kyung-Joon
    • Journal of the Korean Data and Information Science Society
    • /
    • 제18권2호
    • /
    • pp.447-456
    • /
    • 2007
  • Recently, the multivariate statistical analysis has been used to analyze meaningful information for various data. In this paper, we develope the multivariate statistical analysis system combined with Fisher discriminant analysis, logistic regression, neural network, and decision tree using visual basic 6.0.

  • PDF

Recent Developments in Discriminant Analysis fro man Information Geometric Point of View

  • Eguchi, Shinto;Copas, John B.
    • Journal of the Korean Statistical Society
    • /
    • 제30권2호
    • /
    • pp.247-263
    • /
    • 2001
  • This paper concerns a problem of classification based on training dta. A framework of information geometry is given to elucidate the characteristics of discriminant functions including logistic discrimination and AdaBoost. We discuss a class of loss functions from a unified viewpoint.

  • PDF

Detecting Multiple Outliers Using the Gaps of Order Statistics

  • Kim, Hyun Chul
    • Communications for Statistical Applications and Methods
    • /
    • 제2권2호
    • /
    • pp.184-197
    • /
    • 1995
  • An objective and one-step detection procedure of multiple outliers is suggested by using the gaps of the order statistics. The detection procedure can be used as a routine outlier detection method of a statistical analysis computer program. The procedure is applied to some examples including the data selected by Kitagawa.

  • PDF

Optimal Designs for Multivariate Nonparametric Kernel Regression with Binary Data

  • Park, Dong-Ryeon
    • Communications for Statistical Applications and Methods
    • /
    • 제2권2호
    • /
    • pp.243-248
    • /
    • 1995
  • The problem of optimal design for a nonparametric regression with binary data is considered. The aim of the statistical analysis is the estimation of a quantal response surface in two dimensions. Bias, variance and IMSE of kernel estimates are derived. The optimal design density with respect to asymptotic IMSE is constructed.

  • PDF

Change Analysis with the Sample Fourier Coefficients

  • Jaehee Kim
    • Communications for Statistical Applications and Methods
    • /
    • 제3권1호
    • /
    • pp.207-217
    • /
    • 1996
  • The problem of detecting change with independent data is considered. The asymptotic distribution of the sample change process with the sample Fourier coefficients is shown as a Brownian Bridge process. We suggest to use dynamic statistics such as a sample Brownian Bridge and graphs as statistical animation. Graphs including change PP plots are given by way of illustration with the simulated data.

  • PDF

Statistical bioinformatics for gene expression data

  • Lee, Jae-K.
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2001년도 제2회 생물정보학 국제심포지엄
    • /
    • pp.103-127
    • /
    • 2001
  • Gene expression studies require statistical experimental designs and validation before laboratory confirmation. Various clustering approaches, such as hierarchical, Kmeans, SOM are commonly used for unsupervised learning in gene expression data. Several classification methods, such as gene voting, SVM, or discriminant analysis are used for supervised lerning, where well-defined response classification is possible. Estimating gene-condition interaction effects require advanced, computationally-intensive statistical approaches.

  • PDF

통계분석 및 수질지수를 이용한 남한강 하류 유역의 수질 평가 (Assessment of Water Quality in the Lower Reaches Namhan River by using Statistical Analysis and Water Quality Index (WQI))

  • 조용철;최현미;류인구;김상훈;신동석;유순주
    • 한국물환경학회지
    • /
    • 제37권2호
    • /
    • pp.114-127
    • /
    • 2021
  • Water pollution in the lower reaches of the Namhan River is getting worse due to drought and a decrease in water quantity due to climatic changes and hence is affecting the water quality of Paldang Lake. Accordingly, we have used a water quality index (WQI) and statistical analysis in this study to identify the characteristics of the water quality in the lower reaches of the Namhan River, the main causes of water pollution, and tributaries that need priority management. Typically, 10 items (WT, pH, EC, DO, BOD, COD, SS, T-N, T-P, and TOC) were used as the water quality factors for the statistical analysis, and the matrix of data was set as 324 × 10·1. The correlation analysis demonstrated a strong correlation between Chemical Oxygen Demand (COD) and T-P with a high statistical significance (r=0.700, p<0.01). Furthermore, the result of principal component analysis (PCA) revealed that the main factors affecting the change in water quality were T-P and organic substances introduced into the water by rainfall. Based on the Mann-Kendall test, a statistically significant increase in pH was observed in SH-1, DL, SH-2, CM, and BH, along with an increase in WQI in SH-2 and SM. BH was identified as a tributary that needs priority management in the lower reaches of the Namhan River, with a "Somewhat poor" (IV) grade in T-P, "Fair" grade in WQI, and "Marginal" grade in summer.

다변량통계기법을 이용한 지하저장시설 주변의 지하수질 변동에 관한 연구 (Use of Multivariate Statistical Approaches for Decoding Chemical Evolution of Groundwater near Underground Storage Caverns)

  • 이정훈
    • 한국지구과학회지
    • /
    • 제35권4호
    • /
    • pp.225-236
    • /
    • 2014
  • 다변량통계기법은 수리지구화학 자료의 분석 및 해석에 많이 이용되어 왔다. 본 연구에서 대응분석과 주성분분석을 동시에 사용하여 인위적인 활동에 의한 지하수의 특징을 살펴보았다. 본 연구의 목적은 NETPATH 프로그램 속의 WATEQ4F를 이용하여 지하수 화학성분의 분화를 계산하고 이를 다변량통계기법을 이용하여 지구화학적인 정보를 추출하는 것이다. 연구지역은 한반도의 남동쪽에 위치한 울산의 LPG 저장시설이다. 본 연구지역에서는 다른 저장시설에서 관찰되는 초염기성의 조성을 가지는 지하수가 관찰되었다. 이러한 인위적인 영향에 의한 높은 pH를 가지는 지하수로 인해 Al의 분화특징과 탄산염의 침전을 유발할 수 있다. 본 연구에서는 연구지역에 지하수에 영향을 주는 두 인위적인 요소(세정작용와 시멘트영향)에 의해서 수리지구화학적인 특징과 상이 어떻게 변하는 가에 초점을 두었다. 이전 연구결과와 두 통계분석을 통해 제시된 결과를 비교하여 지구화학적인 정보를 이용한 주성분분석과 대응분석인 수리지구화학 연구에서 기초연구로 활용될 수 있음을 알 수 있다.