• Title/Summary/Keyword: multivariate data analysis

Search Result 1,414, Processing Time 0.025 seconds

Short-term Wind Farm Power Forecasting Using Multivariate Analysis to Improve Wind Power Efficiency (풍력발전 설비 효율화를 위한 다변량 분석을 이용한 풍력발전단지 단기 출력 예측 방법)

  • Wi, Young-Min
    • Journal of the Korean Institute of Illuminating and Electrical Installation Engineers
    • /
    • v.29 no.7
    • /
    • pp.54-61
    • /
    • 2015
  • This paper presents short-term wind farm power forecasting method using multivariate analysis and time series. Based on factor analysis, the proposed method makes new independent variables which newly composed by raw independent variables such as wind speed, ramp rate, wind power. Newly created variables are used in the time series model for forecasting wind farm power. To demonstrate the improved accuracy, the proposed method is compared with persistence model commonly used as reference in wind power forecasting using data from Jeju Island. The results of case studies are presented to show the effectiveness of the proposed forecasting method.

Multivariate Analysis of the Prognosis of 37 Chondrosarcoma Patients

  • Yang, Zheng-Ming;Tao, Hui-Min;Ye, Zhao-Ming;Li, Wei-Xu;Lin, Nong;Yang, Di-Sheng
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.13 no.4
    • /
    • pp.1171-1176
    • /
    • 2012
  • Objective: The current study aimedto screen for possible factors which affect prognosis of chondrosarcoma. Methods: Thirty seven cases were selected and analyzed statistically. The patients received surgical treatment at our hospital between December 2005 and March 2008. All of them had complete follow-up data. The survival rates were calculated by univariate analysis using the Kaplan-Meier method and tested by Log-rank. ${\chi}^2$ or Fisher exact tests were carried out for the numeration data. The significant indexes after univariate analysis were then analyzed by multivariate analysis using COX regression model. Based on the literature, factors of gender, age, disease course, tumor location, Enneking grades, surgical approaches, distant metastasis and local recurrence were examined. Results: Univariate analysis showed that there were significant differences in Enneking grades, surgical approaches and distant metastasis related to the patients' 3-year survival rate after surgery (P<0.001). No significant difference was not found in gender, age, disease course, tumor location or local recurrence (P>0.05). Multivariate analysis showed that Enneking grade (P=0.007) and surgical approaches (P=0.010) were independent factors affecting the prognosis of chondrosarcoma, but distant metastasis was not (P=0.942). Conclusion: Enneking grades, surgical approaches and distant metastasis are risk factors for prognosis of chondrosarcoma, among which the former two are independent factors.

Use of Multivariate Statistical Approaches for Decoding Chemical Evolution of Groundwater near Underground Storage Caverns (다변량통계기법을 이용한 지하저장시설 주변의 지하수질 변동에 관한 연구)

  • Lee, Jeonghoon
    • Journal of the Korean earth science society
    • /
    • v.35 no.4
    • /
    • pp.225-236
    • /
    • 2014
  • Multivariate statistical analyses have been extensively applied to hydrochemical measurements to analyze and interpret the data. This study examines anthropogenic factors obtained from applications of correspondence analysis (CA) and principal component analysis (PCA) to a hydrogeochemical data set. The goal was to synthesize the hydrogeochemical information using these multivariate statistical techniques by incorporating hydrogeochemical speciation results calculated by the program, commonly used, WATEQ4F included in the NETPATH. The selected case study was LPG underground storage caverns, which is located in the southeastern Korea. The highly alkaline groundwaters at this study area are an analogue for the repository system. High pH, speciation of Al and possible precipitation of calcite characterize these groundwaters. Available groundwater quality monitoring data were used to confirm these statistical models. The present study focused on understanding the hydrogeochemical attributes and establishing the changes of phase when two anthropogenic effects (i.e., disinfection activity and cement pore water) in the study area have been introduced. Comparisons made between two statistical results presented and the findings of previous investigations highlight the descriptive capabilities of PCA using calculated saturation index and CA as exploratory tools in hydrogeochemical research.

A Study on Forest Land Classification Using Multivariate Statistical Methods : A Case Study at Mt. Kwanak (다변수통계방법을 이용한 산지분류에 관한 연구)

  • 정순오
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.13 no.1
    • /
    • pp.43-66
    • /
    • 1985
  • Korea needs proper and rational public policies on conservation and use of forest land and other natural resources because of the accelerating expansion of national land developments in recent years. Unfortunately, there is no systematic planning system to support the needs. Generally, forest land use planning needs suitability analysis based on efficient land classification system. The goal of this study was to classify a forest land using multivariate satistical methods. A case study was carried out in winter of 1983 on a mountainous area higher than 100m above sea level located at Mt. Kwanak in Anyang -city, Kyung-gi-do (province). The study area was 19.80 km$^2$wide and was divided into 1, 383 Operational Taxonomic Units (OTU's) by a 120m$\times$120m grid. Fourteen descriptors were identified and quantified for each OTU from existing national land data : elevation, slope, aspect, terrain form, geologic material, surface soil permeability, topsoil type, depth of the solum, soil acidity, forest cover type, stand size class, stand age class, stand density class, and simple forest soil capability class. For this study, a FORTRAN IV program was written for input and output map data, and the computer statistics packages, SPSS and BMD, were used to perform the multivariate statistical analysis. Fourteen variables were analyzed to investigate the characteristics of their fire quench distribution and to estimate the correlation coefficients among them. Principal component analysis was executed to find the dimensions of forest land characteristics, and factor scores were used for proper samples of OTU throughout the study area. In order to develop the classes of forest land classification based on 102 surrogates, cluster and discriminant analyses of principal descriptor variable matrix were undertaken. Results obtained through a series of multivariate statistical analyses were as follows ; 1) Principal component analysis was proved to be a useful tool for data selection and identification of principal descriptor variables which represented the characteristics of forest land and facilitated the selection of samples.

  • PDF

Complex sample design effects and inference for Korea National Health and Nutrition Examination Survey data (국민건강영양조사 자료의 복합표본설계효과와 통계적 추론)

  • Chung, Chin-Eun
    • Journal of Nutrition and Health
    • /
    • v.45 no.6
    • /
    • pp.600-612
    • /
    • 2012
  • Nutritional researchers world-wide are using large-scale sample survey methods to study nutritional health epidemiology and services utilization in general, non-clinical populations. This article provides a review of important statistical methods and software that apply to descriptive and multivariate analysis of data collected in sample surveys, such as national health and nutrition examination survey. A comparative data analysis of the Korea National Health and Nutrition Examination Survey (KNHANES) was used to illustrate analytical procedures and design effects for survey estimates of population statistics, model parameters, and test statistics. This article focused on the following points, method of approach to analyze of the sample survey data, right software tools available to perform these analyses, and correct survey analysis methods important to interpretation of survey data. It addresses the question of approaches to analysis of complex sample survey data. The latest developments in software tools for analysis of complex sample survey data are covered, and empirical examples are presented that illustrate the impact of survey sample design effects on the parameter estimates, test statistics, and significance probabilities (p values) for univariate and multivariate analyses.

INVITED PAPER MULTIVARIATE ANALYSIS FOR THE CASE WHEN THE DIMENSION IS LARGE COMPARED TO THE SAMPLE SIZE

  • Fujikoshi, Yasunori
    • Journal of the Korean Statistical Society
    • /
    • v.33 no.1
    • /
    • pp.1-24
    • /
    • 2004
  • This paper is concerned with statistical methods for multivariate data when the number p of variables is large compared to the sample size n. Such data appear typically in analysis of DNA microarrays, curve data, financial data, etc. However, there is little statistical theory for high dimensional data. On the other hand, there are some asymptotic results under the assumption that both and p tend to $\infty$, in some ratio p/n ${\rightarrow}$c. The results suggest that the new asymptotic results are more useful and insightful than the classical large sample asymptotics. The main purpose of this paper is to review some asymptotic results for high dimensional statistics as well as classical statistics under a high dimensional asymptotic framework.

Multi-sensor data-based anomaly detection and diagnosis of a pumped storage hydropower plant

  • Sojin Shin;Cheolgyu Hyun;Seongpil Cho;Phill-Seung Lee
    • Structural Engineering and Mechanics
    • /
    • v.88 no.6
    • /
    • pp.569-581
    • /
    • 2023
  • This paper introduces a system to detect and diagnose anomalies in pumped storage hydropower plants. We collect data from various types of sensors, including those monitoring temperature, vibration, and power. The data are classified according to the operation modes (pump and turbine operation modes) and normalized to remove the influence of the external environment. To detect anomalies and diagnose their types, we adopt a multivariate normal distribution analysis by learning the distribution of the normal data. The feasibility of the proposed system is evaluated using actual monitoring data of a pumped storage hydropower plant. The proposed system can be used to implement condition monitoring systems for other plants through modifications.

Multivariate Statistical Analysis Approach to Predict the Reactor Properties and the Product Quality of a Direct Esterification Reactor for PET Synthesis (다변량 통계분석법을 이용한 PET 중합공정 중 직접 에스테르화 반응기의 거동 및 생산제품 예측)

  • Kim Sung Young;Chung Chang Bock;Choi Soo Hyoung;Lee Bomsock;Lee Bomsock
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.11 no.6
    • /
    • pp.550-557
    • /
    • 2005
  • The multivariate statistical analysis methods, using both multiple linear regression(MLR) and partial least square(PLS), have been applied to predict the reactor properties and the product quality of a direct esterification reactor for polyethylene terephthalate(PET) synthesis. On the basis of the set of data including the flow rate of water vapor, the flow rate of EG vapor, the concentration of acid end groups of a product and other operating conditions such as temperature, pressure, reaction times and feed monomer mole ratio, two multi-variable analysis methods have been applied. Their regression and prediction abilities also have been compared. The prediction results are critically compared with the actual plant data and the other mathematical model based results in reliability. This paper shows that PLS method approach can be used for the reasonably accurate prediction of a product quality of a direct esterification reactor in PET synthesis process.

A Study on the Estimation of Coefficients K and n Using Multivariate Data Analysis (다변량 통계기법을 이용한 K및 n의 산정에 관한 연구)

  • 백용진;최재성;배동명;김경진
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.13 no.8
    • /
    • pp.583-590
    • /
    • 2003
  • For the preestimate of the vibration level of the ground next to a dwelling, a multivariate statistical analysis on the experiment data acquired from a variety of construction sites was performed, and then a new estimate model for the value of K and n that can be applied in the diagnosis of the damage was offered. The results maybe summarized as follows : First, the $K_{95}$ and n showed high correlation at P$\leq$0.05. Specially the correlation coefficient about $W_{max}$, S were higher in $K_{95}$ than in n. indicating that $K_{95}$ is generally associated with source conditions. Second, the factor analysis permitted to identify two major sources in each fraction. These sources accounted for at least 73 % of valiance of $K_{95}$. Third, the multiple regression model for the estimate of $K_{95}$ was developed from Fac1 which depend upon the source conditions and Fac2 which depend upon the transmission conditions. The n value is able to determine from the correlation relationship associated with $K_{95}$./.

A multivariate latent class profile analysis for longitudinal data with a latent group variable

  • Lee, Jung Wun;Chung, Hwan
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.1
    • /
    • pp.15-35
    • /
    • 2020
  • In research on behavioral studies, significant attention has been paid to the stage-sequential process for multiple latent class variables. We now explore the stage-sequential process of multiple latent class variables using the multivariate latent class profile analysis (MLCPA). A latent profile variable, representing the stage-sequential process in MLCPA, is formed by a set of repeatedly measured categorical response variables. This paper proposes the extended MLCPA in order to explain an association between the latent profile variable and the latent group variable as a form of a two-dimensional contingency table. We applied the extended MLCPA to the National Longitudinal Survey on Youth 1997 (NLSY97) data to investigate the association between of developmental progression of depression and substance use behaviors among adolescents who experienced Authoritarian parental styles in their youth.