• 제목/요약/키워드: sets of variables

검색결과 522건 처리시간 0.024초

AGRICULTURAL DROUGHT RISK ASSESSMENT USING REMOTE SENSING AND GEOGRAPHIC INFORMATION SYSTEM

  • Narongrit, Chada;Yeesoonsang, Seesai
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2003년도 Proceedings of ACRS 2003 ISRS
    • /
    • pp.991-993
    • /
    • 2003
  • The 4 sets of environmental variables dealing with meteorology, hydrology and physiography were analyzed to generate a spatial drought risk index of Phitsanulok province of Thailand. The analysis of K-mean and discriminant were applied to the set of the selective drought variables for grouping each of spatial variable set into 4 classes. The obtained 4 classes, based on group statistics, were thus recoded in the meaning of no risk, low risk, moderate risk, and high risk. The regression coefficient between recoded classes and a set of the selective environmental variables were then applied as spatial variable weighting on thematic dataset in GIS spatial analysis. The results showed that the weighting score of drought variable was highest in meteorological variable compared to other variables.

  • PDF

Variable Arrangement for Data Visualization

  • Huh, Moon Yul;Song, Kwang Ryeol
    • Communications for Statistical Applications and Methods
    • /
    • 제8권3호
    • /
    • pp.643-650
    • /
    • 2001
  • Some classical plots like scatterplot matrices and parallel coordinates are valuable tools for data visualization. These tools are extensively used in the modern data mining softwares to explore the inherent data structure, and hence to visually classify or cluster the database into appropriate groups. However, the interpretation of these plots are very sensitive to the arrangement of variables. In this work, we introduce two methods to arrange the variables for data visualization. First method is based on the work of Wegman (1999), and this is to arrange the variables using minimum distance among all the pairwise permutation of the variables. Second method is using the idea of principal components. We Investigate the effectiveness of these methods with parallel coordinates using real data sets, and show that each of the two proposed methods has its own strength from different aspects respectively.

  • PDF

Canonical Correlation of 3D Visual Fatigue between Subjective and Physiological Measures

  • Won, Myeung Ju;Park, Sang In;Whang, Mincheol
    • 대한인간공학회지
    • /
    • 제31권6호
    • /
    • pp.785-791
    • /
    • 2012
  • Objective: The aim of this study was to investigate the correlation between 3D visual fatigue and physiological measures by canonical correlation analysis enabling to categorical correlation. Background: Few studies have been conducted to investigate the physiological mechanism underlying the visual fatigue caused by processing 3D information which may make the cognitive mechanism overloaded. However, even the previous studies lack validation in terms of the correlation between physiological variables and the visual fatigue. Method: 9 Female and 6 male subjects with a mean age of $22.53{\pm}2.55$ voluntarily participated in this experiment. All participants were asked to report how they felt about their health sate at after viewing 3D. In addition, Low & Hybrid measurement test(Event Related Potential, Steady-state Visual Evoked Potential) and for evaluating cognitive fatigue before and after viewing 3D were performed. The physiological signal were measured with subjective fatigue evaluation before and after in watching the 3D content. For this study suggesting categorical correlation, all measures were categorized into three sets such as included Visual Fatigue set(response time, subjective evaluation), Autonomic Nervous System set(PPG frequency, PPG amplitude, HF/LF ratio), Central Nervous System set(ERP amplitude P4, O1, O2, ERP latency P4, O1, O2, SSVEP S/N ratio P4, O1, O2). Then the correlation of three variables sets, canonical correlation analysis was conducted. Results: The results showed a significant correlation between visual fatigue and physiological measures. However, different variables of visual fatigue were highly correlated to respective HF/LF ratio and to ERP latency(O2). Conclusion: Response time was highly correlated to ERP latency(O2) while the subjective evaluation was to HF/LF ratio. Application: This study may provide the most significant variables for the quantitative evaluation of visual fatigue using HF/LF ratio and ERP latency based human performance and subjective fatigue.

의사결정나무에서 다중 목표변수를 고려한 (Splitting Decision Tree Nodes with Multiple Target Variables)

  • 김성준
    • 한국지능시스템학회:학술대회논문집
    • /
    • 한국퍼지및지능시스템학회 2003년도 춘계 학술대회 학술발표 논문집
    • /
    • pp.243-246
    • /
    • 2003
  • Data mining is a process of discovering useful patterns for decision making from an amount of data. It has recently received much attention in a wide range of business and engineering fields Classifying a group into subgroups is one of the most important subjects in data mining Tree-based methods, known as decision trees, provide an efficient way to finding classification models. The primary concern in tree learning is to minimize a node impurity, which is evaluated using a target variable in the data set. However, there are situations where multiple target variables should be taken into account, for example, such as manufacturing process monitoring, marketing science, and clinical and health analysis. The purpose of this article is to present several methods for measuring the node impurity, which are applicable to data sets with multiple target variables. For illustrations, numerical examples are given with discussion.

  • PDF

Pure additive contribution of genetic variants to a risk prediction model using propensity score matching: application to type 2 diabetes

  • Park, Chanwoo;Jiang, Nan;Park, Taesung
    • Genomics & Informatics
    • /
    • 제17권4호
    • /
    • pp.47.1-47.12
    • /
    • 2019
  • The achievements of genome-wide association studies have suggested ways to predict diseases, such as type 2 diabetes (T2D), using single-nucleotide polymorphisms (SNPs). Most T2D risk prediction models have used SNPs in combination with demographic variables. However, it is difficult to evaluate the pure additive contribution of genetic variants to classically used demographic models. Since prediction models include some heritable traits, such as body mass index, the contribution of SNPs using unmatched case-control samples may be underestimated. In this article, we propose a method that uses propensity score matching to avoid underestimation by matching case and control samples, thereby determining the pure additive contribution of SNPs. To illustrate the proposed propensity score matching method, we used SNP data from the Korea Association Resources project and reported SNPs from the genome-wide association study catalog. We selected various SNP sets via stepwise logistic regression (SLR), least absolute shrinkage and selection operator (LASSO), and the elastic-net (EN) algorithm. Using these SNP sets, we made predictions using SLR, LASSO, and EN as logistic regression modeling techniques. The accuracy of the predictions was compared in terms of area under the receiver operating characteristic curve (AUC). The contribution of SNPs to T2D was evaluated by the difference in the AUC between models using only demographic variables and models that included the SNPs. The largest difference among our models showed that the AUC of the model using genetic variants with demographic variables could be 0.107 higher than that of the corresponding model using only demographic variables.

A Technique to Improve the Fit of Linear Regression Models for Successive Sets of Data

  • Park, Sung H.
    • Journal of the Korean Statistical Society
    • /
    • 제5권1호
    • /
    • pp.19-28
    • /
    • 1976
  • In empirical study for fitting a multiple linear regression model for successive cross-sections data observed on the same set of independent variables over several time periods, one often faces the problem of poor $R^2$, the multiple coefficient of determination, which provides a standard measure of how good a specified regression line fits the sample data.

  • PDF

Nonlinear Canonical Correlation Analysis for Paralysis Disease Data

  • Shin, Yang-Kyu
    • Journal of the Korean Data and Information Science Society
    • /
    • 제15권3호
    • /
    • pp.515-521
    • /
    • 2004
  • Categorical data are mostly found in oriental medical research. The nonlinear canonical correlation analysis does not assume an interval level of measurement. In this paper, we apply nonlinear canonical correlation analysis to quantification and explain how similar sets of variables are to one another for paralysis disease data.

  • PDF

덕유산 지의식물 분포에 대한 정준분석법의 적용연구 (An Application of Canonical Analysis on the Distribution of Lichens in Mt. Duckyuoo)

  • Park, Seung Tai
    • The Korean Journal of Ecology
    • /
    • 제9권3호
    • /
    • pp.135-147
    • /
    • 1986
  • The simplification and the searching trends of complex data which assumed relationship between predictor variables and object variables are one of primary objective of ecological research. This study was aimed to apply cononical analysis consisting of canonical correlation analysis and canonical variate analysis related to lichen vegetation and several environmental variables which are elevation, height on grond, exposure side and cover values. Data collected from the Duckyoo National Park in August 1985. Lichen species was ranked by eqivocation information theory with cover values. Canonical correlation analysis was applied to one data set both set both environmental variables and lichem family. In order to make two sets of data matrix the scale of position vector ordination was calculated from the vector scalar product for lichen species. Canonical variate analysis was applied to rearranged data which was made by interval class code for environmental variables. The sharpness values was calculated in frequency of cotingency tables and the dispersion profiles of each species in classes of environmental variables was designed to extract component values based on the decomposition of expected frequencies in contingency table. The results of canonical correlation analysis revealed canonical first correlation value 0.815(89%), and second correlation value 0.083(11%). Significance test showed that the hypothesis of joint mutuallity of canonical correlation is accepted (P>0.05). The relation between canonical score of vegetation variables and that of environmental variable indicated linear tendency.

  • PDF

인공신경망을 통한 사출 성형조건의 최적화 예측 및 특성 선택에 관한 연구 (A study on the prediction of optimized injection molding conditions and the feature selection using the Artificial Neural Network(ANN))

  • 양동철;김종선
    • Design & Manufacturing
    • /
    • 제16권3호
    • /
    • pp.50-57
    • /
    • 2022
  • The qualities of the products produced by injection molding are strongly influenced by the process variables of the injection molding machine set by the engineer. It is very difficult to predict the qualities of the injection molded product considering the stochastic nature of the manufacturing process, since the processing conditions have a complex impact on the quality of the injection molded product. It is recognized that the artificial neural network(ANN) is capable of mapping the intricate relationship between the input and output variables very accurately, therefore, many studies are being conducted to predict the relationship between the results of the product and the process variables using ANN. However in the condition of a small number of data sets, the predicting performance and robustness of the ANN model could be reduced due to too many input variables. In the present study, the ANN model that predicts the length of the injection molded product for multiple combinations of process variables was developed. And the accuracy of each ANN model was compared for 8 process variables and 4 important process inputs that were determined by the feature selection. Based on the comparison, it was verified that the performance of the ANN model increased when only 4 important variables were applied.

한국인 영어학습자의 지각 모음공간과 발화 모음공간의 연계 (A Link between Perceived and Produced Vowel Spaces of Korean Learners of English)

  • 양병곤
    • 말소리와 음성과학
    • /
    • 제6권3호
    • /
    • pp.81-89
    • /
    • 2014
  • Korean English learners tend to have difficulty perceiving and producing English vowels. The purpose of this study is to examine a link between perceived and produced vowel spaces of Korean learners of English. Sixteen Korean male and female participants perceived two sets of English synthetic vowels on a computer monitor and rated their naturalness. The same participants produced English vowels in a carrier sentence with high and low pitch variation in a clear speaking mode. The author compared the perceived and produced vowel spaces in terms of the pitch and gender variables. Results showed that the perceived vowel spaces were not significantly different in either variables. Korean learners perceived the vowels similarly. They did not differentiate the tense-lax vowel pairs nor the low vowels. Secondly, the produced vowel spaces of the male and female groups showed a 25% difference which may have come from their physiological differences in the vocal tract length. Thirdly, the comparison of the perceived and produced vowel spaces revealed that although the vowel space patterns of the Korean male and female learners appeared similar, which may lead to a relative link between perception and production, statistical differences existed in some vowels because of the acoustical properties of the synthetic vowels, which may lead to an independent link. The author concluded that any comparison between the perceived and produced vowel space of nonnative speakers should be made cautiously. Further studies would be desirable to examine how Koreans would perceive different sets of synthetic vowels.