• 제목/요약/키워드: CLASSIFICATION FACTOR

검색결과 1,374건 처리시간 0.03초

Demension reduction for high-dimensional data via mixtures of common factor analyzers-an application to tumor classification

  • Baek, Jang-Sun
    • Journal of the Korean Data and Information Science Society
    • /
    • 제19권3호
    • /
    • pp.751-759
    • /
    • 2008
  • Mixtures of factor analyzers(MFA) is useful to model the distribution of high-dimensional data on much lower dimensional space where the number of observations is very large relative to their dimension. Mixtures of common factor analyzers(MCFA) can reduce further the number of parameters in the specification of the component covariance matrices as the number of classes is not small. Moreover, the factor scores of MCFA can be displayed in low-dimensional space to distinguish the groups. We propose the factor scores of MCFA as new low-dimensional features for classification of high-dimensional data. Compared with the conventional dimension reduction methods such as principal component analysis(PCA) and canonical covariates(CV), the proposed factor score was shown to have higher correct classification rates for three real data sets when it was used in parametric and nonparametric classifiers.

  • PDF

도시와 농촌 지역 구분 기준 연구 (A Study on the Classification Criteria Between Urban and Rural Area)

  • 강대구
    • 농촌지도와개발
    • /
    • 제16권3호
    • /
    • pp.557-586
    • /
    • 2009
  • The objective is to find the classification criteria between urban and rural, and to classify the urban and rural area all the country in Korea. For the research objectives, reviews of related literature and statistical yearbooks were used for finding criteria and analysing data. Through reviewing the literature, some indicators were selected in views of rurality and urbanity, and gathered the data from statistical yearbooks. And factor analysis was used to find first and second factor for classifying region. Six factors as a city surrounding(36%), non-farmer household population ratio(28.1%), cultivated acreage(12.48%), agricultural production surrounding (12.40%), the farm family number change(5.58%) and household number rise and fall(5.54%) were finding. And rurality factors were cultivated acreage, agricultural production surrounding, the farm family number change and household number rise and fall, and urbanity factors were city surrounding and non-farmer household population ratio. Based on the first and second factor loaded amount, four type regional classification was followed.

  • PDF

3차원 인제 형상 데이터를 이용만 목밑둘레 유형화 연구 - 20대 여성을 중심으로 - (A Study on the Classification of Neck-Base Circumference by Three-Dimensional Automatic Measurements of the Human Body - With the Focus on Women in their 20's -)

  • 조신현;석혜정
    • 복식
    • /
    • 제58권6호
    • /
    • pp.35-41
    • /
    • 2008
  • The purposes of this study lied in the analysis and classification of neck-base circumference shapes of the women in their twenties, by the application of three-dimensional automatic measurement data of human body, and thereby in the understanding of neck-base circumference shapes by the classified type. The findings are as follows: 1. The comparison of three-dimensional human body measurement items relating to the neck-base circumference part of the women in their twenties indicated that the largest individual difference was found in cervicale-center-anterior neck radius than in other items. 2. The factor analysis, which was conducted to extract the factors constituting the neck-base circumference, showed the shape of cervicale(factor 1), the shape of section neck(factor 2), the thickness of neck(factor 3), the shape of anterior neck(factor 4), and the shape of side neck(factor 5). 3. The classification of the neck-base circumference shapes resulted in three types. Type 1 was the shape of a reverse triangle hanging forward, Type 2 was that of a circle, and Type 3 was that of an oval open to the sides.

공통요인분석자혼합모형의 요인점수를 이용한 일반화가법모형 기반 신용평가 (A credit classification method based on generalized additive models using factor scores of mixtures of common factor analyzers)

  • 임수열;백장선
    • Journal of the Korean Data and Information Science Society
    • /
    • 제23권2호
    • /
    • pp.235-245
    • /
    • 2012
  • 로지스틱판별분석은 금융 분야에서 유용하게 사용되고 있는 통계적 기법으로 신용평가 시 해석이 쉽고 우수한 분별력으로 많이 활용되고 있지만 종속변수에 대한 설명변수들의 비선형적인 관계를 설명하는 부분에는 한계점이 있다. 일반화가법모형은 로지스틱판별모형의 장점과 함께 종속변수와 설명변수 사이의 비선형적인 관계도 설명할 수 있다. 그러나 연속형 설명변수의 수가 대단히 많은 경우이 두 방법은 모형에 유의한 변수를 선택해야하는 문제점이 있다. 따라서 본 연구에서는 다수의 연속형 설명변수들을 공통요인분석자혼합모형에 의한 차원축소를 통해 변환된 소수의 요인점수들을 일반화가법모형의 새로운 연속형 설명변수로 사용하여 신용분류를 하는 방법을 제시한다. 실제 금융자료를 이용하여 로지스틱판별모형과 일반화가법모형, 그리고 본 연구에서 제안한 방법에 의한 정분류율을 비교한 결과 본 연구에서 제안한 방법의 분류 성능이 더 우수하였다.

Factor-analysis based questionnaire categorization method for reliability improvement of evaluation of working conditions in construction enterprises

  • Lin, Jeng-Wen;Shen, Pu Fun
    • Structural Engineering and Mechanics
    • /
    • 제51권6호
    • /
    • pp.973-988
    • /
    • 2014
  • This paper presents a factor-analysis based questionnaire categorization method to improve the reliability of the evaluation of working conditions without influencing the completeness of the questionnaire both in Taiwanese and Chinese construction enterprises for structural engineering applications. The proposed approach springs from the AI application and expert systems in structural engineering. Questions with a similar response pattern are grouped into or categorized as one factor. Questions that form a single factor usually have higher reliability than the entire questionnaire, especially in the case when the questionnaire is complex and inconsistent. By classifying questions based on the meanings of the words used in them and the responded scores, reliability could be increased. The principle for classification was that 90% of the questions in the same classified group must satisfy the proposed classification rule and consequently the lowest one was 92%. The results show that the question classification method could improve the reliability of the questionnaires for at least 0.7. Compared to the question deletion method using SPSS, 75% of the questions left were verified the same as the results obtained by applying the classification method.

실무적 적용 관점에서 신뢰성 분포의 유형화 모형의 고찰 (Review of Classification Models for Reliability Distributions from the Perspective of Practical Implementation)

  • 최성운
    • 대한안전경영과학회지
    • /
    • 제13권1호
    • /
    • pp.195-202
    • /
    • 2011
  • The study interprets each of three classification models based on Bath-Tub Failure Rate (BTFR), Extreme Value Distribution (EVD) and Conjugate Bayesian Distribution (CBD). The classification model based on BTFR is analyzed by three failure patterns of decreasing, constant, or increasing which utilize systematic management strategies for reliability of time. Distribution model based on BTFR is identified using individual factors for each of three corresponding cases. First, in case of using shape parameter, the distribution based on BTFR is analyzed with a factor of component or part number. In case of using scale parameter, the distribution model based on BTFR is analyzed with a factor of time precision. Meanwhile, in case of using location parameter, the distribution model based on BTFR is analyzed with a factor of guarantee time. The classification model based on EVD is assorted into long-tailed distribution, medium-tailed distribution, and short-tailed distribution by the length of right-tail in distribution, and depended on asymptotic reliability property which signifies skewness and kurtosis of distribution curve. Furthermore, the classification model based on CBD is relied upon conjugate distribution relations between prior function, likelihood function and posterior function for dimension reduction and easy tractability under the occasion of Bayesian posterior updating.

탐색적 요인분석을 이용한 도로특성분류에 관한 연구 (A Study on Road Characteristic Classification using Exploratory Factor Analysis)

  • 조준한;김성호;노정현
    • 대한교통학회지
    • /
    • 제26권3호
    • /
    • pp.53-66
    • /
    • 2008
  • 본 연구는 기존의 도로기능 분류체계를 보완하면서 유형화된 도로구간들의 교통특성을 규명하기 위해 새로운 관점에서 도로특성분류 개념을 정립하였다. 도로특성분류는 교통계획, 교통운영관리 등의 교통전반으로 설계 및 정책을 수립하고 지침을 마련하는데 중요한 판단자료로 이용될 것으로 기대된다. 또한, 도로특성분류를 위해 일반국도 상시지점 조사자료를 토대로 12개의 설명변수를 산출하였으며, 이 설명변수들간의 상호상관을 통한 잠재구조 및 다중공선성 검토, 요인점수를 추출하는 탐색적 요인분석을 수행하였다. 연구 방향은 탐색적 요인분석의 각 실행단계별 접하게 되는 의사결정 문제를 세밀하게 검토하였으며, 각 논점별로 올바른 평가기준 방법을 제시하여 최종적인 종합결론을 도출하였다. 적정 설명변수와 요인 수를 결정하기 위해 10개의 시나리오를 비교분석한 결과, 처음 제시한 12개의 설명변수를 모두 포함한 경우가 가장 우수한 것으로 분석되었으며, 4개의 요인이 가장 적정한 것으로 나타났다. 본 연구결과는 추후에 다양한 분석방법(군집분석, 회귀분석, 판별분석 등)에 있어서 객관적인 입력자료로 사용됨에 따라 보다 정확한 연구결과가 도출될 것으로 기대된다.

뇌성마비 아동의 신체기능이 완수동기에 미치는 영향 (The Effect of Motor Ability in Children with Cerebral Palsy on Mastery Motivation)

  • 이나정;오태영
    • The Journal of Korean Physical Therapy
    • /
    • 제26권5호
    • /
    • pp.315-323
    • /
    • 2014
  • Purpose: This study was conducted in order to investigate the effect of motor ability on mastery motivation in children with cerebral palsy. Methods: Sixty children with cerebral palsy (5~12 years) and their parents participated in the study. Data on general characteristics and disability condition, Gross Motor Functional Classification System, Manual Ability Classification System, and The Dimensions of Mastery questionnaire were collected for this study. Independent t-test, and ANOVA were used for analysis of the effect of The Dimensions of Mastery questionnaire according to general and disability condition, Gross Motor Functional Classification System, and Manual Ability Classification System. Linear regression analysis was performed to determine the effects of Gross Motor Functional Classification System and Manual Ability Classification System on The Dimensions of Mastery questionnaire. SPSS win. 22.0 was used and Tukey was used for post hoc analysis, level of statistical significance was less than 0.05. Results: The Dimensions of Mastery questionnaire score showed statistically significant difference according to gender, region, type, disability rating, Gross Motor Functional Classification System, and Manual Ability Classification System (p<0.05). Gross Motor Functional Classification System and Manual Ability Classification System were the effect factor on The Dimensions of Mastery questionnaire significantly (p<0.05). Conclusion: These results suggest that motor ability of children with cerebral palsy was an important factor having an effect on The Dimensions of Mastery questionnaire.

요인 및 군집분석을 이용한 지상 라이다 자료의 분류 (Classification of Terrestrial LiDAR Data Using Factor and Cluster Analysis)

  • 최승필;조지현;김열;김준성
    • 대한공간정보학회지
    • /
    • 제19권4호
    • /
    • pp.139-144
    • /
    • 2011
  • 본 연구는 지상라이다 자료에서 얻어진 색상정보(R, G, B)와 반사강도정보(I)를 동시에 이용하여 이를 통계학적 분류기법으로 서로의 연관성을 분석하여 라이다 자료에 대한 분류방법을 제시하였다. 이를 위하여 우선 변수 R,G,B 및 I를 사용하여 분산 을 극대화하는 요인을 추출하여 주요인과 각 변수들 간의 요인행렬을 산출하였다. 그러나 요인행렬은 기초자료를 축소시켜 보여주기는 하지만, 이로부터 어떤 변수들이 어떤 요인에 의해 높게 관계되는지 명확하게 알기 어렵기 때문에 직각회전방식 중에서 Varimax방법을 이용하여 회전된 요인행렬을 구하여 요인점수를 산출하였다. 그리고 비 계층적 군집화 방법인 K-평균법을 이용하여 요인분석으로 산출된 요인점수에 대하여 군집분석을 실시한 후, 지상라이다 자료의 분류 정확도를 평가하였다.

SMOTE와 분류 기법을 활용한 산사태 위험 지역 결정 방법 (Method for Assessing Landslide Susceptibility Using SMOTE and Classification Algorithms)

  • 윤형구
    • 한국지반공학회논문집
    • /
    • 제39권6호
    • /
    • pp.5-12
    • /
    • 2023
  • 산사태 위험 지역을 사전에 조사하여 설정하는 것은 다수의 피해를 줄이기 위해 필요하다. 해당 연구의 목적은 machine learning 기법 중 분류 알고리즘을 활용하여 대상 지반의 안전율 분류를 수행할 수 있는 방법론을 제시하는 것이다. 산사태 위험 지역은 high risk area(HRA) 모델을 적용하였으며, 8개의 지반공학 물성치를 통해 위험 지역을 판단하였다. 분류 알고리즘은 decision tree(DT), K-Nearest Neighbor(KNN), logistic regression(LR) 그리고 random forest(RF)의 4가지가 활용 되었으며, 안전율 1.2~2.0 범위에 8가지 지반공학 물성치의 분류 정확도를 계산하였다. 정확도는 안전율이 1.2~1.7 범위에서 신뢰성 높게 나타났지만, 그 외 범위인 1.8~2.0 사이에서는 상대적으로 낮은 정확도를 보였다. 이를 극복하기 위하여 synthetic minority over-sampling technique(SMOTE) 알고리즘을 적용하여 데이터 개수를 증폭하였으며, 증폭한 데이터를 통해 분류 알고리즘을 적용하면 안전율 1.8~2.0 범위에서 정확도가 평균적으로 약 250% 증가한 것으로 나타났다. 해당 연구 결과는 SMOTE 알고리즘이 데이터 개수를 향상시켜 분류 알고리즘의 정확도가 개선된 것을 보여주며, 타 분야에도 정확도 향상에 적용 가능하다고 판단된다.