• Title/Summary/Keyword: 다변량통계기법

Search Result 132, Processing Time 0.028 seconds

Performance of PCA Algorithm for Multivariate Data Analysis (다변량 데이터 분석을 위한 PCA 알고리즘 구현)

  • Gim, GwiSuk;Shon, Ho Sun;Ryu, Keun Ho;Lee, YoungSung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2013.11a
    • /
    • pp.1264-1266
    • /
    • 2013
  • 다변량 데이터 분석에 주로 사용되는 차원축소 기법 중 하나인 PCA 알고리즘을 직접 구현해보고 기존의 통계분석 프로그램과 그 결과를 비교분석 해보았다. UCI에서 제공하는 유방암 데이터를 이용하여 실험 해본 결과 두 프로그램 모두 같은 주성분을 얻고, Eigenvalue와 variance도 같은 값을 얻었다. 따라서 상용화된 통계패키지를 사용하지 않고도 PCA 알고리즘을 적용하여 차원축소 문제를 해결하고 데이터를 분석 할 수 있다.

Multivariate Region Growing Method with Image Segments (영상분할단위 기반의 다변량 영역확장기법)

  • 이종열
    • Proceedings of the Korean Association of Geographic Inforamtion Studies Conference
    • /
    • 2004.03a
    • /
    • pp.273-278
    • /
    • 2004
  • Feature identification is one of the largest issue in high spatial resolution satellite imagery. A popular method associated with this feature identification is image segmentation to produce image segments that are more likely to features interested. Here, it is, proposed that combination of edge extraction and region growing methods for image segments were used to improve the result of image segmentation. At the intial step, an image was segmented by edge detection method. The segments were assigned IDs, and polygon topology of segments were built. Based on the topology, the segments were tested their similarities with adjacent segments using multivariate analysis. The segments that have similar spectral characteristics were merged into a region. The test application shows that the segments composed of individual large, spectrally homogeneous structures, such as buildings and roads, were merged into more similar shape of structures.

  • PDF

Geostatistical Integration of Ground Survey Data and Secondary Data for Geological Thematic Mapping (지질 주제도 작성을 위한 지표 조사 자료와 부가 자료의 지구통계학적 통합)

  • Park, No-Wook;Jang, Dong-Ho;Chi, Kwang-Hoon
    • Korean Journal of Remote Sensing
    • /
    • v.22 no.6
    • /
    • pp.581-593
    • /
    • 2006
  • Various geological thematic maps have been generated by interpolating sparsely sampled ground survey data and geostatistical kriging that can consider spatial correlation between neighboring data has widely been used. This paper applies multi-variate geostatistical algorithms to integrate secondary information with sparsely sampled ground survey data for geological thematic mapping. Simple kriging with local means and kriging with an external drift are applied among several multi-variate geostatistical algorithms. Two case studies for spatial mapping of groundwater level and grain size have been carried out to illustrate the effectiveness of multi-variate geostatistical algorithms. A digital elevation model and IKONOS remote sensing imagery were used as secondary information in two case studies. Two multi-variate geostatistical algorithms, which can account for both spatial correlation of neighboring data and secondary data, showed smaller prediction errors and more local variations than those of ordinary kriging and linear regression. The benefit of applying the multi-variate geostatistical algorithms, however, depends on sampling density, magnitudes of correlation between primary and secondary data, and spatial correlation of primary data. As a result, the experiment for spatial mapping of grain size in which the effects of those factors were dominant showed that the effect of using the secondary data was relatively small than the experiment for spatial mapping of groundwater level.

다변량 분석기법을 이용한 재해통계 분석

  • 고병인;임현교
    • Proceedings of the Korean Institute of Industrial Safety Conference
    • /
    • 1999.06a
    • /
    • pp.133-136
    • /
    • 1999
  • 국내의 산업재해 통계 산출방법은 재해자가 제출한 요양신청서 중 업무상 재해로 인정된 재해만을 대상으로 통계를 산출하고 있고, 산업재해발생에 대한 원인분석도 재해발생형태, 기인물, 관리적 원인, 불안전행동, 불안전 상태등의 단순 빈도에 대해서만 행해지고 있다. 이것은 재해건수 감소에 목표를 집중시킨 결과로서 효율적인 안전관리가 실시되지 않고 있는 이유이고 또 그 목적을 충족시키기에는 미흡하고, 근본적인 재해발생 원인 규명에도 한계가 있다. (중략)

  • PDF

Vector at Risk and alternative Value at Risk (Vector at Risk와 대안적인 VaR)

  • Honga, C.S.;Han, S.J.;Lee, G.P.
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.4
    • /
    • pp.689-697
    • /
    • 2016
  • The most useful method for financial market risk management may be Value at Risk (VaR) which estimates the maximum loss amount statistically. The VaR is used as a risk measure for one industry. Many real cases estimate VaRs for many industries or nationwide industries; consequently, it is necessary to estimate the VaR for multivariate distributions when a specific portfolio is established. In this paper, the multivariate quantile vector is proposed to estimate VaR for multivariate distribution, and the Vector at Risk for multivariate space is defined based on the quantile vector. When a weight vector for a specific portfolio is given, one point among Vector at Risk could be found as the best VaR which is called as an alternative VaR. The alternative VaR proposed in this work is compared with the VaR of Morgan with bivariate and trivariate examples; in addition, some properties of the alternative VaR are also explored.

A Study on the Use of Inferential Statistics in Library and Information Science Research (국내외 문헌정보학분야 연구에서 추론통계 사용에 관한 연구)

  • Ro, Jung-Soon
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.40 no.1
    • /
    • pp.119-138
    • /
    • 2006
  • This Study analyzed the use of statistics in 1,768 research articles published in 2001-2004 in 4 korean & 6 English core journals in the field of library and information science. Korean journals made significantly less use of descriptive and inferential statistics. Of the 663 inferential statistics used in 345 of the 1768 articles, the most frequently used inferential technique was multivariate analysis. There was significant difference in inferential methods used in Korean & English journals, also in traditional library science journals and information science journals.

Data-based On-line Diagnosis Using Multivariate Statistical Techniques (다변량 통계기법을 활용한 데이터기반 실시간 진단)

  • Cho, Hyun-Woo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.17 no.1
    • /
    • pp.538-543
    • /
    • 2016
  • For a good product quality and plant safety, it is necessary to implement the on-line monitoring and diagnosis schemes of industrial processes. Combined with monitoring systems, reliable diagnosis schemes seek to find assignable causes of the process variables responsible for faults or special events in processes. This study deals with the real-time diagnosis of complicated industrial processes from the intelligent use of multivariate statistical techniques. The presented diagnosis scheme consists of a classification-based diagnosis using nonlinear representation and filtering of process data. A case study based on the simulation data was conducted, and the diagnosis results were obtained using different diagnosis schemes. In addition, the choice of future estimation methods was evaluated. The results showed that the performance of the presented scheme outperformed the other schemes.

체육,스포츠과학 분야의 학문적 성장: 통계적 방법 적용의 역사

  • Gang, Sang-Jo
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2002.11a
    • /
    • pp.43-49
    • /
    • 2002
  • 이 글은 체육 및 운동과학 연구에서 각종 통계적 방법의 출현을 역사적으로 개관하고 이들 통계적 방법의 출현시기를 미국의 경우와 비교함으로써 한국 체육학연구의 학문적 성장을 확인하고 이들 방법 적용과정에서 나타난 문제점을 밝히는데 목적이 있다. 이러한 목적을 달성하기 위해 통계적 방법 관련논문을 분석하고 통계적 방법이 체육 현장에서 어느 정도 중요하게 다루어 질 수 있는지를 확인하기 위해 측정,평가 담당교수의 학문적 배경과 교육적 경력을 조사하였다. 전공자의에 대한 의 체육학이 아직 학문적으로 자리를 잡기 전인 최초 태동 시기에 적용된 통계적 방법을 미국의체육학회지(RQES)와 비교해 볼 때 동일한 통계적 방법의 적용시기는 약 30년 차이가 있다. 이러한 차이는 미국에서 1980년도에 나타나기 시작한 진보된 다변량 통계기법을 1990년대에 적용하기 시작하면서 급속도로 좁혀졌으며 현재는 동일한 시기에 나타나고 있다. 그러나 진보된 통계적 기법의 출현에도 불구하고 이들 기법을 적용하는데 필요한 기본가정이 충족되었는지에 대한 검토 없이 적지않은 논문이 보고되고 있다. 담당교수의 학문적, 교육적 배경이 통계학과 거리가 있는 교수가 47%로 나타남으로써 가르치는 내용과 범위에서 적지않은 제약이 따르고 있다. 또한 전문가에 의한 평가체제가 확립되지 못함으로써 적용된 통계적 기법의 적절성을 평가하는데 장애가 되고 있다.

  • PDF

Establishment of rapid discrimination system of leguminous plants at metabolic level using FT-IR spectroscopy with multivariate analysis (FT-IR 스펙트럼 기반 다변량통계분석기법에 의한 두과작물의 대사체 수준 식별체계 확립)

  • Song, Seung-Yeob;Ha, Tae-Joung;Jang, Ki-Chang;Kim, In-Jung;Kim, Suk-Weon
    • Journal of Plant Biotechnology
    • /
    • v.39 no.3
    • /
    • pp.121-126
    • /
    • 2012
  • To determine whether FT-IR spectroscopy combined with multivariate analysis for whole cell extracts can be used to discriminate major leguminous plant at metabolic level, seed extracts of six leguminous plants were subjected to Fourier transform infrared spectroscopy (FT-IR). FT-IR spectral data from seed extracts were analyzed by principal component analysis (PCA), partial least square discriminant analysis (PLS-DA) and hierarchical clustering analysis (HCA). The PCA could not fully discriminate six leguminous plants, however PLS-DA could successfully discriminate six leguminous plants. The hierarchical dendrogram based on PLS-DA separated the six leguminous plants into four branches. The first branch was consisted of all three Vigna species including Vigna radiata var. radiate, Vigna angularis var. angularis and Vigna unguiculata subsp. Unguiculata. Whereas Pisum sativum var. sativum, Glycine max L and Phaseolus vulgaris var. vulgaris were clustered into a separate branch respectively. The overall results showed that metabolic discrimination system were in accordance with known phylogenic taxonomy. Thus we suggested that the hierarchical dendrogram based on PLS-DA of FT-IR spectral data from seed extracts represented the most probable chemotaxonomical relationship between six leguminous plants.

A Study on the Estimation of Coefficients K and n Using Multivariate Data Analysis (다변량 통계기법을 이용한 K및 n의 산정에 관한 연구)

  • 백용진;최재성;배동명;김경진
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.13 no.8
    • /
    • pp.583-590
    • /
    • 2003
  • For the preestimate of the vibration level of the ground next to a dwelling, a multivariate statistical analysis on the experiment data acquired from a variety of construction sites was performed, and then a new estimate model for the value of K and n that can be applied in the diagnosis of the damage was offered. The results maybe summarized as follows : First, the $K_{95}$ and n showed high correlation at P$\leq$0.05. Specially the correlation coefficient about $W_{max}$, S were higher in $K_{95}$ than in n. indicating that $K_{95}$ is generally associated with source conditions. Second, the factor analysis permitted to identify two major sources in each fraction. These sources accounted for at least 73 % of valiance of $K_{95}$. Third, the multiple regression model for the estimate of $K_{95}$ was developed from Fac1 which depend upon the source conditions and Fac2 which depend upon the transmission conditions. The n value is able to determine from the correlation relationship associated with $K_{95}$./.