DOI QR코드

DOI QR Code

Binary classification on compositional data

  • Joo, Jae Yun (Department of Statistics, Hankuk University of Foreign Studies) ;
  • Lee, Seokho (Department of Statistics, Hankuk University of Foreign Studies)
  • Received : 2020.10.12
  • Accepted : 2020.12.22
  • Published : 2021.01.31

Abstract

Due to boundedness and sum constraint, compositional data are often transformed by logratio transformation and their transformed data are put into traditional binary classification or discriminant analysis. However, it may be problematic to directly apply traditional multivariate approaches to the transformed data because class distributions are not Gaussian and Bayes decision boundary are not polynomial on the transformed space. In this study, we propose to use flexible classification approaches to transformed data for compositional data classification. Empirical studies using synthetic and real examples demonstrate that flexible approaches outperform traditional multivariate classification or discriminant analysis.

Keywords

References

  1. Aitchison J (1986). The Statistical Analysis of Compositional Data, Monographs on Statistics and Applied Probability, Chapman & Hall, London.
  2. Egozcue JJ, Pawlowsky-Glahn V, Mateu-Figueras G, and Barcelo-Vidal C (2003). Isometric logratio ' transformations for compositional data analysis, Mathematical Geology, 35, 279-300. https://doi.org/10.1023/A:1023818214614
  3. Otero N, Tolosana-Delgado R, Soler A, Pawlowsky-Glahn V, and Canals A (2005). Relative vs. absolute statistical analysis of compositions: a comparative study of surface waters of a Mediterranean river. Water Research, 39, 1404-1414. https://doi.org/10.1016/j.watres.2005.01.012
  4. Pawlowsky-Glahn V and Egozcue JJ (2001). Geometric approach to statistical analysis on the simplex, Stochastic Environmental Research and Risk Assessment (SERRA), 15, 384-398. https://doi.org/10.1007/s004770100077
  5. Pawlowsky-Glahn V, Egozcue JJ, and Tolosana-Delgado R (2015). Modeling and Analysis of Compositional Data, John Wiley & Sons, Hoboken.