DOI QR코드

DOI QR Code

Local linear regression analysis for interval-valued data

  • Jang, Jungteak (Department of Statistics, Hankuk University of Foreign Studies) ;
  • Kang, Kee-Hoon (Department of Statistics, Hankuk University of Foreign Studies)
  • Received : 2020.01.24
  • Accepted : 2020.03.01
  • Published : 2020.05.31

Abstract

Interval-valued data, a type of symbolic data, is given as an interval in which the observation object is not a single value. It can also occur frequently in the process of aggregating large databases into a form that is easy to manage. Various regression methods for interval-valued data have been proposed relatively recently. In this paper, we introduce a nonparametric regression model using the kernel function and a nonlinear regression model for the interval-valued data. We also propose applying the local linear regression model, one of the nonparametric methods, to the interval-valued data. Simulations based on several distributions of the center point and the range are conducted using each of the methods presented in this paper. Various conditions confirm that the performance of the proposed local linear estimator is better than the others.

Keywords

References

  1. Ahn J, Peng M, Park C, and Jeon Y (2012). A resampling approach for interval-valued data regression, Statistical Analysis and Data Mining, 5, 336-348. https://doi.org/10.1002/sam.11150
  2. Billard L and Diday E (2000). Regression analysis for interval-valued data. In Data Analysis, Classification, and Related Methods, (H. A. L. Kiers, J.-P Rassoon, P. J. F. Groenen, and M. Schader (eds), pp. 369-374), Springer-Verlag, Berlin.
  3. Billard L and Diday E (2002). Symbolic regression analysis. In Classification, Clustering and Data Analysis, (K. Jajuga, A. Sokolowski, and H.-H Bock (eds), pp. 281-288), Springer-Verlag, Berlin.
  4. Edwin KPC and Stanislaw HZ (2013). An Introduction to Optimization (4th ed), Wiley, New Jersey.
  5. Fagundes RAA, De Souza RMCR, and Cysneiros FJA (2014). Interval kernel regression, Neurocomputing, 128, 371-388. https://doi.org/10.1016/j.neucom.2013.08.029
  6. Im S and Kang K (2018). On regression analysis of interval-valued data, Journal of the Korean Data & Information Science Society, 29, 351-365. https://doi.org/10.7465/jkdi.2018.29.2.351
  7. Lima Neto EA and De Carvalho FAT (2008). Center and range method for fitting and linear regression model to symbolic interval data, Computational Statistics and Data Analysis, 52, 1500-1515. https://doi.org/10.1016/j.csda.2007.04.014
  8. Lima Neto EA and De Carvalho FAT (2010). Constrained linear regression models for symbolic interval-valued variables, Computational Statistics and Data Analysis, 54, 333-347. https://doi.org/10.1016/j.csda.2009.08.010
  9. Lima Neto EA and De Carvalho FAT (2017). Nonlinear regression applied to interval-valued data, Pattern Analysis and Applications, 20, 809-824. https://doi.org/10.1007/s10044-016-0538-y
  10. Lima Neto EA, De Carvalho FAT, and Tenorio C (2004). Univariate and multivariate linear regression methods to predict interval-valued features. In Advances in Artificial Intelligence 2004 (G.I. Webb and X. Yu (eds), Vol 3339, pp. 526-537), Springer, Berlin, Heidelberg.
  11. Nadaraya EA (1964). On estimating regression, Theory of Probability and Its Application, 10, 186-190. https://doi.org/10.1137/1110024
  12. Qin Z (2007). The relationships between CG, BFGS, and two limited-memory algorithms, Electronic Journal of Undergraduate Mathematics, 12, 5-20.
  13. Ruppert D, Sheather SJ, andWand MP (1995). An effective bandwidth selector for local least squares regression, Journal of the American Statistical Association, 90, 1257-1270. https://doi.org/10.1080/01621459.1995.10476630
  14. Watson GS (1964). Smooth regression analysis, Sankhya: The Indian Journal of Statistics, Series A, 26, 359-372.
  15. Xu W (2010). Symbolic data analysis: interval-valued data regression (Ph.D, dissertation), University of Georgia,