Evaluation of Multivariate Stream Data Reduction Techniques

다변량 스트림 데이터 축소 기법 평가

  • 정훈조 (한서대학교 컴퓨터정보학과) ;
  • 서성보 ;
  • 최경주 (충북대학교 전기전자컴퓨터공학부) ;
  • 박정석 (충주대학교 전기전자 및 정보공학부) ;
  • 류근호 (충북대학교 전기전자컴퓨터공학부)
  • Published : 2006.12.31


Even though sensor networks are different in user requests and data characteristics depending on each application area, the existing researches on stream data transmission problem focus on the performance improvement of their methods rather than considering the original characteristic of stream data. In this paper, we introduce a hierarchical or distributed sensor network architecture and data model, and then evaluate the multivariate data reduction methods suitable for user requirements and data features so as to apply reduction methods alternatively. To assess the relative performance of the proposed multivariate data reduction methods, we used the conventional techniques, such as Wavelet, HCL(Hierarchical Clustering), Sampling and SVD (Singular Value Decomposition) as well as the experimental data sets, such as multivariate time series, synthetic data and robot execution failure data. The experimental results shows that SVD and Sampling method are superior to Wavelet and HCL ia respect to the relative error ratio and execution time. Especially, since relative error ratio of each data reduction method is different according to data characteristic, it shows a good performance using the selective data reduction method for the experimental data set. The findings reported in this paper can serve as a useful guideline for sensor network application design and construction including multivariate stream data.


  1. J. M. Hellerstein, W. Hong, and S. R. Madden, 'The Sensor Spectrum: Technology, Trends, and Requirements,' In SIGMOD Record. Vol. 32, No.4, pp.22-27, 2003
  2. A. Deligiannakis, Y. Kotidis and N. Roussopoulos, 'Compressing Historical Information in Sensor Networks,' In Conf. of SIGMOD, pp.527-538, 2004
  3. A. Deligiannakis, Y. Kotidis, and N. Roussopoulos, 'Hierarchical in-Network Data Aggregation with Quality Guarantees,' In Conf. of EDBT, pp.658-675, 2004
  4. M. J. Franklin and S. R. Jeffery et al, 'Design Considerations for High Fan-In Systems: The HiFi Approach,' In Conf. of CIDR, pp290-304, 2005
  5. A. Manjeshwar and D. P. Agrawal, 'TEEN: A routing protocol for enhanced efficiency in wireless sensor networks,' In Proc. of PDPS, pp2009-2015, 2001
  6. A. Mainwaring and J. Polastre et al, 'Wireless Sensor Networks for habitat monitoring,' In Proc. of WSNA, pp.88-97, 2002
  7. B. X. and O. Wolfson, 'Time-Series Prediction with Applications to Traffic and Moving Objects Databases,' In Proc. of MobiDE, pp.56-60, 2003
  8. S. Guha, C. Kim, and K. S. Shim, 'XWAVE: Approximate Extended Wavelets for Stream Data,' In Conf. of VLDB, pp.288-299, 2004
  9. Y. Chen and G. Dong et al, 'Multi-Dimensional Regresion Analysis of Time-Series Data Streams,' In Conf. of VLDB, pp.323-334, 2002
  10. A. Deshpande and C. Guestrin et al, 'Model-Driven Data Acquisition in Sensor Networks,' In Conf. of VLDB, pp.588-599, 2004
  11. R. C. Oliver and K. Smettem et al, 'Field Testing a Wireless Sensor Network for Reactive Environmental Monitoring,' In Proc. of ISSNlP, pp.7-12, 2004
  12. J. Han and M. Kamber, 'Data Mining Concepts and Techniques,' Morgan Kaufmann Publishers, 2000
  13. M. Garofalakis, and P. B. Gibbons, 'Approximate Query Processing: Taming the Terabytes!' In Conf. of VLDB, Tutorial, 2001
  14. G. Strang, 'Introduction to Linear Algebra,' 3rd Ed., Wellesley-Cambridge Press, 1998
  15. F. Korn, H. V. Jagadish, and C. Faloutsos, 'Efficient Supporting Ad Hoc Queries in Large Datasets of Time Sequences,' In Conf. SIGMOD, pp.289-300, 1997
  16. D. Barbara and W. DuMouchel, et al, 'The New Jersey Data Reduction Report,' IEEE Data Engineering Bulletin, pp.3-45, 1997
  17. L. M. Camarinha-Matos, L. S. Lopes, and J. Barata, 'Assembly Execution Supervision with Learning Capabilities,' In Conf. of ICRA, pp.272-279, 1994
  18. S. Guha and N. Mishara et al, 'Clustering Data Streams,' In Conf. of FOCS, pp.359-366, 2000
  19. A. Deligiannakis, M. Garofalakis, and N. Roussopoulos, 'A Fast Approximation Scheme for Probabilistic Wavelet Synopses,' Int. Conf. on SSDBM, pp.243-252, 2005
  20. S. R. Madden, M. J. Franklin, and J. M. Hellerstein, 'Tinyl.B: An Acquisitional Query Processing System for Sensor Networks,' In ACM TODS, pp.1-47, 2004
  21. S. Hettich and S. D. Bay, 'The UCI KDD Archive (Synthetic Control Chart Time Series, Robot Execution Failures) [],' Irvine, CA: University of California, Department of Information and Computer Science, 1999
  22. 'JAMA,' A Java Matrix Package, ''
  23. 'Multivariate Data Analysis Software,' Java Source, ''
  24. 'FFT Spectrum Analyzer,' Java Source, ''
  25. S.B. Seo, J.W. Kang, D.W.Lee, and K.H.Ryu, 'Multivariate stream data classification using standard text classifiers,' In Conf of DEXA, pp.420-429, 2006
  26. S.B.Seo, J.W.Kang, and K. H. Ryu, 'Multivariate Stream Data Reduction in Sensor Network Applications,' EUC workshops, pp.198-207, 2005