DOI QR코드

DOI QR Code

A New Parameter Estimation Method for a Zipf-like Distribution for Geospatial Data Access

  • Li, Rui (State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University) ;
  • Feng, Wei (State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University) ;
  • Wang, Hao (State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University) ;
  • Wu, Huayi (State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University)
  • Received : 2013.03.29
  • Accepted : 2013.06.25
  • Published : 2014.02.01

Abstract

Many reports have shown that the access pattern for geospatial tiles follows Zipf's law and that its parameter ${\alpha}$ represents the access characteristics. However, visits to geospatial tiles have temporal and spatial popularities, and the ${\alpha}$-value changes as they change. We construct a mathematical model to simulate the user's access behavior by studying the attributes of frequently visited tile objects to determine parameter estimation algorithms. Because the least squares (LS) method in common use cannot obtain an exact ${\alpha}$-value and does not provide a suitable fit to data for frequently visited tiles, we present a new approach, which uses a moment method of estimation to obtain the value of ${\alpha}$ when ${\alpha}$ is close to 1. When ${\alpha}$ is further away from 1, the method uses the associated cache hit ratio for tile access and uses an LS method based on a critical cache size to estimate the value of ${\alpha}$. The decrease in the estimation error is presented and discussed in the section on experiment results. This new method, which provides a more accurate estimate of ${\alpha}$ than earlier methods, promises more effective prediction of requests for frequently accessed tiles for better caching and load balancing.

Keywords

References

  1. J.H. Gong, "Man-Earth Relationships Based on Virtual Geographic Environments," 6th Nat. Conf. Cartography GIS Conf., Wuhan, Hubei, China, Oct. 30, 2006.
  2. D. Butler, "Virtual Globes: The Web-Wide World," Nature, vol. 439, no. 7078, Feb. 16, 2006, pp. 776-778. https://doi.org/10.1038/439776a
  3. D.G. Bell et al., "NASA World Wind: Opensource GIS for Mission Operations," Proc. IEEE Aerospace Conf., Mar. 3-10, 2007, pp. 1-9.
  4. C. Yang et al., "Performance-Improving Techniques in Web- Based GIS," Proc. Int. J. Geograph.Inf. Sci., vol. 19, no. 3, 2005, pp. 319-342.
  5. D Fisher, "Hotmap: Looking at Geographic Attention," IEEE Trans. Proc. Vis. Comput. Graph., vol. 13, no. 6, Nov.-Dec. 2007, pp. 1184-1191.
  6. D. Fisher, "How We Watch the City: Popularity and Online Maps," Workshop Imaging City, ACM Comput.-Human Interaction, San Jose, CA, USA, May 2007.
  7. Q. Li et al., "Mining User Similarity Based on Location History," 16th ACM SIGSPATIAL Int. Conf. Geograph. Inf. Syst., Irvine, CA, USA, Nov. 5-7, 2008.
  8. Y. Fang, O.A. Omitaomu, and A.R. Ganguly, "Incremental Anomaly Detection Approach for Characterizing Unusual Profiles," LNCS 5840: Knowledge Discovery from Sensor Data, M.M. Gaber et al., Eds., Heidelberg: Springer, 2010, pp. 190-202.
  9. J. Krumm and E. Horvitz, "Predestination: Where Do You Want to Go Today?" IEEE Comput. Archive, vol. 40, no. 4, Apr. 2007, pp. 105-107.
  10. N. Talagala et al., "The Art of Massive Storage: A Web Image Archive," IEEE Comput. Society, vol. 33, no. 11, Nov. 2000, pp. 22-28. https://doi.org/10.1109/MC.2000.881691
  11. L.A. Adamic and B.A. Huberman, "Zipf's Law and the Internet," Glottometrics, vol. 3, no. 1, 2002, pp. 143-150.
  12. H. Wang, Research on Distributed Load Balancing and Cache Technologies for Multimedia Networked GIS, doctoral dissertation, Wuhan University, 2009.
  13. H. Wang et al., "Zipf-like Distribution and Its Application Analysis for Image Data Tile Request in Digital Earth," Geomatics Inf. Sci. Wuhan Univ., vol. 35, no. 3, Mar. 2010, pp. 356-359.
  14. L. Shi et al., "Quantitative Analysis of Zipf's Law on Web Cache," LNCS, vol. 3758, 2005, pp. 845-852.
  15. A.R. Ganguly et al., "Knowledge Discovery from Sensor Data for Scientific Applications," Learning from Data Streams: Processing Techniques in Sensor Networks, J. Gama and M.M. Gaber, Eds., Heidelberg: Springer, 2007, pp. 205-229.
  16. R. Li et al., "A Prefetching Model Based on Access Popularity for Geospatial Data in a Cluster-Based Caching System," Int. J. Geograph. Inf. Sci., vol. 26, no. 10, Oct. 2012, pp. 1831-1844. https://doi.org/10.1080/13658816.2012.659184
  17. R. Li et al., "A Mathematical Simulation Model for Access Traffic of Geospatial Data," 7th Int. Conf. Comput. Sci. Education, Melbourne, Australia, July 14-17, 2012, pp. 1127-1129.
  18. S. Chatterjee and A.S. Hadi, Regression Analysis by Example, New York: Wiley-Interscience, 1977.
  19. Y. Fang, L. Si, and A.P. Mathur, "Discriminative Graphical Models for Faculty Homepage Discovery," Inf. Retrieval, vol. 13, no. 6, Dec. 1, 2010, pp. 618-635. https://doi.org/10.1007/s10791-010-9127-7
  20. Y. Fang and M.K. Jeong, "Robust Probabilistic Multivariate Calibration Model," Technometrics, vol. 50, no. 3, July 2008, pp. 305-316. https://doi.org/10.1198/004017008000000073

Cited by

  1. DCCP: an effective data placement strategy for data-intensive computations in distributed cloud computing systems vol.72, pp.7, 2014, https://doi.org/10.1007/s11227-015-1511-z