DOI QR코드

DOI QR Code

An Improved Combined Content-similarity Approach for Optimizing Web Query Disambiguation

  • Kamal, Shahid (Faculty of Computing, Universiti Teknologi Malaysia (UTM)) ;
  • Ibrahim, Roliana (Faculty of Computing, Universiti Teknologi Malaysia (UTM)) ;
  • Ghani, Imran (Faculty of Computing, Universiti Teknologi Malaysia (UTM))
  • Received : 2015.09.04
  • Accepted : 2015.11.17
  • Published : 2015.12.31

Abstract

The web search engines are exposed to the issue of uncertainty because of ambiguous queries, being input for retrieving the accurate results. Ambiguous queries constitute a significant fraction of such instances and pose real challenges to web search engines. Moreover, web search has created an interest for the researchers to deal with search by considering context in terms of location perspective. Our proposed disambiguation approach is designed to improve user experience by using context in terms of location relevance with the document relevance. The aim is that providing the user a comprehensive location perspective of a topic is informative than retrieving a result that only contains temporal or context information. The capacity to use this information in a location manner can be, from a user perspective, potentially useful for several tasks, including user query understanding or clustering based on location. In order to carry out the approach, we developed a Java based prototype to derive the contextual information from the web results based on the queries from the well-known datasets. Among those results, queries are further classified in order to perform search in a broad way. After the result provision to users and the selection made by them, feedback is recorded implicitly to improve the web search based on contextual information. The experiment results demonstrate the outstanding performance of our approach in terms of precision 75%, accuracy 73%; recall 81% and f-measure 78% when compared with generic temporal evaluation approach and furthermore achieved precision 86%, accuracy 71%; recall 67% and f-measure 75% when compared with web document clustering approach.

Keywords

References

  1. Anastasiu, D.C., et al., "A novel two-box search paradigm for query disambiguation," World Wide Web, 16(1), pp.1-29, 2013. http://dx.doi.org/10.1007/s11280-011-0154-0,
  2. Chowdhury, A.R. and G.S. Pass, Query disambiguation, Google Patents, 2014. http://www.google.com/patents/US7428530
  3. Carpineto, C., et al., "A survey of web clustering engines," ACM Computing Surveys (CSUR), 41(3), pp.17, 2009. http://dx.doi.org/10.1145/1541880.1541884
  4. Joho, H., A. Jatowt, and B. Roi. "A survey of temporal web search experience," in Proceedings of the 22nd international conference on World Wide Web companion, International World Wide Web Conferences Steering Committee, 2013.
  5. Dey, A.K., "Understanding and Using Context," Personal and Ubiquitous Computing, 5(1), pp. 4-7, 2001. http://dx.doi.org/10.1007/s007790170019
  6. Campos, R., Google Insights for Search Query Classification dataset (GISQC_DS), 2011. http://www.ccc.ipt.pt/-ricardo/datasets/GISQC_DS.html
  7. Carpineto C. and R. G., Ambient dataset, 2008. http://credo.fub.it/ambient/
  8. Campos, R.N.T., "Disambiguating implicit temporal queries for temporal information retrieval applications," Universidade do Porto, 2013. https://dias.users.greyc.fr/publications/PhdThesisRCampos.pdf
  9. Brin, S. and L. Page, "The anatomy of a large-scale hypertextual Web search engine," Computer networks and ISDN systems, 30(1),pp. 107-117, 1998. http://dx.doi.org/10.1016/S0169-7552(98)00110-X
  10. Kleinberg, J.M., "Authoritative sources in a hyperlinked environment,"Journal of the ACM (JACM), 46(5),pp. 604-632, 1999. http://dx.doi.org/10.1145/324133.324140
  11. Boston, C., et al., "Wikimantic: Toward effective disambiguation and expansion of queries," Data & Knowledge Engineering, 90, pp. 22-37, 2014. http://dx.doi.org/10.1016/j.datak.2013.07.004
  12. Ferragina, P. and U. Scaiella. "Tagme: on-the-fly annotation of short text fragments (by wikipedia entities),"in Proceedings of the 19th ACM international conference on Information and knowledge management. ACM, 2010. http://dx.doi.org/10.1145/1871437.1871689
  13. Alonso, O., M. Gertz, and R. Baeza-Yates. "Clustering and exploring search results using timeline constructions,"in Proceedings of the 18th ACM conference on Information and knowledge management. ACM, 2009. http://dx.doi.org/10.1145/1645953.1645968
  14. Campos, R., et al. "Disambiguating Implicit Temporal Queries by Clustering Top Relevant Dates in Web Snippets,"in Web Intelligence and Intelligent Agent Technology (WI-IAT), IEEE/WIC/ACM International Conferences on 2012. Macau: IEEE, 2012. http://dx.doi.org/10.1109/WI-IAT.2012.158
  15. Loia, V., et al., "Interactive knowledge management for agent-assisted web navigation," International Journal of Intelligent Systems,22(10),pp. 1101-1122, 2007. http://dx.doi.org/10.1002/int.20239
  16. Yu, J. and M. Jeon,"A context-aware intelligent recommender system in ubiquitous environment," in 10th IASTED international conference on artificial intelligence and applications,pp.229-234, 2010. http://www.actapress.com/Abstract.aspx?paperId=37710
  17. Richardson, M., E. Dominowska, and R. Ragno,"Predicting clicks: estimating the click-through rate for new ads," in Proceedings of the 16th international conference on World Wide Web, ACM, 2007. http://dx.doi.org/10.1145/1242572.1242643
  18. Chai, X., et al.,"Efficiently incorporating user feedback into information extraction and integration programs," in Proceedings of the 2009 ACM SIGMOD International Conference on Management of data, ACM, 2009. http://dx.doi.org/10.1145/1559845.1559857
  19. Song, R., et al.,"Identification of Ambiguous Queries in Web Search," Information Processing & Management,45(2), pp. 216-229, 2006. http://dx.doi.org/10.1016/j.ipm.2008.09.005
  20. Song, R., et al.,"Identifying ambiguous queries in web search," in Proceedings of the 16th international conference on World Wide Web, ACM, 2007. http://dx.doi.org/10.1145/1242572.1242749
  21. Campos, R., A. Jorge, and G. Dias,"Using web snippets and query-logs to measure implicit temporal intents in queries," in SIGIR 2011 Workshop on Query Representation and Understanding, University of Massachusetts Amherst, 2011. http://www.researchgate.net/publication/265284219_Using_Web_Snippets_and_Web_Query-logs_to_Measure_Implicit_Temporal_Intents_in_Queries
  22. Campos, R., et al., "GTE-Cluster: A temporal search interface for implicit temporal queries, in Advances in Information Retrieval," Springer International Publishing: Switzerland,pp. 775-779, 2014. http://dx.doi.org/10.1007/978-3-319-06028-6_94
  23. Cobos, C., et al., "Clustering of web search results based on the cuckoo search algorithm and Balanced Bayesian Information Criterion," Information Sciences, 281,pp. 248-264, 2014.http://dx.doi.org/10.1016/j.ins.2014.05.047
  24. Xue, G.-R., et al.,"Optimizing web search using web click-through data," in Proceedings of the thirteenth ACM international conference on Information and knowledge management, ACM, 2004. http://dx.doi.org/10.1145/1031171.1031192
  25. Li, Y., et al., "Name disambiguation in scientific cooperation network by exploiting user feedback,"Artificial Intelligence Review, 41(4),pp. 563-578, 2014. http://dx.doi.org/10.1007/s10462-012-9323-5

Cited by

  1. Access Control as a Service for Information Protection in Semantic Web based Smart Environment vol.17, pp.5, 2016, https://doi.org/10.7472/jksii.2016.17.5.09