Design and Development of a Multimodal Biomedical Information Retrieval System

  • Received : 2012.02.18
  • Accepted : 2012.05.27
  • Published : 2012.06.30


The search for relevant and actionable information is a key to achieving clinical and research goals in biomedicine. Biomedical information exists in different forms: as text and illustrations in journal articles and other documents, in images stored in databases, and as patients' cases in electronic health records. This paper presents ways to move beyond conventional text-based searching of these resources, by combining text and visual features in search queries and document representation. A combination of techniques and tools from the fields of natural language processing, information retrieval, and content-based image retrieval allows the development of building blocks for advanced information services. Such services enable searching by textual as well as visual queries, and retrieving documents enriched by relevant images, charts, and other illustrations from the journal literature, patient records and image databases.


  1. R. J. Sandusky and C. Tenopir, "Finding and using journalarticle components: impacts of disaggregation on teaching and research practice," Journal of the American Society for Information Science and Technology, vol. 59, no. 6, pp. 970-982, 2008.
  2. A. Divoli, M. A. Wooldridge, and M. A. Hearst, "Full text and figure display improves bioscience literature search," PLoS One, vol. 5, no. 4, p. e9619, 2010.
  3. M. S. Simpson, D. Demner-Fushman, and G. R. Thoma, "Evaluating the importance of image-related text for ad-hoc and case-based biomedical article retrieval," American Medical Informatics Association (AMIA) Annual Symposium Proceedings, vol. 13, pp. 752-756, 2010.
  4. N. C. Ide, R. F. Loane, and D. Demner-Fushman, "Essie: a concept-based search engine for structured biomedical text," Journal of the American Medical Informatics Association, vol. 14, no. 3, pp. 253-263, 2007.
  5. D. A. Lindberg, B. L. Humphreys, and A. T. McCray, "The unified medical language system," Methods of Information in Medicine, vol. 32, no. 4, pp. 281-291, 1993.
  6. E. Apostolova and D. Demner-Fushman, "Towards automatic image region annotation: image region textual coreference resolution," Proceedings of Human Language Technologies: the 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Boulder, CO, 2009, pp. 41-44.
  7. A. R. Aronson and F. M. Lang, "An overview of MetaMap: historical perspective and recent advances," Journal of the American Medical Informatics Association, vol. 17, no. 3, pp. 229-236, 2010.
  8. D. Demner-Fushman, J. G. Mork, S. E. Shooshan, and A. R. Aronson, "UMLS content views appropriate for NLP processing of the biomedical literature vs. clinical text," Journal of Biomedical Informatics, vol. 43, no. 4, pp. 587-594, 2010.
  9. S. A. Chatzichristofis and Y. S. Boutalis, "CEDD: color and edge directivity descriptor: a compact descriptor for image indexing and retrieval," Computer Vision Systems, Lecture Notes in Computer Science vol. 5008, A. Gasteratos, M. Vincze, J. K. Tsotsos, eds., Heidelberg: Springer Berlin, pp. 312-322, 2008.
  10. S. F. Chang, T. Sikora, and A. Purl, "Overview of the MPEG-7 standard," IEEE Transactions on Circuits and Systems for Video Technology, vol. 11, no. 6, pp. 688-695, 2001.
  11. G. Pass, R. Zabih, and J. Miller, "Comparing images using color coherence vectors," Proceedings of the 4th ACM International Conference on Multimedia, Boston, MA, 1996, pp. 65-73.
  12. J. Canny, "A computational approach to edge detection," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 8, no. 6, pp. 679-698, 1986.
  13. A. Oliva and A. Torralba, "Building the gist of a scene: the role of global image features in recognition," Progress in Brain Research, vol. 155, pp. 23-36, 2006.
  14. M. Lux and S. A. Chatzichristofis, "LIRe: lucene image retrieval: an extensible java CBIR library," Proceedings of the 16th ACM International Conference on Multimedia, Vancouver, BC, Canada, 2008, pp. 1085-1088.
  15. D. You, S. Antani, D. Demner-Fushman, V. Govindaraju, and G. R. Thoma, "Detecting figure-panel labels in medical journal articles using MRF," Proceedings of 2011 International Conference on Document Analysis and Recognition, Beijing, China, 2011, pp. 967-971.
  16. M. A. Hearst, A. Divoli, and H. Guturu, A. Ksikes, P. Nakov, M. A. Wooldridge, and J. Ye, "BioText search engine: beyond abstract search," Bioinformatics, vol. 23, no. 16, pp. 2196-2197, 2007.
  17. D. Demner-Fushman, B. Few, S. E. Hauser, and G. R. Thoma, "Automatically identifying health outcome information in MEDLINE records," Journal of the American Medical Informatics Association, vol. 13, no. 1, pp. 52-60, 2006.
  18. M. M. Rahman, S. K. Antani, R. L. Long, D. Demner-Fushman, and G. R. Thoma, "Multi-modal query expansion based on local analysis for medical image retrieval," Proceedings of the 1st MICCAI International Conference on Medical Content-Based Retrieval for Clinical Decision Support, London, UK, 2009, pp. 110-119.
  19. S. Xu, J. McCusker, and M. Krauthammer, "Yale image finder (YIF): a new search engine for retrieving biomedical images," Bioinformatics, vol. 24, no. 17, pp. 1968-1970, 2008.
  20. S. K. Antani, T. M. Deserno, L. R. Long, and G. R. Thoma, "Geographically distributed complementary content-based image retrieval systems for biomedical image informatics," Studies in Health Technology and Informatics, vol. 129, pp. 493-497, 2007.
  21. T. Saracevic, "Evaluation of evaluation in information retrieval," Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Seattle, WA, 1995, pp. 138-146.
  22. D. You, S. Antani, D. Demner-Fushman, M. M. Rahman, V. Govindaraju, and G. R. Thoma, "Biomedical article retrieval using multimodal features and image annotations in regionbased CBIR," Proceedings of the 17th Document Recognition and Retrieval Conference, San Jose, CA, 2010.
  23. B. J. Jansen and A. Spink, "How are we searching the world wide web?: a comparison of nine search engine transaction logs," Information Processing and Management, vol. 42, no. 1, pp. 248-263, 2006.

Cited by

  1. Deep Transfer Learning for Modality Classification of Medical Images vol.8, pp.3, 2017,
  2. Assembling Deep Neural Networks for Medical Compound Figure Detection vol.8, pp.2, 2017,
  3. A Simple and Efficient Arrowhead Detection Technique in Biomedical Images vol.30, pp.05, 2016,
  4. Risk Analysis for Pathological Changes in Pulmonary Parenchyma Based on Lung Computed Tomography Images vol.40, pp.3, 2016,
  5. Shangri-La: A medical case-based retrieval tool 2017,
  6. Overlaid Arrow Detection for Labeling Regions of Interest in Biomedical Images vol.31, pp.3, 2016,
  7. CoMAGC: a corpus with multi-faceted annotations of gene-cancer relations vol.14, pp.1, 2013,
  8. Arrow detection in biomedical images using sequential classifier 2017,
  9. Biomedical text mining for research rigor and integrity: tasks, challenges, directions 2017,
  10. Comparing fusion techniques for the ImageCLEF 2013 medical case retrieval task vol.39, 2015,
  11. Multimodal biomedical image indexing and retrieval using descriptive text and global feature mapping vol.17, pp.3, 2014,
  12. Line Segment-Based Stitched Multipanel Figure Separation for Effective Biomedical CBIR vol.31, pp.06, 2017,
  13. Image retrieval from scientific publications: Text and image content processing to separate multipanel figures vol.64, pp.5, 2013,
  14. Multi-panel medical image segmentation framework for image retrieval system vol.77, pp.16, 2018,