DOI QR코드

DOI QR Code

MTReadable: Arabic Readability Corpus for Medical Tests Information

  • Alahmdi, Dimah (Faculty of Computer and Information Technology, King Abdulaziz University) ;
  • Alghamdi, Athir Saeed (Faculty of Computer and Information Technology, King Abdulaziz University) ;
  • Almuallim, Neda'a (Faculty of Computer and Information Technology, King Abdulaziz University) ;
  • Alarifi, Suaad (Faculty of Computer and Information Technology, King Abdulaziz University)
  • Received : 2021.05.05
  • Published : 2021.05.30

Abstract

Medical tests are very important part of the health monitoring process. It is performed for various reasons like diagnosing diseases, determining medications effectiveness, etc. Due to that, patients should be able to read and understand the available online tests and results in order to take proper decisions regarding their health condition. In fact, people are varying in their educational level and health backgrounds that make providing such information in an easily readable format by the majority of people considered as a challenge in the health domain since ever. This paper describes the MTReadable corpus which constructed for evaluating the readability of online medical tests. It covered 32 basic periodic check-up tests with over 36k words. These tests information are annotated and labelled based on three readability levels which are easy, neutral and difficult by three non-specialists native Arabic speakers. This paper contributes to enriching the Arabic health research community with an investigation of the level of readability of online medical tests and to be a baseline for further complex health online reports and information.

Keywords

References

  1. Charnock, Deborah. "The DISCERN handbook." Quality criteria for consumer health information on treatment choices. Radcliffe: University of Oxford and The British Library (1998).
  2. Pope, C., S. Ziedland, and N. Mays. "Qualitative research in health care: Analysing qualitative data. 320." BMJ 8.320 (2000): p.7227.
  3. Bustos, Aurelia, et al. "Padchest: A large chest x-ray image dataset with multi-label annotated reports." Medical image analysis 66 (2020):p. 101797. https://doi.org/10.1016/j.media.2020.101797
  4. Sun, Wencheng, et al. "Data processing and text mining technologies on electronic medical records: a review." Journal of healthcare engineering 2018 (2018).
  5. Pinsonneault, Alain, et al. "Integrated health information technology and the quality of patient care: A natural experiment." Journal of Management Information Systems 34.2 (2017): p.457-486. https://doi.org/10.1080/07421222.2017.1334477
  6. Daraz, Lubna, et al. "Can patients trust online health information? A meta-narrative systematic review addressing the quality of health information on the internet." Journal of general internal medicine 34.9 (2019): 1884-1891. https://doi.org/10.1007/s11606-019-05109-0
  7. Al Aqeel, Sinaa, et al. "Readability of written medicine information materials in Arabic language: expert and consumer evaluation." BMC health services research 18.1 (2018): p.1-7. https://doi.org/10.1186/s12913-017-2770-6
  8. Kher, Akhil, Sandra Johnson, and Robert Griffith. "Readability assessment of online patient education material on congestive heart failure." Advances in preventive medicine 2017 (2017).
  9. Alotaibi, S., Alyahya, M., Al-Khalifa, H., Alageel, S., & Abanmy, N.. Readability of Arabic medicine information leaflets: a machine learning approach. Procedia Computer Science, 82 (2016), p. 122-126. https://doi.org/10.1016/j.procs.2016.04.017
  10. Albukhitan, Saeed, Ahmed Alnazer, and Tarek Helmy. "Semantic annotation of arabic web documents using deep learning." Procedia computer science 130 (2018): p.589-596. https://doi.org/10.1016/j.procs.2018.04.108
  11. Alalyani, Nada, and Souad Larabi Marie-Sainte. "NADA: New Arabic dataset for text classification." International Journal of Advanced Computer Science and Applications 9.9 (2018).
  12. Dukes, Kais, and Nizar Habash. "Morphological Annotation of Quranic Arabic." Lrec. 2010.
  13. Zeroual, Imad, and Abdelhak Lakhouaja. "A new Quranic Corpus rich in morphosyntactical information." International Journal of Speech Technology 19.2 (2016): p.339-346. https://doi.org/10.1007/s10772-016-9335-7
  14. Saad, Motaz K., and Wesam M. Ashour. "Osac: Open source arabic corpora." 6th ArchEng Int. Symposiums, EEECS. Vol. 10. 2010.
  15. Samy, Doaa, et al. "Medical Term Extraction in an Arabic Medical Corpus." LREC. 2012.
  16. Health, S. M. o. (2019). Awareness. Retrieved from https://www.moh.gov.sa/Pages/Default.asx
  17. Encyclopedia, King, A, (2019) https://kaahe.org/en-us/Pages/Home/Home.aspx
  18. Laboratories, A. B. M. (2019). Lab Tests Website. Retrieved from https://www.alborg.sa/ar/
  19. Blackman, Nicole J - M., and John J. Koval. "Interval estimation for Cohen's kappa as a measure of agreement." Statistics in medicine 19.5 (2000): p.723-741. https://doi.org/10.1002/(SICI)1097-0258(20000315)19:5<723::AID-SIM379>3.0.CO;2-A
  20. Bird, Steven, Ewan Klein, and Edward Loper. Natural language processing with Python: analyzing text with the natural language toolkit. " O'Reilly Media, Inc.", 2009.
  21. Salloum, Said A., et al. "A survey of Arabic text mining." Intelligent Natural Language Processing: Trends and Applications. Springer, Cham, 2018. p.417-431.