DOI QR코드

DOI QR Code

Minimum Reporting Items for Clear Evaluation of Accuracy Reports of Large Language Models in Healthcare (MI-CLEAR-LLM)

  • Seong Ho Park (Department of Radiology and Research Institute of Radiology, University of Ulsan College of Medicine, Asan Medical Center) ;
  • Chong Hyun Suh (Department of Radiology and Research Institute of Radiology, University of Ulsan College of Medicine, Asan Medical Center) ;
  • Jeong Hyun Lee (Department of Radiology and Center for Imaging Science, Samsung Medical Center, Sungkyunkwan University School of Medicine) ;
  • Charles E. Jr Kahn (Department of Radiology and Institute for Biomedical Informatics, University of Pennsylvania) ;
  • Linda Moy (Department of Radiology, New York University Grossman School of Medicine)
  • 투고 : 2024.08.27
  • 심사 : 2024.08.27
  • 발행 : 2024.10.01

초록

키워드

참고문헌

  1. Bhayana R. Chatbots and large language models in radiology: a practical primer for clinical and research applications. Radiology 2024;310:e232756
  2. Jung KH. Uncover this tech term: foundation model. Korean J Radiol 2023;24:1038-1041
  3. Thirunavukarasu AJ, Ting DSJ, Elangovan K, Gutierrez L, Tan TF, Ting DSW. Large language models in medicine. Nat Med 2023;29:1930-1940
  4. Mesko B, Topol EJ. The imperative for regulatory oversight of large language models (or generative AI) in healthcare. NPJ Digit Med 2023;6:120
  5. Li R, Kumar A, Chen JH. How chatbots and large language model artificial intelligence systems will reshape modern medicine: fountain of creativity or pandora's box? JAMA Intern Med 2023;183:596-597
  6. CHART Collaborative. Protocol for the development of the chatbot assessment reporting tool (CHART) for clinical advice. BMJ Open 2024;14:e081155
  7. Park SH, Suh CH. Reporting guidelines for artificial intelligence studies in healthcare (for both conventional and large language models): what's new in 2024. Korean J Radiol 2024;25:687-690
  8. Kaddour J, Harris J, Mozes M, Bradley H, Raileanu R, McHardy R. Challenges and applications of large language models [accessed on August 27, 2024]. Available at: https://doi.org/10.48550/arXiv.2307.10169
  9. Lewis P, Perez E, Piktus A, Petroni F, Karpukhin V, Goyal N, et al. Retrieval-augmented generation for knowledge-intensive NLP tasks [accessed on August 26, 2024]. Available at: https://proceedings.neurips.cc/paper/2020/file/6b493230205f780e1bc26945df7481e5-Paper.pdf
  10. Wolf T, Debut L, Sanh V, Chaumond J, Delangue C, Moi A, et al. Transformers: state-of-the-art natural language processing [accessed on August 26, 2024]. Available at: https://doi.org/10.18653/v1/2020.emnlp-demos.6
  11. Kim W. Seeing the unseen: advancing generative AI research in radiology. Radiology 2024;311:e240935
  12. Lee JH, Shin J. How to optimize prompting for large language models in clinical research. Korean J Radiol 2024;25:869-873
  13. Gu K, Lee JH, Shin J, Hwang JA, Min JH, Jeong WK, et al. Using GPT-4 for LI-RADS feature extraction and categorization with multilingual free-text reports. Liver Int 2024;44:1578-1587
  14. Sahoo SS, Plasek JM, Xu H, Uzuner O, Cohen T, Yetisgen M, et al. Large language models for biomedicine: foundations, opportunities, challenges, and best practices. J Am Med Inform Assoc 2024;31:2114-2124
  15. Gallifant J, Afshar M, Ameen S, Aphinyanaphongs Y, Chen S, Cacciamani G, et al. The TRIPOD-LLM statement: a targeted guideline for reporting large language models use. medRxiv [Preprint]. 2024 [accessed on August 26, 2024]. Available at: https://doi.org/10.1101/2024.07.24.24310930