1.
Wirawan R. Benchmarking Large Language Models on Diagnostic Inference Tasks in Medical Texts. ATCAEP [Internet]. 2024 Sep. 7 [cited 2025 Sep. 21];14(9):15-31. Available from: https://heilarchive.com/index.php/ATCAEP/article/view/2024-SEP-07