Justin Xu
PhD Candidate
Justin joined the Big Data Institute in 2023 for his PhD in Machine Learning. His research aims to leverage AI to decipher clinical data and enhance healthcare. Specifically, his current doctoral work focuses on generation and evaluation of infectious disease consultations with clinical advice. Justin is co-advised by David Eyre, Sarah Walker, and David Clifton. He is principally funded by Oxford University Press via the Clarendon Fund Scholarship.
In 2024, Justin visited Stanford University as a Canadian Fulbrighter and joined the Centre for Artificial Intelligence in Medicine & Imaging (AIMI). Under Curtis Langlotz, he is developing multimodal generative AI in radiology, including interpretive LLM-based metrics for clinical report generation and vision-language systems capable of understanding temporal relationships in medical images.
Prior to Oxford, Justin worked with Alistair Johnson at the Hospital for Sick Children in Canada. During this time, he worked with the MIMIC-IV dataset and deployed a clinical terminology annotation dashboard with NLP to support multi-site analyses of EHRs. Additionally, with Matthew McDermott, he developed the “Automatic Cohort Extraction System (ACES)" for reproducible machine learning over event-stream data and contributed to the “MEDS Decentralized Extensible Validation (MEDS-DEV)” benchmark for medical time series representation learning. Justin was trained as an engineer and holds a BASc in Engineering Science from the University of Toronto.
Recent publications
A benchmark of expert-level academic questions to assess AI capabilities
Journal article
Phan L. et al, (2026), Nature, 649, 1139 - 1146
RadEval: A framework for radiology text evaluation
Conference paper
Xu J. et al, (2025), Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 546 - 557
Automated Structured Radiology Report Generation
Conference paper
Delbrouck J-B. et al, (2025), Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 26813 - 26829
Tree-of-Quote Prompting Improves Factuality and Attribution in Multi-Hop and Medical Reasoning
Conference paper
Xu J. et al, (2025), Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 5605 - 5622
