RT Journal Article SR Electronic T1 A Framework for Evaluating the Efficacy of Foundation Embedding Models in Healthcare JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2024.04.17.24305983 DO 10.1101/2024.04.17.24305983 A1 Xu, Sonnet A1 Gui, Haiwen A1 Rotemberg, Veronica A1 Wang, Tongzhou A1 Chen, Yiqun T. A1 Daneshjou, Roxana YR 2024 UL http://medrxiv.org/content/early/2024/04/19/2024.04.17.24305983.abstract AB Recent interest has surged in building large-scale foundation models for medical applications. In this paper, we propose a general framework for evaluating the efficacy of these foundation models in medicine, suggesting that they should be assessed across three dimensions: general performance, bias/fairness, and the influence of confounders. Utilizing Google’s recently released dermatology embedding model and lesion diagnostics as examples, we demonstrate that: 1) dermatology foundation models surpass state-of-the-art classification accuracy; 2) general-purpose CLIP models encode features informative for medical applications and should be more broadly considered as a baseline; 3) skin tone is a key differentiator for performance, and the potential bias associated with it needs to be quantified, monitored, and communicated; and 4) image quality significantly impacts model performance, necessitating that evaluation results across different datasets control for this variable. Our findings provide a nuanced view of the utility and limitations of large-scale foundation models for medical AI.Competing Interest StatementRD is consultant to Enspectra Health, VisualDx, DWA Healthcare Communications Group, Genentech Inc, Frazier Healthcare Partners, L'Oreal France, investigator at UCB, and on advisory board for MDalgorithms Inc, Revea.Funding StatementStanford MedScholars to HG and Stanford Data Science Fellowship to YTC.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:The study used openly available human data that were originally located at https://stanfordaimi.azurewebsites.net/datasets/35866158-8196-48d8-87bf-50dca81df965 and https://challenge.isic-archive.com/data/.I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesAll code and results reproduced in the present study will be released on github and are available upon reasonable request to the authors before then.