For best results with Artificial Intelligence in healthcare, leverage and augment, rather than outsource

Number of Citations: 0

https://doi.org/10.29060/TAPS.2026-11-2/TT006

Craig S. Webster
Centre for Medical and Health Sciences Education,
School of Medicine, University of Auckland,
Auckland, New Zealand

LLMs are a powerful new tool, and like any other technology introduced into healthcare, clinicians must carefully consider the risks, and use LLMs in a way that leverages and augments clinical knowledge and skill, rather than outsourcing tasks to LLMs in an uncritical way.

Large Language Models (LLMs), such as ChatGPT (Open AI), are a prominent form of Artificial Intelligence (AI) currently capturing the imagination of millions of people around the world. Although not conscious, LLMs comprise artificial neural networks trained on vast datasets of human language. This allows the LLM to identify statistical patterns in language, and to generate coherent and contextually appropriate responses to a wide range of prompts, yielding impressive conversational abilities. ChatGPT has even gained a pass grade on the United States Medical Licensing Examination (Gilson et al., 2023).

In healthcare, LLMs may assist with tasks such as transcribing medical notes, offering clinical decision support, or generating teaching materials. However, the statistical patterns and rules learnt by the LLM during training are entirely dependent on the training data, and so are not equivalent to known causal mechanisms of disease or evidence-based medicine (Webster, 2025). Hence the introduction of LLMs to healthcare without careful consideration carries substantial risk.

LLMs can generate plausible but incorrect or misleading information, known as hallucinations, which could have dangerous consequences in a medical context. Because LLMs are trained on historical and internet-based texts, they may inherit and even amplify existing societal biases, including those based on race, gender, or socioeconomic status (Webster & Jowsey, 2025; Webster et al., 2022). The use of LLMs in healthcare may therefore perpetuate or worsen health disparities.

In a high-stakes field like healthcare, accountability and trust are crucial, but LLMs are not morally or legally bound agents. Misplaced trust in an AI-generated recommendation without proper oversight could result in harm, and this includes the possibility of inappropriate disclosure of confidential patient information. Unlike regulated clinical decision support systems, LLMs lack transparent mechanisms for verifying the validity of their output.

However, LLMs excel at summarising large domains of knowledge, and if used in conjunction with appropriate human oversight, can save considerable time in many teaching and research activities (Topol, 2019). For example, in research LLMs make excellent sounding boards for the development of new ideas or hypotheses, and can very quickly generate patient vignettes useful for teaching.

We have a strong tendency to anthropomorphise LLMs and to see them as all-knowing. However, an understanding of the risks inherent in this technology is critical. LLMs are a powerful new tool, and like any other technology introduced into healthcare, clinicians must carefully consider the risks, and use LLMs in a way that leverages and augments clinical knowledge and skill, rather than outsourcing tasks to LLMs in an uncritical way.

References

Gilson, A., Safranek, C. W., Huang, T., Socrates, V., Chi, L., Taylor, R. A., & Chartash, D. (2023). How does
ChatGPT perform on the United States Medical Licensing Examination (USMLE)? The implications of large language
models for medical education and knowledge assessment. JMIR Medical Education, 9, e45312.
https://doi.org/10.2196/45312

Topol, E. J. (2019). High-performance medicine: The convergence of human and artificial intelligence. Nature
Medicine, 25, 44-56. https://doi.org/10.1038/s41591-018-0300-7

Webster, C. S. (2025). Natural and artificial intelligence – The psychotechnical agenda of the 21st century. Journal of
Psychology and AI, 1(1), 2491445. https://doi.org/10.1080/29974100.2025.2491445

Webster, C. S., & Jowsey, T. (2025). Artificial intelligence must operate ethically in health care and not be prone to
racist or sexist biases. Anesthesia and Analgesia, 140(5), 1099-1104. https://doi.org/10.1213/ane.0000000000007140

Webster, C. S., Taylor, S., Thomas, C., & Weller, J. M. (2022). Social bias, discrimination and inequity in healthcare:
Mechanisms, implications and recommendations. BJA Education, 22(4), 131-137. https://doi.org/10.1016/j.bjae.2021.11.011

Announcements