A recent study highlights the potential dangers of integrating Large Language Models in healthcare systems without addressing inherent biases.
A recent study published on npj Digital Medicine (Nature) highlights the potential dangers of integrating Large Language Models (LLMs) in healthcare systems without addressing inherent biases. At a time when artificial intelligence (AI) is increasingly getting embedded in healthcare, the study emphasises the potential for these models to perpetuate harmful, debunked, racist ideas in the medical field.
The primary objective of the study was to discern whether four commercially available LLMs — Bard, Chat GPT, Claude, GPT-4 — could propagate harmful, inaccurate race-based content across eight different scenarios historically associated with race-based medicine or widespread misconceptions surrounding race. The examination of LLMs was conducted using nine distinct questions derived from discussions among physician experts and prior research on race-based medical misconceptions held by medical trainees.
Despite the advances in medical understanding and ethics, the study reveals that LLMs showcased instances of promoting race-based medicine or racist tropes. The inconsistency in responses when asked the same question repeatedly further highlights the unpredictability and potential risk of these AI models.
One of the remarkable aspects of the study is its revelation of how LLMs, while almost universally recognising race as a social construct, faltered significantly when navigating questions regarding kidney function and lung capacity — areas historically marred by race-based medical practices. Some models went as far as to attempt justifying race-based medicine with false assertions, like suggesting differences in muscle mass among Black individuals which would affect creatinine levels.
The study is a gentle reminder of the potential dangers of AI integration within healthcare systems. As some of these LLMs are already being connected to electronic health record systems, the risk of perpetuating harmful, debunked, racist ideas are not merely theoretical but a serious concern.
The paper also highlights a bigger issue: We really need to fix the unfairness in these big computer models, especially when we want to use them in important places like hospitals. It’s important to share the data used to teach these models and have good checks to make sure they don’t spread wrong or unfair information in medical environments.
It's time to take a hard look at human biases and re-examine the data we feed into new technologies.