Claim analyzed

Health

“AI-generated deepfake X-ray images are sufficiently realistic to cause radiologists to make incorrect diagnoses.”

The conclusion

Misleading
5/10

The evidence confirms that AI-generated deepfake X-rays can deceive radiologists — with only 41% spontaneously detecting fakes in a major 2026 study — but it does not demonstrate that this deception causes incorrect diagnoses. The same study found comparable diagnostic accuracy on real versus synthetic images (91.3% vs. 92.4%), undermining the claim's causal assertion. The claim conflates "hard to detect" with "causes misdiagnosis," an inferential leap the available research does not support.

Based on 15 sources: 10 supporting, 0 refuting, 5 neutral.

Caveats

  • The claim conflates radiologists' inability to detect deepfake X-rays with a demonstrated increase in incorrect diagnoses — the key study does not measure real-world misdiagnosis rates caused by deepfakes.
  • Diagnostic accuracy on AI-generated X-rays was comparable to authentic images (92.4% vs. 91.3%), which does not support the assertion that deepfakes cause more diagnostic errors.
  • CT scan tampering evidence (e.g., 99.2% success injecting fake lung cancers) involves a different imaging modality and attack scenario than whole-image X-ray deepfakes, and should not be treated as direct proof of the X-ray-specific claim.

Sources

Sources used in the analysis

#1
Radiological Society of North America (RSNA) 2026-03-24 | Deepfake X-Rays Fool Radiologists and AI
SUPPORT

Neither radiologists nor multimodal LLMs could easily distinguish AI-generated deepfake X-ray images from authentic ones. Radiologists' accuracy was 75% in detecting ChatGPT-generated images (range 58-92%) and 62-78% for RoentGen chest X-rays, even when aware; no correlation with experience except musculoskeletal subspecialists performed better. Lead author: 'These deepfake X-rays are realistic enough to deceive radiologists.'

#2
PMC 2025-12-01 | Synthetic data, synthetic trust: navigating data challenges in the digital revolution - PMC
NEUTRAL

Although synthetic data address crucial shortages of real-world training data, their overuse might propagate biases, accelerate model degradation, and compromise generalisability across populations. A concerning consequence of the rapid adoption of synthetic data in medical AI is the emergence of synthetic trust—an unwarranted confidence in models trained on artificially generated datasets that fail to preserve clinical validity or demographic realities.

#3
PMC 2025-01-31 | Is synthetic data generation effective in maintaining clinical biomarkers? Investigating diffusion models across diverse imaging modalities - PMC
NEUTRAL

The generation of synthetic medical images poses a unique set of challenges compared to other domains due to the intricate nature of biological structures and the subtle nuances of imaging biomarkers. Without faithful representation of these biomarkers, synthetic images risk being of limited utility in clinical settings, hindering their adoption for tasks such as training AI models, augmenting datasets, and validating imaging algorithms.

#4
PMC - NIH The Potential Dangers of Artificial Intelligence for Radiology and Radiologists - PMC - NIH
SUPPORT

As proof of principle, Mirsky et al [3] showed that they were able to tamper with CT scans and artificially inject or remove lung cancers on the images. When the radiologists were blinded to the attack, this hack had a 99.2% success rate for cancer injection and a 95.8% success rate for cancer removal. Even when the radiologists were warned about the attack, the success of cancer injection decreased to 70%, but the cancer removal success rate remained high at 90%. This illustrates the sophistication and realistic appearance of such artificial images.

#5
News-Medical.net 2026-03-25 | Study finds AI-generated X-rays can fool radiologists and chatbots
SUPPORT

A study in Radiology found realistic AI-generated X-rays not easily distinguished by radiologists (moderate performance) or LLMs. Diagnostic accuracy was 91.3% for authentic and 92.4% for AI-generated radiographs, showing AI images sufficiently realistic to maintain high diagnostic reliability. Only 41% of radiologists spontaneously identified AI images when blinded.

#6
PMC 2020-06-01 | Evaluating the Clinical Realism of Synthetic Chest X-Rays Generated Using Progressively Growing GANs - PMC
NEUTRAL

We quantify X-ray clinical realism by asking radiologists to distinguish between real and fake scans and find that generates are more likely to be classed as real than by chance, but there is still progress required to achieve true realism. We confirm these findings by evaluating synthetic classification model performance on real scans.

#7
STAT News 2026-03-24 | Can you spot a deepfake X-ray? Neither can your radiologist
SUPPORT

17 radiologists could only differentiate real X-rays from ChatGPT-generated ones 75% accurately even when alerted; only 41% noticed issues blindly while diagnosing. Images generated easily with simple prompts, fooling even the generating model (57-85% detection accuracy), disrupting medical care potential.

#8
Gizmodo 2026-03-27 | Doctors Struggle to Spot AI-Generated X-Rays, Raising Scam Risks - Gizmodo
SUPPORT

“Our study demonstrates that these deepfake X-rays are realistic enough to deceive radiologists, the most highly trained medical image specialists,” the study's lead author Dr. Mickael Tordjman said, “even when they were aware that AI-generated images were present.” Radiologists who were made aware of the fact that these datasets contained AI images fared better than those exposed to the images without any indication of the test's actual purpose, but still not great, showing a mean accuracy of 75%.

#9
EMJ 2025-04-10 | The Good, the Bad, and the Ugly of AI in Medical Imaging - EMJ
NEUTRAL

AI algorithms can analyse medical images with remarkable accuracy and speed, often surpassing human capabilities, by identifying subtle patterns and anomalies that may be missed by the human eye. This can improve diagnostic accuracy and reduce the risk of misdiagnosis and false negatives. However, the use of AI in healthcare raises ethical questions, and AI algorithms require high-quality, labelled data to train effectively, with biased datasets potentially leading to incorrect or discriminatory outcomes.

#10
Neuroscience News 2026-03-24 | AI-Generated Medical Images Deceive Even Top Radiologists - Neuroscience News
SUPPORT

A multi-center international study reveals that neither experienced radiologists nor advanced multimodal large language models (LLMs) can reliably distinguish “deepfake” X-rays from authentic ones. Even when warned that synthetic images were present, radiologists only averaged 75% accuracy in identifying them. The findings expose a high-stakes vulnerability in healthcare, ranging from fraudulent litigation (fabricated injuries) to cybersecurity threats where hackers could inject synthetic images into digital medical records to cause clinical chaos.

#11
ICT&health 2026-03-25 | Deepfake X-ray images mislead doctors and AI systems - ICT&health
SUPPORT

Both radiologists and advanced AI models have difficulty distinguishing AI-generated X-ray images from real ones. When they did not know that deepfakes were included, only 41 percent recognized them spontaneously. After an explicit warning, the average accuracy rose to 75 percent, with significant variation among individual evaluators. The researchers warn that these so-called deepfakes pose serious risks to the reliability of medical imaging and the safety of healthcare processes. For instance, manipulated images could be used in fraud or legal claims, such as by presenting a nonexistent fracture as real.

#12
ScienceDaily 2026-03-26 | Deepfake x-rays are so real even doctors can't tell the difference | ScienceDaily
SUPPORT

A new study published on March 24 in Radiology, the journal of the Radiological Society of North America (RSNA), shows that both radiologists and multimodal large language models (LLMs) have difficulty telling real X-rays apart from artificial intelligence (AI)-generated "deepfake" images. Once they were informed that synthetic images were present, their average accuracy in distinguishing real from fake rose to 75%. Performance varied widely among individuals. Radiologists correctly identified between 58% and 92% of the ChatGPT-generated images.

#13
Forbes 2026-03-27 | AI-Generated Fake X-Rays Are Now So Good They Can Fool Doctors - Forbes
SUPPORT

X-ray images generated by Artificial Intelligence (AI) are now so good they can fool expert radiologists, according to a new study. The findings, published in the journal Radiology come amid a huge boom in AI use in healthcare and raise concerns that these “deepfake” X-rays could be used to deceive insurance companies, employers and even interfere with legal cases. The study asked 17 radiologists from six different countries to assess X-ray images of different parts of the body.

#14
EMJ 2026-03-29 | Radiologists Can Spot Fake Radiographs, But Not Always - EMJ
SUPPORT

Synthetic radiographs, or AI-generated X-ray images, are becoming increasingly realistic, raising questions about their detectability in clinical practice. While they can be valuable for training or research, their high realism can occasionally blur the line between genuine and artificial images, potentially affecting diagnostic accuracy if unnoticed. When unaware of the study's true purpose, 41% of radiologists spontaneously suspected some images to be AI-generated. After being informed, overall detection accuracy was 75% for GPT-4o radiographs and 70% for those produced by RoentGen.

#15
LLM Background Knowledge 2026-03-01 | Context on Deepfake Medical Imaging Studies
NEUTRAL

Prior to 2026, smaller studies (e.g., 2023-2024 GAN-based fakes) showed radiologists detecting ~80-90% but with lower realism; 2026 Radiology study marks advance in diffusion/LLM realism, yet explicitly notes comparable diagnostic accuracy (91.3% vs 92.4%), implying potential but not measuring actual incorrect diagnoses caused.

Full Analysis

Expert review

How each expert evaluated the evidence and arguments

Expert 1 — The Logic Examiner

Focus: Inferential Soundness & Fallacies
Misleading
5/10

Sources 1/7/8/12/14 show radiologists often cannot reliably distinguish AI-generated X-rays from real ones (e.g., ~75% detection when warned; ~41% spontaneous suspicion when blinded), and Source 4 shows analogous realism-driven deception in CT tampering, but none of these logically entails that radiologists therefore make incorrect diagnoses from deepfake X-rays; the only directly diagnosis-linked metric provided (Source 5, echoed by Source 15) reports similar diagnostic accuracy on authentic vs AI-generated radiographs, which does not demonstrate deepfakes causing incorrect diagnoses. Because the pro side's inference largely equates “undetectable/realistic” with “causes incorrect diagnosis” (a non sequitur) and the evidence pool does not directly establish increased misdiagnosis attributable to deepfake X-rays, the claim as stated is not proven and is more likely false on this record.

Logical fallacies

Non sequitur / category error: inferring that difficulty detecting fakes (Sources 1/7/12/14) necessarily means incorrect diagnoses occur, without evidence of diagnostic error caused by the fakeness.Modality switch / weak analogy: using CT tampering results (Source 4) to prove an X-ray-specific causal claim.Equivocation on 'incorrect diagnosis': treating 'acting confidently on an image' or 'not suspecting AI' as equivalent to making a wrong clinical diagnosis, which requires a ground-truth mismatch.
Confidence: 8/10

Expert 2 — The Context Analyst

Focus: Completeness & Framing
Misleading
5/10

The claim frames “hard to detect” deepfake X-rays as proof they “cause incorrect diagnoses,” but the key 2026 Radiology/RSNA reporting emphasizes detectability (75% detection when warned; 41% spontaneous suspicion) and even reports similar diagnostic accuracy on authentic vs AI-generated radiographs (91.3% vs 92.4%), which does not demonstrate increased incorrect diagnoses attributable to deepfakes in that study (Sources 1, 5, 15). With full context, the evidence supports that deepfake X-rays can deceive radiologists, but it does not establish that they are sufficiently realistic to cause radiologists to make incorrect diagnoses as a demonstrated outcome, so the claim's overall impression overreaches (Sources 1, 3, 5).

Missing context

The cited 2026 study primarily measures detectability of AI-generated radiographs, not real-world misdiagnosis rates or downstream clinical decisions attributable to deepfakes (Sources 1, 15).Reported diagnostic accuracy was comparable on real vs AI-generated images (91.3% vs 92.4%), which undercuts the implication that deepfakes increased incorrect diagnoses in that experiment (Source 5).CT tampering results (lung cancer injection/removal) are a different modality and attack scenario than AI-generated X-ray deepfakes, so using them to substantiate the X-ray-specific 'incorrect diagnoses' claim is a context shift (Source 4).General concerns about synthetic biomarker fidelity and clinical validity suggest realism can be uneven across pathologies and populations, limiting a blanket claim that deepfakes are 'sufficiently realistic' to cause diagnostic errors broadly (Sources 2, 3).
Confidence: 8/10

Expert 3 — The Source Auditor

Focus: Source Reliability & Independence
Misleading
5/10

The most reliable evidence is Source 1 (RSNA, 2026-03-24) summarizing a Radiology study showing radiologists and multimodal models struggle to distinguish AI-generated radiographs from real ones (e.g., ~75% detection when warned; only ~41% spontaneous suspicion when blinded), while Source 4 (PMC/NIH review citing Mirsky et al.) shows highly successful clinically deceptive image tampering in CT but is not X-ray-specific; most other outlets (Sources 5, 7, 8, 10, 12, 13, 14) appear to be secondary reportage largely echoing the same Radiology/RSNA study rather than independent verification. Taken together, trustworthy sources strongly support that deepfake radiographs can be realistic enough to fool radiologists, but they do not directly establish that this realism has been shown to cause radiologists to make incorrect diagnoses (and Source 5's reported similar diagnostic accuracy on real vs synthetic images does not demonstrate increased misdiagnosis), so the claim overreaches beyond what the best evidence actually measures.

Weakest sources

Source 15 (LLM Background Knowledge) is not an independent, citable primary source and should be discounted relative to peer-reviewed or professional-society publications.Source 10 (Neuroscience News) is a low-to-mid reliability news-aggregation style outlet that typically rewrites press releases and is unlikely to add independent verification beyond the underlying Radiology/RSNA report.Source 8 (Gizmodo) is a general-interest tech outlet and is best treated as secondary commentary rather than primary scientific evidence.Source 5 (News-Medical.net) often republishes press-release style summaries and is likely non-independent of the RSNA/Radiology study it describes.
Confidence: 7/10

Expert summary

See the full panel summary

Create a free account to read the complete analysis.

Sign up free
The claim is
Misleading
5/10
Confidence: 8/10 Unanimous

The arguments

Two AI advocates debated this claim using the research gathered.

Argument for

P
Proponent Argues TRUE

The claim is decisively supported by a multi-center international study published in Radiology (Source 1, RSNA), the field's most authoritative journal, which found that radiologists achieved only 75% accuracy in detecting AI-generated deepfake X-rays even when explicitly warned — meaning roughly one in four fake images went undetected — and that only 41% of radiologists spontaneously identified synthetic images when blinded, directly demonstrating that these fakes are realistic enough to cause misdiagnosis in routine clinical conditions. This is further corroborated by Source 4 (PMC-NIH), which documented a 99.2% success rate for AI-injected false lung cancers on CT scans when radiologists were unaware, and by Source 5 (News-Medical.net), which confirmed that AI-generated X-rays maintained diagnostic accuracy comparable to authentic images (92.4% vs. 91.3%), proving radiologists cannot distinguish them and would therefore act on fabricated pathology — constituting precisely the kind of incorrect diagnosis the claim describes.

O
Opponent Rebuttal

You commit a category error by treating “can't reliably detect fakes” (75% detection; 41% spontaneous suspicion) as “therefore they make incorrect diagnoses,” but the only diagnostic-performance figure in the brief cuts the other way: diagnostic accuracy on AI images was comparable to (even slightly higher than) authentic images, so your argument never shows deepfakes cause more wrong calls (Source 1, RSNA; Source 5, News-Medical.net; Source 15). And your CT-tampering citation is a modality switch that doesn't validate the X-ray claim, while “would therefore act on fabricated pathology” is an unsupported leap because the Radiology/RSNA reporting you cite doesn't measure downstream misdiagnosis rates from deepfake X-rays in routine practice (Source 4, PMC-NIH; Source 1, RSNA).

Argument against

O
Opponent Argues FALSE

The brief's flagship evidence actually undercuts the motion: the Radiology/RSNA coverage reports diagnostic accuracy was essentially the same on authentic vs AI-generated radiographs (91.3% vs 92.4%), which indicates the images did not cause radiologists to make more incorrect diagnoses in that study (Source 5, News-Medical.net; Source 15, LLM Background Knowledge). Other cited items focus on detectability or non–X-ray modalities (e.g., radiologists only 75% at spotting fakes, or CT tampering success rates) and therefore do not demonstrate that AI-generated deepfake X-rays are “sufficiently realistic” to cause incorrect diagnoses, especially given prior work noting synthetic X-rays still fell short of “true realism” and biomarker fidelity concerns (Source 1, RSNA; Source 4, PMC-NIH; Source 6, PMC; Source 3, PMC).

P
Proponent Rebuttal

Your interpretation of the 91.3% vs. 92.4% diagnostic accuracy figures commits a critical logical fallacy: you treat equivalent diagnostic performance on real versus fake images as evidence of safety, when it actually proves the opposite — radiologists acted on deepfake X-rays with the same clinical confidence as authentic ones, meaning fabricated pathology would trigger real treatment decisions, which is precisely what "causing incorrect diagnoses" means. Furthermore, you selectively dismiss Source 4's CT tampering evidence and Source 1's finding that only 41% of radiologists spontaneously detected fakes, yet these directly establish that radiologists routinely act on synthetic images without suspicion — your own cited Source 6's older GAN limitations are rendered obsolete by the 2026 Radiology study's diffusion/LLM-based images, which Source 1 explicitly confirms deceive even warned specialists.

Your annotation will be reviewed by an editor before becoming visible.

Embed this verification

Copy this code and paste it in your article's HTML.