Lunit INSIGHT MMG: Evidence Review of ScreenTrustCAD and AI-STREAM

Correcting the Record: No NEJM 2024 RCT Exists for Lunit INSIGHT MMG

A persistent search query circulates among radiologists and health system researchers: a "NEJM 2024 RCT" evaluating Lunit INSIGHT MMG in breast cancer screening. That publication does not exist. No randomized controlled trial of Lunit INSIGHT MMG has appeared in the New England Journal of Medicine or its affiliated journals, and no such trial result was identified in the primary literature as of mid-2026.

The confusion likely stems from two sources. First, the MASAI trial — the only published full RCT in breast cancer screening AI — appeared in The Lancet in January 2026 and attracted wide attention. It used Transpara AI (ScreenPoint Medical), not Lunit. Second, Lunit INSIGHT MMG's prospective evidence is genuinely significant — published in high-impact journals and covering large populations — but neither study is an RCT, and neither appeared in NEJM.

The actual Lunit evidence base comprises the ScreenTrustCAD prospective paired-reader study (Lancet Digital Health, 2023, n=55,581) and the AI-STREAM preliminary analysis (Nature Communications, 2025, n=24,543). Both are prospective, multi-reader designs with meaningful clinical findings — but they are not RCTs, and their generalizability is bounded by single-country settings and specific equipment configurations. The remainder of this article covers what those studies actually show.

Clinical Problem: Workforce Shortages, Missed Cancers, and the Case for AI in Mammography Screening

A radiologist at a PACS workstation reviewing a mammogram with an AI heatmap overlay and abnormality score panel visible. — Lunit INSIGHT MMG generates per-lesion abnormality scores and heatmap overlays that radiologists review alongside standard mammographic views. The radiologist retains final diagnostic authority.

Organized breast cancer screening programs face a structural tension. The clinical case for double reading — having two radiologists independently review each mammogram before consensus — is supported by evidence showing it increases cancer detection rates compared with single reading. But double reading is resource-intensive, and radiologist workforce shortages are acute in many health systems. Programs that cannot sustain double reading default to single reading, often by general radiologists rather than breast imaging specialists.

This creates several compounding problems that AI-assisted screening is designed to address:

Interval cancers — tumors diagnosed between scheduled screens — remain a persistent quality indicator. Reducing interval cancer rates requires either more sensitive reading or earlier detection at the screening visit itself.
Performance variability between breast imaging specialists and general radiologists is well documented. General radiologists reading lower volumes of mammograms tend to show lower cancer detection rates and higher recall rates than specialist readers.
Workload pressure on specialist radiologists reduces reading time per case and may affect detection sensitivity, particularly for subtle findings.
Double-reading programs, while clinically valuable, require approximately twice the radiologist time per case — a resource most screening programs cannot sustain as screening volumes grow.

AI-CAD systems like Lunit INSIGHT MMG are positioned to address these gaps — either by replacing one radiologist in a double-read workflow, augmenting a single reader's performance, or eventually triaging cases by risk level. Whether they actually deliver on that positioning is an evidence question, not a marketing one.

What Is Lunit INSIGHT MMG: AI Approach, Output, and Regulatory Status

Lunit INSIGHT MMG is a deep learning computer-aided detection and diagnosis (CADe/x) system for 2D digital screening mammography. It analyzes standard mammographic views and generates a per-lesion abnormality score on a 0–100 scale, with higher scores indicating greater likelihood of malignancy. The system also produces lesion localization heatmaps that overlay on the mammographic image, allowing radiologists to identify which regions the algorithm has flagged and at what confidence level.

The system is designed to integrate with existing PACS infrastructure and radiology viewers. It supports both single-read and double-read workflow configurations, functioning as a concurrent second-reader or as a CAD overlay during standard reading sessions.

Lunit INSIGHT MMG regulatory authorizations and confirmed national program deployments as of Q2 2026.
Authorization	Status	Notes
FDA 510(k)	Cleared — K211678 (2021)	CADe/x for screening mammography
CE Mark	Confirmed	European conformity for medical device use
Health Canada	Authorized (2022)	Canadian market authorization
Australia (BreastScreen NSW)	National program deployment	Operational in national screening program
Sweden	National program deployment	Operational in organized screening
Iceland, Singapore, Saudi Arabia, Qatar	National program deployment	Confirmed as of Q2 2026

Evidence Quality Overview: What Study Types Support This Product

As of mid-2026, the published evidence base for Lunit INSIGHT MMG includes two prospective studies, one large retrospective simulation study, and one ongoing real-world evaluation trial. No randomized controlled trial specific to this product has been published.

Published and ongoing evidence for Lunit INSIGHT MMG as of Q2 2026. Prospective studies are non-RCT designs; no mortality endpoint data are available.
Study	Design	N	Setting	Publication	Status
ScreenTrustCAD	Prospective paired-reader non-inferiority	55,581	Single Swedish center, double-read program	Lancet Digital Health, 2023	Published — full results
AI-STREAM	Prospective multicenter single-read cohort	24,543	Six Korean academic hospitals	Nature Communications, 2025	Preliminary analysis — final results pending post-2026
Danish Population Simulation	Retrospective simulation	249,402	Danish national mammography database	Radiology AI, 2024	Published — simulation only
NCT06232070	Real-world evaluation trial	Not disclosed	Ongoing	Not yet published	Active — no results available

The two prospective studies differ in workflow context: ScreenTrustCAD examined AI in a double-read program (replacing one of two radiologists), while AI-STREAM examined AI as a CAD assistant in a single-read program. These are not equivalent deployment scenarios, and findings from one do not directly transfer to the other.

ScreenTrustCAD Trial (Lancet Digital Health, 2023): Double-Read Setting Evidence

ScreenTrustCAD is the foundational prospective study for Lunit INSIGHT MMG in double-read screening programs. The trial enrolled 55,581 women attending a population-based mammography screening program in Stockholm, Sweden, using Philips mammography equipment. Radiologists participating in the study had a median of 17 years of breast imaging experience — a notably experienced cohort relative to many real-world screening programs.

The study design was a prospective paired-reader non-inferiority trial. Each mammogram was read under three conditions: standard double reading by two radiologists, double reading by one radiologist plus Lunit INSIGHT MMG, and standalone AI reading. The primary outcome was cancer detection rate (CDR); recall rate was a key secondary outcome.

ScreenTrustCAD key findings: cancer detection and recall rates by reading configuration (n=55,581, Stockholm, Lancet Digital Health 2023).
Reading Configuration	Cancers Detected	CDR (per 1,000)	Recall Rate	Workload vs. Standard
Two radiologists (standard double reading)	250	~4.50	2.93%	Baseline
One radiologist + Lunit AI	261	~4.68	2.80%	~50% reduction in radiologist reads
Standalone Lunit AI	Non-inferior to 2-radiologist reading	—	Not reported as superior	~100% radiologist reads eliminated (not viable)

The primary finding was that one radiologist plus Lunit AI was not only non-inferior but statistically superior to two-radiologist double reading in cancer detection rate: 261 versus 250 screen-detected cancers, a relative proportion of 1.04 (95% CI 1.00–1.09; p=0.017 for superiority). Recall rate was simultaneously 4% lower (2.80% vs. 2.93%). Replacing one radiologist with AI would, in a population of 100,000 screened women, eliminate approximately 100,000 radiologist reads while increasing consensus discussions by approximately 1,562 cases.

Standalone AI performance was non-inferior to two-radiologist double reading in CDR but was not superior. The authors noted that standalone AI raises unresolved questions around medical-legal responsibility, public acceptability, and the absence of a human clinical decision-maker — factors that preclude standalone deployment in current regulatory and clinical frameworks.

AI-STREAM Preliminary Analysis (Nature Communications, 2025): Single-Read Setting Evidence

AI-STREAM is the first large-scale, multicenter prospective study of AI-CAD in a single-read mammography setting. The preliminary analysis enrolled 24,543 women across six Korean academic hospitals, with Lunit INSIGHT MMG version 1.1.7.1 deployed at a positive threshold of ≥10. The study compared cancer detection rates and recall rates with and without AI-CAD assistance for breast radiologists reading in their standard single-read workflow.

The primary prospective finding for breast radiologists was a 13.8% higher cancer detection rate with AI-CAD (5.70 per 1,000 screened) compared with reading without AI (5.01 per 1,000; p<0.001). Critically, this improvement was achieved without a statistically significant change in recall rate (p=0.564) — a finding that addresses the central concern that AI assistance would increase false positives and unnecessary callbacks.

AI-STREAM preliminary analysis key findings by reader group and condition (n=24,543, six Korean academic hospitals, Nature Communications 2025). General radiologist and standalone AI data require separate interpretation — see notes below.
Reader Group / Condition	CDR (per 1,000)	Change vs. No AI	Recall Rate Change	Data Type
Breast radiologists — no AI	5.01	Baseline	Baseline	Prospective primary outcome
Breast radiologists — with Lunit AI-CAD	5.70	+13.8% (p<0.001)	No significant change (p=0.564)	Prospective primary outcome
General radiologists — no AI (simulation)	~4.76	Baseline	~6.31%	Retrospective simulation sub-study
General radiologists — with Lunit AI-CAD (simulation)	~6.02	+26.4%	~6.89% (significant increase)	Retrospective simulation sub-study
Standalone Lunit AI	5.21	Non-inferior to specialists (p=0.752)	Significantly higher than specialists	Prospective comparison arm

The standalone AI arm showed CDR non-inferior to breast specialist radiologists (5.21 vs. 5.01 per 1,000; p=0.752), but with significantly higher recall rates than specialists. This pattern — adequate cancer detection but excess recalls — is consistent with ScreenTrustCAD's standalone findings and reinforces that standalone AI is not currently viable as a sole reader in organized screening programs.

Real-World Deployment Evidence: Danish Population-Wide Simulation

A workflow diagram showing two mammography screening pathways: single-read with AI assistant and double-read with AI replacing one radiologist. — AI integration position within the screening workflow — whether as a first-reader replacement, second-reader replacement, or triage tool — meaningfully affects both accuracy and workload outcomes. The Danish simulation study quantified these differences across 249,402 mammograms.

A retrospective simulation study published in Radiology AI (2024) applied Lunit INSIGHT MMG v1.1.7.1 to 249,402 Danish mammograms to model how different AI integration positions within the double-read workflow would affect accuracy and workload compared with standard two-radiologist double reading.

Three AI-integrated configurations were modeled:

Danish population-wide retrospective simulation: AI integration position and its effect on screening accuracy (n=249,402, Radiology AI 2024).
Configuration	Description	Workload Reduction	Key Accuracy Finding
AIfirst	AI replaces the first radiologist; second radiologist reviews all cases plus AI output	~49%	No significant difference in CDR, sensitivity, or specificity vs. standard double reading; higher arbitration rate
AIsecond	AI replaces the second radiologist; first radiologist reviews all cases, AI provides second opinion	~49%	Recall rate reduced, but sensitivity decreased by 1.58% (p<0.001) — a meaningful accuracy penalty
AItriage	AI triages cases to single or double reading based on risk score; high-risk cases get two readers	~50%	Higher CDR, sensitivity, and PPV than standard double reading

Known Limitations Across the Evidence Base

The Lunit INSIGHT MMG evidence base is among the strongest for any commercial mammography AI system as of mid-2026, but it carries specific limitations that clinicians and administrators must weigh before deployment decisions.

Single-country generalizability: ScreenTrustCAD was conducted at a single Swedish center using Philips mammography equipment. AI-STREAM was conducted at six Korean academic hospitals. Neither study has been replicated across diverse healthcare systems, equipment vendors, or screening program structures.
Radiologist experience dependency: ScreenTrustCAD radiologists had a median 17 years of breast imaging experience — a highly specialized cohort. AI performance in the context of less experienced readers may differ from what ScreenTrustCAD demonstrated.
Threshold calibration uncertainty: ScreenTrustCAD aimed for a 2% CDR increase but observed 4–6%. Retrospective calibration does not reliably predict prospective operating points. Ongoing calibration in clinical use is likely necessary.
Automation bias — experience-dependent risk: A Korean Journal of Radiology editorial (June 2026) synthesizing MASAI and AI-STREAM findings noted that experienced radiologists in MASAI showed no automation bias (specificity unchanged), while general radiologists in AI-STREAM showed a statistically significant recall rate increase — suggesting that automation bias risk is reader-experience-dependent and higher in lower-volume or less specialized settings.
Overdiagnosis concern — increased DCIS detection: Both ScreenTrustCAD and AI-STREAM showed increased detection of ductal carcinoma in situ (DCIS) with AI assistance. Increased DCIS detection raises unresolved questions about overdiagnosis — the detection and treatment of cancers that would not have caused clinical harm during a patient's lifetime. Long-term follow-up data sufficient to address this question are not yet available.
No published mortality endpoint: Neither ScreenTrustCAD nor AI-STREAM has reported breast cancer mortality data. CDR and recall rate are process measures, not mortality outcomes. Whether improved CDR with AI translates to reduced mortality from breast cancer remains undemonstrated for this product.
AI-STREAM is a preliminary analysis: Final AI-STREAM results with 2-year follow-up and National Cancer Registry linkage are expected after 2026. The preliminary findings should not be treated as the study's definitive conclusions.

Clinical Implications by Setting: How Workflow Design and Reader Experience Shape AI Value

The evidence does not support a single universal recommendation for how Lunit INSIGHT MMG should be deployed. The clinical implications differ substantially based on screening program structure, reader experience, and where AI is positioned in the reading workflow.

Clinical implications of Lunit INSIGHT MMG by deployment scenario, based on published prospective evidence as of Q2 2026.
Deployment Scenario	Evidence Source	Expected Benefit	Key Risk to Monitor
Double-read program: AI replaces second radiologist	ScreenTrustCAD (Lancet Digital Health 2023)	Maintained or superior CDR, reduced recall rate, ~50% workload reduction	Threshold calibration drift; arbitration rate increase
Single-read program: breast specialist with AI-CAD	AI-STREAM prospective arm (Nature Communications 2025)	13.8% higher CDR, no significant recall rate change	DCIS overdetection; final AI-STREAM results pending
Single-read program: general radiologist with AI-CAD	AI-STREAM simulation sub-study (Nature Communications 2025)	26.4% CDR increase (simulation)	Significant recall rate elevation; automation bias risk; simulation data only — not prospectively validated
Standalone AI as sole reader	ScreenTrustCAD, AI-STREAM	CDR non-inferior to specialists in both studies	Significantly higher recall rates; unresolved medical-legal responsibility; not viable as sole reader in current frameworks

For programs deploying AI alongside general radiologists, the KJR 2026 editorial recommends educational programs for correct AI use and systematic post-deployment monitoring of recall rates. The recall rate elevation observed in the AI-STREAM general radiologist simulation — from 6.31% to 6.89% — may reflect automation bias: readers accepting AI flags without sufficient independent assessment. This risk is not theoretical; it appears to be reader-experience-dependent and requires active management.

Deployment Stage and Ongoing Research

Lunit INSIGHT MMG is in broad clinical use across multiple national screening programs as of Q2 2026, including BreastScreen NSW in Australia, organized screening programs in Sweden and Iceland, and national programs in Singapore, Saudi Arabia, and Qatar. The product holds FDA 510(k) clearance (K211678), CE mark, and Health Canada authorization, and is reported to be deployed across more than 4,800 medical institutions globally.

A real-world evaluation trial (ClinicalTrials.gov: NCT06232070) is active but has not published results. No interim findings, enrollment targets, or primary endpoints from this trial are available for characterization at this time.

Final AI-STREAM results — including 2-year follow-up data linked to the National Cancer Registry — are expected after 2026. Those results will be critical for assessing whether the CDR improvements observed in the preliminary analysis translate to clinically meaningful outcomes, and whether the DCIS overdetection signal resolves into a net benefit or raises persistent overdiagnosis concerns.

Evidence gaps that remain before mortality-endpoint conclusions can be drawn: no published randomized controlled trial specific to Lunit INSIGHT MMG; no long-term follow-up data resolving the DCIS overdetection question; no prospective validation of general radiologist performance with AI-CAD (current data are simulation-only); no published data from the NCT06232070 real-world evaluation.
The MASAI trial (The Lancet, January 2026) — using Transpara AI, not Lunit — remains the only published full RCT in breast cancer screening AI and provides the only available RCT-level evidence that AI-supported screening can reduce interval cancer rates. That evidence does not transfer directly to Lunit INSIGHT MMG without product-specific RCT data.
Institutions deploying Lunit INSIGHT MMG in organized screening programs should implement prospective performance monitoring from the outset, with attention to CDR, recall rate, DCIS detection proportion, and reader-level automation bias indicators — particularly where general radiologists are the primary readers.

Lunit INSIGHT MMG: Prospective Evidence Review, Regulatory Status, and Clinical Implications for AI-Assisted Mammography Screening