The phrase "AI for health" covers a wide range of things that don't behave the same way in practice. A radiology AI that flags suspected pulmonary emboli on CT operates under different evidence standards, regulatory requirements, and failure modes than an LLM-based clinical documentation tool or a sepsis prediction model embedded in an ICU EHR. Treating them as a single category obscures the decisions that actually matter.
This page organizes the current state of AI across five specialties where FDA-cleared devices and peer-reviewed literature are concentrated enough to support structured comparison. The goal is to help readers locate relevant device records, evidence appraisals, and regulatory context without having to reconstruct the landscape from scratch.
Where FDA-Cleared AI Is Concentrated
As of Q2 2026, radiology accounts for the largest share of FDA-authorized AI/ML-enabled medical devices by a significant margin. Cardiology and pathology follow at a distance. Gastroenterology and primary care have smaller but growing authorization counts, with primary care's AI tools skewing toward administrative and documentation functions rather than diagnostic SaMD.
| Specialty | Authorization Concentration | Primary Use Categories | Dominant Pathway |
|---|---|---|---|
| Radiology | Highest — majority of total FDA AI/ML device authorizations | Triage, detection, measurement, workflow prioritization | 510(k) |
| Cardiology | High — second largest concentration | ECG interpretation, arrhythmia detection, imaging analysis | 510(k) |
| Pathology | Moderate — growing rapidly with digital pathology adoption | Slide analysis, cancer grading, cell counting | 510(k) and De Novo |
| Gastroenterology | Lower — concentrated in endoscopy AI | Polyp detection during colonoscopy | 510(k) and De Novo |
| Primary Care | Lower — skews toward administrative AI | Documentation (AI scribe), risk stratification, prior auth | 510(k) for clinical tools; many admin tools are not SaMD |
Radiology
Radiology is the most saturated specialty for AI deployment — and also the one with the most documented performance heterogeneity. Chest X-ray AI, CT triage tools, and mammography CAD systems have accumulated the longest post-market track records. The evidence base is real but uneven: retrospective studies vastly outnumber prospective ones, and external validation on demographically diverse populations remains the exception rather than the rule.
What's Cleared and What It Does
The bulk of cleared radiology AI falls into three functional categories: detection and flagging (e.g., intracranial hemorrhage, pulmonary embolism, pneumothorax on CT), measurement and quantification (e.g., nodule sizing, organ volume), and workflow triage (prioritizing worklist order based on suspected findings). Most of these are cleared as 510(k) devices using predicate comparisons to earlier CAD tools.
A smaller number of tools — particularly in mammography screening and chest CT lung nodule management — have gone through De Novo authorization, establishing new device classifications. These tend to have more detailed performance data in the FDA submission record.
Evidence Quality and Known Gaps
- Most published performance studies use retrospective, single-institution datasets. Performance figures from these studies often don't replicate in multi-site prospective deployments.
- External validation — testing on a dataset entirely separate from the training and tuning set — is present in a minority of published radiology AI studies.
- Demographic composition of training datasets is inconsistently reported. Known gaps include underrepresentation of darker skin tones in dermatology-adjacent imaging, and limited data from lower-resource healthcare settings outside the U.S. and Europe.
- Model drift — performance degradation as scanner hardware, imaging protocols, or patient populations shift — has been documented in deployed radiology AI but is rarely reported systematically post-market.
Active Clinical Trials
Several prospective trials are evaluating radiology AI in real workflow conditions rather than retrospective image sets. NCT05046886 (AI-assisted chest X-ray reading in emergency settings) and NCT04936776 (AI triage for CT pulmonary angiography) represent the type of prospective design the field needs more of. ClinicalTrials.gov NCT numbers for radiology AI trials can be searched directly at ClinicalTrials.gov using the intervention term "artificial intelligence" filtered by radiology condition terms.
Cardiology
Cardiology AI has two distinct clusters: ECG-based tools and imaging-based tools. These have different evidence profiles, different regulatory histories, and different deployment realities.
ECG AI
AI algorithms applied to 12-lead ECG data represent one of the more clinically mature applications in the field. Several tools have demonstrated prospective validation across large, multi-site datasets. The Mayo Clinic-developed ECG AI work — detecting conditions like low ejection fraction, atrial fibrillation, and hypertrophic cardiomyopathy from ECG waveforms — has been among the most rigorously published. Importantly, some of this work has moved from publication to prospective randomized trials, which remains rare in healthcare AI.
Consumer-grade ECG AI (single-lead, wearable) has a different evidence profile. FDA clearance exists for atrial fibrillation detection in consumer devices, but the performance data in real-world populations — particularly older adults with comorbidities, who are the highest-risk group — is thinner than the device clearance record suggests.
Cardiac Imaging AI
AI tools for echocardiography (automated chamber measurement, ejection fraction calculation) and cardiac CT (coronary artery calcium scoring, stenosis detection) are cleared and in deployment at major health systems. The echocardiography AI space has seen particularly rapid adoption because it addresses a real workflow bottleneck: manual measurement of cardiac function is time-consuming and has known inter-reader variability. Automated measurement tools can reduce that variability — but the degree to which they improve clinical outcomes, rather than just workflow efficiency, is less established.
Pathology
Digital pathology AI is at an inflection point. The underlying infrastructure requirement — whole-slide imaging scanners and the digital workflow to support them — limited adoption for years. As more pathology labs complete the transition to digital workflows, AI tools for slide analysis are moving from research settings into clinical deployment.
FDA-cleared pathology AI tools cover applications including prostate cancer grading (Gleason scoring assistance), breast cancer detection on core needle biopsy, and mitotic figure counting. Some of these have gone through De Novo authorization rather than 510(k), which means they established new regulatory classifications and required more detailed performance characterization in their submissions.
Unresolved Questions in Pathology AI
- Staining variability: AI models trained on slides from one lab's staining protocol can show degraded performance on slides from another lab. This is a known generalization problem with limited standardized solutions.
- Rare cancer types: Most validated pathology AI tools target high-prevalence cancers. Performance on rare histologic subtypes is largely uncharacterized.
- Pathologist-AI interaction: Studies show that when AI provides a concurrent read, pathologist behavior changes — sometimes improving accuracy, sometimes anchoring to the AI output even when it's wrong. The net clinical effect depends heavily on how the tool is integrated into the workflow.
- Regulatory scope: Some pathology AI tools are positioned as "decision support" rather than diagnostic devices, which affects both the FDA pathway and the evidentiary standard expected.
Gastroenterology
Gastroenterology AI is narrower in scope than radiology or cardiology, but the concentration in one application — polyp detection during colonoscopy — has produced some of the most rigorous clinical trial data in the entire healthcare AI field.
Computer-aided detection (CADe) tools for colonoscopy have been evaluated in multiple randomized controlled trials, with several reporting statistically significant improvements in adenoma detection rate (ADR) compared to unassisted colonoscopy. This is a harder evidentiary bar than most healthcare AI studies attempt. The RCT results are not uniformly positive — some trials show ADR improvements while others show no significant difference — and the clinical significance of detecting additional small adenomas remains an open debate among gastroenterologists.
Beyond Polyp Detection
AI applications in upper GI endoscopy (Barrett's esophagus surveillance, gastric cancer detection) are earlier in their regulatory and evidence trajectory. Several tools have CE marking in Europe; FDA authorizations in this sub-area are more limited. Capsule endoscopy AI — automated reading of small bowel capsule studies — addresses a genuine workflow bottleneck (a single capsule study can generate 50,000+ frames) and has cleared FDA, though prospective outcome data remains limited.
Primary Care
Primary care AI doesn't fit neatly into the diagnostic imaging paradigm that dominates the other specialties on this page. The applications are more heterogeneous: risk stratification models embedded in EHRs, AI-generated clinical documentation (AI scribes), chronic disease management tools, and administrative automation for prior authorization and scheduling.
AI Scribes and Ambient Documentation
Ambient AI documentation tools — which listen to a clinical encounter and generate a draft note — have achieved the fastest adoption rate of any AI category in primary care. Several major health systems have deployed them at scale. The driver is not clinical efficacy in the traditional sense but physician burnout: documentation burden is a well-documented contributor to burnout, and tools that reduce it have strong adoption incentives regardless of whether they've been evaluated in RCTs.
Most ambient documentation tools are not regulated as SaMD because they generate draft notes for physician review rather than making autonomous clinical decisions. This means they don't appear in FDA device records — which creates a regulatory gap. Hallucination risk (AI-generated text that sounds plausible but contains factual errors) is a real concern in this category. A note that misattributes a medication, documents a finding that wasn't discussed, or fabricates a patient statement could reach the permanent medical record if the reviewing physician doesn't catch it.
Risk Stratification in EHRs
Predictive models embedded in EHR platforms — flagging patients at risk for deterioration, readmission, or specific conditions — have been deployed widely but studied inconsistently. The Epic Deterioration Index and similar tools have real-world deployment data, but independent prospective validation is limited. Several published studies have documented algorithmic bias in commercial risk stratification tools: a widely cited 2019 study in Science found that a commercial algorithm used across health systems systematically underestimated the care needs of Black patients relative to equally sick white patients, because it used healthcare cost as a proxy for health need.
Cross-Specialty Bias and Equity Concerns
Algorithmic bias in healthcare AI is not a hypothetical risk — it has been documented in deployed tools across multiple specialties. The mechanisms vary: training data that underrepresents certain populations, proxy variables that encode historical inequities, and performance metrics that don't disaggregate results by race, sex, age, or socioeconomic status.
| Specialty | Documented Bias Concern | Mechanism | Evidence Status |
|---|---|---|---|
| Radiology | Lower sensitivity for certain findings in non-white patients on chest X-ray AI | Training dataset skewed toward academic medical center populations | Published retrospective analyses; limited prospective confirmation |
| Cardiology | ECG AI trained on predominantly white, male populations; performance gaps in women and non-white patients for some conditions | Historical ECG dataset composition | Published; flagged in multiple validation studies |
| Pathology | Staining protocol variation disproportionately affects labs in lower-resource settings | Model brittleness to domain shift; not a demographic bias per se | Documented in multi-site validation studies |
| Gastroenterology | Polyp detection ADR improvements less consistent in lower-volume endoscopists; limited data on diverse patient populations | Most RCTs conducted in high-volume academic centers | Partially documented; ongoing trials addressing this |
| Primary Care | Risk stratification models underestimate care needs of Black patients; documented in commercial EHR-embedded tools | Cost as proxy for need; historical utilization data encodes access inequity | Published peer-reviewed evidence (Obermeyer et al., Science 2019) |
One structural problem: most FDA submissions do not require disaggregated performance data by race, ethnicity, or sex as a condition of clearance. Some submissions include it voluntarily; many don't. This means the FDA device record often cannot answer the question of whether a tool performs equally across patient populations.
How to Use This Landscape Page
This page is a navigation layer, not a standalone reference. The structured information lives in the linked records.
- To verify whether a specific tool is FDA-cleared and for what indication, go to the FDA AI Device Records section and filter by specialty.
- To evaluate the quality of evidence behind a specific tool or application, see the Evidence Appraisals section, which covers study design, dataset composition, and limitations.
- To track how FDA regulatory policy on AI/ML devices has changed over time, the Regulatory Tracker maintains a chronological record of guidance documents and policy actions.
- For real-world deployment accounts — how tools actually perform outside controlled study conditions — see Clinical Deployment Reports.
Discussion
Clinical experience, implementation questions, and workflow observations from clinicians and administrators are welcome.
Comments
Join the discussion with an anonymous comment.