
Why This Guidance Is the Operational Capstone of FDA's AI Regulatory Buildout
On January 6, 2025, FDA's Center for Devices and Radiological Health (CDRH), along with CBER and CDER, issued draft guidance titled "Artificial Intelligence-Enabled Device Software Functions: Lifecycle Management and Marketing Submission Recommendations" under docket FDA-2024-D-4488. It is the first single FDA document to provide lifecycle-spanning recommendations for all AI-enabled device software functions — from device description and labeling requirements in premarket submissions through postmarket performance monitoring plans. Nothing FDA had previously issued came close to this operational scope.
The practical significance is straightforward: by early 2026, FDA had authorized over 1,350 AI-enabled devices through established premarket pathways — roughly double the number from 2022. Yet the regulatory framework governing how manufacturers document, validate, and monitor those devices had remained fragmented across multiple guidance documents, review-level expectations, and informal feedback accumulated through individual submissions. This guidance consolidates those expectations into one document.
The guidance was long anticipated. It appeared on FDA's fiscal year 2024 "A list" — the agency's highest-priority guidance publications — before being deferred to the FY2025 list when that deadline passed. Its arrival in January 2025 was therefore not a surprise in direction, only in timing. What it delivers is a submission blueprint: nine documentation areas, explicit bias and transparency mandates, a reconciliation of terminology conflicts between the AI engineering community and FDA regulatory usage, and a framework for postmarket monitoring that begins at the time of initial submission.
Scope and Applicability: What Is an AI-DSF and Who Must Comply
The guidance centers on a defined unit of regulation: the AI-enabled Device Software Function (AI-DSF). An AI-DSF is a device software function that implements one or more AI models to achieve its intended purpose. This definition is precise and consequential — it ties the regulatory requirements to the functional role of AI within the device, not to the device category or submission pathway.
The guidance applies across all major submission pathways: 510(k), De Novo, PMA, Humanitarian Device Exemption (HDE), Biologics License Application (BLA), and Investigational Device Exemption (IDE). This breadth makes it the widest-scope AI device guidance document FDA has issued. A manufacturer submitting a low-risk AI-assisted imaging tool under 510(k) and a manufacturer seeking PMA approval for an AI-driven diagnostic system are both within scope.
While the guidance is most immediately relevant to machine learning applications — particularly deep learning and neural networks, which represent the large majority of currently authorized AI devices — its language explicitly opens the door to generative AI and large language model applications. FDA has not yet authorized any AI-DSF using generative AI or LLMs as of Q2 2026, but the guidance's framing anticipates future submissions in this space.
| Submission Pathway | Applies to AI-DSF Guidance | Notes on Performance Monitoring Plans |
|---|---|---|
| 510(k) | Yes | Performance monitoring plans recommended but generally not required |
| De Novo | Yes | Performance monitoring plans may be required as a special control |
| PMA | Yes | Performance monitoring plans may be required |
| HDE | Yes | Same lifecycle expectations apply |
| BLA | Yes | CBER-specific applications within scope |
| IDE | Yes | Applies during investigational phase |
The guidance is designed to complement, not replace, existing software device guidance — including the June 2023 premarket software guidance and the August 2025 final guidance on Predetermined Change Control Plans (PCCPs). Manufacturers should read the AI-DSF draft guidance as an overlay that specifies what AI-specific documentation is required within the framework those existing documents establish.
The Nine Documentation Areas: What FDA Expects for AI-DSFs Specifically
The structural core of the guidance is its nine documentation areas. Each represents a section of a marketing submission where AI-specific content is expected — content that goes beyond what standard software device submissions require. The nine areas are: Device Description, User Interface, Labeling, Risk Assessment, Data Management, Model Description and Development, Validation, Device Performance Monitoring, and Cybersecurity.

1. Device Description
The device description must explicitly state that the device uses AI and describe the AI model's role in achieving the device's intended purpose. This is not merely a technical disclosure — it establishes the regulatory framing for every subsequent documentation area. Submissions that describe AI functionality only implicitly, or that bury it within broader software architecture descriptions, will not satisfy this requirement.
2. User Interface
The user interface section carries more regulatory weight in an AI-DSF submission than in a standard software device submission. FDA links UI design directly to risk control — the way the interface presents AI outputs, communicates uncertainty, and supports appropriate clinician interpretation is treated as a mechanism for mitigating the risks that AI model errors create. Human factors validation for AI-DSFs therefore needs to address how users interact with model outputs specifically, not just with the device interface generally.
3. Labeling
Labeling is where the guidance introduces one of its most practically significant new requirements: the model card. Appendix E of the guidance provides a template for model card content. Appendix F provides a sample 510(k) summary that includes a model card, giving manufacturers a concrete reference for how this information should appear in a public submission summary. The model card is the transparency mechanism — it communicates the model's intended use, performance characteristics, known limitations, and subgroup performance data in a standardized format accessible to both reviewers and, ultimately, users.
4. Risk Assessment
Risk assessment for AI-DSFs must explicitly link user interface design to risk controls. This reflects FDA's recognition that many AI-related harms in clinical settings arise not from model failure alone but from how clinicians interpret and act on model outputs. The risk assessment should therefore trace how UI design decisions — such as how confidence scores are displayed or how override mechanisms are presented — function as controls against identified risks.
5. Data Management
Data management is among the most detailed sections in the guidance. FDA requires strict segregation of training, tuning, and validation datasets — these must be maintained as distinct sets with documented governance to prevent data leakage between phases. The guidance also requires a representativeness assessment: manufacturers must demonstrate that the data used to develop and validate the model adequately represents the intended use population, including assessment of non-U.S. data sources where relevant.
This representativeness requirement is the mechanism through which demographic subgroup coverage becomes a data governance issue, not just a post-hoc analysis question. If training data does not adequately represent the intended use population across age, sex, race, and ethnicity, that gap must be identified and addressed — not merely acknowledged.
6. Model Description and Development
This section requires documentation of the model architecture, decision thresholds, and calibration methodology. For deep learning models, this means explaining architectural choices — not as a theoretical exercise, but as documentation that allows reviewers to understand how the model produces its outputs and what assumptions underlie its performance claims. Threshold selection and calibration documentation are particularly important for diagnostic AI tools where operating point choices directly affect clinical sensitivity and specificity tradeoffs.
7. Validation
Validation is treated as a separate documentation area — and its content requirements are inseparable from the terminology issue addressed in a dedicated section below. Within the submission, validation must demonstrate that the device consistently fulfills its intended purpose as defined under 21 CFR 820.3(z). The validation dataset must be strictly separated from training and tuning data. Performance must be reported across clinically relevant subgroups, and the validation study design must be appropriate to the intended use population.
8. Device Performance Monitoring
This is arguably the most structurally novel requirement in the guidance. FDA explicitly recognizes that AI-enabled devices are uniquely susceptible to performance degradation over time because AI models can be particularly sensitive to changes in data inputs — a phenomenon commonly called data drift or model drift. In response, the guidance introduces an expectation that manufacturers describe their postmarket performance monitoring plan as part of the initial premarket submission.
This is a meaningful shift in FDA's regulatory posture: clearance is no longer treated as a one-time event followed by independent postmarket operation. Manufacturers are expected to commit, at the time of submission, to how they will monitor device performance in real-world deployment and what thresholds will trigger review or intervention. As noted in the pathway table above, performance monitoring plans are generally not required for 510(k) submissions but may be required as special controls for De Novo and as conditions of approval for PMA.
9. Cybersecurity
The cybersecurity section extends standard device cybersecurity requirements to address vulnerabilities specific to AI systems. Manufacturers must include AI-specific elements in the cybersecurity risk management report covering: data poisoning, model inversion and model stealing, model evasion, data leakage, overfitting, model bias, and performance drift. These are not theoretical concerns — they represent documented attack vectors and failure modes for deployed AI systems that standard software cybersecurity frameworks do not fully address.
Bias Mitigation and Transparency: How the Guidance Operationalizes These Mandates
Prior FDA guidance on AI bias had largely been aspirational — acknowledging the problem and encouraging good practice without specifying what documentation was required. The January 2025 draft guidance moves bias control from aspiration to operational requirement by tying it to data representativeness throughout the product lifecycle.
The Federal Register notice accompanying the guidance states the explicit goal: ensuring that device benefits extend to all relevant demographic groups, with age, sex, race, and ethnicity called out specifically. This is not a post-hoc reporting requirement. It is embedded in data management (representativeness of training data), validation (subgroup performance testing), and labeling (model card disclosure of subgroup performance).
The practical gap this addresses is significant. A review of FDA-approved AI/ML devices found that only 46.1% provided detailed performance studies, just 1.9% linked a scientific publication with safety and efficacy data, and approximately 9% conducted a prospective study for postmarket surveillance. The guidance's formalized requirements are a direct response to this transparency deficit in the existing device landscape.
- Data Management: Training datasets must be assessed for representativeness across the intended use population, including demographic subgroups. Gaps must be documented and addressed, not merely noted.
- Validation: Performance must be reported across clinically relevant subgroups — age, sex, race, and ethnicity — not only as aggregate metrics. Subgroup performance gaps must be disclosed.
- Labeling / Model Card: The model card (Appendix E template) must communicate known performance limitations and subgroup-specific performance data in a format accessible to clinical users.
- User Interface: UI design must support appropriate interpretation of AI outputs, including communication of uncertainty and known limitations — functioning as a risk control mechanism for bias-related errors.
- Public Submission Summary: Appendix F shows how model card content should appear in the 510(k) summary, making subgroup performance data publicly accessible through FDA's submission database.
The Validation Terminology Gap: Why AI Teams and FDA Are Not Speaking the Same Language
One of the most practically important — and most frequently mishandled — issues in AI device submissions is the word "validation." The guidance explicitly addresses this terminology divergence, and submission teams need to understand it precisely.
In AI and machine learning engineering, "validation" commonly refers to the process of evaluating model performance during development — selecting hyperparameters, comparing architectures, and tuning the model to optimize performance on a held-out dataset. This is sometimes called a "validation set" in the ML training pipeline.
FDA defines validation differently. Under 21 CFR 820.3(z), validation means "confirmation by examination and provision of objective evidence that the particular requirements for a specific intended use can be consistently fulfilled." This is a post-development confirmation activity — not a model selection or tuning activity.
| Term | AI/ML Engineering Meaning | FDA Regulatory Meaning (21 CFR 820.3(z)) |
|---|---|---|
| Validation | Model performance evaluation during development; hyperparameter tuning; architecture selection using a held-out dataset | Confirmation by examination and objective evidence that requirements for a specific intended use can be consistently fulfilled — a post-development activity |
| Validation dataset | A dataset used during training pipeline to evaluate and tune model selection | An independent dataset used to confirm fitness for intended use — must be strictly segregated from training and tuning data |
| Verification | Checking that the model implementation matches its specification | Confirmation that design outputs meet design input requirements — a distinct step from validation |
The practical consequence: data tuning or training activities should not appear in an FDA submission as part of the validation process. If a submission describes model selection, hyperparameter optimization, or architecture comparison as "validation," it will be misread by FDA reviewers — and may trigger requests for clarification or additional information that delay review.
How the January 2025 Draft Fits with the PCCP Final Guidance and QMSR
The January 2025 draft guidance does not stand alone. It is designed to work in tandem with two other major regulatory instruments that together define the current framework for AI device governance in the United States.
The first is the Predetermined Change Control Plan (PCCP) final guidance (docket FDA-2022-D-2628). Originally finalized in December 2024, this guidance was subsequently updated and is now published in its August 2025 final form. The PCCP guidance governs how manufacturers can pre-authorize planned modifications to AI-enabled devices — describing the modifications they anticipate making, the methodology for developing and validating those modifications, and an assessment of their impact — so that those changes can be implemented post-clearance without requiring a new submission for each update.
Together, the January 2025 AI-DSF draft guidance and the August 2025 PCCP final guidance form the near-complete regulatory architecture for adaptive AI devices. The AI-DSF guidance governs what must be documented in the initial submission and what postmarket monitoring commitments must be made upfront. The PCCP guidance governs how planned post-clearance modifications are pre-authorized. A manufacturer developing an AI device that will be updated over time needs both documents.
- AI-DSF draft guidance (FDA-2024-D-4488): Governs initial submission content across nine documentation areas, including the postmarket monitoring plan that must appear in the initial submission.
- PCCP final guidance, August 2025 (FDA-2022-D-2628): Governs pre-authorized planned modifications — the three-part structure of modification description, modification protocol, and impact assessment — allowing adaptive updates without new submissions.
- QMSR (Quality Management System Regulation): Effective February 2, 2026, the QMSR aligns FDA's quality system requirements with ISO 13485:2016. Its design control, data management, and adverse event reporting requirements interact directly with the documentation expectations in the AI-DSF draft guidance.
The QMSR's February 2026 effective date is particularly relevant for manufacturers currently preparing submissions. Quality system processes — design controls, data management governance, risk management plans, labeling controls — must now comply with QMSR requirements, and those same processes are the organizational infrastructure that produces the documentation the AI-DSF guidance requires. A manufacturer whose quality system is not yet aligned with QMSR will face compounding compliance gaps when preparing an AI-DSF submission.
What Manufacturers and Health Systems Need to Do Now
The guidance is non-binding in draft form. But "non-binding" does not mean "not operative." FDA reviewers are applying these expectations in submissions now, and manufacturers preparing submissions over the next 12 to 18 months should treat the nine documentation areas as the current standard for what a complete AI-DSF submission looks like.
For Device Manufacturers and Regulatory Affairs Teams
- Audit existing submission templates against the nine documentation areas. Identify which sections are absent or insufficiently AI-specific. Device Description, Data Management, and Device Performance Monitoring are the areas most likely to require new content rather than adaptation of existing software documentation.
- Establish demographic subgroup testing protocols before validation begins. Retrospectively adding subgroup analyses after primary validation is complete is technically possible but operationally difficult and may not satisfy FDA's expectation that representativeness is addressed throughout the development process.
- Adopt FDA's validation terminology in all submission documentation now. Conduct a terminology audit of existing submission templates, protocols, and SOPs. Replace AI engineering uses of "validation" (model tuning, architecture selection, hyperparameter optimization) with accurate terms before those documents enter a submission.
- Prepare model cards aligned with Appendix E even before the guidance is finalized. The model card structure FDA has proposed is specific enough to serve as a template. Building model card preparation into the development process — rather than treating it as a post-development documentation task — reduces submission preparation time and improves the quality of the output.
- Develop postmarket performance monitoring plans as part of the submission design, not as an afterthought. For De Novo and PMA submissions, these plans may be required. For 510(k) submissions, they are expected. Defining monitoring metrics, drift detection thresholds, and intervention triggers requires input from clinical, engineering, and regulatory teams and takes time to develop credibly.
- Verify QMSR alignment before finalizing any AI-DSF submission. The QMSR took effect February 2, 2026. Quality system documentation referenced in a submission must reflect QMSR-compliant processes.
For Health System Procurement Teams
The guidance creates a new set of evaluable procurement criteria that health systems should be incorporating into vendor assessments now.
- Request model cards from vendors. If a vendor has submitted under the AI-DSF framework, they should have a model card. If they cannot provide one, ask why. Absence of a model card is an informative signal about the vendor's transparency practices and submission quality.
- Ask for subgroup performance data. The guidance requires demographic subgroup testing. Vendors should be able to provide performance breakdowns by age, sex, race, and ethnicity for the intended use population. Aggregate performance metrics alone are insufficient for assessing equity and safety across your patient population.
- Evaluate postmarket monitoring commitments. Ask vendors what their postmarket performance monitoring plan covers, what metrics they track, and what thresholds trigger review or notification to health system customers. A vendor with a credible monitoring plan has made a regulatory commitment to lifecycle accountability.
- Confirm PCCP status for adaptive AI tools. If you are evaluating an AI device that will be updated over time, ask whether the vendor has an FDA-authorized PCCP. A PCCP means planned updates are pre-authorized — the absence of one means each significant update may require a new submission, creating deployment uncertainty.
Open Questions: Finalization Status, Generative AI Coverage, and What Remains Unresolved
The January 2025 draft guidance advances the regulatory architecture for AI devices further than any prior single document. But several significant questions remain open as of Q2 2026.
Finalization Status
The guidance remains in draft form. FDA accepted public comments through April 7, 2025, and held a public webinar on February 18, 2025. Comments addressed four specific questions: alignment with the AI lifecycle, adequacy for generative AI, the approach to performance monitoring as risk mitigation, and what information about AI devices should be conveyed to users. The finalization timeline has not been publicly announced. Readers should verify current status directly with the FDA guidance document page before making compliance decisions based on this analysis.
Generative AI and Large Language Models
The guidance's language opens the door to generative AI submissions, and FDA's Digital Health Advisory Committee has identified the total product lifecycle approach as the foundation for regulating GenAI-enabled medical devices. However, the TPLC framework presents genuine technical challenges when applied to LLMs. Unlike predictive models with deterministic outputs, LLM outputs are non-deterministic — even when model temperature is set to zero. This creates fundamental difficulties for the validation framework the guidance describes, which presupposes that a model can be tested against defined performance specifications.
FDA has not yet authorized any AI-DSF using generative AI or LLMs. The regulatory path for such submissions remains genuinely unresolved, and the January 2025 guidance does not provide a complete answer. Manufacturers developing LLM-based clinical tools should monitor FDA's Digital Health Center of Excellence communications closely, as additional guidance specific to generative AI applications is likely to be needed before a credible submission pathway exists.
Postmarket Monitoring: Recommended vs. Required
The guidance recommends postmarket performance monitoring plans for all submissions but does not uniformly require them. For 510(k) submissions, plans are recommended but generally not required. For De Novo and PMA submissions, they may be required as special controls or conditions of approval. This creates an asymmetry: manufacturers submitting under 510(k) may face less formal monitoring accountability than those pursuing De Novo or PMA pathways, even for devices with comparable clinical risk profiles. Whether finalization of the guidance will tighten this requirement across pathways remains to be seen.
- Finalization timeline: Not announced. The guidance remains in draft as of Q2 2026. Public comment closed April 7, 2025.
- Generative AI pathway: Not yet resolved. No GenAI or LLM-based AI-DSF has been authorized. The non-deterministic output challenge for LLMs is not addressed in the current draft.
- Postmarket monitoring binding status: Recommended for all pathways; required only for De Novo (as special control) and PMA. 510(k) monitoring plans remain non-mandatory.
- Ninth documentation area: Secondary sources differ on whether the ninth area is "Cybersecurity" alone or includes a separate "Public Submission Summary" section. Manufacturers should verify section headings directly against the FDA guidance PDF before structuring their submission.
Comments
Join the discussion with an anonymous comment.