Clinical Evaluation for SaMD and AI-Based Medical Devices: Applying MDR Principles to Algorithms
6 min read

The introduction of the EU Medical Device Regulation 2017/745 has reinforced and expanded clinical evaluation expectations for Software as a Medical Device (SaMD) and AI-based technologies. While MDR principles apply equally to software and hardware, software can no longer rely solely on simplified performance metrics; instead, clinical evaluation must be anchored in robust clinical evidence supported by appropriate software verification and validation activities and clinical validation that demonstrates clinical performance, benefit–risk acceptability and evidence sufficiency across the device lifecycle.

Notified Bodies increasingly scrutinise how manufacturers establish and maintain clinical performance, benefit–risk acceptability and lifecycle evidence sufficiency for SaMD and AI-based devices. Effective application of MDR principles to algorithm-based technologies therefore requires a clear understanding of regulatory expectations and a structured, lifecycle-oriented approach to clinical evidence generation and maintenance.

MDR Clinical Evaluation Principles Applied to Software

Under MDR, clinical evaluation must demonstrate that a device achieves its intended medical purpose and maintains an acceptable benefit–risk profile. This process is primarily mentioned in MDCG 2020-1 and is further aligned with internationally recognized IMDRF guidance on Software as a Medical Device (SaMD) clinical evaluation, which emphasizes clinical relevance, valid clinical association and evidence-based performance claims. Clinical evaluation must therefore focus on how software outputs support clinically meaningful decisions rather than on technical performance alone.

Notified Bodies expect manufacturers to articulate how software functions translate into diagnostic, therapeutic or decision-support benefits for the intended users. This includes a clear definition of intended use, target population, clinical context and user interaction. Ambiguity in these areas often leads to challenges during conformity assessment.

Defining Clinical Performance for Algorithms

One of the most critical challenges in SaMD clinical evaluation is defining and measuring clinical performance. Unlike traditional devices, software performance cannot be assessed solely through physical characteristics or bench testing.

Notified Bodies expect clinical performance to be demonstrated using appropriate endpoints that reflect real-world clinical use. This may include diagnostic accuracy, sensitivity and specificity, clinical decision impact or patient outcome measures, depending on the intended purpose of the software.

Performance metrics must be justified within the context of current clinical practice and state of the art. This reinforces the importance of defensible clinical and performance evaluation approaches, where software validation is grounded in clinical evidence rather than technical benchmarks alone.

Clinical Evidence Sources for SaMD and AI Devices

Clinical evidence for SaMD may be generated from multiple sources including clinical investigations, literature, retrospective data analyses, real-world evidence and post-market data. Notified Bodies assess whether the chosen evidence sources are appropriate for the software’s risk class, novelty and clinical claims.

Where literature is used, manufacturers must demonstrate that published data is directly applicable to the specific software version, algorithm logic and intended use. This requirement often necessitates robust medical device literature search protocols and critical reviews to ensure relevance and reproducibility of the findings.

For AI-based devices, access to training and validation datasets, as well as transparency around algorithm development, plays a significant role in clinical evaluation defensibility.

State of the Art and Algorithm Benchmarking

Defining state of the art (SOTA) is particularly complex for SaMD and AI technologies due to rapid innovation and evolving clinical practice. Notified Bodies expect SOTA to reflect not only comparable software solutions, but also alternative clinical pathways and non-software interventions.

Benchmarking algorithm performance against SOTA requires careful justification of comparator selection and performance thresholds. Weak or outdated SOTA definitions can undermine claims of clinical benefit and raise questions about evidence sufficiency.

These challenges reinforce the importance of maintaining SOTA definitions as living elements within the Clinical Evaluation Report, aligned with evolving clinical standards and technological capabilities.

Article 61 and Evidence Sufficiency for SaMD

Demonstrating sufficient clinical evidence under Article 61 is a critical requirement for SaMD and AI-based devices. Notified Bodies assess whether available clinical data adequately supports safety and performance claims, considering the software’s level of autonomy, clinical impact and risk profile.

For adaptive or continuously learning algorithms, manufacturers must explain how evidence sufficiency is maintained over time, particularly when software updates may alter performance characteristics. This expectation is consistent with MDCG 2020-6 guidance on sufficient clinical evidence, which emphasizes proportionality and lifecycle-based evidence assessment.

Managing Software Updates and Lifecycle Changes

Software updates represent one of the most significant regulatory challenges for AI-based medical devices, as Notified Bodies expect manufacturers to rigorously assess whether modifications impact clinical performance, safety or benefit–risk acceptability.

Under MDCG 2020-3, a structured change management process is essential to ensure that updates triggering significant clinical impact such as model retraining are documented within the clinical evaluation and supported by appropriate evidence. Integrating these activities within broader medical device lifecycle management practices helps maintain traceability and regulatory confidence.

Post-market surveillance (PMS) and post-market clinical follow-up (PMCF) plays a vital role in monitoring real-world software performance and identifying emerging risks.

Post-Market Evidence for SaMD and AI Devices

Under MDR, post-market evidence is a core component of clinical evaluation for SaMD. Notified Bodies expect PMS systems to capture performance trends, user feedback and real-world outcomes associated with software to ensure the algorithm remains safe and effective in clinical practice.

PMCF activities as outlined in the MDCG 2020-7 are necessary to address residual uncertainties, particularly for AI systems deployed in complex or high-risk clinical settings. These expectations are consistent with broader regulatory perspectives from the European Medicines Agency on digital health and artificial intelligence, which highlight the importance of lifecycle oversight for adaptive technologies.

Conclusion: Applying MDR Clinical Evaluation Principles to Algorithms

While MDR principles apply consistently across medical device technologies, SaMD and AI-based devices require careful adaptation of traditional clinical evaluation methodologies. Notified Bodies expect manufacturers to demonstrate clear clinical relevance, defensible performance validation and ongoing evidence sufficiency throughout the software lifecycle.

Manufacturers that approach clinical evaluation for SaMD as a continuous, evidence-driven process integrated with post-market surveillance and change management are better positioned to meet regulatory expectations and sustain compliance. Applying MDR principles thoughtfully to algorithms is essential not only for regulatory approval, but for maintaining trust in software-driven clinical decision-making.

How Freyr Supports Clinical Evaluation for SaMD and AI Devices

Clinical evaluation for SaMD and AI-based devices requires specialised expertise at the intersection of clinical science, regulatory strategy and software development. Freyr supports manufacturers in developing defensible Clinical Evaluation Reports, defining appropriate clinical performance endpoints and aligning evidence strategies with MDR and international guidance.

Freyr’s experts assist with literature-based evaluations, clinical evidence gap assessments, post-market evidence integration and Notified Body interactions for software-driven technologies. For support with SaMD or AI clinical evaluation, CER development or evidence strategy under EU MDR, speak to a Freyr expert to discuss your regulatory challenges.

Frequently Asked Questions (FAQs)

Subscribe to Freyr Blog

Privacy Policy