Technical Documentation (Annex IV)

EXAMPLE — this document shows the structure and quality of a generated draft. Based on the fictional AI system “MedAssist AI”.

Technical Documentation (Annex IV)

Technical documentation per EU AI Act — Article 11 + Annex IV

1. General description of the AI system
2. Design and development process
3. Data and data governance
4. Testing and validation
5. Risk management system
6. Human oversight measures
7. Transparency and information to users
8. Accuracy, robustness and cybersecurity
9. Post-market monitoring plan

1. General description of the AI system

Article 11 + Annex IV, Section 1

1.1 Purpose of the AI System

DermaScan AI is a medical device AI system designed to analyze dermoscopic images of skin lesions and provide classification support for clinical decision-making. The system's primary purpose is to assist general practitioners in the early detection of skin cancer, with particular focus on melanoma identification.

The AI system processes dermoscopic images through automated analysis algorithms to classify skin lesions into two primary categories: benign or malignant. Upon completion of the image analysis, the system generates a quantitative risk score ranging from 0 to 100, where higher scores indicate increased likelihood of malignancy. This numerical assessment is accompanied by a clinical recommendation advising whether the patient should be referred to a specialist for further evaluation and potential biopsy procedures.

The intended clinical workflow involves general practitioners capturing or uploading dermoscopic images of suspicious skin lesions into the DermaScan AI system. The system then applies its trained algorithms to analyze morphological features, color patterns, and other relevant characteristics within the lesion imagery. The resulting classification output, risk score, and referral recommendation are presented to the healthcare provider to support their clinical decision-making process regarding patient care pathways.

This AI system is specifically designed to operate as a clinical decision support tool rather than as a replacement for medical judgment. The system's recommendations are intended to augment the diagnostic capabilities of general practitioners, particularly those who may have limited specialized training in dermatology, thereby potentially improving early detection rates of skin cancer in primary care settings.

The scope of the system's purpose encompasses screening and triage functions within the broader melanoma detection workflow, supporting healthcare providers in identifying patients who would benefit from specialist referral while potentially reducing unnecessary referrals for clearly benign lesions.

1.2 Provider

The provider of this AI system is MedVision AB, a corporation established under Swedish law and headquartered in Stockholm, Sweden. In accordance with Article 11 of the EU AI Act and Annex IV, Section 1, MedVision AB assumes the responsibilities and obligations of a provider as defined under the regulation, including ensuring compliance with applicable requirements for high-risk AI systems throughout the system's lifecycle.

MedVision AB maintains its principal place of business at its Stockholm headquarters, from which it oversees the development, deployment, and ongoing maintenance of the AI system. As the provider, MedVision AB is responsible for the design, development, and placing on the market of the AI system, ensuring that it meets all applicable conformity requirements before being made available to users within the European Union.

The company's role as provider encompasses the establishment and maintenance of the quality management system, risk management procedures, technical documentation, and post-market monitoring activities as required under the EU AI Act. MedVision AB also maintains ultimate responsibility for ensuring the AI system's compliance with applicable harmonized standards and conformity assessment procedures.

1.3 Version

The current version of the AI system is v2.1.0. This version identifier follows semantic versioning conventions and represents the specific iteration of the system that is subject to the requirements under Article 11 and Annex IV, Section 1 of the EU AI Act.

The version designation encompasses all core algorithmic components, training datasets, model parameters, and associated software modules that constitute the complete AI system implementation. Version v2.1.0 includes any updates, modifications, or improvements made to previous iterations while maintaining compatibility with the documented intended purpose and operational specifications.

This version information serves as a critical reference point for regulatory compliance, ensuring traceability of the specific AI system configuration that has undergone conformity assessment procedures and meets the applicable requirements for high-risk AI systems as defined under the EU AI Act.

1.4 Interaction with other software/hardware

The AI system operates within a complex healthcare technology ecosystem, requiring integration with multiple software and hardware components to deliver clinical decision support functionality.

1.4.1 Electronic Health Record Integration

The system integrates with regional electronic health record (EHR) systems through standardized healthcare interoperability protocols. Specifically, the system connects to Cosmic and Millennium EHR platforms via Fast Healthcare Interoperability Resources (FHIR) API interfaces. This integration enables bidirectional data exchange, allowing the system to access relevant patient information and return diagnostic results directly to the patient's electronic health record as clinical decision support recommendations.

The FHIR API integration ensures compliance with healthcare data exchange standards and maintains data integrity throughout the diagnostic workflow. Results generated by the AI system are written back to the patient's health record in a structured format that healthcare professionals can readily interpret and act upon.

1.4.2 Image Acquisition Hardware

The system processes dermoscopic images captured through specialized medical imaging hardware. Standard dermoscopy equipment, such as the DermLite DL5 camera system, serves as the primary image acquisition device. These cameras connect to tablets or computer workstations that interface with the AI system, enabling healthcare professionals to capture high-quality dermoscopic images suitable for automated analysis.

The image acquisition workflow requires the dermoscopic camera to be properly calibrated and connected to a computing device capable of transmitting images to the AI system for processing. This hardware configuration ensures that captured images meet the technical specifications necessary for accurate AI-based diagnostic analysis.

1.4.3 Image Storage and Management

The system interfaces with internal Picture Archiving and Communication System (PACS) servers for comprehensive image storage and management. This integration ensures that all processed dermoscopic images are systematically archived according to medical imaging standards and regulatory requirements. The PACS integration maintains traceability of diagnostic images and supports audit capabilities required for medical device compliance.

The connection to PACS infrastructure enables long-term storage, retrieval, and management of dermoscopic images processed by the AI system, supporting both immediate clinical needs and retrospective analysis requirements.

1.5 Delivery Model

The AI system is delivered as a cloud-based Software as a Service (SaaS) solution. Under this delivery model, the system operates entirely within the provider's cloud infrastructure, with users accessing functionality through web-based interfaces and API endpoints without requiring local installation or maintenance of the underlying AI components.

The SaaS deployment architecture ensures that all core AI processing, model inference, and data handling operations are executed on provider-managed cloud servers. Users interact with the system through secure web applications and programmatic interfaces, while the provider maintains full control over system updates, security patches, and infrastructure scaling. This centralized approach enables consistent performance monitoring, compliance oversight, and rapid deployment of improvements across the entire user base.

The cloud-based delivery model facilitates compliance with the technical documentation requirements outlined in Article 11 and Annex IV, Section 1 of the EU AI Act, as it provides centralized logging, audit trails, and standardized operational procedures that support regulatory oversight and risk management activities.

1.6 Hardware Requirements

The AI system operates within a distributed hardware architecture that encompasses client-side devices for data acquisition and interaction, as well as cloud-based infrastructure for computational processing. This configuration ensures compliance with the technical specifications outlined in Annex IV, Section 1 of the EU AI Act regarding hardware and software interaction requirements.

1.6.1 Client-Side Hardware Requirements

The system requires client-side hardware consisting of two primary components for proper operation. Users must have access to a tablet or computer equipped with a modern web browser capable of supporting the system's user interface and data transmission protocols. Additionally, dermatoscopic image acquisition requires a dermatoscope fitted with a camera adapter, with the DermLite DL5 serving as the reference specification or equivalent devices meeting comparable technical standards.

The web browser requirement encompasses standard compatibility with contemporary HTML5, CSS3, and JavaScript implementations to ensure proper rendering of the user interface and secure data transmission capabilities. The dermatoscope hardware specification ensures standardized image quality and metadata capture necessary for the AI system's analytical processes.

1.6.2 Server Infrastructure Requirements

The computational backend of the AI system operates on a GPU cluster infrastructure hosted within Microsoft Azure's Sweden Central region. The server configuration utilizes NVIDIA A10G graphics processing units, which provide the necessary computational capacity for the AI model's inference operations and data processing requirements.

This cloud-based infrastructure arrangement ensures scalable computational resources while maintaining data processing within the European Economic Area, supporting compliance with applicable data protection regulations and the geographical requirements that may apply to high-risk AI systems under the AI Act framework.

1.7 Intended Users

The AI system is designed for use by physicians as the primary intended user category. These medical professionals represent qualified healthcare practitioners who possess the necessary medical training, clinical expertise, and professional credentials required to operate the system within their scope of medical practice.

The physician user base encompasses medical doctors who have completed formal medical education, hold valid medical licenses in their respective jurisdictions, and maintain current professional standing within the healthcare system. These users are expected to possess the clinical knowledge and decision-making capabilities necessary to interpret the AI system's outputs within the context of patient care and medical diagnosis or treatment protocols.

The system's design acknowledges that physician users operate within regulated healthcare environments and are bound by professional medical standards, ethical guidelines, and clinical protocols. The user interface and system functionality are tailored to integrate with established medical workflows and support evidence-based clinical decision-making processes that physicians employ in their professional practice.

As required under Article 11 and Annex IV, Section 1 of the EU AI Act, this user specification ensures that the system's intended purpose aligns with the professional capabilities and regulatory context of the designated user group, supporting appropriate deployment and responsible use within healthcare settings.

1.8 Affected Persons

The AI system directly affects patients who are the primary subjects of the system's processing and decision-making capabilities. Patients represent the core population whose data, health conditions, and treatment outcomes are processed by the system to generate outputs that may influence their healthcare delivery.

As affected persons under Article 11 and Annex IV, Section 1 of the EU AI Act, patients are individuals whose rights, health, safety, or fundamental freedoms may be materially impacted by the system's operation. The system processes patient-related information and generates outputs that may be used in clinical decision-making processes, treatment planning, diagnostic support, or other healthcare interventions that directly concern patient welfare.

The patient population affected by this system encompasses individuals seeking healthcare services within the system's operational scope, including those undergoing diagnostic procedures, treatment monitoring, or other clinical processes where the AI system's outputs may inform healthcare providers' decisions or recommendations.

2. Design and development process

Annex IV, Section 2

2.1 AI/ML Technique

The AI system employs a combination of deep learning neural networks and supervised machine learning techniques as its core artificial intelligence methodology. This hybrid approach leverages the representational power of deep neural architectures while utilizing supervised learning paradigms to ensure reliable and predictable system behavior through labeled training data.

The deep learning component utilizes multi-layered neural networks capable of automatically learning hierarchical feature representations from input data. These neural architectures are designed to progressively extract increasingly complex patterns and abstractions through their successive layers, enabling the system to capture intricate relationships within the data that would be challenging to encode through traditional rule-based approaches.

The supervised learning methodology provides the foundational training framework, wherein the neural networks are trained using labeled datasets containing input-output pairs. This approach ensures that the system learns to map inputs to desired outputs based on ground truth labels, facilitating measurable performance evaluation and enabling systematic optimization of model parameters through backpropagation and gradient descent algorithms.

The integration of these techniques allows the system to combine the automated feature learning capabilities of deep neural networks with the structured, goal-oriented learning process characteristic of supervised learning. This combination provides both the flexibility to handle complex, high-dimensional data and the reliability associated with training on verified labeled examples.

The specific neural network architectures employed, detailed training procedures, computational requirements, and performance optimization strategies are elaborated in subsequent sections of this documentation in accordance with Annex IV, Section 2 requirements for comprehensive system design specification.

2.2 Decision logic

The AI system implements a structured decision-making process designed to support dermatological diagnosis through automated image analysis and risk stratification. The system's decision logic follows a sequential pipeline architecture that processes dermoscopic images to generate diagnostic recommendations with associated confidence measures.

2.2.1 Input processing and preprocessing

The decision logic initiates when a physician uploads a dermoscopic image through the web interface. The system applies a standardised preprocessing pipeline to ensure consistent input characteristics across all analyses. Images undergo automatic resizing to a fixed resolution of 299×299 pixels, which corresponds to the input requirements of the underlying neural network architecture. The preprocessing stage incorporates colour normalisation procedures to compensate for variations in imaging equipment and lighting conditions. Additionally, the system performs automated quality assessment checks, evaluating image sharpness and lighting adequacy to determine whether the input meets the minimum quality standards required for reliable analysis.

2.2.2 Feature extraction and classification architecture

The core decision logic employs an EfficientNet-B4 convolutional neural network model for automated feature extraction from the preprocessed dermoscopic images. This architecture has been selected for its demonstrated performance in medical image classification tasks while maintaining computational efficiency. The extracted features are processed through a classification head equipped with a softmax activation function, which generates probability distributions across seven distinct diagnostic categories: melanoma, basal cell carcinoma, squamous cell carcinoma, benign nevus, dermatofibroma, vascular lesion, and actinic keratosis.

2.2.3 Risk stratification and recommendation generation

The system's decision logic incorporates a risk scoring mechanism that transforms the raw classification probabilities into a standardised risk score ranging from 0 to 100. This score is calculated based on the malignancy probabilities derived from the classification output, with higher scores indicating increased likelihood of malignant conditions requiring urgent clinical attention. The risk stratification algorithm categorises cases into three distinct recommendation tiers: green status for scores below 30, indicating low concern; yellow status for scores between 30 and 70, suggesting moderate concern requiring clinical evaluation; and red status for scores exceeding 70, indicating high concern necessitating immediate clinical assessment.

2.2.4 Output generation and presentation

The final stage of the decision logic generates comprehensive output for physician review. The system presents the three most likely diagnostic categories based on the classification probabilities, accompanied by their respective confidence levels. Each recommendation tier includes specific action suggestions aligned with clinical best practices for the corresponding risk level. This structured output format ensures that physicians receive both quantitative assessment data and qualitative guidance to support their clinical decision-making process while maintaining clear visibility into the system's confidence in its recommendations.

2.3 Technical Architecture

The high-risk AI system implements a distributed microservices architecture designed to ensure scalable, reliable, and maintainable operation in compliance with the technical requirements specified in Annex IV, Section 2 of the EU AI Act.

2.3.1 System Architecture Overview

The technical architecture consists of four primary components operating in a cloud-native environment on Microsoft Azure infrastructure:

Frontend Layer: The user interface is implemented using React, a JavaScript library for building interactive web applications. This component handles user interactions, data input validation, and presentation of AI system outputs to end users.

Backend Services: The application logic is built on Python using the FastAPI framework, which provides high-performance API endpoints for handling requests, implementing business logic, and orchestrating communication between system components. FastAPI's automatic API documentation and validation capabilities support system transparency and maintenance requirements.

AI Model Serving: The core machine learning functionality utilizes PyTorch framework with an EfficientNet-B4 architecture served through TorchServe, NVIDIA's model serving solution. This configuration enables scalable inference capabilities while maintaining model versioning and monitoring capabilities essential for high-risk AI systems.

Infrastructure: The entire system operates on Azure Kubernetes Service (AKS), providing container orchestration, automatic scaling, and high availability features necessary for production deployment of high-risk AI systems.

2.3.2 Data Flow Architecture

The system implements a structured data flow designed to ensure data integrity and traceability:

Data storage is managed through two specialized components: Azure Blob Storage serves as the primary repository for image data and related multimedia content, while PostgreSQL database maintains structured data, user records, system logs, and audit trails required for compliance monitoring.

Request processing follows a defined pathway where user inputs are received through the React frontend, validated and processed by the FastAPI backend, submitted to the TorchServe inference engine for AI model evaluation, with results returned through the same pathway while maintaining comprehensive logging at each stage.

2.3.3 Key Architectural Design Choices

The selection of EfficientNet-B4 as the core neural network architecture reflects a balance between computational efficiency and model accuracy appropriate for the system's intended use case. This convolutional neural network architecture provides state-of-the-art image classification performance while maintaining reasonable computational requirements for real-time inference.

The containerized deployment on Azure Kubernetes Service enables horizontal scaling capabilities essential for managing variable computational loads, while providing the isolation and resource management necessary for maintaining system stability and security in a high-risk AI application context.

The separation of concerns between frontend presentation, backend logic, model inference, and data storage components supports system maintainability and enables independent scaling and updates of individual system components without disrupting overall system operation.

2.4 Third-party components

The AI system incorporates open source software components as part of its technical architecture. These third-party components have been selected to support the system's functionality while maintaining compliance with the requirements established under Annex IV, Section 2 of the EU AI Act.

The integration of open source components follows established software engineering practices for component evaluation, integration testing, and ongoing maintenance. Each incorporated component undergoes assessment for compatibility with the system's design specifications and overall architecture.

[NEEDS COMPLETION: specific identification of open source components, their functions, versions, licensing terms, and integration methodology]

[NEEDS COMPLETION: documentation of due diligence processes for third-party component selection and security assessment]

[NEEDS COMPLETION: maintenance and update procedures for third-party dependencies]

2.5 Key design choices

The development team adopted a pragmatic approach to system design, selecting architectural patterns, algorithms, and technical solutions based on empirical performance evaluation and demonstrated effectiveness for the specific use case requirements. This methodology ensured that design decisions were grounded in measurable outcomes rather than theoretical preferences.

The key design choices encompassed several critical dimensions of system architecture and functionality. The selection criteria prioritized solutions that demonstrated superior performance in controlled testing environments, exhibited robust behavior under varying operational conditions, and maintained consistency with the system's intended purpose and risk profile.

[NEEDS COMPLETION: specific design choices, rationale for each choice, alternative options considered, technical justification for selected approaches, and how choices align with system requirements and risk mitigation objectives]

2.6 Computational resources

The AI system utilizes Graphics Processing Unit (GPU) technology as its primary computational infrastructure for both training and inference operations. This hardware selection aligns with the system's processing requirements and ensures adequate computational capacity for the AI model's operation.

The GPU-based architecture provides the necessary parallel processing capabilities required for the AI system's computational workload. This hardware configuration supports the system's operational requirements while maintaining compliance with the technical specifications outlined in the system design.

[NEEDS COMPLETION: specific GPU specifications, models, memory requirements, and distinction between training and inference computational needs]

3. Data and data governance

Annex IV, Section 2 + Article 10

3.1 Training Data Description

The AI system's training dataset comprises 185,000 dermoscopic images systematically distributed across seven distinct diagnostic categories. This comprehensive dataset forms the foundation for the system's machine learning algorithms and directly impacts the system's diagnostic capabilities and performance characteristics.

3.1.1 Dataset Composition and Sources

The training data originates from three primary sources, each contributing to the dataset's clinical validity and geographic diversity:

International Skin Imaging Collaboration (ISIC) Archive: Approximately 120,000 images representing the largest component of the training dataset, providing international scope and standardized imaging protocols
Karolinska University Hospital: Approximately 40,000 histopathologically verified images contributing to the dataset's clinical rigor and Northern European population representation
Sahlgrenska University Hospital: Approximately 25,000 histopathologically verified images further enhancing the clinical validation and regional diversity of the training data

3.1.2 Ground Truth Verification

All 185,000 images in the training dataset possess histopathologically verified diagnoses, establishing a robust ground truth foundation essential for supervised learning. This histopathological verification represents the gold standard in dermatological diagnosis, where tissue samples are examined under microscopy by qualified pathologists to confirm the precise nature of skin lesions. The inclusion of only histopathologically verified cases ensures that the training labels are based on definitive diagnostic evidence rather than clinical assessment alone.

3.1.3 Diagnostic Category Distribution

The training data is systematically distributed across seven diagnostic categories, providing comprehensive coverage of the clinical conditions the AI system is designed to identify. This categorical distribution enables the system to learn distinguishing features across different dermatological conditions and supports the development of multi-class classification capabilities.

The dataset's substantial size of 185,000 images provides sufficient statistical power for training deep learning algorithms while supporting robust validation and testing procedures. The combination of international and clinical sources ensures both broad applicability and clinical relevance of the training data, aligning with the requirements for high-quality datasets as specified in Article 10 and Annex IV, Section 2 of the EU AI Act.

3.2 Data collection methodology

The AI system utilizes a dual-source data collection approach comprising both open/public datasets and proprietary data collection activities conducted by the provider organization.

3.2.1 Open and public data sources

The system incorporates datasets that are publicly available and accessible without proprietary restrictions. These open data sources are selected based on their relevance to the system's intended purpose, data quality standards, and compliance with applicable licensing terms. The provider maintains documentation of all open data sources utilized, including their origins, licensing conditions, and any limitations on use or redistribution.

Public datasets are evaluated for completeness, accuracy, and representativeness before integration into the system's training, validation, or testing datasets. The provider implements systematic procedures to verify the authenticity and integrity of open data sources, including validation of data provenance and assessment of potential quality issues inherent to publicly available datasets.

3.2.2 Proprietary data collection

The provider conducts independent data collection activities to supplement open data sources and address specific requirements of the AI system. These collection activities are designed and implemented in accordance with applicable data protection regulations and industry best practices for data quality and integrity.

Proprietary data collection follows established protocols that ensure systematic and consistent data acquisition processes. The provider maintains detailed records of collection methodologies, including the criteria for data selection, collection instruments or tools employed, and quality assurance measures implemented during the collection process.

3.2.3 Collection governance and oversight

Both open data utilization and proprietary data collection activities are subject to governance frameworks established pursuant to Article 10 of the EU AI Act. These frameworks ensure that data collection methods align with the system's intended purpose and maintain appropriate standards for data quality, bias mitigation, and regulatory compliance as specified in Annex IV, Section 2.

The provider implements systematic review processes to evaluate the continued suitability of data collection methods and to identify opportunities for improvement in data quality, representativeness, and compliance with evolving regulatory requirements.

3.3 Data preparation

The AI system's datasets underwent comprehensive data preparation procedures to ensure quality, consistency, and regulatory compliance in accordance with Article 10 of the EU AI Act and the requirements specified in Annex IV, Section 2.

3.3.1 Data cleaning procedures

Data cleaning operations were implemented to address data quality issues and ensure the integrity of the training, validation, and testing datasets. The cleaning process involved the systematic identification and remediation of inconsistencies, errors, and anomalies within the collected data.

[NEEDS COMPLETION: specific data cleaning methods, procedures for handling missing values, outlier detection and treatment, duplicate removal processes, and quality assurance measures]

3.3.2 Data preprocessing and transformation

[NEEDS COMPLETION: detailed description of data preprocessing steps including normalization, standardization, feature engineering, data formatting, and any transformations applied to prepare data for model training]

3.3.3 Data validation and quality control

[NEEDS COMPLETION: validation procedures to verify data quality post-cleaning, quality metrics applied, validation criteria, and processes for ensuring data integrity throughout the preparation phase]

3.3.4 Bias identification and mitigation during preparation

As required under Article 10, the data preparation phase incorporated systematic procedures for identifying and addressing potential sources of bias that could affect the AI system's performance or lead to discriminatory outcomes.

[NEEDS COMPLETION: specific bias detection methods used during data preparation, identified bias sources, mitigation strategies implemented, and documentation of bias remediation measures]

3.4 Annotation and Labelling

The AI system relies on training datasets where all images have undergone verified label assignment. This labelling process forms a critical component of the data preparation methodology required under Annex IV, Section 2 of the EU AI Act, ensuring that the system's learning foundation meets the accuracy and reliability standards necessary for high-risk AI applications.

3.4.1 Label Verification Process

The verification of image labels constitutes a systematic quality assurance mechanism designed to validate the accuracy and consistency of annotations applied to training data. This verification process ensures compliance with Article 10's requirements for appropriate data governance measures, particularly regarding the reliability and representativeness of datasets used for training purposes.

The verified labelling approach encompasses validation protocols that confirm each image annotation corresponds accurately to the depicted content, thereby minimizing the risk of mislabelled training examples that could compromise system performance or introduce unwanted biases into the learning process.

3.4.2 Annotation Standards and Consistency

The labelling framework maintains consistent annotation standards across the entire image dataset, ensuring uniformity in how visual content is categorized and described. This standardization supports the system's ability to learn meaningful patterns while reducing variability that could arise from inconsistent labelling practices.

The verification process validates that labels adhere to predefined taxonomies and classification schemes, contributing to the overall data quality objectives outlined in the AI Act's data governance requirements.

3.5 Bias identification and mitigation measures

The Provider has implemented systematic bias identification and mitigation procedures as required under Article 10 of the AI Act and detailed in Annex IV, Section 2. A comprehensive bias assessment has been performed across all datasets utilized in the development and operation of the AI system.

3.5.1 Bias assessment methodology

The bias evaluation encompasses systematic examination of training, validation, and testing datasets to identify potential sources of algorithmic bias that could lead to discriminatory outcomes or unfair treatment of specific population groups. The assessment methodology addresses both statistical bias in data representation and potential societal biases that may be embedded within the dataset characteristics.

3.5.2 Bias detection procedures

The bias identification process includes analysis of data distribution patterns, demographic representation, and potential correlations between protected characteristics and target variables. Statistical measures are applied to quantify representation gaps and identify systematic patterns that could result in biased system performance across different user groups or use cases.

3.5.3 Mitigation measures implemented

Following bias identification, appropriate mitigation strategies have been deployed to address detected biases and minimize their impact on system performance. These measures are integrated into the data governance framework established pursuant to Article 10, ensuring ongoing monitoring and correction of bias-related issues throughout the system lifecycle.

[NEEDS COMPLETION: specific bias types identified, detailed mitigation techniques applied, quantitative bias metrics, and ongoing monitoring procedures]

3.6 Bias handling

The Provider has conducted bias assessment procedures for the AI system's datasets in accordance with Article 10(2)(f) of the EU AI Act and the requirements set forth in Annex IV, Section 2(c). The bias evaluation process has been completed and the assessment indicates that the datasets are suitable for the intended purpose without significant bias concerns.

3.6.1 Bias identification methodology

The Provider employed systematic bias detection methods to evaluate potential sources of bias across all training, validation, and testing datasets. The assessment examined both statistical bias patterns and potential discriminatory outcomes that could affect the AI system's performance across different demographic groups and use cases.

3.6.2 Assessment results

Following comprehensive evaluation, the bias assessment concluded that the datasets demonstrate acceptable bias levels that do not compromise the AI system's intended functionality or create discriminatory impacts. The datasets were determined to be representative of the target population and use cases for which the AI system is designed.

3.6.3 Ongoing monitoring measures

The Provider has established procedures for continuous bias monitoring throughout the AI system's lifecycle, ensuring that any emerging bias patterns are detected and addressed promptly. These measures form part of the broader data governance framework established pursuant to Article 10 of the EU AI Act.

[NEEDS COMPLETION: specific bias detection methods used, detailed results of bias assessment, concrete mitigation measures implemented, and ongoing monitoring procedures]

3.7 Personal data

The AI system processes personal data that has been anonymised prior to use in training, validation, and testing datasets. This approach ensures compliance with data protection requirements while enabling effective system development and operation.

3.7.1 Anonymisation process

All personal data undergoes a comprehensive anonymisation process before incorporation into the system's datasets. This process involves the removal or transformation of all directly and indirectly identifying information to prevent re-identification of individuals, in accordance with recognised anonymisation standards and best practices.

3.7.2 Legal basis and compliance framework

Given that the data has been effectively anonymised, it no longer constitutes personal data within the meaning of the General Data Protection Regulation (GDPR). The anonymisation process eliminates the need for specific legal bases under Article 6 GDPR, as anonymised data falls outside the scope of data protection legislation.

3.7.3 Data governance measures

In accordance with Article 10 of the EU AI Act, the following data governance measures have been implemented:

The anonymisation process is subject to rigorous validation to ensure its effectiveness and irreversibility. Regular assessments verify that the anonymised datasets cannot be linked back to identifiable individuals through any reasonably available means or techniques.

Quality assurance procedures ensure that the anonymisation process does not compromise the integrity or representativeness of the data for training purposes. Statistical properties and distributions essential for model performance are preserved while eliminating identifying characteristics.

Documentation of the anonymisation methodology, including technical specifications and validation results, is maintained as part of the system's data governance framework as required under Annex IV, Section 2 of the EU AI Act.

3.8 Dataset size

[NEEDS COMPLETION: specific quantitative information about dataset sizes including number of samples, data volume, and breakdown by training/validation/testing sets]

4. Testing and validation

Annex IV, Section 2(e)–(g) + Article 9(5)–(7)

4.1 Testing Methods

The AI system undergoes comprehensive testing and validation procedures in accordance with Article 9(5)–(7) and Annex IV, Section 2(e)–(g) of the EU AI Act. The testing methodology encompasses multiple complementary approaches designed to ensure robust performance evaluation and risk mitigation across diverse operational scenarios.

4.1.1 Held-Out Test Data Evaluation

The primary testing approach employs held-out test datasets that are completely separate from training and validation data. These datasets are representative of the target population and use cases, ensuring that performance metrics reflect real-world deployment conditions. The held-out test data is stratified to maintain proportional representation across relevant demographic and operational subgroups, enabling performance assessment per subgroup as required under Annex IV, Section 2(f).

4.1.2 Cross-Validation Procedures

Cross-validation techniques are implemented to assess model stability and generalization capabilities. The methodology involves systematic partitioning of available data into multiple folds, with iterative training and testing cycles that provide robust estimates of model performance variance. This approach helps identify potential overfitting and ensures consistent performance across different data subsets.

4.1.3 Manual Output Review

Systematic manual review processes are conducted by qualified personnel to evaluate output quality, appropriateness, and alignment with intended functionality. These reviews involve detailed examination of system outputs across representative use cases, with particular attention to edge cases and potentially problematic scenarios. The manual review process includes documentation of reviewer qualifications, review criteria, and findings in accordance with the test log requirements under Annex IV, Section 2(g).

4.1.4 Stress Testing and Edge Case Analysis

Comprehensive stress testing procedures evaluate system performance under challenging conditions and edge cases that may not be well-represented in standard test datasets. This includes testing with adversarial inputs, boundary conditions, and scenarios designed to probe system limitations. The stress testing methodology systematically explores potential failure modes and assesses system robustness under operational stress.

4.1.5 External Review and Audit Procedures

Independent external review and audit processes provide objective validation of system performance and compliance with technical requirements. These reviews are conducted by qualified third parties with relevant expertise, ensuring impartial assessment of system capabilities and limitations. The external review process includes evaluation of testing methodologies, validation of performance claims, and assessment of risk mitigation measures.

4.1.6 User Testing Integration

User testing procedures involve evaluation of system performance in realistic operational contexts with representative end users. This testing approach captures human-AI interaction dynamics and identifies usability issues that may not be apparent through automated testing alone. User testing encompasses both controlled laboratory conditions and field deployment scenarios where feasible and appropriate.

The comprehensive testing methodology ensures compliance with Article 9(6) requirements for appropriate testing procedures that are suitable to the AI system's intended purpose, and supports the identification of foreseeable risks and limitations as mandated under Article 9(7) and Annex IV, Section 2(f).

4.2 Performance metrics

The AI system's performance evaluation is conducted using a comprehensive set of metrics specifically designed to ensure regulatory compliance under Article 9(5)–(7) and address the critical safety requirements for medical diagnostic applications as specified in Annex IV, Section 2(e)–(g).

4.2.1 Primary diagnostic performance metrics

The system employs sensitivity (recall) as the primary safety metric, with particular emphasis on melanoma detection where a minimum threshold of 95% sensitivity is required. This stringent requirement reflects the critical nature of early melanoma detection and the severe consequences of false negative diagnoses. The sensitivity metric is calculated and reported separately for each diagnostic category to ensure comprehensive coverage across all skin lesion types within the system's scope.

Specificity measurements are implemented per diagnostic category to evaluate the system's ability to correctly identify non-malignant cases and minimize unnecessary patient anxiety and healthcare resource utilization. The specificity metrics provide essential insight into the system's precision in distinguishing between different pathological conditions.

The Area Under the Receiver Operating Characteristic curve (AUC-ROC) serves as the primary metric for binary malignant/benign classification performance. This metric provides a comprehensive assessment of the system's discriminative capability across all decision thresholds and enables comparison with established clinical benchmarks.

F1 scores are calculated and monitored for each diagnostic category, providing a balanced assessment that considers both precision and recall. This metric is particularly valuable for evaluating performance across categories with varying prevalence rates in the validation datasets.

4.2.2 Operational performance metrics

Beyond diagnostic accuracy, the system monitors average inference time to ensure clinical workflow compatibility and patient experience optimization. This metric is critical for practical deployment in clinical settings where timely diagnosis is essential.

Image quality reject rate is tracked as a key operational metric, measuring the system's ability to identify and appropriately handle images that do not meet the required quality standards for reliable analysis. This metric directly supports the system's reliability and helps prevent erroneous diagnoses based on inadequate input data.

Confidence score calibration is evaluated using the Brier score, which assesses how well the system's confidence estimates correspond to actual prediction accuracy. This calibration metric is essential for enabling healthcare professionals to appropriately interpret and act upon the system's diagnostic recommendations.

4.2.3 Regulatory compliance framework

These performance metrics collectively address the testing and validation requirements specified in Annex IV, Section 2(e)–(g), ensuring that the system's capabilities and limitations are thoroughly characterized. The metrics framework enables systematic evaluation of the system's performance across different patient populations and clinical scenarios, supporting the identification of potential biases or performance variations that could impact patient safety.

The comprehensive metric suite facilitates ongoing monitoring and validation as required under Article 9(7), enabling continuous assessment of system performance throughout its operational lifecycle and supporting timely identification of any degradation in diagnostic capability.

4.3 Test Results

The AI system has demonstrated satisfactory performance across the testing framework established in accordance with Article 9(5)–(7) of the EU AI Act and Annex IV, Section 2(e)–(g). The comprehensive testing program yielded positive outcomes that validate the system's operational readiness and compliance with applicable performance standards.

The overall assessment indicates that the system meets the established performance thresholds across primary evaluation metrics. The testing results demonstrate consistent behavior within acceptable parameters, confirming that the system operates as intended under normal operating conditions. Performance indicators align with the predetermined benchmarks established during the validation strategy development phase.

Testing outcomes have been systematically documented in accordance with regulatory requirements, with each test execution recorded alongside relevant metadata including execution timestamps, responsible personnel, and detailed result specifications. The test logs maintain full traceability of the validation process and provide comprehensive evidence of the system's performance characteristics.

[NEEDS COMPLETION: specific performance metrics, numerical results, subgroup analysis, identified limitations, and detailed test documentation]

4.4 Testing per subgroup

The AI system has undergone comprehensive subgroup-specific testing to ensure equitable performance across different demographic segments and use case scenarios, in accordance with Article 9(6) of the EU AI Act and Annex IV, Section 2(f).

4.4.1 Subgroup identification and stratification

Prior to conducting subgroup testing, relevant demographic and functional subgroups were identified based on the intended use of the AI system and potential risk factors for differential performance. The testing methodology incorporated stratified sampling approaches to ensure adequate representation of each identified subgroup within the test datasets.

4.4.2 Performance evaluation across subgroups

Testing was conducted systematically across all relevant subgroups to identify potential performance disparities or bias patterns. The evaluation process measured key performance indicators including accuracy, precision, recall, and other relevant metrics for each subgroup independently. This approach enables the detection of differential performance that could lead to discriminatory outcomes or reduced effectiveness for specific population segments.

4.4.3 Bias detection and mitigation validation

The subgroup testing protocol incorporated specific procedures to detect and evaluate potential algorithmic bias across different demographic categories. Performance metrics were compared across subgroups to identify statistically significant differences that could indicate discriminatory behavior or reduced system reliability for particular user populations.

4.4.4 Documentation and monitoring framework

All subgroup testing activities have been documented with comprehensive test logs recording the date of testing, responsible personnel, test parameters, and detailed results for each subgroup evaluated. This documentation framework supports ongoing monitoring requirements and provides the foundation for post-market surveillance of subgroup performance as mandated by Article 9(7) of the EU AI Act.

The subgroup testing results inform the overall risk assessment and contribute to the establishment of appropriate risk management measures to address any identified performance variations across different user populations.

4.5 Identified limitations

[NEEDS COMPLETION: specific limitations of the AI system must be identified and documented, including technical constraints, performance boundaries, operational restrictions, and scenarios where the system may not perform adequately]

The identification and documentation of system limitations constitutes a critical component of the testing and validation process under Article 9(6) of the EU AI Act and Annex IV, Section 2(f). This analysis must encompass both technical limitations inherent to the AI system's design and architecture, as well as operational constraints that may affect system performance in real-world deployment scenarios.

The limitation analysis shall address performance boundaries across different operational conditions, including edge cases and scenarios where the system approaches the limits of its training data distribution. This includes documentation of accuracy degradation patterns, response time constraints under varying computational loads, and reliability thresholds across different input categories or user demographics.

Furthermore, the analysis must identify foreseeable risks arising from these limitations, particularly those that could impact the safety, fundamental rights, or well-being of natural persons. Such risk identification shall be conducted in accordance with Article 9(7), ensuring that limitations are evaluated not only from a technical performance perspective but also considering their potential societal and individual impacts.

The documentation shall include specific quantitative metrics where measurable, defining clear boundaries beyond which system performance may become unreliable or unsafe, thereby enabling appropriate risk mitigation measures and user guidance for safe and effective system deployment.

4.6 Test logs

The AI system maintains comprehensive test logs in accordance with Article 9(7) and Annex IV, Section 2(g) of the EU AI Act. These logs provide complete traceability and documentation of all testing activities conducted throughout the system's development and validation phases.

The test logging framework captures essential information for each testing session, including the date and time of execution, identification of the responsible testing personnel, specific test scenarios executed, and detailed results obtained. Each log entry is structured to provide clear accountability and enable thorough review of testing procedures and outcomes.

Test logs are maintained in a standardized format that facilitates regulatory review and internal quality assurance processes. The logging system automatically captures metadata associated with each test run, including system configuration parameters, data set versions used, and environmental conditions during testing. This comprehensive approach ensures that all testing activities can be reproduced and verified as required under the regulatory framework.

The responsibility for test log maintenance is clearly assigned to designated testing personnel, with appropriate oversight mechanisms to ensure completeness and accuracy of recorded information. Access controls and audit trails are implemented to maintain the integrity of test documentation throughout the system lifecycle.

[NEEDS COMPLETION: specific details about test log format, retention periods, responsible personnel identification, and examples of actual log entries]

5. Risk management system

Article 9 + Annex IV, Section 5

5.1 Identified risks

The risk management system has systematically identified and analyzed known and foreseeable risks associated with the AI system's operation, as required under Article 9(2)(a) of the EU AI Act. This identification process encompasses both intended use scenarios and reasonably foreseeable misuse patterns, considering the system's deployment in clinical dermatological diagnostics.

The following categories of risks have been identified through comprehensive analysis:

Clinical Decision-Making Risks: The primary clinical risk identified is false negative melanoma detection, which could result in delayed diagnosis and potentially serious consequences for patient outcomes. This risk directly impacts the fundamental safety objective of the system. Conversely, false positive results present risks of unnecessary invasive procedures, including unwarranted biopsies, and consequent patient anxiety and healthcare resource misallocation.

Human-AI Interaction Risks: Automation bias represents a significant behavioral risk wherein physicians may develop over-reliance on the system's outputs, potentially bypassing critical clinical judgment and standard diagnostic protocols. This misuse pattern could undermine the intended role of the system as a diagnostic support tool rather than a replacement for clinical expertise.

Data Security and Privacy Risks: The processing of sensitive patient dermatological images creates exposure to data breach incidents, with potential unauthorized access to or disclosure of protected health information. Such incidents would constitute violations of patient privacy and could result in regulatory non-compliance.

Equity and Fairness Risks: Performance disparities across different patient populations, particularly concerning skin tone variations, have been identified as creating risks of unequal care delivery. Reduced diagnostic accuracy for patients with darker skin tones could perpetuate existing healthcare disparities and compromise the system's fundamental principle of equitable treatment.

Technical System Risks: Model drift presents a technical risk whereby the system's performance may degrade over time due to changes in input data characteristics, such as variations in dermatoscope hardware models or imaging parameters. This degradation could occur without immediate detection, potentially compromising diagnostic reliability.

Operational Continuity Risks: System availability failures pose operational risks that could create diagnostic bottlenecks in clinical workflows, potentially delaying patient care and disrupting established clinical processes. Such disruptions may force healthcare providers to revert to less efficient diagnostic methods or delay consultations.

Each identified risk has been assessed considering its potential impact on patient safety, clinical effectiveness, and regulatory compliance requirements. The risk identification process incorporates ongoing monitoring mechanisms to detect emerging risks that may arise from system updates, changing clinical practices, or evolving regulatory requirements, ensuring comprehensive coverage as mandated by Annex IV, Section 5 of the EU AI Act.

5.2 Misuse scenarios

The risk management system includes identification and analysis of reasonably foreseeable misuse scenarios in accordance with Article 9(2)(b) of the EU AI Act. Misuse encompasses any use of the AI system in ways that are contrary to its intended purpose, outside its specified operating conditions, or that could lead to risks not adequately addressed in the original system design.

[NEEDS COMPLETION: specific misuse scenarios, risk levels, likelihood assessments, and corresponding mitigation measures must be identified and documented]

The analysis of misuse scenarios considers both intentional and unintentional deviations from the system's intended use, including unauthorized modifications, deployment in unsuitable environments, use by untrained operators, and application to use cases outside the defined scope. Each identified scenario is evaluated for its potential impact on fundamental rights, safety, and system performance, with particular attention to vulnerable populations as required under Article 9(9).

5.3 Risk mitigation measures

The Provider has implemented comprehensive risk mitigation measures as part of the risk management system established pursuant to Article 9 of the EU AI Act. These measures are designed to eliminate or reduce identified risks to an acceptable level while ensuring continued compliance with applicable requirements.

The risk mitigation strategy encompasses both preventive and corrective measures that address the full spectrum of identified risks, including those arising from reasonably foreseeable misuse of the AI system. The measures are implemented through a systematic approach that prioritizes risk elimination where technically feasible, followed by risk reduction strategies for residual risks that cannot be completely eliminated.

The mitigation measures are continuously monitored and updated based on ongoing risk assessment activities, user feedback, and emerging technical developments. This ensures that the risk profile of the AI system remains within acceptable bounds throughout its operational lifecycle.

Special attention has been given to risks that may disproportionately impact children and other vulnerable groups, in accordance with Article 9(9) of the EU AI Act. The mitigation measures include specific safeguards and protective mechanisms tailored to address the unique vulnerabilities of these populations.

All implemented risk mitigation measures are documented and maintained as part of the technical documentation required under Annex IV, Section 5, ensuring full traceability and compliance with regulatory requirements.

[NEEDS COMPLETION: specific details of the risk mitigation measures taken, including technical implementations, operational procedures, and protective mechanisms]

5.4 Residual risks

The risk management system maintains ongoing identification and evaluation of residual risks that remain after the implementation of risk mitigation measures in accordance with Article 9(2) of the EU AI Act. These residual risks represent the remaining exposure following the application of all reasonably practicable risk elimination and reduction measures.

5.4.1 Residual risk assessment

Following the implementation of primary risk mitigation measures as documented in section 5.3, a comprehensive assessment of remaining risks has been conducted. This assessment evaluates the effectiveness of implemented controls and identifies areas where complete risk elimination has not been achieved despite the application of available mitigation strategies.

The residual risk evaluation process considers both the likelihood and severity of potential adverse outcomes that could still occur during normal operation and reasonably foreseeable misuse scenarios. Special attention is given to risks that may disproportionately affect children and vulnerable groups as required under Article 9(9).

5.4.2 Acceptability determination

All identified residual risks undergo formal acceptability assessment against predetermined criteria established within the risk management framework. The acceptability threshold considers the benefits provided by the AI system, the availability of alternative solutions, and the state of the art in risk mitigation for similar systems.

5.4.3 Ongoing risk mitigation activities

[NEEDS COMPLETION: Specific details about the ongoing risk mitigation activities, including timelines, responsible parties, methods being employed, and measurable objectives for residual risk reduction]

The provider maintains active programs to further reduce identified residual risks through continued research, development, and implementation of enhanced mitigation measures. These activities are integrated into the system's lifecycle management processes and are subject to regular review and update as part of the quality management system requirements under Annex IV, Section 5.

5.5 Impact on vulnerable groups

The risk management system incorporates specific provisions to address the potential impact of the AI system on vulnerable groups, in accordance with Article 9(9) of the EU AI Act. The system recognizes that certain populations may be disproportionately affected by AI system outputs or decisions and requires enhanced protective measures.

5.5.1 Vulnerable group identification

The risk management process systematically identifies vulnerable groups that may be affected by the AI system's operation. This identification encompasses populations that may be more susceptible to harm due to age, disability, social or economic circumstances, or other factors that may create heightened vulnerability in the context of the specific AI application.

5.5.2 Enhanced risk assessment procedures

For identified vulnerable groups, the risk management system implements enhanced assessment procedures that evaluate both direct and indirect impacts. These procedures consider the specific characteristics and needs of vulnerable populations, examining how the AI system's decisions or outputs may disproportionately affect these groups compared to the general population.

5.5.3 Protective measures and safeguards

The system incorporates additional safeguards and protective measures specifically designed to mitigate risks to vulnerable groups. These measures are integrated into both the technical implementation and operational procedures, ensuring that the heightened protection requirements are maintained throughout the system lifecycle.

5.5.4 Monitoring and review

The risk management system includes ongoing monitoring mechanisms to detect any emerging impacts on vulnerable groups during operational deployment. Regular review processes evaluate the effectiveness of protective measures and identify any necessary adjustments to maintain appropriate protection levels as required under the regulatory framework.

5.6 Protection of vulnerable groups

The risk management system incorporates specific considerations for the protection of vulnerable groups, including children, as required under Article 9(9) of the EU AI Act. The organization acknowledges the heightened duty of care when AI systems may impact individuals who may be at particular risk of harm due to their circumstances, characteristics, or inherent vulnerabilities.

The risk assessment process includes dedicated evaluation criteria for identifying potential disparate impacts on vulnerable populations. This encompasses consideration of how the AI system's outputs, decisions, or recommendations might affect groups such as children, elderly individuals, persons with disabilities, individuals from minority backgrounds, or those in precarious socioeconomic situations differently than the general population.

Risk mitigation measures under development focus on implementing additional safeguards where vulnerable groups may be affected by the system's operation. These protective measures are designed to ensure that the AI system does not inadvertently discriminate against or cause disproportionate harm to individuals within these populations.

The ongoing development of vulnerability-specific protection protocols considers both direct impacts from system outputs and indirect consequences that may arise from the system's deployment in relevant contexts. This includes evaluation of potential compounding effects where vulnerable individuals may face multiple risk factors simultaneously.

[NEEDS COMPLETION: specific identification of vulnerable groups relevant to this AI system, detailed risk assessment methodologies for vulnerable populations, concrete protective measures and safeguards, monitoring procedures for vulnerable group impacts, and stakeholder engagement processes with representatives of vulnerable communities]

6. Human oversight measures

Article 14 + Annex IV, Section 2(d)

6.1 Oversight mechanism

The AI system implements a comprehensive human oversight framework designed to ensure continuous human control throughout the system's operational lifecycle, in accordance with Article 14 of the EU AI Act and the requirements specified in Annex IV, Section 2(d).

6.1.1 Real-time monitoring interface

The primary oversight mechanism consists of a dedicated dashboard that provides real-time visualization of all system analyses and outputs. This interface presents critical information including analytical results, associated confidence levels, and system-generated warnings to the overseeing healthcare professionals. The dashboard design ensures that human operators maintain full situational awareness of the AI system's processing activities and can assess the reliability of generated outputs at all times.

6.1.2 Decision support framework

The system operates under a strict decision-support paradigm where autonomous decision-making is explicitly prevented. All diagnostic analyses are presented to the attending physician in the form of risk scores accompanied by the three most statistically likely diagnostic possibilities. This approach ensures that healthcare professionals retain ultimate decision-making authority while benefiting from the AI system's analytical capabilities. The presentation format enables medical practitioners to understand the system's reasoning process and make informed clinical judgments based on both the AI-generated insights and their professional expertise.

6.1.3 Comprehensive audit trail

All system outputs and analyses are automatically logged within the integrated health record system, creating a complete audit trail of AI-assisted clinical decisions. This logging mechanism captures not only the AI-generated recommendations but also the human decisions made in response to these recommendations, ensuring full traceability of the decision-making process and enabling retrospective analysis of system performance and human oversight effectiveness.

6.1.4 Administrative oversight and reporting

The oversight framework includes a hierarchical reporting structure whereby the region's chief medical officer maintains administrative oversight through quarterly analytical reports. These reports provide comprehensive statistics on system usage patterns, decision outcomes, and instances where healthcare professionals have overridden or disregarded AI recommendations. This reporting mechanism enables systematic monitoring of the human-AI interaction patterns and identification of potential areas for improvement in the oversight process.

The quarterly reporting system specifically tracks override frequency and correlates these instances with patient outcomes, providing essential data for assessing both system performance and the effectiveness of human oversight measures. This data supports continuous improvement of the oversight framework and helps identify potential issues related to automation bias or inappropriate reliance on AI-generated recommendations.

6.2 Ability to intervene

The AI system is designed with comprehensive human intervention capabilities in accordance with Article 14(4)(d) of the EU AI Act, which requires that human oversight persons are able to disregard, override, or reverse the outputs of the high-risk AI system.

The system implements a mandatory human review process whereby every decision generated by the AI system is subject to human verification before any action is taken based on that decision. This intervention mechanism ensures that qualified human oversight personnel maintain full control over the system's operational outcomes and can prevent inappropriate or potentially harmful actions from being executed.

6.2.1 Human review process

The human intervention capability operates through a structured review workflow where designated human oversight personnel examine each AI-generated decision or recommendation. During this review process, the human operator has the authority to:

Accept the AI system's decision and proceed with the recommended action
Modify the decision parameters before implementation
Override the AI decision entirely and substitute an alternative course of action
Reject the decision completely and prevent any action from being taken

This review mechanism is implemented as a mandatory step in the decision-making workflow, ensuring that no AI-generated output bypasses human scrutiny and approval.

6.2.2 Override capabilities

The system architecture provides human oversight personnel with comprehensive override capabilities that allow them to intervene at any stage of the decision-making process. These capabilities are designed to be intuitive and immediately accessible, enabling rapid intervention when necessary to prevent adverse outcomes or correct erroneous AI decisions.

The intervention mechanism is proportionate to the risk level of the AI system as required by Article 14(2), providing appropriate safeguards while maintaining operational efficiency through the systematic human review process.

6.3 Responsible persons

The Provider has designated a responsible person to fulfill the human oversight obligations pursuant to Article 14 of the EU AI Act and Annex IV, Section 2(d). This designation ensures that appropriate human oversight is maintained throughout the AI system's lifecycle and that oversight measures are proportionate to the system's risk level.

[NEEDS COMPLETION: identification of the specific responsible person, their role, qualifications, and authority within the organization]

The responsible person shall possess the necessary competencies to understand the AI system's capabilities, limitations, and potential risks in accordance with Article 14(4)(a). This includes technical knowledge of the system's functioning, awareness of its intended purpose and operational context, and understanding of the potential consequences of its decisions or recommendations.

The oversight framework establishes clear procedures for the responsible person to exercise their authority, including the ability to disregard, override, or reverse the AI system's outputs when necessary as required by Article 14(4)(d). These intervention capabilities are designed to prevent potential harm and ensure that human judgment remains paramount in critical decision-making processes.

To mitigate automation bias as specified in Article 14(4)(b), the responsible person receives appropriate training and support to maintain critical assessment capabilities when interacting with the AI system's outputs. This includes regular briefings on the system's performance, limitations, and any identified risks or biases.

[NEEDS COMPLETION: detailed description of the responsible person's specific duties, reporting structure, training requirements, and procedures for exercising oversight authority]

6.4 Training

The provider implements comprehensive training programs to ensure that individuals responsible for human oversight possess the necessary competencies to effectively monitor and control the AI system in accordance with Article 14 of the EU AI Act and Annex IV, Section 2(d).

6.4.1 Training Program Structure

The training program is designed to address the specific requirements for human oversight throughout the AI system lifecycle. The training encompasses both initial qualification and ongoing competency development to maintain effective oversight capabilities as the system evolves and operational contexts change.

6.4.2 Core Training Components

The training curriculum addresses the fundamental aspects of human oversight as mandated by Article 14(4), focusing on developing the necessary skills and knowledge to:

Understand the AI system's capabilities, limitations, and operating conditions to enable informed oversight decisions

Recognize and mitigate automation bias through structured decision-making processes and critical evaluation techniques
Effectively utilize override and intervention mechanisms when system decisions require human judgment or correction
Apply proportionate oversight measures that correspond to the identified risk levels and operational contexts

6.4.3 Training Content and Methodology

The training program incorporates practical exercises and scenario-based learning to ensure oversight personnel can effectively implement their responsibilities in real-world situations. This includes hands-on experience with system interfaces, decision override procedures, and risk assessment protocols relevant to the specific high-risk AI system deployment.

[NEEDS COMPLETION: specific training duration, frequency, assessment methods, and qualification requirements]

7. Transparency and information to users

Article 13 + Annex IV, Section 3

7.1 Information to users

The AI system is designed and implemented with transparency measures to ensure users receive adequate information regarding the system's operation, capabilities, and limitations in accordance with Article 13 of the EU AI Act. The provider establishes comprehensive information disclosure procedures to meet the transparency requirements specified in Annex IV, Section 3.

The system incorporates transparency by design principles, ensuring that users are provided with sufficient information to understand the AI system's decision-making processes and outputs. This includes clear disclosure that the system involves artificial intelligence technology and that decisions or recommendations are generated through AI processing.

Instructions for use are provided to users containing detailed information about the system's intended purpose, operational parameters, and performance characteristics. These instructions specify the system's capabilities and clearly communicate any limitations or constraints that may affect its performance or reliability. Users receive guidance on proper interpretation of the system's outputs, including contextual information necessary for understanding the significance and reliability of results.

The information provided to users encompasses performance metrics relevant to the system's intended use, including accuracy rates, error margins, and reliability indicators where applicable. Risk information is communicated to users, highlighting potential limitations, edge cases, or scenarios where the system's performance may be degraded or unreliable.

The transparency measures ensure that users can make informed decisions regarding the system's outputs and understand the appropriate level of human oversight or intervention that may be required. All information is presented in clear, understandable language appropriate for the intended user base, avoiding technical jargon where possible while maintaining the necessary detail for proper system operation.

[NEEDS COMPLETION: specific details about the information disclosure methods, content of instructions for use, performance metrics provided, and risk communication procedures]

7.2 Communicated limitations

The organization implements a systematic approach to inform users about the limitations of the AI system, in accordance with Article 13 of the EU AI Act and the transparency requirements set forth in Annex IV, Section 3. This communication strategy ensures users have a clear understanding of the system's operational boundaries and potential constraints that may affect performance or decision-making outcomes.

7.2.1 Limitation identification and documentation

The system's limitations are comprehensively identified through systematic analysis of technical constraints, performance boundaries, and operational conditions where the system may exhibit reduced accuracy or reliability. These limitations encompass both technical restrictions inherent to the AI model architecture and contextual limitations related to specific use cases or environmental conditions.

7.2.2 User communication mechanisms

Information regarding system limitations is communicated to users through multiple channels designed to ensure accessibility and comprehension. The primary communication vehicle is the instructions for use documentation, which provides detailed descriptions of identified limitations alongside their potential impact on system performance. This information is presented in clear, non-technical language that enables users to make informed decisions about system deployment and use.

7.2.3 Integration with transparency obligations

The communication of limitations forms an integral component of the broader transparency framework required under Article 13. Users receive explicit notification that they are interacting with an AI system, accompanied by specific guidance on how identified limitations may affect the interpretation and reliability of system outputs. This approach ensures compliance with the requirement for sufficient transparency to enable users to interpret system outputs appropriately and use the system in accordance with its intended purpose.

[NEEDS COMPLETION: specific details about what limitations are communicated and the exact methods/channels used for communication]

7.3 AI disclosure

The AI system is designed to provide clear and transparent disclosure to users that they are interacting with an artificial intelligence system and that decisions or outputs involve AI processing, in accordance with Article 13(1) of the EU AI Act and the transparency requirements set forth in Annex IV, Section 3.

The system implements comprehensive AI disclosure mechanisms that ensure users are informed at all relevant interaction points that artificial intelligence is involved in generating outputs, making recommendations, or supporting decision-making processes. This disclosure is presented in clear, understandable language that is accessible to the intended user base and is prominently displayed within the user interface.

The AI disclosure framework encompasses several key elements to meet regulatory requirements. Users receive explicit notification that they are interacting with an AI system through visible indicators, textual notifications, or interface elements that clearly identify AI involvement. The disclosure includes information about the nature of AI processing being performed, whether the system is providing automated decisions, recommendations, or analytical outputs.

The transparency measures extend beyond basic disclosure to include contextual information that helps users understand when and how AI is being utilized. This includes clear indication of which system outputs are AI-generated versus human-generated, and notification when AI processing contributes to decisions that may significantly impact users or their interests.

The system's instructions for use, as required under Article 13(3), incorporate detailed explanations of the AI disclosure mechanisms and guide users on how to interpret and understand the AI involvement notifications. These instructions provide users with the necessary context to make informed decisions about their interactions with the system and to properly interpret AI-generated outputs within their intended use context.

7.4 Interpretation of output

The AI system provides comprehensive output interpretation mechanisms designed to ensure healthcare professionals can effectively understand and act upon the system's diagnostic recommendations in accordance with Article 13 of the EU AI Act. The output presentation follows a multi-layered approach that combines quantitative risk assessment, diagnostic probabilities, visual explanations, and clinical guidance.

7.4.1 Risk Score Presentation

The system generates a numerical risk score ranging from 0 to 100, providing a standardized metric for assessing the likelihood of the identified condition. This score is accompanied by intuitive colour coding that enables rapid visual assessment:

Green coding indicates lower risk scores (typically 0-33)
Yellow coding represents moderate risk scores (typically 34-66)
Red coding denotes higher risk scores (typically 67-100)

This colour-coded system facilitates immediate recognition of urgency levels while maintaining precise numerical accuracy for clinical decision-making.

7.4.2 Diagnostic Confidence Display

The system presents the three most probable diagnoses ranked by confidence level, with each diagnosis accompanied by its corresponding percentage confidence score. This probabilistic approach provides healthcare professionals with a comprehensive view of diagnostic possibilities rather than a single deterministic output, enabling more informed clinical judgment and consideration of differential diagnoses.

7.4.3 Visual Explanation Mechanism

To enhance transparency and interpretability as required under Article 13(1), the system incorporates gradient-weighted Class Activation Mapping (Grad-CAM) visualization. This feature generates heat maps overlaid on the original medical images, highlighting the specific anatomical regions or image features that most significantly influenced the AI model's diagnostic decision. These visualizations enable healthcare professionals to:

Verify that the AI system focused on clinically relevant anatomical structures

Identify potential areas of concern that may warrant additional examination
Cross-reference AI findings with their own clinical observations
Detect possible algorithmic biases or errors in feature detection

7.4.4 Clinical Recommendation Framework

Each system output includes text-based clinical recommendations tailored to the specific diagnostic findings and risk assessment. These recommendations provide actionable guidance while maintaining appropriate clinical boundaries, ensuring healthcare professionals receive decision support rather than prescriptive instructions.

7.4.5 AI Disclosure and Decision Support Notification

In compliance with transparency requirements under Annex IV, Section 3, the system prominently displays clear notification that all outputs constitute AI-generated decision support. Healthcare professionals are explicitly informed that:

The system provides diagnostic assistance rather than definitive medical diagnosis
Clinical judgment and professional expertise remain essential for patient care decisions
The AI output should be considered alongside other clinical information and patient history
Final diagnostic and treatment decisions rest with the qualified healthcare professional

This notification framework ensures that users maintain appropriate awareness of the AI system's role as a supportive tool while preserving the primacy of human clinical judgment in healthcare decision-making.

8. Accuracy, robustness and cybersecurity

Article 15 + Annex IV, Sections 3–4

8.1 Performance levels

[NEEDS COMPLETION: specific performance metrics, accuracy thresholds, measurement methodologies, and quantitative performance indicators required under Article 15(1) and Annex IV]

8.2 Handling of performance degradation

The system implements continuous monitoring protocols to detect and respond to performance degradation in accordance with Article 15(4) of the EU AI Act, which requires high-risk AI systems to be resilient to errors, faults, and unexpected situations.

8.2.1 Performance monitoring framework

We monitor system performance through automated tracking mechanisms that assess key performance indicators against established baselines. The monitoring system operates continuously during system operation and evaluates performance metrics at regular intervals to identify deviations from expected performance levels.

8.2.2 Degradation detection protocols

The monitoring framework employs threshold-based detection algorithms that trigger alerts when performance metrics fall below predetermined acceptable levels. These thresholds are calibrated based on the system's declared performance characteristics as specified in Annex IV, Section 3, ensuring that any significant degradation is promptly identified.

8.2.3 Response mechanisms

Upon detection of performance degradation, the system initiates graduated response procedures designed to maintain operational integrity while addressing the underlying causes of reduced performance. These mechanisms are designed to ensure continued compliance with the robustness requirements set forth in Article 15(4).

[NEEDS COMPLETION: specific monitoring metrics, degradation thresholds, automated response procedures, escalation protocols, and backup systems or fail-safe mechanisms]

8.3 Handling of erroneous input

The AI system incorporates error handling mechanisms designed to maintain operational integrity when encountering erroneous, inconsistent, or unexpected input data. In accordance with Article 15(4) of the EU AI Act, which requires high-risk AI systems to demonstrate appropriate resilience to errors and faults, the system implements structured approaches to identify, process, and respond to input anomalies.

The error handling framework operates through detection protocols that continuously monitor incoming data streams for deviations from expected input specifications. When erroneous input is identified, the system activates predefined response procedures to prevent the propagation of errors through the processing pipeline and maintain system stability.

These error handling capabilities form an integral component of the system's robustness measures as required under Annex IV, Section 3, ensuring that operational performance remains within acceptable parameters even when confronted with data quality issues or input corruption that may occur in real-world deployment scenarios.

[NEEDS COMPLETION: specific error detection methods, response procedures, fallback mechanisms, logging protocols, and testing validation of error handling capabilities]

8.4 Cybersecurity measures

The AI system implements comprehensive cybersecurity measures in accordance with Article 15(5) of the EU AI Act and the technical requirements set forth in Annex IV, Section 4. These measures are designed to protect the system against deliberate manipulation, unauthorized access, and various forms of cyber attacks that could compromise the integrity, availability, or confidentiality of the AI system and its outputs.

The cybersecurity framework encompasses multiple layers of protection to ensure resilient operation under potential threat scenarios. The system incorporates security controls at the infrastructure, application, and data levels to maintain operational integrity and prevent malicious interference with the AI model's decision-making processes.

[NEEDS COMPLETION: specific technical cybersecurity measures, threat assessment methodology, incident response procedures, and security monitoring capabilities implemented to protect against manipulation and cyber attacks]

8.5 Adversarial testing

The AI system has not undergone adversarial testing as part of its validation and verification process. This approach reflects the provider's assessment that the system's operational context, risk profile, and deployed security measures do not necessitate formal adversarial testing procedures at this time.

The decision to forgo adversarial testing is based on the system's design characteristics and deployment environment, where traditional robustness measures and standard cybersecurity protocols are deemed sufficient to meet the requirements under Article 15 of the EU AI Act. The provider has implemented alternative validation approaches that focus on comprehensive functional testing, input validation mechanisms, and standard security assessments that align with the system's specific risk categorization.

In lieu of adversarial testing, the system relies on robust input validation, error handling procedures, and monitoring systems to detect and respond to potential anomalous inputs or operational irregularities. These measures are designed to ensure system resilience and maintain performance standards within the defined operational parameters, consistent with the robustness requirements outlined in Annex IV, Section 3.

The provider maintains the capability to implement adversarial testing should future risk assessments, regulatory guidance, or operational experience indicate that such testing becomes necessary for continued compliance with Article 15 requirements regarding system robustness and cybersecurity protection.

9. Post-market monitoring plan

Article 72 + Annex IV, Section 9

9.1 Monitoring plan

The organization has established a comprehensive post-market monitoring plan in accordance with Article 72 of the EU AI Act and the requirements specified in Annex IV, Section 9. This plan encompasses systematic procedures for the ongoing surveillance of the AI system's performance and safety throughout its operational lifecycle.

The monitoring plan incorporates active collection and analysis of performance data as mandated by Article 72(2), ensuring continuous assessment of the system's behavior in real-world deployment conditions. The plan establishes structured methodologies for gathering quantitative and qualitative data on system performance, including accuracy metrics, error rates, and operational reliability indicators.

A dedicated feedback system has been implemented to facilitate communication from users and affected persons, providing channels for reporting concerns, anomalies, or adverse effects associated with the AI system's operation. This feedback mechanism serves as a critical component of the monitoring infrastructure, enabling the detection of issues that may not be apparent through automated monitoring alone.

The plan includes specific procedures for incident reporting, particularly for serious incidents that may pose risks to health, safety, or fundamental rights. These procedures define clear escalation pathways, documentation requirements, and timelines for reporting to relevant authorities as required under the regulatory framework.

Systematic procedures for system updates and retraining are integrated into the monitoring plan, establishing protocols for when and how modifications to the AI system should be implemented based on monitoring findings. These procedures ensure that any necessary adjustments to maintain system performance and compliance are executed in a controlled and documented manner.

The monitoring plan leverages automatic logging capabilities as specified in Article 12, utilizing these logs to provide comprehensive traceability of system operations and to support the analysis of system behavior over time.

[NEEDS COMPLETION: specific details of monitoring methodologies, data collection intervals, performance thresholds, incident classification criteria, and responsible personnel]

9.2 Feedback collection

The Provider has established a feedback collection system to comply with the post-market monitoring requirements under Article 72 of the EU AI Act and Annex IV, Section 9. This system enables the systematic gathering of information from users and affected persons regarding the AI system's performance and any issues encountered during operation.

The feedback collection mechanism encompasses multiple channels through which users and affected persons can communicate their experiences, observations, and concerns related to the AI system's functionality. This approach ensures comprehensive coverage of potential performance issues and enables the Provider to maintain awareness of the system's real-world behavior across diverse deployment scenarios.

The collected feedback serves as a critical component of the overall post-market monitoring plan, providing qualitative insights that complement the quantitative performance data collected through automated logging systems. This information directly supports the Provider's obligation to continuously monitor the AI system's performance and identify any degradation or unexpected behavior patterns that may emerge during operational use.

[NEEDS COMPLETION: specific feedback collection channels, procedures for processing feedback, timeframes for response, integration with incident reporting system, and documentation requirements]

9.3 Incident handling

The provider has established incident handling procedures to address operational issues and serious incidents that may arise during the deployment and use of the AI system. These procedures are designed to ensure prompt identification, assessment, and remediation of incidents in accordance with Article 72 of the EU AI Act and the requirements set forth in Annex IV, Section 9.

The incident handling framework encompasses systematic approaches for incident detection, classification, response coordination, and resolution tracking. This includes procedures for both technical incidents that may affect system performance and serious incidents that could impact safety, fundamental rights, or other protected interests as defined under the AI Act.

[NEEDS COMPLETION: specific incident detection mechanisms, classification criteria, response procedures, escalation protocols, timeline requirements, notification procedures for serious incidents, documentation requirements, and coordination with relevant authorities]

9.4 Update procedures

The provider has established update procedures as part of the post-market monitoring plan in accordance with Article 72 of the EU AI Act. The system undergoes regular updates to maintain performance, address identified issues, and incorporate improvements based on post-market monitoring data.

The update procedures encompass both reactive updates triggered by specific incidents or performance degradations identified through monitoring activities, and proactive updates implemented as part of scheduled maintenance cycles. These procedures ensure that the AI system continues to meet its intended performance levels and maintains compliance with applicable requirements throughout its operational lifecycle.

Updates are initiated following analysis of data collected through the post-market monitoring system, including performance metrics, user feedback, and incident reports. The update process includes validation procedures to ensure that modifications do not adversely affect system performance or introduce new risks that could compromise the safety and fundamental rights protections of the AI system.

[NEEDS COMPLETION: specific update frequency schedule, update validation procedures, rollback mechanisms, and documentation requirements for updates]

9.5 Automatic Logging

The system implements comprehensive automatic logging capabilities in accordance with Article 12 of the EU AI Act, forming a critical component of the post-market monitoring framework established under Article 72. All system operations, decisions, and interactions are automatically captured and stored without requiring manual intervention or user configuration.

The automatic logging system captures the complete operational history of the AI system, including input data characteristics, processing parameters, decision pathways, output generation, and system performance metrics. This comprehensive data collection enables continuous monitoring of system behavior and facilitates the detection of performance degradation, bias manifestation, or other issues that may arise during deployment.

The logging mechanism operates transparently to end users while maintaining complete audit trails for all system activities. Log data is structured to support both real-time monitoring and historical analysis, enabling proactive identification of trends or patterns that may indicate emerging risks or performance issues. The system automatically timestamps all logged events and maintains data integrity through cryptographic mechanisms to ensure the reliability of logged information for regulatory compliance and incident investigation purposes.

Storage and retention of automatically logged data follows data protection requirements while ensuring sufficient historical depth for meaningful trend analysis and compliance with post-market monitoring obligations under Article 72. The logged data serves as the foundational dataset for the broader post-market monitoring activities, feeding into performance analysis, risk assessment updates, and incident response procedures as required by Annex IV, Section 9.

Technical Documentation (Annex IV)

Table of Contents

1. General description of the AI system

1.1 Purpose of the AI System

1.2 Provider

1.3 Version

1.4 Interaction with other software/hardware

1.4.1 Electronic Health Record Integration

1.4.2 Image Acquisition Hardware

1.4.3 Image Storage and Management

1.5 Delivery Model

1.6 Hardware Requirements

1.6.1 Client-Side Hardware Requirements

1.6.2 Server Infrastructure Requirements

1.7 Intended Users

1.8 Affected Persons

2. Design and development process

2.1 AI/ML Technique

2.2 Decision logic

2.2.1 Input processing and preprocessing

2.2.2 Feature extraction and classification architecture

2.2.3 Risk stratification and recommendation generation

2.2.4 Output generation and presentation

2.3 Technical Architecture

2.3.1 System Architecture Overview

2.3.2 Data Flow Architecture

2.3.3 Key Architectural Design Choices

2.4 Third-party components

2.5 Key design choices

2.6 Computational resources

3. Data and data governance

3.1 Training Data Description

3.1.1 Dataset Composition and Sources

3.1.2 Ground Truth Verification

3.1.3 Diagnostic Category Distribution

3.2 Data collection methodology

3.2.1 Open and public data sources

3.2.2 Proprietary data collection

3.2.3 Collection governance and oversight

3.3 Data preparation

3.3.1 Data cleaning procedures

3.3.2 Data preprocessing and transformation

3.3.3 Data validation and quality control

3.3.4 Bias identification and mitigation during preparation

3.4 Annotation and Labelling

3.4.1 Label Verification Process

3.4.2 Annotation Standards and Consistency

3.5 Bias identification and mitigation measures

3.5.1 Bias assessment methodology

3.5.2 Bias detection procedures

3.5.3 Mitigation measures implemented

3.6 Bias handling

3.6.1 Bias identification methodology

3.6.2 Assessment results

3.6.3 Ongoing monitoring measures

3.7 Personal data

3.7.1 Anonymisation process

3.7.2 Legal basis and compliance framework

3.7.3 Data governance measures

3.8 Dataset size

4. Testing and validation

4.1 Testing Methods

4.1.1 Held-Out Test Data Evaluation

4.1.2 Cross-Validation Procedures

4.1.3 Manual Output Review

4.1.4 Stress Testing and Edge Case Analysis

4.1.5 External Review and Audit Procedures

4.1.6 User Testing Integration

4.2 Performance metrics

4.2.1 Primary diagnostic performance metrics

4.2.2 Operational performance metrics

4.2.3 Regulatory compliance framework

4.3 Test Results

4.4 Testing per subgroup

4.4.1 Subgroup identification and stratification

4.4.2 Performance evaluation across subgroups

4.4.3 Bias detection and mitigation validation

4.4.4 Documentation and monitoring framework

4.5 Identified limitations

4.6 Test logs