ICO Consultation on Generative AI and Data Protection

ICO's consultation on generative AI clarifies data protection laws, addressing lawful basis, purpose limitation, accuracy, rights and controllership.

22 April 2025

Publication

Loading...

Listen to our publication

0:00 / 0:00

On 15 January 2024, the Information Commissioner's Office ("ICO") launched a comprehensive consultation on the application of UK data protection laws to generative AI. This consultation aimed to address uncertainties regarding the UK GDPR and the Data Protection Act 2018 in the context of generative AI.

Through this process, the ICO engaged with a diverse range of stakeholders, including AI developers, industry bodies, and civil society, to gather insights and evidence on five key areas:

  • The lawful basis for web scraping to train generative AI models.
  • Purpose limitation in the generative AI lifecycle.
  • Accuracy of training data and model outputs.
  • Engineering individual rights into generative AI models.
  • Allocating controllership across the generative AI supply chain.

On 12 December 2024, the ICO shared its finding, which are pivotal in shaping the regulatory landscape for generative AI, ensuring that these technologies are developed and deployed in a manner that respects individuals' data protection rights.

Set out below is a summary of the key findings from the ICO's consultation:

Key Findings

1. Lawful Basis for Web Scraping:

  • The ICO initially identified legitimate interests as the primary lawful basis for using web-scraped data in training generative AI models, given the impracticality of other bases like consent. Respondents from the technology sector argued that web scraping is essential due to the vast data requirements for AI training. However, creative industry representatives challenged this, suggesting alternative methods such as licensing data directly from publishers. Concerns were also raised about the unauthorised use of copyright-protected data and the lack of transparency in web scraping practices.

  • The ICO reaffirmed that legitimate interests remain the primary lawful basis but emphasised the need for developers to demonstrate necessity and pass the balancing test. The ICO highlighted the importance of transparency and encouraged developers to explore alternative data collection methods. The ICO also noted the risks associated with invisible processing and the challenges developers may face in meeting the legitimate interests balancing test without sufficient transparency measures.

2. Purpose Limitation:

  • The ICO stressed the importance of defining explicit and specific purposes for processing personal data in the generative AI lifecycle. Generative AI developers expressed difficulty in specifying purposes due to the open-ended nature of AI applications. Civil society and creative industries argued that broad purposes like "developing a model" are insufficient and called for greater transparency and documentation of processing purposes.

  • The ICO maintained its position on purpose limitation, emphasising the need for clear and specific purposes to ensure compliance with data protection laws. The ICO acknowledged the challenges posed by open-ended uses but stressed the importance of transparency and documentation. The ICO also recognised the need for guidance on how developers can demonstrate sufficiently detailed and specific purposes when training generative AI.

3. Accuracy of Training Data and Model Outputs:

  • The ICO underscored the need for developers to ensure training data accuracy and communicate the statistical accuracy of model outputs. Respondents agreed on the importance of accuracy but differed on responsibility. Developers claimed verifying data accuracy is challenging due to the lack of 'ground truth.' Creative industries emphasised the need for high-quality data for factual outputs, suggesting independent audits and technical measures like labelling and watermarking.

  • The ICO reiterated the link between training data accuracy and model output accuracy, emphasising the need for transparency about data quality. While acknowledging limitations in verifying data accuracy, the ICO supported measures like labelling and watermarking to communicate accuracy and reliability. The ICO also highlighted the importance of clear communication between developers, deployers, and end-users to ensure the degree of statistical accuracy is proportionate to the model's final application.

4. Engineering Individual Rights

  • The ICO called for clear processes to enable individuals to exercise their information rights. Creative industries argued that generative AI development often disregards information rights, while developers claimed facilitating rights is challenging once data is integrated into models. Civil society emphasised the unlawfulness of non-compliant models and the need for retraining on compliant data.

  • The ICO emphasised the importance of data protection by design and the need for mechanisms to fulfil information rights requests. The ICO expressed concern over the lack of practical measures to enable rights exercise and called for improved transparency and innovative solutions. The ICO also stressed that organisations must have processes in place to enable and record people exercising their information rights, and that data protection by design is a legal requirement.

5. Allocating Controllership:

  • The ICO outlined that controllership should reflect actual control and influence over processing activities. Technology sector respondents argued that developers lack sufficient control at the deployment phase to be considered controllers. Some acknowledged joint controllership in certain scenarios but preferred clear contracts to define roles and liabilities.

  • The ICO maintained its position on joint controllership, clarifying that it does not imply equal responsibility for all processing activities. The ICO emphasised the importance of fact-based assessments to determine controllership and encouraged further engagement to provide practical examples. The ICO also highlighted that joint controllership agreements should clearly set out the responsibilities of each party, ensuring accountability and compliance with data protection laws.

Tackling Misconceptions

Throughout the consultation process, the ICO identified several misconceptions related to generative AI and data protection. These misconceptions often stemmed from misunderstandings about the scope and application of data protection laws in the context of AI technologies. The ICO provided clarity on each of these issues, ensuring that stakeholders have a clear understanding of their obligations under the law:

Incidental Processing of Personal Data

plus

Misconception

Incidental processing is exempt from data protection laws.

ICO Clarification

Data protection laws apply to all personal data processing, intentional or incidental.

Common Practice and Reasonable Expectations

plus

Misconception

Common practices align with reasonable expectations.

ICO Clarification

Common practice does not guarantee compliance with individuals' expectations and rights.

Personally Identifiable Information (PII) vs. Personal Data

plus

Misconception

Many organisations focused their compliance efforts on PII.

ICO Clarification

Personal data includes a broader range of information than PII.

Reliance on Search Engine Case Law

plus

Misconception

Search engine case law applies to generative AI.

ICO Clarification

Generative AI differs from search engines; developers must enable rights exercise.

Data Protection Implications of AI Models

plus

Misconception

AI models do not store personal data.

ICO Clarification

AI models can contain personal data, requiring compliance.

Scope of ICO's Remit

plus

Misconception

ICO can guide compliance beyond data protection.

ICO Clarification

ICO's remit is limited to data protection and information law.

AI Exemption to Data Protection Law

plus

Misconception

Generative AI is exempt from data protection law.

ICO Clarification

No exemptions exist; compliance is required from the outset.

By addressing these misunderstandings, the ICO aims to ensure that organisations develop and deploy AI technologies responsibly, with due regard for individuals' rights and freedoms.

Next Steps

The ICO will update its guidance on AI and data protection to reflect the consultation findings and forthcoming data protection law changes. A joint statement with the Competition and Markets Authority (CMA) will further explore the interplay of data protection, competition, and consumer law in the context of generative AI.

Conclusion

The ICO's consultation provides important insights into the application of data protection laws to generative AI. Organisations involved in developing and deploying generative AI models must ensure compliance with data protection principles, particularly regarding transparency, lawful basis, and individual rights. The ICO's forthcoming guidance updates will offer further clarity and support for navigating these complex regulatory landscapes.

This document (and any information accessed through links in this document) is provided for information purposes only and does not constitute legal advice. Professional legal advice should be obtained before taking or refraining from any action as a result of the contents of this document.