Overview on EDPB Opinion 28/2024 on Personal Data in AI models - McGarr Solicitors

The European Data Protection Board has issued a guidance note on how to approach the regulation of artificial intelligence models. This was done at the request of the Irish Data Protection Commission who asked the EDPB to issue an opinion on matters of general application pursuant to Article 64(2) of the GDPR.

They asked four specific questions.

When and how an AI model can be considered as anonymous,
How controllers can demonstrate the appropriateness of legitimate interest as a legal basis in the development, and
deployment phases, and;
What are the consequences of the unlawful processing of personal data in the development phase of an AI model on the subsequent processing or operation of the AI model?

The executive summary of the report answers each one in turn. When it comes to the first question, the opinion sets out a case-by-case basis approach to answering the question. So the result is that any AI model be set the test of whether it processed personal data unlawfully and then, on a case-by-case basis, the model must be examined to see if it meets that test in order to be considered whether an AI model is anonymous.

The test is two-part, and it must meet both parts.

Part one, the likelihood of direct, including probabilistic extraction of personal data, regarding individuals whose personal data were used to develop the model, and part two, the likelihood of obtaining intentionally or not, personal data from queries into the model should be insignificant, taking into account all the means reasonably likely to be used by the controller or another person.

When it comes to the second and third questions, which dealt with the question of legitimate interest as a processing legal basis, the opinion generally restates the tests for legitimate interest processing and goes into some detail in relation to that, but it does raise some AI-specific elements.

So, it is highly specific on risks of fundamental rights that may emerge either in the development or the deployment phases of AI models arising from the role of data subjects’ reasonable expectations, which are to be taken account of as part of the balancing test.

And in particular, whether data subjects, when that balancing test is being done, whether data subjects can reasonably expect that their personal data would be processed in the means and nature of the AI model.

So, Data Controllers have to ask; where did the data come from, how it was collected, the sources from where it was collected, whether people knew that this is how it was going to be used, and also the potential future uses of the model, and whether the data subjects are even aware that the personal data is online at all.

All of those pose enormous difficulties for the current model of web scraping and processing personal data in order to train models without people’s knowledge, let alone expectation that their personal data would be used this way.

Finally, we come to the core question, which is actually the fourth question provided by the Data Protection Commission.

This addresses what the consequences are if an AI model is trained unlawfully with personal data, and what should be done about that by the regulatory authorities. Here, there’s a significant gap between the executive summary’s description of the consequences and the actual report.

The executive summary points out that regulatory authorities across the EU have a lot of discretion in applying the rules. However, the text itself points out that those discretionary powers are in the application of consequences, and then goes on to say what those consequences are expected to be.

They could be

issuing a fine,
imposing a limitation on the processing,
erasing the data set that was processed unlawfully, and if that is not possible,
possibly ensuring the entire data set is used to develop the AI model is erased and/or
Erasing the AI model itself.

This is a fairly significant series of powers given to the regulatory authority and recognized in this opinion as being appropriate under these circumstances.

The consideration then breaks into three different scenarios depending on the division of labour between the creation of the AI model and the deployment of the model and assesses each one.

In each case, although it is clear that it is possible for each scenario to involve unlawful behaviors, the opinion is careful not to say that a particular outcome is required in terms of regulatory action, but to stress again that it is a matter for the individual regulators to conduct case by case analysis, taking account of the specific circumstances of each case.

What is clear is there is risk for both the parties who process the data to create the model and the parties who then take the model and use it to deploy amongst those options.

The power of regulatory authorities to act upon the creation of AI models, including those which claim to be anonymized, is confirmed. Regulation of this sort can only be done with access to granular information of the inputs, processes and outputs of those models.

As a corollary, this Opinion is a charter for breaking open the black boxes of AI models and the companies who develop them.