Data protection considerations related to the development of AI models
Reading time: 5 minutes
Artificial intelligence (“AI“) is a rapidly evolving family of technologies that contributes to a wide range of economic, environmental, and social benefits across all sectors and social activities. By improving predictive accuracy, optimizing operational processes and the allocation of resources, and enabling the personalization of digital solutions available to individuals and organizations, the use of AI can confer a decisive competitive advantage on businesses while also delivering beneficial social and environmental outcomes.
The use of artificial intelligence, alongside its potential benefits, is also associated with certain risks. In order to mitigate these risks, Regulation (EU) 2024/1689 of the European Parliament and of the Council on artificial intelligence (“AI Act”) has been adopted, several provisions of which have already entered into force. At the same time, the development of many AI models involves the use of personal data, which raises the question of how the AI Act affects data processing activities related to AI systems.
The relationship between the AI Act and the GDPR
The AI Act makes it clear that it does not amend the application of existing EU rules on the processing of personal data, including the requirements set out in the GDPR. Accordingly, organizations falling within the scope of the AI Act must, in the course of their data processing activities, comply fully with the provisions of the GDPR.
Through the enforcement of the right to the protection of personal data, the GDPR also supports the effective exercise of other fundamental rights, including, inter alia, freedom of thought and expression, the right to information and education, and the freedom to conduct a business. On this basis, it can be concluded that the GDPR establishes a legal framework that facilitates responsible innovation, including the responsible development and deployment of AI-related technologies.
Data protection considerations in relation with the development of AI Models
In connection with the development of AI models, the European Data Protection Board (“EDPB”) adopted a standalone opinion on data protection aspects arising in relation to the processing of personal data in the context of artificial intelligence models (“Opinion”).
The Opinion examines how personal data may be used in the development of AI models and highlights the issues requiring particular attention when placing on the market AI systems developed using personal data.
Lifecycle of AI Models
The EDPB divides the lifecycle of AI models into two stages, emphasizing that data processing may occur in either of them. The first stage covers the processes preceding the deployment of the model (including e.g. its creation, development, the training, the fine-tuning). The second stage relates to the deployment phase, encompassing the use of the model following its development.
Existence of a legal basis for data processing by data controllers
One of the cornerstones of data protection regulation is that personal data may only be processed where a specific legal basis exists. The Opinion reiterates the general expectation that data controllers must determine the appropriate legal basis for their processing activities.
However, the EDPB found that, as a general rule, an AI model developer may rely on legitimate interest as a legal basis, provided that the existence of such legitimate interest is duly substantiated. For this purpose, a three-step test – already familiar to those with experience in data protection compliance practice – serves to properly assess whether a legitimate interest genuinely exists.
The EDPB emphasizes that the balancing test must take into account whether the data subjects can reasonably expect their personal data to be used. The Opinion is significant in this regard because it sets out several criteria intended to assist data protection authorities in assessing the “reasonably foreseeable” criteria
The Opinion also recalls that, where it appears that the interests, rights, and freedoms of data subjects override the legitimate interests of the data controller or of a third party, all is not lost. Namely, the data controller may consider the implementation of mitigating measures to limit such adverse effects. These may include, for example, pseudonymization, or measures aimed at masking personal data or replacing them with fictitious personal data within the training dataset. The introduction of appropriate data protection measures can make data processing lawful again.
Anonymity
The GDPR classifies as personal data any information relating to an identified or identifiable natural person, whether directly or indirectly. According to the position of the EU institution, in the context of AI model development, personal data may only be used where they are properly anonymized, such that even in the event of a potential reverse engineering of the model, the identification of data subjects is not possible. With regard to anonymization, the EDPB emphasizes that the competent data protection authorities must assess, on a case-by-case basis, whether the organization developing the AI model has complied with this requirement. The body also sets out several recommended technique that may be suitable for preserving anonymity (e.g. prevent or limit the extraction of personal data used for training purposes).
Summary
The EU body emphasizes in its Opinion that compliance with data protection requirements governing the processing of personal data must be ensured throughout both the development and deployment of AI models. It is evident that the expansion of AI and its potential risks are being treated and monitored as a priority in law enforcement, and therefore numerous regulatory guidelines from authorities can be expected in the near future.
Photo source: pexels.com, Tara Winstead
Data protection considerations related to the development of AI models Read More »

