GDPR and OpenAI: how well founded are the Guarantor's accusations?

“The OpenAI case” is back on the scene, which we had already dealt with when the Italian Guarantor temporarily prevented Italian users from using ChatGPT.

The second act of this piece opens with the announcement by the Italian authority that it has notified OpenAI of the adispute for alleged violations of the legislation on the protection of personal data.

The Authority has not yet released any information on the merits of the accusations. However, based on other public initiatives in collateral subjects such as webscraping, it is It is likely that the Guarantor will hold OpenAI responsible for having collected personal data in Italy, for having “exported” them to the USA, for having processed them for profit without a legal basis.

As I wrote when the authority initially blocked OpenAI, I have some doubts as to whether the accusations against OpenAI are founded.

First, a procedural issue should be considered: an administrative authority should not have direct jurisdiction over foreign entities. A public prosecutor who wants to investigate abroad must resort to international cooperation treaties, and the decision of a national court must be examined by a court in the receiving country to be enforced in cases, for example, of family law. So, how is it possible that an entity without jurisdictional status can have powers superior to those of a magistrate?

Secondly, ChatGPT is not designed to provide correct results. Therefore, if the accusation was that this software processes personal data unreliably, then it would not be correct. The level of reliability of the output is not an absolute given, but must correspond to the purpose declared by the manufacturer. Since ChatGPT is not a “lie detector” and is not sold as such, it is difficult to argue that the reliability of the results is a source of liability for violation of the GDPR. In other words, It’s not OpenAI’s fault if people persist, against all logic, in using ChatGPT as a substitute for their (lack of) knowledge and then they complain about the results because they are unable to evaluate them and make mistakes.

Third, if the purpose of the processing is not to provide reliable data, why should the GDPR be a concern?

This is a crucial point because it would be the only dispute with some basis in relation to the lack of a legal basis for the processing of personal data, not so much in terms of GDPR but in terms of “predatory” exploitation of information (i.e. monetization some data).

It is reasonable to say that content made available online by individuals (including their personal data) is intended to be consulted, not reused for profit by third parties without at least compensating the rights holders or obtaining some form of license from them. In this regard, other players such as Meta and Google may have a different position, as their terms and conditions allow more room for maneuver in processing user-generated content for purposes other than providing a specific service.

Furthermore, OpenAI (like other players in the sector) has scoured the entire Big Internet to accumulate data in the pursuit of a commercial objective and not for “pure” research purposes. The defense that some genAI providers are using in US disputes is that of “fair use”, but fair use is linked to copyright – copyright in Italy – and only works for creative (i.e. artistic) works. As difficult as it is to say that (personal) data processing is equivalent to Divine Comedy, it is true, however, that copyright fair use is conceptually similar to GDPR legitimate interest. So, if the matter were indeed subject to the Data Protection Regulation, the crux would be put freedom of enterprise on the scales (which, it is worth remembering, in Italy is a constitutionally protected right) with the freedom and fundamental rights of individuals.

The problem to be resolved is therefore whether the scraping and further processing of data by OpenAI undermines, at least in potential terms, the fundamental rights of citizens of EU Member States and, more importantly, those guaranteed by our Constitution.

To this end, it is the duty of data protection authorities to provide robust evidence in relation to each individual person whose data has been reused to possibly enable a claim for damages. For example, if I search for my university email address via Google I find it without difficulty, while if I do the same thing via ChatGPT 3.5 I get this response: I’m sorry, but I can’t help you find the email address of a specific person, like Andrea Monti, without violating the privacy and ethical rules that govern the disclosure of personal information. If you would like to contact Professor Andrea Monti at the University of Chieti breaking latest news, I recommend you visit the official website of the university and look for the contact information on their website. Usually, universities provide public email addresses or contact forms to communicate with academic staff. If, however, I do the same with ChatGPT 4 I get an incorrect result which returns the data of a homonym who teaches at another university. Also the vanity search run on the two platforms produces contrasting results. Version 3.5 of ChatGPT, no matter how detailed the prompt, does not produce results, while version 4 identifies me more or less correctly, because it analyzes and summarizes the results provided by Bing (Microsoft’s search engine). But the fact that ChatGPT4 interacts with a search engine to provide results further reduces the scope for OpenAI’s possible liability. If, in fact, the data it reprocesses comes from a search engine, the errors would not be attributable (only) to the generative AI platform but to the sources made available by the engine itself.

Furthermore, if OpenAI is responsible for having reprocessed that data without legal title, the same accusation should apply to Bing and, more generally, to other search engines.

Therefore, to support its accusations, the Guarantor could not limit itself to generic statements on the “dangers of AI” and deduce from these some form of legal responsibility of OpenAI. It will therefore be interesting to see how the burden of proof will be satisfied.

Regardless of the legal technicalities, however, the fundamental aspect to consider in the OpenAI case is the further blurring of the boundary between law and politics. For a long time, the GDPR has been used as a weapon in the “silent” battle between the EU and the US. But using the law to pursue political objectives in managing international relations may not be the smartest idea, since regulation is a double ax that doesn’t cut in just one direction, and not necessarily with the same effectiveness.

Transparency Disclaimer: The author has no relationship with the parties involved in the facts underlying the article.

GDPR and OpenAI: how well founded are the Guarantor’s accusations?

Share this:

Related

The Political Bureau of the CPC Central Committee held a meeting and Xi Jinping, General Secretary of the CPC Central Committee, presided over the meeting – News Channel – Hualong.com

BMW 4 Series: a light restyling for her

You may also like

Leave a Comment Cancel Reply