Home » Artificial intelligence and copyright: why the OpenAI and Microsoft lawsuit and the AI ​​Act are connected

Artificial intelligence and copyright: why the OpenAI and Microsoft lawsuit and the AI ​​Act are connected

by admin
Artificial intelligence and copyright: why the OpenAI and Microsoft lawsuit and the AI ​​Act are connected

After the rumors of last August, when the news emerged that the New York Times had blocked OpenAI’s web crawler, preventing the platform from using the publication’s content to train its artificial intelligence models was in the air the news of a lawsuit brought by the US publisher which was promptly announced just after Christmas.

According to the NYT’s lawyers, OpenAI and Microsoft have used the NYT’s work to create artificial intelligence products that compete with it, threatening the publisher’s ability to provide a news service. The defendants’ generative artificial intelligence (“GenAI”) tools are based on large language models (“LLMs”) that were built by copying and using millions of news articles, in-depth investigations, opinion pieces, reviews, how-tos copyrighted by the NYT and more. Although the platforms were trained on large-scale copying from many sources, they placed particular emphasis on NYT content when creating their LLMs, revealing a preference for the publisher’s own content. Through Microsoft’s Bing Chat (recently rebranded “Copilot”) and OpenAI’s ChatGPT, the lawsuit states, they seek to take advantage of the NYT’s massive investment in their journalism by using it to build replacement products.

The American publisher also added that it had undertaken the negotiation path to be able to identify a contractual solution with the platforms for the use of its contents with a license and remuneration but that it had received as a response that the training of ChatGPT-type systems would be occurred within the scope of the “fair use” exception (i.e. a free use provided for by American legislation, for example for education), which the NYT instead disputes as it concerns a purely commercial use of its contents.

See also  when and why we started to believe this idiocy

Information and AI Apple in negotiations with publishers to train its artificial intelligence with news by Bruno Ruffilli 23 December 2023

In the court documents that it was possible to view, the NYT shows entire paragraphs of hundreds of articles that are reproduced in full following a prompt from the Bing user for ChatGPT.

It is therefore not just a question of using material created by the NYT for training but of complete copies of articles which would be the basis of the very structure of the platforms’ LLMs. The publisher’s lawyers insist on this point, highlighting how we are faced with a massive and conscious violation by OpenAI and Microsoft given the presence of articles taken verbatim from the newspaper’s website without any authorization or prior verification.

The offending conduct is not only limited to having reproduced millions of copies of NYT content without authorization for training, but extends to the distribution and making available of content that was originally accessible behind a paywall, entering into direct competition with the same NYT on the news market.

The news of the legal action in the USA comes a few days after the trilogue in Brussels, where Parliament, the EU Council and the Commission found a difficult agreement on the text of the first regulation in the world that seeks to manage generative artificial intelligence. Among the most controversial points on which a compromise was found after many hours of negotiation, precisely that relating to transparency and the obligation of registers for the GPAI of the platforms, i.e. the obligation to communicate to the right holders which works have been used for training by giving copyright holders the possibility to opt-out under the Copyright Directive or ask to negotiate a license for exclusive rights. It is no coincidence that this topic has been the subject of a heavy lobby by the platforms to avoid these obligations, confirming the awareness of having already extensively used all the contents to train the systems.

The lawsuit promoted by the NYT as well as others, including Universal Music against Anthropic, a start-up behind which there are investments by Google and Amazon, cited for having illegally reproduced thousands of song lyrics protected by copyright, they will have to set a fundamental precedent for the future of the content industry. A great opportunity to grow and prosper in the face of the unknown of the largest theft of copyrighted works in the history of humanity.

See also  Greentech Saxony-Anhalt invests more money in energy efficiency for companies

*CEO FIMI, Italian Music Industry Federation

You may also like

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More

Privacy & Cookies Policy