UK begins review of training AI models on copyrighted content

On December 9, OpenAI made its Sora artificial intelligence video generation model publicly available in the United States and other countries.

Cphoto | Future publishing | Getty Images

The UK is drawing up measures to regulate the use of copyrighted content by technology companies to train their artificial intelligence models.

The UK government launched a consultation on Tuesday aimed at providing clarity for both the creative industries and AI developers about how intellectual property is acquired and then used by AI firms for training purposes.

Some artists and publishers are unhappy with the way their content is freely scraped by companies like OpenAI and Google to train their large language models – AI models trained on massive amounts of data to generate human-like responses.

Large language models are the foundational technology behind today’s generative AI systems, including OpenAI’s ChatGPT, Google’s Gemini, and Anthropic’s Claude.

Last year, Die New York Times filed a lawsuit against it Microsoft and OpenAI accuses the companies of violating their copyright and misusing intellectual property to train large language models.

In response, OpenAI dismissed the NYT’s allegations, saying that using open web data to train AI models should be considered “fair use” and that it was providing rights holders with an “opt-out” “because it’s the right thing to do.” be.” “

Separately, image distribution platform Getty Images sued another generative AI company, Stability AI, in the UK, accusing it of removing millions of images from its websites without consent to train its Stable Diffusion AI model. Stability AI has contested the lawsuit, pointing out that the training and development of its model took place outside the UK

Suggestions to consider

First, the consultation will consider making an exception to copyright for AI training when used in the context of commercial purposes, while still allowing rights holders to reserve their rights to control the use of their content can.

Second, the consultation proposes measures to help developers license and be compensated for the use of their content by AI model makers, and to provide clarity to AI developers about what material is used to train their models can.

The government said more work needs to be done by both the creative industries and technology companies to ensure that all rights reservation and transparency standards and requirements are effective, accessible and widely used.

The government is also considering proposals that would require AI model makers to be more transparent about their model training datasets and how they are sourced, so that rights holders can understand when and how their content was used to train AI.

That could prove controversial — tech companies aren’t particularly forthcoming about the data that powers their coveted algorithms or how they train them, given the commercial sensitivity involved in disclosing those secrets to potential competitors .

Previously, the government under former Prime Minister Rishi Sunak tried to agree on a voluntary code of conduct for AI copyright.

AI copyright rules: UK versus US

In a recent interview with CNBC, the boss of app development software company Appian said he believes the UK is well placed to be a “world leader on this issue”.

“The UK has stood up and said it prioritizes personal intellectual property rights,” Appian CEO Matt Calkins told CNBC. He cited the Data Protection Act 2018 as an example of how closely tied the UK is to intellectual property rights.

The UK also doesn’t face “the same overwhelming lobbying attack from domestic AI leaders as the US,” Calkins added – meaning it may not be as inclined to give in to pressure from tech giants as politicians in the US.

“In the US, anyone who writes a law about AI will hear from Amazon, Oracle, Microsoft or Google before that bill even passes,” Calkins said.

“This is a powerful force preventing anyone from writing sensible laws or protecting the rights of individuals whose intellectual property is being taken over en masse by these major AI players.”

The issue of possible copyright infringement by AI companies is becoming increasingly important as technology companies turn to a more “multimodal” form of AI – that is, AI systems that can understand and generate content in the form of images and videos as well as text.

Last week, OpenAI made its AI video generation model Sora publicly available in the US and “most countries around the world.” The tool allows a user to enter a desired scene and create a high-resolution video clip.

Source link

Spread the love