Key Takeaways
Daily Mail wants to sue Google for copyright infringement.
Google took Daily Mail articles without permission to train Bard.
The dataset reportedly consisted of one million articles, with almost 75% sourced from the Daily Mail and 25% from CNN.
The Daily Mail, a big newspaper company in the United Kingdom, is getting ready to file a lawsuit against Google.
The Daily Mail plans to sue Google because Google used articles and content from the Daily Mail website without getting permission first.
Google used the Daily Mail articles to train its AI program, Bard. Bard is similar to ChatGPT which can understand and respond to human questions or instructions.
To train Bard, Google used almost 1 million articles from various websites. Around 75% of those 1 million articles came from the Daily Mail. About 25% came from the CNN.
The reason Google may have chosen these two websites is that they both summarize their articles using bullet points at the start.
Google likely had Bard practice by showing it the summaries and asking it to fill in the missing parts from the full article text.
The publishing house, Daily Mail and General Trust (DMGT), owned by Lord Rothermere, is said to have sought legal advice. DMGT is consulting with lawyers to prepare a copyright lawsuit against Google over using their articles without consent.
Tensions Around the Use of Online Content for AI
This situation highlights tensions between news publishers and tech companies like Google around the use of online content.
Publishers are worried that having their content used without compensation could negatively impact their business.
In the past, search engines simply indexed website content to include in search results. But now AI requires actually ingesting and learning from that content during training. Many publishers feel they should be paid if their proprietary material is used in this way.
Separate Class Action Lawsuit in the US
Separately from the Daily Mail case, Google is also facing a proposed class action lawsuit in the United States over similar issues. A class action allows a group of people to sue as a single entity.
In this US case, the plaintiffs include authors and others who claim Google illegally used their copyrighted content and personal data to train AI systems.
The plaintiffs want up to $5 billion from Google if the company is found liable.
Major news companies and writers are taking Google to court. They accuse Google of improperly taking and using their proprietary data to develop AI technologies, without getting consent or providing compensation.
The lawsuits could result in Google having to pay large penalties if the claims are proven true in court.
Google says the claims against them are not true. A general counsel at Google named Halimah DeLaine Prado gave a statement.
She said American laws allow companies to use public information from the internet to create new helpful things. Since Google used publicly available information, she believes they did nothing wrong.