Key Takeaways
The Daily Mail is preparing to file a lawsuit against Google for copyright infringement.
Google allegedly utilized hundreds of thousands of articles from the Daily Mail without permission to train Bard.
The dataset used to train Bard reportedly consisted of one million articles, with approximately 75% sourced from the Daily Mail and 25% from CNN.
Google is currently facing the possibility of new legal action regarding copyright infringement from the owner of the Daily Mail, a prominent newspaper.
Allegedly, Google has unlawfully utilized hundreds of thousands of articles from the Daily Mail to train its competitor to ChatGPT, called Bard.
The publishing house, Daily Mail and General Trust (DMGT), owned by Lord Rothermere, is said to have sought legal advice and is preparing to file a lawsuit against Google in the near future.
Why Does it Matter?
The rise of generative AI, which relies on large language models trained on web content, has transformed the relationship between search engines and content creators.
Previously, search engines granted website operators full control over the crawling and indexing of their content.
However, the development of AI models like Bard has altered this landscape, raising concerns among content creators and publishers who fear potential negative consequences.
Google is accused of engaging in actions that include the creation of a dataset to train Bard.
This dataset allegedly consists of one million articles sourced from news publishers, and it is claimed that Google obtained these articles without seeking permission or notifying the publishers.
The accusation further specifies that approximately 75% of the articles were obtained from the Daily Mail, while the remaining 25% were sourced from CNN's website.
The choice of the Daily Mail and CNN as sources for training Bard can be attributed to the fact that both publishers summarize articles using bullet points at the beginning of their stories.
Google supposedly tested Bard's capabilities by providing these bullet point summaries with missing text and then asking the chatbot to fill in the gaps based on the main body of the article.
All parties involved, including Google, DMGT, and CNN, have declined to comment on these reports.
Other Allegations on Google
In addition to the potential lawsuit by DMGT, Google is also facing other legal troubles.
Specifically, eight plaintiffs in the United States, which includes a best-selling author from Texas, have accused Google of engaging in illegal activities.
They claim that Google has unlawfully used copyrighted content and improperly utilized personal information to train its AI products.
To address these allegations, the plaintiffs have taken legal action by filing a proposed class action lawsuit in San Francisco.
By doing so, they seek to represent a larger group of individuals who may have been affected by Google's actions. If the court finds Google guilty of the accusations, the plaintiffs are requesting compensation of up to $5 billion.
Put simply, alongside the potential lawsuit from DMGT, Google is confronted with a separate legal battle where multiple plaintiffs are accusing the company of copyright infringement and misuse of personal data for AI training purposes.
The outcome of these legal proceedings will determine whether Google will be held accountable and potentially face significant financial consequences.