Discover ANY AI to make more online for less.

select between over 22,900 AI Tool and 17,900 AI News Posts.


Publishers are blocking the Internet Archive for fear AI scrapers can use it as a workaround
Publishers are blocking the Internet Archive for fear AI scrapers can use it as a workaround

The Internet Archive has often been a valuable resource for journalists, from it's finding records of deleted tweets or providing academic texts for background research. However, the advent of AI has created a new tension between the parties. A few major publications have begun blocking the nonprofit digital library's access to their content based on concerns that AI companies' bots are using the Internet Archive's collections to indirectly scrape their articles."A lot of these AI businesses are looking for readily available, structured databases of content," Robert Hahn, head of business affairs and licensing for The Guardian, told Nieman Lab. "The Internet Archive’s API would have been an obvious place to plug their own machines into and suck out the IP."The New York Times took a similar step. "We are blocking the Internet Archive's bot from accessing the Times because the Wayback Machine provides unfettered access to Times content — including by AI companies — without authorization," a representative from the newspaper confirmed to Nieman Lab. Subscription-focused publication the Financial Times and social forum Reddit have also made moves to selectively block how the Internet Archive catalogs their material. Many publishers have attempted to sue AI businesses for how they access content used to train large language models. To name a few just from the realm of journalism:The New York Times sued OpenAI and MicrosoftThe Center for Investigative Reporting sued OpenAI and MicrosoftThe Wall Street Journal and New York Post sued PerplexityA group of publishers including The Atlantic, The Guardian and Politico sued CoherePenske Media sued GoogleThe New York Times and the Chicago Tribune sued PerplexityOther media outlets have sought financial deals before offering up their libraries as training material, although those arrangements seem to provide compensation to the publishing companies rather than the writers. And that's not even delving into the copyright and piracy issues also being fought against AI tools by other creative fields, from fiction writers to visual artists to musicians. The whole Nieman Lab story is well worth a read for anyone who has been following any of these creative industries’ responses to artificial intelligence.This article originally appeared on Engadget at https://www.engadget.com/ai/publishers-are-blocking-the-internet-archive-for-fear-ai-scrapers-can-use-it-as-a-workaround-204001754.html?src=rss

Rating

Innovation

Pricing

Technology

Usability

We have discovered similar tools to what you are looking for. Check out our suggestions for similar AI tools.

Private Internet Access VPN review: Both more and less than a budget VPN
Private Internet Access VPN review: Both more and less than a budget VPN

<p>I came into this review thinking of Private Internet Access (PIA) as one of the better VPNs. It's in the Kape Technologies portfolio, along with the top-tier ExpressVPN and the generally [...]

Match Score: 116.88

engadget
Internet Archive is now an official US government document library

<p>The US Senate has granted the <a data-i13n="cpos:1;pos:1" href="https://archive.org/">Internet Archive</a> federal depository status, making it officially part [...]

Match Score: 111.78

Sony and other music labels settle copyright lawsuit against the Internet Archive
Sony and other music labels settle copyright lawsuit against the Internet A

<p>In 2023, Sony Music Entertainment, Universal Music Group and a handful of other music labels <a data-i13n="cpos:1;pos:1" href="https://www.engadget.com/sony-and-other-music- [...]

Match Score: 105.91

Anna's Archive told to pay Spotify and record labels $322 million over unprecedented music scraping
Anna's Archive told to pay Spotify and record labels $322 million over

<p>The open-source library and search engine Anna’s Archive has been ordered to pay Spotify and the three of the world’s largest music labels $322 million in damages after it <a data-i13n [...]

Match Score: 101.83

Cloudflare experiment will block AI bot scrapers unless they pay a fee
Cloudflare experiment will block AI bot scrapers unless they pay a fee

<p>Cloudflare has rolled out a couple of new measures meant to keep AI bot crawlers at bay. To start with, every new domain customer that signs up with the company to manage their website traffi [...]

Match Score: 70.29

blogspot
Most Frequently Asked Questions About Affiliate Marketing

<p><span style="font-family: Frank Ruhl Libre;"></span></p><div class="separator" style="clear: both; text-align: center;"><div class=&qu [...]

Match Score: 68.32

Threads users still barely click links
Threads users still barely click links

<p>Two years in, Threads is starting to look more and more like the most viable challenger to X. It passed 350 million monthly users earlier this year and Mark Zuckerberg has predicted it could [...]

Match Score: 65.70

Reddit is restricting its availability to the Internet Archive's Wayback Machine
Reddit is restricting its availability to the Internet Archive's Wayba

<p>The Internet Archive&#39;s Wayback Machine is the latest victim of Reddit&#39;s crackdown on data access. The company has begun to place new restrictions on what the archive site will [...]

Match Score: 55.48

The Internet Archive modernizes its GeoCities GIF search engine
The Internet Archive modernizes its GeoCities GIF search engine

<p>The Internet Archive made it easier to search for '90s-era GIFs. <a data-i13n="elm:context_link;elmt:doNotAffiliate;cpos:1;pos:1" class="no-affiliate-link" href=&q [...]

Match Score: 52.17