The Internet Archive’s Wayback Machine, long considered one of the most important tools for preserving the internet’s history, has seen a sharp decline in archived webpages since mid-May 2025.
According to a new report from Nieman Lab, the number of homepage snapshots from 100 major news websites fell by 87 percent between May 17 and October 1, raising concerns about gaps in the public historical record.
From January 1 to May 15, the Wayback Machine captured 1.2 million snapshots from those 100 news sites’ homepages. In the following five months, only 148,628 snapshots were archived. For comparison, CNN’s homepage alone was archived 34,524 times in the first period but only 1,903 times afterward.
The decline is particularly troubling because many of the affected sites are news organizations, which often serve as primary historical sources for future researchers.
Internet Archive Cites Technical Breakdown
Mark Graham, director of the Wayback Machine, told Nieman Lab that “a breakdown in some specific archiving projects in May” led to reduced archiving activity. He added that some missing snapshots may still be added later, explaining that the index structure for certain archives “has not yet been built.”
However, Nieman Lab noted that a five-month indexing delay is unusual. Graham attributed the slowdown to “various operational reasons,” including resource allocation issues, but did not provide further details.
The Internet Archive’s recent financial disclosures suggest growing pressure on the nonprofit. In 2023, it reported $32.7 million in expenses and $23 million in revenue, highlighting the resource-intensive nature of web crawling and data storage.
The organization also suffered a major data breach in October 2024, which temporarily took both the main site and the Wayback Machine offline for several weeks. Recovery from the incident may have contributed to ongoing technical challenges.
The Wayback Machine, which archives roughly 500 million webpages daily, has long played a crucial role in preserving digital journalism. With many print newspapers no longer serving as the main historical record, a slowdown in digital archiving could result in significant gaps in news preservation.
Earlier this year, California Senator Alex Padilla designated the Internet Archive as part of a national network of more than 1,000 libraries responsible for archiving government documents for public access. The new role underscores the institution’s importance — and the potential consequences of its reduced archiving capacity.
Featured image credits: Wikimedia Commons
For more stories like it, click the +Follow button at the top of this page to follow us.