A collection of 2,500 leaked internal documents from Google filled with details about data the company collects is authentic, the company confirmed today. Until now, Google had refused to comment on the materials.
The documents in question detail data that Google is keeping track of, some of which may be used in its closely guarded search ranking algorithm. The documents offer an unprecedented — though still murky — look under the hood of one of the most consequential systems shaping the web.
“We would caution against making inaccurate assumptions about Search based on out-of-context, outdated, or incomplete information,” Google spokesperson Davis Thompson told The Verge in an email. “We’ve shared extensive information about how Search works and the types of factors that our systems weigh, while also working to protect the integrity of our results from manipulation.”
The existence of the leaked material was first outlined by search engine optimization (SEO) experts Rand Fishkin and Mike King, who each published initial analyses of the documents and their contents earlier this week. Google did not respond to The Verge’s multiple requests for comment yesterday about the authenticity of the leak.
The leak is likely to cause ripples across the SEO industry
The leaked material suggests that Google collects and potentially uses data that company representatives have said does not contribute to ranking webpages in Google Search, like clicks, Chrome user data, and more. The thousands of pages of documents act as a repository of information for Google employees, but it’s not clear what pieces of data detailed are actually used to rank search content — the information could be out of date, used strictly for training purposes, or collected but not used for Search specifically. The documents also do not reveal how different elements are weighted in search, if at all.
Still, the information made public is likely to cause ripples across the search engine optimization (SEO), marketing, and publishing industries. Google is typically highly secretive about how its search algorithm works, but these documents — along with recent testimony in the US Department of Justice antitrust case — have provided more clarity around what signals Google is thinking about when it comes to ranking websites.
The choices Google makes on search have a profound impact on anyone relying on the web for business, from small independent publishers to restaurants to online stores. In turn, an industry of people hoping to crack the code or outsmart the algorithm has cropped up, delivering sometimes conflicting answers. Google’s vagueness and mincing of words has not helped, but the influx of internal documents offers, at least, a sense of what the company dominating the web is thinking.