On May 27, 2024, an anonymous source shared a treasure trove of leaked Google Search API documents with Rand Fishkin, Moz founder and CEO of Sparktoro. The leak is extensive, and includes details on ranking factors, algorithm updates, and proprietary search technologies. Within days the leak was confirmed by Google (albeit indirectly).
Here are 4 takeaways based on what we know so far.
1. What Actually Happened?
To know what happened, first we have to talk about how Google employees work on Search.
Google Search is built on hundreds of systems that provide input for the ranking algorithm. Those systems can be interacted with directly by Google Search developers (with authorization) using internal APIs – testing modifications, combining systems, and experimenting with new ranking factors are all performed by using those systems. To keep these APIs usable and organized, Google maintains extensive (according to the leak, more than 2500 pages!) documentation on these APIs and how they contribute to the algorithm. This kind of documentation appears to exist for every Google product, and it is kept extremely secure. Usually.
According to the leaker, who has come forward as Erfan Azimi of EA Eagle Digital, the Github repository for this documentation was made public (likely by accident) around March 2024, and was copied to Github clone sites during that time. iPullRank has a detailed rundown of the timeline as well as many of the most technical details.
I’ve seen articles that claimed to reveal how Google works before, but this one felt different – mostly because Rand Fishkin is involved. I’ve been following Rand for years – when I got started in 2013, Moz’s Whiteboard Friday videos where he’d break down SEO Strategy were required viewing at the local SEO agency where I interned. In 2017, I went to my first marketing conference (Mozcon) where he not only headlined talking about a future without Google traffic (prescient), but wowed at the afterparty by identifying whiskeys on smell and taste alone.
While his new venture at Sparktoro is a detour from SEO, I and others still regard him as one of the most trusted voices when it comes to how Google really works.
2. How has the SEO Community Reacted?
SEO practitioners have traditionally fit into three different groups: White-hat, black-hat, and grey-hat (the space between).
White-hat SEO refers to optimization that strictly hews to Google’s published Webmaster Guidelines for Search and statements by Google officials like John Mueller, Sundar Pichai, and once upon a time, Matt Cutts. Statements like “content is king,” “we don’t look at clicks,” and “we don’t whitelist websites” have become part of the canon for white-hat practitioners.
There’s a whole thing about how Google extending a friendly hand to the SEO industry was a deal with the devil, but I’m ready to clock out for the weekend after this publishes, so we’ll save that one for another day.
Black-hat is the opposite; do whatever it takes to rank, whether or not Google (or your audience) likes it. The worst websites come from these sources, including recent examples like obituary spam, and more broadly, the entire affiliate marketing industry.
Grey is where we live; Google’s guidelines are great when they’re (1) proven to work, and (2) designed to help real people use websites better. We also have known that many of the signals that Google has denied over the years were obviously being used for ranking, as experiments from practitioners like AJ Kohn and agencies like Moz showed.
All that’s to say… if you’ve been living in any shade darker than Google’s guidelines, this leak is deeply vindicating. It validates the advice I’ve been giving to clients for 10 years, and undercuts Google’s strategy to build bridges with SEO practitioners (also good!)
If you were an ardent follower of the Webmaster guidelines, I’m sorry, you are probably feeling embarrassed right now.
3. Context: Google’s Quixotic AI Quest of 2024
This leak arrives just on the heels of Pizza Glue. That is, the rollout of AI Overviews on search result pages. They’ve been funny at best and harmful to newborns at worst. In the face of criticism by many parties, in particular website owners, Google’s response has been “it’s not broken, you are.” Wild, right?
To be fair, we’re not afraid of using AI, but we’re focused on how to use it precisely, maximizing our creativity and strategic insights. Like, for instance, experimental cheese adhesion.
For two decades, SEOs have asked about ranking factors that Google has long denied it used. But in the face of mounting evidence that is now plain to identify, like how Penguin really has a binary value for a penalty, it’s become clear Google hasn’t been completely transparent.
4. Key Takeaways for Site Owners
If you’re working with an agency with its head on straight, not much is changing. We’re still parsing out some of the technical details, and experiments will be coming in the next few months to tease out previously-unknown ranking factors. Here and now, these are some top recommendations:
- Focus on User Engagement: The documents show that direct user engagement signals, like clicks, time on page / time on site, and dwell time (gathered via anonymous Chrome user data) are hugely important.
- Optimize for the results page: There are extensive metrics in this leak around how users interact with SERPs – quick clicks to results and backtracking (retreating from the first clicked result after several seconds) seem highly important to determining the order of results.
- Whitelists are real for protected categories: Several categories have had severely restricted results for health and safety reasons, especially for Covid and elections. Attempting to rank on these results is futile; this isn’t wrong, but potentially excludes large swaths of searches for targeting in some industries.
- Clicks = quality: Google is using three tiers of links based on quality, with that quality determined by click metrics. Links that have not been visited by many users will be stuck in the low quality index and contribute nothing to results; generating clicks on pages may be necessary to achieve visibility for pages that are deeply linked on a site.
Conclusion
It’s still early; much more will be said about these documents, and we still don’t know how they are weighted against today’s algorithm. We’re staying on top of it, watching what happens with our clients and what the industry is discovering as these ranking factors are put to the test. Existing clients will be kept up to date on the latest recommendations from our team.
We’re focused on how to make sites better for actual humans; part of that is optimizing for Google, and another part is hooking people into other channels that aren’t dominated by an AI-obsessed executive team hyping a bot that can’t count to four. We’re in it for the long haul, whether Google continues to be searchable or not.