What Google's Own Documentation Says About Ranking (vs What SEOs Think)

What Google's Own Documentation Says About Ranking (vs What SEOs Think)
Google's public documentation and its internal systems tell two different stories about ranking.

Google maintains a public guide to its ranking systems. It names every active system, explains what each one does, and tells you which ones have been retired. This is the closest thing to an official ranking factors document that exists anywhere.

Most SEOs have not read it carefully. I'd go further: a lot of the industry's operating assumptions directly contradict what's written in it.

For years, SEO knowledge has accumulated through a mix of Google's public statements, third-party experiments, conference talks, and guesswork that eventually hardened into conventional wisdom. Some of that wisdom is solid. Some of it is based on things Google explicitly says are not true. And some of it falls into an uncomfortable middle category, where Google's public documentation says one thing but their own leaked internal systems do something different entirely.

In May 2024, a bot accidentally pushed internal Google Content Warehouse API documentation to a public GitHub repository. The dump included 2,596 modules and over 14,000 attributes describing how Google's ranking systems work internally. As we covered in our breakdown of how Google ranking actually works, the leak confirmed some things, contradicted others, and made the whole "just follow Google's guidelines" advice feel a bit naive.

This is a close read of what Google's documentation actually says about ranking, where the industry's understanding diverges from it, and what you should probably do differently because of the gap.

Google Talks About Systems, Not Factors. That Distinction Is Doing a Lot of Work.

Here's something easy to miss. Google's ranking systems guide doesn't list "ranking factors" at all. It lists systems. BERT, MUM, RankBrain, the helpful content system (which was folded into the core ranking algorithm in 2024 and no longer operates independently), the reviews system, the link analysis system, the page experience system. Each system evaluates signals differently, and none of them work in isolation.

The SEO industry, meanwhile, has spent two decades talking about "200 ranking factors" as if there's a spreadsheet somewhere inside Google with 200 rows and a weight column. That framing came from a 2009 Matt Cutts comment that was probably an approximation even then. The API leak revealed over 14,000 attributes. That's not 200 factors with nuance. That's a fundamentally different architecture than most SEO frameworks assume.

The distinction matters because it changes how you should think about optimization. If you believe in a checklist of factors, you optimize each one individually. Title tag, check. Meta description, check. Word count, check. But if Google is running interconnected systems that evaluate patterns and intent across entire pages and sites, then optimizing for isolated signals is a bit like studying for a test by memorizing the answer key from a different exam.

Google's documentation is pretty clear on this. John Mueller has said publicly that word count is not a ranking factor. He's said domain age "helps nothing." He's said Google doesn't count links at the domain level. These aren't ambiguous statements. They're direct denials. And yet, most SEO audit tools still flag pages for being under 1,500 words. Domain authority is still the default metric in half the industry's pitch decks. The gap between what Google documents and what the industry practices is wider than most people realize.

Five Things the Industry Gets Confidently Wrong

Domain authority is not a Google metric. This one is surprisingly persistent. Domain Authority is a score invented by Moz. Domain Rating is a score invented by Ahrefs. Google has no equivalent. Mueller has said "we don't use domain authority" multiple times, going back to at least 2019. The API leak did reveal something called "siteAuthority," but its exact function and weight remain unclear, and Google's public position hasn't changed. If your link building strategy is built around chasing DA scores, you're optimizing for a metric that one company made up and another company says it doesn't use.

Bounce rate is not a ranking signal. Google has said they do not use Google Analytics data for ranking purposes. Mueller has confirmed this directly. The logic makes sense too: not every site has Analytics installed, and Google would be creating an incentive to game a metric they control both sides of. Now, dwell time and click satisfaction are part of ranking through NavBoost (more on that in a moment). But that's different from bounce rate. The users who pogo-stick back to search results are sending a signal. The ones who bounce to their inbox aren't necessarily sending anything at all.

Meta descriptions don't affect rankings. Google has confirmed this repeatedly. They do affect click-through rate on the SERP, which indirectly matters a lot (again, NavBoost). But the meta description tag itself carries zero ranking weight. Google frequently rewrites them anyway, pulling text from the page that better matches the query. If you're spending hours perfecting meta descriptions for ranking purposes, you're working on a cosmetic layer, not a structural one.

Keyword density hasn't mattered since 2011. Matt Cutts told SEOs to stop worrying about it over a decade ago. Google's natural language processing systems (BERT, MUM) understand meaning and context, not keyword frequency. A page that uses a phrase once in the right context can outrank a page that uses it fifteen times. And yet, tools still show keyword density percentages like they mean something.

E-E-A-T is not a direct ranking factor. This is the subtlest one. Experience, Expertise, Authoritativeness, and Trustworthiness is a framework from Google's Quality Rater Guidelines, which human evaluators use to assess search quality. It informs how Google thinks about quality in the abstract, and it probably influences how they train their ranking systems over time. But there is no "E-E-A-T score" that your page receives. Mueller has confirmed this. The mistake is treating it as a technical factor you can optimize with author bios and credential pages, when it's more of a design philosophy that Google's systems try to approximate through other signals.

The API Leak Confirmed What Google Denied for Years

Here's where the documentation story gets uncomfortable. For years, Google representatives said clicks were not a ranking signal. The 2024 API leak told a very different story.

The leak referenced a system called NavBoost 84 times across six dedicated models. Sworn testimony from Google executives during the DOJ antitrust trial described NavBoost as one of Google's most important ranking signals. It uses a rolling 13-month window of aggregated click data from Chrome to refine search results. The documentation referenced specific metrics: "goodClicks," "badClicks," and "lastLongestClicks."

So when Google's public documentation focuses on content quality and relevance, it's telling a true but incomplete story. Content quality matters. But user behavior data, collected at a scale that no third-party tool can replicate, also matters enormously. The systems that process this data (NavBoost for collection, a system internally called CRAPS for processing click data into ranking adjustments) were among the most heavily referenced in the entire leak.

The leak also revealed a "siteAuthority" attribute, contradicting years of Google saying they don't use site-level authority scores. The "hostAge" attribute tracked domain age, despite Mueller saying it "helps nothing." Now, Google's official response was that the leaked documents were "out of context" and potentially outdated. That's possible. But the disconnect between the public documentation and the internal systems is real, and pretending it doesn't exist isn't a strategy. It's just wishful compliance.

Google responded to the leak by updating some of their documentation in the months that followed. The 2MB indexing limit, for instance, was formally acknowledged. But large parts of the gap remain unaddressed.

What You Should Do With This Information

I realize this could feel paralyzing. If Google's documentation is incomplete and the industry's conventional wisdom is partly wrong, what do you actually optimize for?

A few things seem pretty clear from triangulating the documentation, the leak, and what's been confirmed through testing.

Click satisfaction is probably the single most actionable signal. If NavBoost is as important as Google's own executives testified, then the pages that keep users from returning to the SERP have a structural advantage. That means your title and description need to set accurate expectations, and your content needs to fulfill them within the first scroll. Check your Search Console data: pages with high impressions but low CTR, or high CTR but high pogo-stick rates, are the ones to fix first.

Stop optimizing for metrics Google says they don't use. Domain authority, keyword density, bounce rate. These are comfortable because they're measurable. But measurability and importance are different things. Redirect the time you spend chasing DA 50+ backlinks toward building content that actually answers the query better than the current top results.

Read the actual ranking systems guide. It takes about 20 minutes. You'll find that some systems you've been optimizing for (like the helpful content system) no longer exist as standalone systems, and others you've been ignoring (like the reviews system or how core updates actually roll out in stages) deserve more attention than they're getting.

Treat E-E-A-T as a content philosophy, not a checklist. Adding an author bio with credentials to a mediocre page doesn't make Google trust it more. Writing from genuine experience on a topic that the rest of your site demonstrates authority on, over time, with real sources and original analysis, probably does. The difference is subtle and it's slow. That's precisely why it works.

The Documentation Gap Is the Game Now

There's an irony in all of this. Google publishes more documentation about how Search works than any other search engine. They have the ranking systems guide, the Search Quality Evaluator Guidelines, the Search Central blog, years of Mueller office hours on YouTube. And still, the gap between what's documented and what's actually happening inside the algorithm is wide enough to build a consulting industry in.

From what I've seen, the SEOs who do best in 2026 aren't the ones who memorize the documentation or the ones who ignore it. They're the ones who read it, understand what it actually claims (and what it deliberately leaves out), cross-reference it against the API leak disclosures, and then build their strategy around the overlap. The documentation tells you what Google wants to reward. The leak tells you what the systems actually measure. The overlap between those two is where rankings actually happen.

The uncomfortable truth is that you probably need to hold both versions of reality in your head at the same time. Google's guidelines are worth following because they describe the direction the algorithm is trying to move. The leak data is worth studying because it describes where the algorithm actually is right now. Optimizing for only one of them leaves you half-prepared.

I don't think that gets simpler anytime soon. If anything, as AI systems take on more of the ranking workload, the gap between documented intent and actual behavior will probably grow wider. The documentation will keep describing ideals. The systems will keep making tradeoffs. The job of SEO in 2026 is reading both, honestly, and working in the space between them.

By Notice Me Senpai Editorial