pillar-content

How Google Ranking Works: 14,000 Signals, 3 That Matter

The ranking pipeline has three stages. Most SEO advice only addresses the last one.

Every SEO guide on the internet wants to explain how Google ranking works. Most of them list 200 factors, slap a "complete guide" label on it, and call it a day. The problem is that treating all signals equally produces the same thing as treating none of them seriously. You end up checking boxes that don't move rankings while ignoring the mechanisms that do.

In May 2024, Google's internal API documentation leaked. Over 14,000 ranking features were exposed. A few months before that, Google engineers testified under oath in the DOJ antitrust trial and contradicted years of public statements about how search works. Between the leak and the trial testimony, we now have a clearer picture of how Google actually ranks pages than at any point in the last decade.

This is a practitioner's guide. Not a checklist. Not 200 factors with equal weighting. The goal is to give you a working mental model of how pages get ranked, where AI fits into the system now, and where to spend your time if you want organic traffic to actually go up.

Google Doesn't Rank Pages the Way Most Guides Describe

Most SEO content treats ranking as a single step: Google looks at your page, evaluates it against some factors, and assigns a position. That's not how it works.

Google's ranking pipeline has three distinct stages, and understanding which stage you're failing at changes what you should fix.

Stage 1: Crawling and Indexing. Google discovers your page (via links, sitemaps, or direct submission), crawls it, and decides whether to index it. If your page isn't indexed, it doesn't exist in Google's world. Roughly 90% of the pages on the internet are never indexed. This stage is binary: you're in or you're out.

Stage 2: Retrieval. When someone searches, Google doesn't evaluate every indexed page. It first retrieves a candidate set of pages that are potentially relevant to the query. This is the step most SEO practitioners don't think about, and honestly it's where a lot of pages lose before the real competition even starts. Retrieval uses basic signals: keyword matching, entity recognition, topic relevance. If your page doesn't make it into the retrieval set, it doesn't matter how good your content is. It never reaches the ranking stage.

Stage 3: Ranking. The candidate set (usually hundreds to thousands of pages) gets ranked by deeper quality signals: link authority, user engagement, content depth, freshness, E-E-A-T proxies. This is where the real competition happens.

The practical implication: if your page isn't showing up at all for a query, you probably have a retrieval problem (content doesn't match the topic or intent well enough), not a ranking problem. If you're on page 2 or 3, you have a ranking problem. These require different fixes. Most teams treat both situations identically, which is why most optimization work doesn't produce results.

What the Leaked API Docs and Antitrust Trial Actually Confirmed

The 2024 Google Content Warehouse API leak wasn't a hack. It was internal documentation accidentally published to Google's own GitHub. It sat there for about six weeks before anyone noticed. The documents described over 14,000 ranking features across Google's search systems.

Before the leak, during the DOJ v. Google antitrust trial in 2023, engineers testified under oath about systems they'd previously refused to discuss publicly. The combination gave the SEO industry its first real look under the hood.

Here's what was confirmed, and why it matters.

NavBoost is real, and it's one of Google's most important ranking signals. NavBoost uses a rolling 13-month window of aggregated user click data to refine search results. It tracks "good clicks" (user clicks and stays), "bad clicks" (user clicks and immediately bounces), and "last longest clicks" (the final result a user selects and stays with, meaning their search was satisfied). Google had publicly denied for years that click data influenced rankings. The antitrust testimony confirmed it does, and that NavBoost is considered one of their most important signals.

Site authority exists. The leaked documents contain a metric called "siteAuthority" which measures the overall authority of a domain for specific topics. Google representatives had repeatedly denied that domain-level authority was a ranking factor. The internal documentation says otherwise.

Chrome data feeds into the system. User behavior data from Chrome browsers contributes to ranking signals. This was another thing Google had publicly downplayed for years.

PageRank is very much alive. The leaked API docs show that PageRank, including a homepage-specific variant, is still actively used. Google had been telling SEOs for years that PageRank was "one of hundreds of signals" and implying it was diminishing in importance. Internally, it's still a foundation of the link authority system.

The uncomfortable takeaway here is that Google's public communications about how ranking works have been, at best, strategically incomplete. Many of the official statements about core updates and ranking factors need to be read with this context in mind. The "200 ranking factors" listicles that populate the internet are built on a mix of confirmed signals, Google's own marketing, and speculation. The real system is both simpler in its priorities and more complex in its execution than the industry has been teaching.

Get NMS in your inbox. We break down marketing news into specific actions, benchmarks, and the occasional opinion that might get us in trouble. Subscribe free.

The Three Signals That Move Rankings More Than Everything Else

After reading through the leaked documentation, the antitrust testimony, and years of correlation studies, I think the ranking system comes down to three signal categories that matter disproportionately. Everything else is either a retrieval signal (important but different), a minor tiebreaker, or cargo-cult behavior.

1. Content Relevance and Intent Match

This is primarily a retrieval signal, but it's the gate you have to pass through before anything else matters. Google's systems use NLP, entity recognition, and Gemini-based query interpretation to understand what a searcher wants and match it to pages that cover the topic comprehensively.

The practical version: if the top 5 results for your keyword are all "how-to" guides and your page is a product comparison, you have an intent mismatch. No amount of link building will fix that. Check what's ranking, match the format and depth, then compete on quality.

2. Link Authority and Site Reputation

Backlinks remain the strongest off-page signal, but the quality bar has shifted. The leaked documents show Google evaluating link context, source authority, and topical relevance rather than just counting links. One editorial link from a relevant, authoritative site in your industry is worth more than 50 directory listings or guest post swaps.

The siteAuthority signal from the leak suggests Google evaluates your entire domain's reputation, not just individual pages. This means a strong site with mediocre content on a specific page will often outrank a weak site with excellent content on the same topic. It's not fair, but understanding the mechanism helps you decide whether to invest in content improvements or link acquisition for a given keyword.

3. User Engagement and Satisfaction (NavBoost)

This is arguably the most underestimated ranking signal, probably because Google spent years publicly denying it existed. NavBoost tracks whether users click on your result and stay, click and bounce back immediately, or make your result the last click in their search session.

From what I've seen in practice, pages that have a high "last longest click" rate tend to climb in rankings over time, even without additional link building. Pages that consistently generate pogo-sticking (users clicking back to search results within seconds) tend to drop. The implication is pretty direct: if your page doesn't actually satisfy the search query, Google will eventually figure that out, regardless of your backlink profile or technical SEO.

If you're spending time on anything outside these three categories before you've nailed them, you're probably working on the wrong things.

Where AI Fits Into How Google Ranking Works Now

AI Overviews are the most visible change to Google Search in years, appearing in roughly 48% of all search queries as of early 2026. But the way most SEO content frames them is misleading.

AI Overviews are a presentation layer, not a ranking system. They don't replace the organic results. They sit on top of them. The organic results underneath are still ranked by the same systems described above. AI Overviews are generated by Gemini, which selects information from sources it deems trustworthy, clear, and authoritative. Those sources usually (but not always) overlap with what ranks well organically.

The source selection process is different from organic ranking. Google's AI Overviews don't simply pull from the #1 organic result. Gemini synthesizes information across multiple sources, favoring pages with well-structured content, clear entity definitions, and consistent factual alignment with other authoritative sources. A page can rank in the top 10 organically but still get ignored by the AI Overview if its content isn't structured in a way that's easy for AI to extract clean answers from.

The CTR impact is real and uneven. Studies from early 2026 show organic CTR dropping anywhere from 15% to 61% on queries where AI Overviews appear. But the impact varies dramatically by vertical. E-commerce queries trigger AI Overviews only about 4% of the time. B2B technology queries see them roughly 70% of the time. If you're in an informational content space, the impact is significant. If you're selling products, it's mostly background noise right now.

Being cited is the new visibility play. One finding worth paying attention to: being cited within an AI Overview increases brand clicks by about 35%. So while the overall click volume to organic results is declining on queries with AI Overviews, there's a new form of visibility that has measurable value.

For practitioners, you can't "optimize for AI Overviews" the same way you optimize for organic ranking. But you can increase your chances of being cited by writing content that's clearly structured, factually specific, and easy for an AI system to extract answers from. Use clear H2s that mirror common questions. Include specific numbers and definitions. Cite your sources. The llms.txt approach that many sites tried hasn't shown measurable impact, so focus on the content fundamentals instead.

E-E-A-T Is a Quality Framework, Not a Ranking Score

I think the SEO industry's fixation on E-E-A-T (Experience, Expertise, Authoritativeness, Trustworthiness) has become counterproductive. Not because E-E-A-T doesn't matter. It does. But because "optimize for E-E-A-T" has become a consultancy upsell that produces a lot of author bio pages and not much ranking improvement.

Here's the distinction that actually matters: E-E-A-T is a framework Google gives to its human quality raters. It's the lens through which those raters evaluate whether search results are good. It is not a direct ranking signal with a score attached to it. Google has been explicit about this. The quality rater guidelines inform the direction of algorithm development. They don't feed directly into the algorithm as a numerical input.

The signals that approximate E-E-A-T in the actual ranking systems are more specific:

Author and entity recognition. Google's Knowledge Graph can associate content with known entities (people, organizations). Content from recognized entities in a topic area seems to get a trust boost. This is closer to what people mean when they say "authoritativeness."
Link context and source reputation. The type of sites linking to you, and the context of those links, signals expertise. An editorial mention from Search Engine Journal carries different weight than a link from a random blog.
Content accuracy and sourcing. Pages that cite primary sources, include specific data, and demonstrate actual knowledge of a topic perform better in quality evaluations. The rater guidelines specifically flag content that's vague or unsourced.
Site reputation signals. Reviews, mentions, brand searches, and overall domain trust contribute to the "trustworthiness" component.

What you should actually do: write content where the expertise is obvious from the content itself (specific data, real experience, cited sources), not from a bio box that claims the author is a "thought leader." Build authority by being cited by other authoritative sites. Make sure your claims are verifiable. The bio page and "About the Author" section don't hurt, but they're not the mechanism. The mechanism is whether the content itself demonstrates knowledge that a content mill couldn't produce.

The Ranking Factors That Don't Move the Needle

This is the cargo-cult section. These are things the SEO industry obsesses over that either don't affect ranking or affect it so marginally that your time is better spent elsewhere.

Meta descriptions. Google has confirmed these are not a ranking factor. They can affect click-through rate (which feeds into NavBoost, which does affect ranking), but the meta description text itself isn't used for ranking. Write them for humans, not algorithms.

Exact keyword density. Google's NLP systems understand topics and entities. Whether your keyword appears 3 times or 7 times in a 2,000-word article makes no measurable difference. Use the keyword naturally. Forcing it creates worse content, which hurts the signals that actually matter.

Social media signals. Google has said repeatedly that social signals aren't a ranking factor, and the leaked documents don't contradict this. Social sharing can drive traffic and generate backlinks, which do affect ranking. But the social signal itself is noise.

Domain age. Older domains don't rank higher because they're older. The correlation exists because older sites tend to have more backlinks and content. The age itself isn't the mechanism. The leaked documents do show Google tracking domain registration dates, but age alone isn't what makes the difference.

URL structure and keyword URLs. Whether your URL has the keyword in it, how long it is, or whether you use hyphens vs. underscores makes no meaningful difference to ranking. Clean URLs are good for usability. They're not an SEO lever.

Duplicate content "penalties." There is no penalty for duplicate content. If Google finds duplicate pages, it picks one as the canonical and ignores the others. Your rankings don't get punished. This myth has probably caused more unnecessary panic than almost any other belief in SEO.

The pattern is useful to notice: most of the things SEO practitioners spend time "optimizing" are either not ranking signals at all, or so low in Google's priority stack that they function as tiebreakers at best. The gap between the top three signal categories and everything else is enormous.

The 30-Minute Audit That Tells You Where to Focus

If you manage SEO for a site and want to know where your effort should go, here's what I'd check first. This takes about 30 minutes in Search Console and covers most of the 80/20.

Step 1: Check your indexing coverage. In Google Search Console, go to Pages, then Indexing. If more than 30% of your pages aren't indexed, you have a crawling or indexing problem that needs to be solved before anything else. Common culprits: thin content, orphan pages with no internal links pointing to them, or crawl budget issues on large sites.

Step 2: Identify retrieval vs. ranking problems. Pull your query report from Search Console. Filter for queries with more than 100 impressions. If your average position is 15+ (page 2 or beyond), you probably have a relevance or intent problem. Check what's ranking in positions 1-5 for those queries and compare the content format and depth to yours. If your average position is 4-10, you have a ranking problem, which means authority and engagement are your levers.

Step 3: Find your engagement outliers. Sort your queries by CTR ascending. Pages with high impressions but low CTR are getting retrieved by Google but not clicked by searchers. These might have weak titles, misleading meta descriptions, or just not match what the searcher expected. Fix the title and snippet first. It's the cheapest test you can run.

Step 4: Check your backlink baseline. Use Ahrefs, Semrush, or Google Search Console's Links report. Compare the number and quality of referring domains to the top 3 results for your target keywords. If they have 10x your referring domains from authoritative sources, content improvements alone probably won't close the gap. You need a link acquisition strategy running alongside content work.

Step 5: Run one honest content quality check. Pick your most important page. Read it critically. Does it answer the searcher's query better than the current #1 result? Not "as well as." Better. If not, you've found your next content brief.

Five steps. Half an hour. You now know which of the three core signals (relevance, authority, engagement) is your weakest, and where to focus for the next quarter.

FAQ: How Google Ranking Works

How many ranking factors does Google actually use?

The 2024 API leak revealed over 14,000 features in Google's ranking systems. But "ranking factor" is misleading here. Most of these are technical attributes used in the retrieval and processing pipeline, not factors that directly influence your position on page 1. The number of signals that meaningfully differentiate page 1 from page 2 is much smaller, probably in the range of 20-30, with three categories (relevance, authority, engagement) doing most of the heavy lifting.

Does Google use AI to rank pages?

Google uses AI (specifically Gemini) for query interpretation and for generating AI Overviews. But the core ranking systems are a mix of traditional information retrieval, machine learning models, and specific systems like NavBoost. AI Overviews are a presentation layer on top of the organic results, not a replacement for the ranking pipeline.

Are backlinks still important for Google ranking in 2026?

Yes. The leaked API documentation confirms that PageRank and link-based authority signals are still foundational. What's changed is that link quality and contextual relevance matter more than raw quantity. One authoritative, topically relevant link is worth more than dozens of low-quality links from unrelated sites.

Does click-through rate affect Google ranking?

Yes. The NavBoost system, confirmed through the antitrust trial testimony and the API leak, uses a 13-month rolling window of user click data to adjust rankings. Pages that consistently generate satisfied user interactions (the "last longest click") tend to improve over time.

How do AI Overviews affect my website's traffic?

AI Overviews appear on roughly 48% of search queries as of early 2026 and reduce organic CTR by 15-61% depending on the query type. Informational queries are most affected. However, being cited within an AI Overview can increase brand clicks by about 35%. The impact is highly dependent on your industry and query mix.

By Notice Me Senpai Editorial

Want this kind of analysis daily? Notice Me Senpai covers marketing news with specific actions, real benchmarks, and opinions we actually stand behind. Subscribe free.

The 200-factor listicle is comfortable because it gives you 200 things to work on. The problem is that roughly 197 of them are distractions from the three that determine where your page shows up. Relevance gets you into the candidate set. Authority separates you from the competition. And user engagement, the signal Google denied using for over a decade, is the feedback loop that tells the system whether your page actually deserved the click. Everything else is tiebreakers and noise.