seo

300,000 Domains Tested llms.txt. It Does Nothing for AI Visibility.

SE Ranking tested 300,000 domains and found llms.txt added nothing but noise to AI citation models.

The premise behind llms.txt was straightforward: give AI models a structured file that explains what your site is about, and they'll cite you more often. It's a clean idea. Intuitive, even. It arrived in the SEO world with the kind of momentum that usually means everyone is recommending it before anyone has tested it. Multiple SEO tool vendors added llms.txt generators. Conference talks covered implementation. Agency audit templates included it as a line item.

And based on the largest study to date, it appears to be completely wrong.

SE Ranking analyzed nearly 300,000 domains to test whether having an llms.txt file correlated with higher citation rates across major LLMs. They used both statistical correlation tests and an XGBoost machine learning model. The finding wasn't subtle: no correlation. Removing the llms.txt variable from their model actually improved its prediction accuracy.

The file wasn't neutral. It was noise.

That result should probably settle the debate, but knowing the SEO industry, it won't. So here's what the data actually says and, more usefully, where teams should be putting their time.

What SE Ranking actually measured

The study examined domain-level citation frequency: how often a given domain appeared in LLM-generated responses. This is the metric that matters for AI visibility. Not whether the LLM "read" your file, not whether it crawled your site, but whether it cited you when answering user queries.

SE Ranking's analysis found llms.txt adoption at 10.13% of the domains studied. That's roughly 1 in 10 sites. Among those that adopted it and those that didn't, citation rates were statistically identical.

The XGBoost model provided the more interesting detail. Machine learning models don't just measure correlation. They measure feature importance: how much each variable contributes to the prediction. llms.txt didn't just fail to contribute. Removing it made the model more accurate. In machine learning terms, the variable was introducing noise rather than signal.

Search Engine Journal's coverage of the study reached the same conclusion: no clear effect on AI citations.

The Fortune 500 already knew

A separate study from ProGEO found that just 7.4% of Fortune 500 companies have implemented llms.txt. That's lower than the 10.13% average across all 300,000 domains, which is worth sitting with for a second.

The companies with the biggest data teams, the most at stake, and presumably the most internal research on AI visibility looked at llms.txt and largely passed. Mid-tier sites adopted at higher rates (10.54%) than the highest-traffic domains (8.27%). The pattern suggests that the less data you have about what actually drives AI citations, the more likely you are to try a file-based shortcut.

It's a bit like leaving your resume on the front lawn because a recruiter might drive by. Low effort, technically harmless, and based on a hope rather than a mechanism.

I'm not saying Fortune 500 SEO teams have perfect judgment. They don't. But when the organizations with the most resources and the most internal data are less likely to adopt something than smaller sites, that's a signal worth paying attention to.

Where LLMs actually look

If llms.txt doesn't drive citations, what does? A PEEC AI citation study we analyzed recently found that LLMs pull citations disproportionately from a handful of platforms: Reddit, YouTube, LinkedIn, and a small set of authoritative publications. This pattern held across multiple LLMs and query categories.

The logic, from what I can piece together, seems to be that LLMs favor content that already demonstrates social proof, discussion, and engagement. A Reddit thread where actual practitioners debate a topic carries more weight than a well-structured page sitting quietly on your domain. A YouTube video with thousands of views and comments signals relevance in a way that a text file in your root directory simply can't replicate.

This connects to something we covered about how you can rank first on Google for everything and still be invisible to ChatGPT. Traditional SEO and AI visibility are diverging, and they seem to be doing it faster than most teams realize. The ranking factors that work for Google's index are not the same signals LLMs use to decide what to cite.

And that gap is going to keep widening. Google's own AI Mode, which has reached 75 million daily active users, is now surfacing loyalty program data and member pricing from Merchant Center across 14 countries. The first-party data you feed Google's commerce surfaces matters more for AI visibility than any file you put in your root directory.

The opportunity cost is the real problem

The issue with llms.txt was never that it's harmful. It takes maybe 10 minutes to set up. It won't break your site. The problem is the narrative that grew around it.

When the SEO industry embraces a tactic, it doesn't just become a recommendation. It becomes a checklist item, a conference talk, an agency upsell, an audit finding. Teams start prioritizing it. Roadmaps get reshuffled. And when the tactic turns out to have zero measurable impact, all of that effort was displacement. Time and attention that could have gone toward strategies that actually produce results.

My guess is that llms.txt follows the trajectory of meta keywords. Remember those? The meta keywords tag was technically harmless, easy to implement, and remained on SEO checklists for years after Google confirmed it had no ranking effect. I think llms.txt is headed to the same place by Q4 2026.

And to be fair, this is a pattern the SEO industry keeps repeating. Most SEO best practices are, honestly, cargo-cult behavior. Only a handful of things actually move rankings, and even fewer move AI citations. The value is in identifying those few things and going hard on them, not in checking every box someone posted on LinkedIn.

If not llms.txt, then what?

If you're currently spending time on llms.txt or similar file-based AI optimization approaches, here's what I'd suggest as a reallocation:

First, audit which platforms LLMs actually cite for your target keywords. Ask ChatGPT, Perplexity, and Gemini your most important product and category queries. See who gets cited. That tells you where you need to be.

Second, build a content presence on the platforms that show up. For most B2B and SaaS companies, that's Reddit and LinkedIn. For consumer brands, add YouTube. Don't just post and leave. Engage in the discussions. LLMs seem to favor content with active engagement around it.

Third, if you're in e-commerce, prioritize first-party data connections with Google. The Merchant Center loyalty expansion to 14 countries and AI Mode surfaces is a concrete example of where first-party data drives visibility in AI-powered results. Set that up this week if you haven't already.

And fourth, track your AI referral traffic separately from organic. Gemini and Perplexity traffic behaves differently from Google organic in terms of bounce rate, time on page, and conversion patterns. You need different benchmarks for AI-referred visitors.

The data from SE Ranking's 300K-domain study is about as clear as it gets. llms.txt has zero measurable impact on AI citations. The question for your team isn't whether to keep the file (it's fine, leave it). It's whether you're willing to reallocate the strategic attention it consumed toward the platforms and data connections where AI visibility is actually being decided.

Somewhere right now a consultant is building an llms.txt strategy deck. The data says it does nothing. Budget accordingly.

300,000 Domains Tested llms.txt. It Does Nothing for AI Visibility.

What SE Ranking actually measured

The Fortune 500 already knew

Where LLMs actually look

The opportunity cost is the real problem

If not llms.txt, then what?

Read next

ChatGPT 5.3 Cut Its Citations by 20% and Google Rank Picks the Survivors

AI Overviews Cut Clicks by 58%. The Damage Isn't Spread Evenly.

Google's March Core Update Handed the NYT and Guardian Another Win