Chrome Silently Put Gemini Nano on 500M Devices. Your Site Can Use It for Free.

Chrome Silently Put Gemini Nano on 500M Devices. Your Site Can Use It for Free.
Chrome silently parks Gemini Nano in OptGuideOnDeviceModel. The 4 GB sits on roughly 500 million devices, accessible to any site through the Prompt API.

Google Chrome has been silently downloading a 4 GB Gemini Nano model into a directory called OptGuideOnDeviceModel since 2024, affecting an estimated 500 million devices on Windows, macOS, and Linux. Privacy researcher Alexander Hanff filed the forensic report on May 4, 2026. The undocumented side effect: any website you build can call that model through Chrome's Prompt API at zero inference cost.

The 4 GB file is real, and it is sitting on roughly half a billion devices

Hanff documented the install on April 24, 2026 at 16:38:54 CEST. By 16:53:22, Chrome had pulled down weights.bin version 2025.8.8.1141 and parked it in the user profile. Total wall time: 14 minutes 28 seconds. The file reinstalls automatically if you delete it. Confirmed on Windows 11, Apple Silicon, and Ubuntu, which means the population is essentially every Chrome user with eligible hardware, roughly 15% of installs by Hanff's mid-band estimate.

Google's response, given to Gizmodo, is that Gemini Nano "powers important security capabilities like scam detection and developer APIs without sending your data to the cloud." Both halves of that sentence are true. They are also doing very different work. Scam detection is a Google product. Developer APIs are a distribution channel. The 4 GB ships either way.

The cloud AI Mode does not actually use any of those weights

Here is the tell most coverage missed. The "AI Mode" pill that sits next to Chrome's address bar, the surface that prompted the whole "Google is putting AI in your browser" news cycle, routes to Google's servers. Hanff's investigation confirms it. The local 4 GB only powers a handful of buried features: Help Me Write in textarea right-click menus, tab-group naming, smart paste, and page summarization.

So why ship 4 GB of weights to power features most people never click? Because the model is not really for those features. It is for the Prompt API, the Writer API, the Rewriter API, the Summarizer API, the Translator API, and the Language Detector API, all of which any website can call through Chrome's built-in AI surface. Google seeded the developer ecosystem by pre-installing the inference engine on the user's machine. The end-user features were the cover story.

What your site can actually do with this for free

Per Chrome's developer documentation, Nano is suited for short-form text tasks: summarization, classification, rewriting, structured extraction from short text, lightweight chat, tag generation, proofreading. It is not suited for long-document QA, code generation, or anything reasoning-heavy. Treat it as a fast, free small model on the user's device, not a GPT-4 substitute.

Concretely, that gives a marketer five places to use it without writing a check to OpenAI:

  • Lead form intent classification. Pipe the message field through Nano, classify the lead as sales, support, spam, or billing, then route accordingly. Zero API cost.
  • On-page personalization. Classify the article the user is reading, rewrite the CTA to match. The classification call never leaves the device.
  • Comment moderation. Toxicity and spam pre-filter before the comment hits your queue.
  • Search box query rewriting. Rephrase a vague query into a tighter one before it hits your existing search.
  • Tag generation for UGC. Auto-tag user uploads on the client side.

The inference is local, which means it works offline, has no per-call cost, and the user's text never goes through your servers. That last bit matters more than people think. Pre-existing privacy policies that promised "we do not share form contents with AI vendors" suddenly become technically correct again.

The cost math the cloud-LLM line items have not caught up to

A site that runs 1 million short Nano calls a month against GPT-4o-mini would pay roughly $30 to $50 in inference, depending on prompt length and structured-output tax. On Nano, it costs zero, because the user's CPU and battery are paying. Scale that up to a CMS with 10 million classifications and the savings stop being a rounding error. OpenAI's recent CPC pivot for ChatGPT Ads hinted at the same shift in pricing pressure from the other direction.

The catch: the model only runs on eligible hardware. Hanff's investigation flags roughly 15% of devices by his mid-band estimate, which is 500 million globally. You cannot replace your cloud LLM with this. You can layer it as a free first attempt, fall back to cloud when the API reports the model is unavailable, and pocket the difference.

I think most B2B SaaS marketing teams will overcomplicate this. The right starter project is the cheapest one: a one-line classifier on the contact form. If it works, expand. If 15% of leads get pre-classified for free, that is real money over a year.

How the API actually works (and the origin-trial gotcha)

Feature detection is 'ai' in self or, depending on your Chrome channel, 'aiOriginTrial'. The Prompt API is currently behind an origin trial, which means production sites need to register for a token and embed it in a meta tag. Without the token, the API silently fails. Skip the registration step and your code will work in your dev console and break in prod.

Sequence is: check availability, create a session, call prompt or promptStreaming, destroy the session when done. Sessions hold context, which means you can do short multi-turn flows. Memory budget is small, so keep prompts under a few hundred tokens.

Quick warning. The model identifier and namespace have changed twice since the 2024 origin trial. Anything you wrote a year ago needs a re-read against the current built-in AI docs before you ship. Two of the four code snippets I cross-checked online were already deprecated.

The privacy memo your CMO is going to forward you next week

Hanff's GDPR argument has three parts. ePrivacy Directive Article 5(3) requires consent before storing information on a user's device. GDPR Article 5(1) requires transparency about what is being processed. Article 25 requires data minimization by default. Chrome did none of those visibly before the install. The post is being shared in EU policy circles, and someone in your legal team is going to ask whether your site is using on-device AI without consent. Have an answer ready.

The user-side mitigation matters too. Google shipped a Chrome settings toggle in February 2026 that disables the model. Three other ways to kill the install: turn off Chrome AI flags at chrome://flags, push an enterprise policy, or stop using Chrome. Manual deletion does not work; the file re-downloads when eligibility criteria are next met.

If you run any feature that depends on window.ai, log availability checks. The disable toggle is going to flip on for some chunk of EU users this quarter, and you want to know when your fallback rate spikes.

Why I think this changes the next round of browser ad tech

The honest read here is that Google has just made on-device inference a default. Once 500 million browsers can run a small LLM with no per-call cost, the product surfaces that get built on top will not look like API integrations. They will look like browser extensions, content scripts, and embedded widgets that run inference where the user is. Including, eventually, ad tech. A bidder that can classify the page's intent on the device, in 50 milliseconds, without round-tripping to a cloud, has a different cost structure than one that cannot.

I do not think most marketers need to act on this in the next 30 days. The Prompt API is still origin-trial gated, and nothing about your current campaign mix is broken by ignoring it. But if you have a roadmap item that says "personalization by AI," the default assumption that personalization needs a server-side LLM no longer holds for everything. It can run on the visitor's machine, for free, and your only cost is the fallback path for the 85% of devices that do not yet qualify.

That is a different product than the one most teams are scoping.

By Notice Me Senpai Editorial