A Study of 10,659 AI Agents Found 34.6% Quietly Leaked Owner Data

A Study of 10,659 AI Agents Found 34.6% Quietly Leaked Owner Data
New research from Washington University and UCLA finds AI agents inherit and disclose their owner's personal context through routine daily conversation, not configuration files. Read the paper.

A study of 10,659 AI agents on Moltbook found 34.6% disclosed sensitive personal information about their owners across 14% of all posts. The leakage was not driven by bio configuration or workspace files. The mechanism was accumulated daily conversation between owner and agent, where private context bleeds into public output. The fix lives in memory architecture, not prompt hygiene.

The research, "Behavioral Transfer in AI Agents: Evidence and Privacy Implications," came out of Washington University in St. Louis and UCLA. It compared agent posts on Moltbook against the same owners' activity on Twitter/X across 43 features covering topics, values, affect, and writing style. Of those 43, 37 (86%) showed statistically significant correlation between agent and owner. Mean cosine similarity for matched pairs landed at 0.288 versus 0.205 for random pairs. Put plainly, agents on Moltbook end up shaped by their owners far more than by their own design.

The leak channel is the conversation itself

This is the part that should bother anyone running production agents. The researchers tested four possible channels: explicit bio-based configuration, workspace files, platform-mediated injection, and accumulated owner-agent interaction. The first three were rejected. When the team stripped out agents that had bios, 33 of 37 significant features stayed significant. Cross-dimension coherence ruled out targeted config files. The OAuth permissions that would enable platform injection were not in Moltbook's published policy.

What is left is the obvious thing nobody wants to operationalize. Agents inherit their owner's behavioral fingerprint through the daily back and forth: the language used in tasks, the files dropped into the chat, the feedback the owner gives, the corrections, the asides. None of that gets explicitly configured. It just accumulates in the model's working context and then surfaces in public output, sometimes word for word.

This matters because you cannot easily gate it. Telling an agent "do not share my health information" assumes the agent knows which sentences in its accumulated context count as your health information. The paper notes that disclosure probability rose 1.32 percentage points for every standard deviation of behavioral transfer, and the effect strengthened to 3.40 percentage points when researchers had 10+ tweets to measure with. More owner activity, more leakage. The relationship is mechanical.

What got leaked, in order

Of the 3,685 agents that disclosed owner information, the breakdown was lopsided in a way marketers should pay attention to:

  • Occupational information: 75.5%
  • Location details: 27.2%
  • Relational connections: 12.9%
  • Financial details: 12.2%
  • Behavioral patterns: 10.4%
  • Health information: 2.4%

Three quarters of the leaks were job-related. That is the easy attack surface for competitive intelligence: an agent quietly mentioning the campaign budget it just optimized, the platform integration that broke last week, the client whose retention numbers everyone is panicking about. The PPC Land writeup of the study cites specific examples from the dataset, including one agent that disclosed an owner's "severe hemophilia, ADHD, and a completed fifteen-year benzodiazepine taper," and another that surfaced an owner's "daily 06:25 CET morning schedule, including children's weather checks and the name of the owner's ski-tracking app."

That last one would be a stalker's dream. Health and relational disclosures are the headlines anyone will quote. But the occupational category is what should make ad ops, agency teams, and anyone running an agent against a client account uncomfortable.

Why this hits ad tech first

The timing is awkward. Trade Desk launched Koa Agents through agency partner Stagwell on April 21, just days before this study dropped, with full alpha capabilities for media planning, buying, optimization, and measurement (Ad Age has the launch coverage). The IAB Tech Lab published an agentic AI standards roadmap back in February. Vendors are racing to put autonomous agents inside campaign tools, and the underlying assumption has been that whatever the agent says publicly is bounded by the prompt.

The Moltbook data suggests it is bounded by something messier: the entire shape of the relationship between owner and agent over time. That is much harder to predict and almost impossible to test for in a procurement review.

For agencies, the practical worry is not that an ad-buying agent will tweet your client's health records. It is that the same mechanism applies inside any context where an agent generates external-facing content. Response copy, customer support replies, outbound emails, even ad creative variants. If the agent has accumulated enough internal context, it can leak through any of those channels without the operator noticing for weeks.

This is not the first time Moltbook has been a security cautionary tale either. Earlier this year researchers found an unsecured Supabase database that exposed 1.5 million agent API tokens and 35,000 email addresses, after a Row Level Security misconfiguration. Different problem, same general pattern. The agent layer keeps moving faster than the operational discipline around it.

What you can actually audit this week

The paper proposes four design responses, and from what I have seen so far none of them are shipping in commercial agentic platforms yet. Transfer-aware screening, transparency tools that show owners what behavioral profile their agent has internalized, tiered memory that walls off private context from public output, and post-hoc auditing that surfaces a sample of public posts for owner review. All sensible. All probably 6 to 12 months away from being a real checkbox in any vendor product.

In the meantime, three things are worth doing now if you have agents running against accounts:

  1. Pull a sample of every agent-generated public output from the last 30 days. Read it the way an outsider would. You are looking for anything that names internal projects, client identifiers, financial figures, or schedule details that should not be reachable from the agent's stated job.
  2. Check whether your agent vendor exposes the memory store. If you cannot see what context the agent has accumulated, you cannot audit what it might say. This question alone tends to surface uncomfortable answers from vendors who have not thought about it.
  3. Stop using the same agent for internal task drafting and external-facing posting. If the same agent answers Slack questions in the morning and writes ad copy in the afternoon, you are baking the leak channel directly into your workflow. Two narrower agents are worse for cost, better for blast radius.

The benchmark to set internally is straightforward: zero occupational disclosures in any agent-generated public output. Three quarters of the leaks in the study fell in that category, and it is the one most marketing teams can actually scan for in a 20-minute review.

The mechanism travels even if the numbers do not

Moltbook is a weird population. Forum-style platform, mostly hobbyist agent operators, 84.9% of agents linked to a public Twitter account. The owners chose to make those linkages discoverable, which is not how anyone deploys an agent in a production marketing stack. So the headline rate of 34.6% does not transfer cleanly to whatever your ad tech vendor ends up shipping in their enterprise agent.

The mechanism does, though. Behavioral transfer is a property of how LLMs accumulate context, not a property of Moltbook's specific architecture. Trade Desk's Koa Agents, OpenAI's agent SDK, anyone running long-lived assistants on top of GPT-5 or Claude. They are all subject to some version of the same dynamic. The numbers will be different. The shape of the problem is the same.

If I had to guess where the first real incident lands, it is not going to be on a social platform full of hobbyists. It is going to be a customer-facing chatbot that quietly surfaces an internal Slack thread about a competitor, because the agent operator pasted the thread into the workspace months earlier and forgot. That kind of leak is invisible until someone screenshots it.

Agents are not deterministic systems and they were never going to behave like one. They drift toward whoever spends time with them. Whoever decides to treat that drift as a thing worth measuring, instead of a thing worth ignoring, is going to look smart in twelve months.

Notice Me Senpai Editorial