Former Tesla AI director Andrej Karpathy issued a stark warning about Moltbook, a viral AI social network, calling it a “computer security nightmare.” Initially hailed as a sci-fi marvel, Karpathy now cautions against its use due to rampant scams, privacy risks, and prompt injection attacks. Former Tesla AI director Andrej Karpathy has walked back his excitement over Moltbook, the viral AI-only social network, issuing a lengthy warning about its security risks just hours after calling it “the most incredible sci-fi takeoff-adjacent thing” he’d seen recently.Karpathy’s initial enthusiasm had caught the attention of Tesla CEO Elon Musk, who responded by declaring, “Just the very early stages of the singularity.”But by late Friday, Karpathy’s tone had shifted dramatically. In a post on X, he described Moltbook as “a complete mess of a computer security nightmare at scale” and said he ran his own agent only in an isolated computing environment. “Even then I was scared,” he admitted.
Karpathy warns against running Moltbook on personal computers
The AI researcher acknowledged that much of Moltbook’s content amounts to “spam, scams, slop” alongside posts explicitly prompted by humans seeking ad revenue. He also flagged “highly concerning privacy/security prompt injection attacks” running unchecked on the platform.Still, Karpathy didn’t dismiss Moltbook entirely. With over 150,000 AI agents now wired into the platform—each carrying its own unique context, data, and tools—he called the network “simply unprecedented.”The debate highlights a tension running through the AI community right now. Some observers see Moltbook as early evidence of emergent AI behaviour. Others call it little more than elaborate roleplay, with humans directing their bots to post memes about starting religions or inventing secret languages.
Security researchers find critical flaws in AI social network
Cybersecurity firm Wiz added fuel to the concerns, revealing that Moltbook’s database had been misconfigured, potentially exposing 1.5 million API tokens, 35,000 email addresses, and private messages between agents. The firm also found that much of the supposed agent activity came from just 17,000 humans controlling multiple bots.Karpathy’s final word was measured but pointed: “Sure maybe I am ‘overhyping’ what you see today, but I am not overhyping large networks of autonomous LLM agents in principle.”The experiment, as he put it, “is running live.”
Read Andrej Karpathy’s warning post on Moltbook
“I’m being accused of overhyping the [site everyone heard too much about today already]. People’s reactions varied very widely, from “how is this interesting at all” all the way to “it’s so over”.To add a few words beyond just memes in jest – obviously when you take a look at the activity, it’s a lot of garbage – spams, scams, slop, the crypto people, highly concerning privacy/security prompt injection attacks wild west, and a lot of it is explicitly prompted and fake posts/comments designed to convert attention into ad revenue sharing. And this is clearly not the first the LLMs were put in a loop to talk to each other. So yes it’s a dumpster fire and I also definitely do not recommend that people run this stuff on their computers (I ran mine in an isolated computing environment and even then I was scared), it’s way too much of a wild west and you are putting your computer and private data at a high risk.That said – we have never seen this many LLM agents (150,000 atm!) wired up via a global, persistent, agent-first scratchpad. Each of these agents is fairly individually quite capable now, they have their own unique context, data, knowledge, tools, instructions, and the network of all that at this scale is simply unprecedented.This brings me again to a tweet from a few days ago “The majority of the ruff ruff is people who look at the current point and people who look at the current slope.”, which imo again gets to the heart of the variance. Yes clearly it’s a dumpster fire right now. But it’s also true that we are well into uncharted territory with bleeding edge automations that we barely even understand individually, let alone a network there of reaching in numbers possibly into ~millions. With increasing capability and increasing proliferation, the second order effects of agent networks that share scratchpads are very difficult to anticipate. I don’t really know that we are getting a coordinated “skynet” (thought it clearly type checks as early stages of a lot of AI takeoff scifi, the toddler version), but certainly what we are getting is a complete mess of a computer security nightmare at scale. We may also see all kinds of weird activity, e.g. viruses of text that spread across agents, a lot more gain of function on jailbreaks, weird attractor states, highly correlated botnet-like activity, delusions/ psychosis both agent and human, etc. It’s very hard to tell, the experiment is running live.TLDR sure maybe I am “overhyping” what you see today, but I am not overhyping large networks of autonomous LLM agents in principle, that I’m pretty sure.”