Anthropic Is Now Training on Elon's Supercluster

The tweet was about twelve words long. Tom Brown posted it from his account. Anthropic is expanding to Colossus2, he wrote. Will use GB200.

If you know who Tom Brown is, you paused. If you don’t: he co-authored the GPT-3 paper in 2020, the paper that started the modern LLM gold rush. He left OpenAI, eventually landed at Anthropic as a senior figure in model development. And he’s announcing that Anthropic is training on Elon Musk’s supercluster.

That’s the story. Twelve words. Three years of industry drama compressed into a single tweet.

Anthropic on the left labeled 'safety-first lab' connects via a rental link to xAI · Colossus2 on the right with 100k+ Blackwells. A 'fees' arrow goes right, a 'weights?' arrow goes left.

What Colossus is

Colossus is xAI’s data center in Memphis, Tennessee. When xAI stood it up in 2024, it was briefly the world’s largest GPU cluster - roughly 100,000 H100s, assembled in what Musk described as record time. The speed was partly achieved by using mobile natural gas turbines for power, some of which, according to the Southern Environmental Law Center, operated without proper air permits. Memphis residents who live near the site had opinions about this. Those opinions went mostly unaddressed.

Colossus2 is the upgrade - NVIDIA GB200 NVLink racks, Blackwell architecture. A meaningful step up from H100 in both throughput and memory bandwidth.

The strange part

Here is what the HN submitter aurareturn noted in their first comment, pulling together signals that had been accumulating for a while: xAI gave the entire Colossus1 to Anthropic. They also let Cursor train a model on Colossus2. And now they’re giving Anthropic compute on Colossus2 as well.

That’s not a partnership. That’s a data center company.

xAI launched in 2023 to go heads-up with OpenAI in the frontier model race. Grok is their model. Colossus was built to train it. Less than two years later, the same infrastructure is hosting training runs for competitors.

The comment from zitterbewegung put a specific prognosis on it: “Seems like either Grok is being shut down or it will be ‘powered by anthropic’ soon.” Another user try-working went further: “xAI cannot train models. Anthropic cannot do inference. The roles of these two companies have already been decided.”

That’s a sharp read, even if speculative. What’s less speculative: something changed at xAI.

Left: xAI in 2023 - 'Frontier lab' competing with OpenAI. Right: xAI in 2026 - 'Compute landlord' hosting Anthropic and Cursor training runs.

The trust question nobody has an answer to

The thread’s most uncomfortable question came from multiple angles. stevefan1999 asked it plainly: “Aren’t Anthropic afraid of Elon siphoning the model weights out from the network buses?” virgildotcodes asked the longer version: if someone owned the data center, could they observe token streams, exfiltrate model weights, use that to build competitive models?

Nobody in the thread had a satisfying answer. Presumably Anthropic has contractual protections. Whether those protections hold against a datacenter owner who is also a frontier AI competitor is a different kind of question.

There’s also a simpler angle. chinathrow put it this way: “I use Claude daily but I do not want that my spend is going towards Elon.”

This is the tradeoff that’s hard to price. Claude has been the model of choice for users who specifically want to be somewhere other than the OpenAI orbit. Choices like this make that positioning complicated.

What the GB200 does

The NVIDIA GB200 (Blackwell) is a significant architectural step from H100. The GB200 NVLink 72 rack combines 36 Grace CPUs and 72 Blackwell GPUs in a single unit with a 72x NVLink interconnect - roughly 5× the NVLink bandwidth of the previous generation. Per-node FP8 training performance is claimed at ~1.4 PetaFLOPS per rack.

For large model training, the practical effect is being able to run larger parallel configurations with less inter-node synchronization overhead. What took Colossus1 a hundred thousand H100s, Colossus2 can potentially do with meaningfully fewer units, or can do larger things with the same headcount.

Anthropic’s next model release will be shaped, in part, by this. The compute is real.

The reading I keep returning to

ReptileMan left one comment: “War makes strange bedfellows.”

Three words. The whole situation.

Anthropic and xAI are ideologically distant. Anthropic was founded partly as a response to what its founders considered insufficient safety focus at OpenAI. Elon Musk has been publicly combative with Anthropic’s backers and the broader AI safety community. Their visions for how AI development should go are not aligned.

And yet here is the deal: Anthropic trains on Colossus, xAI collects the fee, GB200s run Claude. The market found an equilibrium that no mission statement would have predicted.

The question alienreborn asked is the one that lingers: “Why is xAI giving up their advantage?” Training compute is a competitive moat. You don’t rent it to rivals unless you think the moat either isn’t working or isn’t worth defending.

Maybe the model race looks different from inside xAI than it does from outside. Maybe the numbers on Grok’s market share made the decision obvious. Maybe Musk decided infrastructure is the business and models are the product someone else builds on top.

Or maybe it’s simpler than all of that. Building the world’s largest GPU cluster is expensive. Having it run hot while a competitor pays for training time is less expensive.

Discussion on Hacker News · Source: twitter.com · Submitted by aurareturn