Anthropic Is Now Training on Elon's Supercluster
The tweet was about twelve words long. Tom Brown posted it from his account. Anthropic is expanding to Colossus2, he wrote. Will use GB200.
If you know who Tom Brown is, you paused. If you don’t: he co-authored the GPT-3 paper in 2020, the paper that started the modern LLM gold rush. He left OpenAI, eventually landed at Anthropic as a senior figure in model development. And he’s announcing that Anthropic is training on Elon Musk’s supercluster.
That’s the story. Twelve words. Three years of industry drama compressed into a single tweet.
--> // making it invisible to querySelectorAll. // // `data-cfasync="false"` keeps this rescue script executable even when // Rocket Loader is active. It rescues module scripts via two strategies: // 1. Query the DOM for type$="-module" + src (covers case A) // 2. Regex-parse the raw HTML for commented-out script tags (covers case B) // Dynamically-created scripts bypass Rocket Loader entirely. (function () { if (window.__markdyRescue) return; window.__markdyRescue = true; var rescued = false; function rescueModuleScripts() { if (rescued) return; rescued = true; var srcs = []; // Strategy 1: Rocket Loader kept the tag in DOM but changed the type. // type="module" → type="{uuid}-module" (still has src attribute) document.querySelectorAll('script[type$="-module"][src]').forEach(function (s) { srcs.push(s.src); }); // Strategy 2: Rocket Loader COMMENTED OUT the script tag entirely: // // These are invisible to querySelectorAll, so we parse the raw HTML. // We handle both attribute orderings (type-first or src-first). var html = document.documentElement.innerHTML; var reSrcFirst = //g; var reTypeFirst = //g; var m; while ((m = reSrcFirst.exec(html)) !== null) { srcs.push(m[1]); } while ((m = reTypeFirst.exec(html)) !== null) { srcs.push(m[1]); } // Re-inject each found src as a real module script. // Deduplicate first, then inject. Dynamically-created scripts bypass // Rocket Loader entirely. Modules with the same URL are only executed // once by the browser (cached), so re-injecting already-running scripts // is safe. var seen = {}; srcs.forEach(function (src) { if (seen[src]) return; seen[src] = true; var fix = document.createElement('script'); fix.type = 'module'; fix.src = src; fix.setAttribute('data-cfasync', 'false'); document.head.appendChild(fix); }); } // Rescue when user clicks the placeholder (fallback if autoplay failed). document.addEventListener('click', function (e) { var t = e.target; if (t && typeof t.closest === 'function' && t.closest('.markdy-placeholder')) { rescueModuleScripts(); } }); // Rescue automatically after a short delay for autoplay. // Only fires if initAll() never ran (no data-markdy-init on any root). setTimeout(function () { if (document.querySelector('.markdy-root:not([data-markdy-init])')) { rescueModuleScripts(); } }, 1500); }());What Colossus is
Colossus is xAI’s data center in Memphis, Tennessee. When xAI stood it up in 2024, it was briefly the world’s largest GPU cluster - roughly 100,000 H100s, assembled in what Musk described as record time. The speed was partly achieved by using mobile natural gas turbines for power, some of which, according to the Southern Environmental Law Center, operated without proper air permits. Memphis residents who live near the site had opinions about this. Those opinions went mostly unaddressed.
Colossus2 is the upgrade - NVIDIA GB200 NVLink racks, Blackwell architecture. A meaningful step up from H100 in both throughput and memory bandwidth.
The strange part
Here is what the HN submitter aurareturn noted in their first comment, pulling together signals that had been accumulating for a while: xAI gave the entire Colossus1 to Anthropic. They also let Cursor train a model on Colossus2. And now they’re giving Anthropic compute on Colossus2 as well.
That’s not a partnership. That’s a data center company.
xAI launched in 2023 to go heads-up with OpenAI in the frontier model race. Grok is their model. Colossus was built to train it. Less than two years later, the same infrastructure is hosting training runs for competitors.
The comment from zitterbewegung put a specific prognosis on it: “Seems like either Grok is being shut down or it will be ‘powered by anthropic’ soon.” Another user try-working went further: “xAI cannot train models. Anthropic cannot do inference. The roles of these two companies have already been decided.”
That’s a sharp read, even if speculative. What’s less speculative: something changed at xAI.
--> // making it invisible to querySelectorAll. // // `data-cfasync="false"` keeps this rescue script executable even when // Rocket Loader is active. It rescues module scripts via two strategies: // 1. Query the DOM for type$="-module" + src (covers case A) // 2. Regex-parse the raw HTML for commented-out script tags (covers case B) // Dynamically-created scripts bypass Rocket Loader entirely. (function () { if (window.__markdyRescue) return; window.__markdyRescue = true; var rescued = false; function rescueModuleScripts() { if (rescued) return; rescued = true; var srcs = []; // Strategy 1: Rocket Loader kept the tag in DOM but changed the type. // type="module" → type="{uuid}-module" (still has src attribute) document.querySelectorAll('script[type$="-module"][src]').forEach(function (s) { srcs.push(s.src); }); // Strategy 2: Rocket Loader COMMENTED OUT the script tag entirely: // // These are invisible to querySelectorAll, so we parse the raw HTML. // We handle both attribute orderings (type-first or src-first). var html = document.documentElement.innerHTML; var reSrcFirst = //g; var reTypeFirst = //g; var m; while ((m = reSrcFirst.exec(html)) !== null) { srcs.push(m[1]); } while ((m = reTypeFirst.exec(html)) !== null) { srcs.push(m[1]); } // Re-inject each found src as a real module script. // Deduplicate first, then inject. Dynamically-created scripts bypass // Rocket Loader entirely. Modules with the same URL are only executed // once by the browser (cached), so re-injecting already-running scripts // is safe. var seen = {}; srcs.forEach(function (src) { if (seen[src]) return; seen[src] = true; var fix = document.createElement('script'); fix.type = 'module'; fix.src = src; fix.setAttribute('data-cfasync', 'false'); document.head.appendChild(fix); }); } // Rescue when user clicks the placeholder (fallback if autoplay failed). document.addEventListener('click', function (e) { var t = e.target; if (t && typeof t.closest === 'function' && t.closest('.markdy-placeholder')) { rescueModuleScripts(); } }); // Rescue automatically after a short delay for autoplay. // Only fires if initAll() never ran (no data-markdy-init on any root). setTimeout(function () { if (document.querySelector('.markdy-root:not([data-markdy-init])')) { rescueModuleScripts(); } }, 1500); }());The trust question nobody has an answer to
The thread’s most uncomfortable question came from multiple angles. stevefan1999 asked it plainly: “Aren’t Anthropic afraid of Elon siphoning the model weights out from the network buses?” virgildotcodes asked the longer version: if someone owned the data center, could they observe token streams, exfiltrate model weights, use that to build competitive models?
Nobody in the thread had a satisfying answer. Presumably Anthropic has contractual protections. Whether those protections hold against a datacenter owner who is also a frontier AI competitor is a different kind of question.
There’s also a simpler angle. chinathrow put it this way: “I use Claude daily but I do not want that my spend is going towards Elon.”
This is the tradeoff that’s hard to price. Claude has been the model of choice for users who specifically want to be somewhere other than the OpenAI orbit. Choices like this make that positioning complicated.
What the GB200 does
The NVIDIA GB200 (Blackwell) is a significant architectural step from H100. The GB200 NVLink 72 rack combines 36 Grace CPUs and 72 Blackwell GPUs in a single unit with a 72x NVLink interconnect - roughly 5× the NVLink bandwidth of the previous generation. Per-node FP8 training performance is claimed at ~1.4 PetaFLOPS per rack.
For large model training, the practical effect is being able to run larger parallel configurations with less inter-node synchronization overhead. What took Colossus1 a hundred thousand H100s, Colossus2 can potentially do with meaningfully fewer units, or can do larger things with the same headcount.
Anthropic’s next model release will be shaped, in part, by this. The compute is real.
The reading I keep returning to
ReptileMan left one comment: “War makes strange bedfellows.”
Three words. The whole situation.
Anthropic and xAI are ideologically distant. Anthropic was founded partly as a response to what its founders considered insufficient safety focus at OpenAI. Elon Musk has been publicly combative with Anthropic’s backers and the broader AI safety community. Their visions for how AI development should go are not aligned.
And yet here is the deal: Anthropic trains on Colossus, xAI collects the fee, GB200s run Claude. The market found an equilibrium that no mission statement would have predicted.
The question alienreborn asked is the one that lingers: “Why is xAI giving up their advantage?” Training compute is a competitive moat. You don’t rent it to rivals unless you think the moat either isn’t working or isn’t worth defending.
Maybe the model race looks different from inside xAI than it does from outside. Maybe the numbers on Grok’s market share made the decision obvious. Maybe Musk decided infrastructure is the business and models are the product someone else builds on top.
Or maybe it’s simpler than all of that. Building the world’s largest GPU cluster is expensive. Having it run hot while a competitor pays for training time is less expensive.
Discussion on Hacker News · Source: twitter.com · Submitted by aurareturn
Hoang Yell
A software developer and technical storyteller. I read Hacker News every day and retell the best stories here — in English and Vietnamese — for curious people who don't have time to scroll.