Suleyman's Bet: There Is No Wall, Only People Who Can't Read Exponents
- Bonca | Lab

- 23 hours ago
- 3 min read
Mustafa Suleyman picked April 8, 2026 to tell the skeptics they're wrong. Again. In an MIT Technology Review op-ed, the Microsoft AI CEO laid out the numbers: training compute for frontier models has grown roughly a trillion-fold since 2010, from about 10¹⁴ flops to north of 10²⁶. Every time someone predicts a wall - data, energy, Moore's Law - the curve keeps bending up.
That's the thesis. Everything else is plumbing.
And the plumbing is where the piece actually gets interesting. Suleyman isn't just waving the scaling flag. He's arguing that the people calling a plateau are reading one variable - transistor density - and missing four others compounding beside it.
The calculator room
His metaphor: imagine AI training as a warehouse full of people mashing calculators. For years, adding compute meant adding more bodies, most of them idle half the time. What changed isn't just faster calculators. It's that every calculator now fires in sync.
Three hardware shifts are doing the work. Nvidia's raw chip throughput jumped from 312 teraflops in 2020 to around 2,500 today - roughly 8x in six years. High-bandwidth memory stopped starving those chips. And NVLink plus InfiniBand stitched hundreds of thousands of GPUs into something that behaves like one machine. Microsoft's own Maia 200 chip, Suleyman claims, delivers about 30% better performance per dollar than its predecessor.
The kicker is what happens when you multiply these together. A language model that took 167 minutes to train on eight GPUs in 2020 now trains in under four. Fifty times faster. Moore's Law alone would have predicted five.
The software half nobody talks about
Hardware is only half the story, and Suleyman knows the harder sell is the other half. The compute needed to hit a given performance benchmark is halving roughly every eight months - more than twice Moore's pace. Better architectures, better training recipes, better data curation. Software efficiency is quietly doing as much work as the silicon.
Stack these curves and you get his headline claim: effective compute could grow another 1,000x by the end of 2028. Frontier labs are expanding capacity around 4x annually. Global AI-relevant compute is on track to hit 100 million H100-equivalents by 2027. By 2030, he suggests, the industry could be bringing 200 gigawatts online every year - rough equivalent to the peak electricity demand of the UK, France, Germany and Italy combined.
Read that number twice. It's the part that should give anyone pause.
What the compute buys
Suleyman's prize isn't a smarter chatbot. It's agents. Systems that write code for days, run projects for months, negotiate contracts, manage logistics. Semi-autonomous workers, in his framing - the transition from answering questions to doing jobs. He's been drumbeating this since his Inflection days, but now he has Microsoft's $100 billion cluster plans and a declared superintelligence lab behind the claim.
Notice what he doesn't say. The word AGI doesn't appear. Two years ago Suleyman was calling AGI a "conceptual horizon." Now it's "near human-level agents." Italian commentator Stefano Quintarelli caught the rhetorical downshift - a quieter framing for a moment when the hype needs to justify the capex, not promise the singularity.
The part he glosses
Power. Suleyman names it as the binding constraint and moves on. But 200 gigawatts a year is not a footnote - it's a fight with grid operators, regulators, water authorities, and local communities in every jurisdiction where these campuses land. And there's a second friction he doesn't touch: the audience. Recent polling cited around the piece shows Gen Z hostility to AI rising from 22% to 31% in a year. The infrastructure is compounding faster than the social license to run it.
So the real question isn't whether the compute curve bends. Suleyman has the receipts on that. It's whether the grid, the permitting process, and the public stay patient long enough for 2028 to arrive on his schedule. What happens to the exponential when it runs into a substation that takes seven years to build?
Sources: MIT Technology Review, Suleyman op-ed (April 8, 2026); Startup Fortune; DigitalToday; Brightcast; Quintarelli blog.
Comments