For three years, Microsoft's artificial intelligence story has been inseparable from OpenAI. The partnership — cemented by a cumulative investment exceeding $13 billion — gave Microsoft early access to the most advanced AI models on the planet, catapulting its Copilot products into the enterprise mainstream and adding hundreds of billions of dollars to its market capitalization. To the outside world, Microsoft's AI strategy was OpenAI. Mustafa Suleyman wants to change that narrative. In an exclusive sit-down interview with VentureBeat at Microsoft Build 2026 , the CEO of Microsoft AI disclosed that a contractual change with OpenAI roughly six months ago granted his division the formal authority to pursue what he openly calls "superintelligence" — using Microsoft's own researchers, its own data pipelines, and its own custom silicon. "We were only sort of set free from our contract with OpenAI about six months ago to formally pursue superintelligence," Suleyman said. "So this is very early days." The comment, delivered matter-of-factly backstage at the Fort Mason Center here, offers the clearest signal yet of a strategic inflection point unfolding inside the world's most valuable public company. Microsoft is not abandoning OpenAI. But it is building something alongside it — and, eventually, something that could stand entirely on its own. Microsoft's first in-house model family signals a new level of AI ambition The most tangible evidence of that shift arrived the same day. Microsoft announced a family of seven new AI models developed entirely in-house by its AI Superintelligence Team, spanning reasoning, code generation, image creation, transcription, and voice synthesis. The models — branded under the "MAI" family name — are Microsoft's most ambitious first-party AI release to date. The flagship, MAI-Thinking-1 , is a 35-billion-active-parameter reasoning model that Microsoft says matches leading models in its weight class on key software engineering benchmarks and demonstrates advanced mathematical reasoning. Suleyman emphasized one point repeatedly: the model was trained from scratch on clean, commercially licensed data, without distillation from third-party frontier models — a direct, if unstated, contrast to the widespread industry practice of using outputs from competitors' systems to train cheaper alternatives. "We train our reasoning models from scratch," Suleyman wrote in a blog post accompanying the announcement. "We don't distill from other labs and we don't rely on unlicensed or opaque data." The rest of the family fills out a multimodal portfolio designed for enterprise deployment: MAI-Code-1-Flash , a lightweight coding model built specifically for GitHub Copilot and VS Code ; MAI-Image-2.5 , which supports both text-to-image and image editing; MAI-Transcribe-1.5 , which Microsoft claims is the most accurate transcription model available, operating across 43 languages; and MAI-Voice-2 , a multilingual speech-generation system. All of the models ship through Microsoft Foundry , the company's model-hosting and deployment infrastructure, and for the first time, developers can tune model weights themselves through third-party platforms including OpenRouter , Fireworks , and Baseten . But Suleyman made clear in the interview that the seven models are a proof of concept, not a finished product. The real project is the lab itself. "Our job is to make sure that when we look out to 2030 and beyond, we have the capacity not just to buy models from third parties, but to build the absolute frontier, the best models in the world," he said. "That's a long transition." What "set free" from OpenAI actually means for Microsoft's AI future To understand what Suleyman means by "set free," you need to understand the unusual contractual architecture that has governed Microsoft's AI efforts for years. When Microsoft invested billions into OpenAI beginning in 2019, the partnership came with a specific arrangement: OpenAI would build the frontier models, and Microsoft would serve as the exclusive cloud provider , integrating those models into its products and reselling them through Azure. The deal gave Microsoft extraordinary commercial leverage — access to the world's most advanced AI without having to build it — but it also created a dependency. Microsoft was explicitly barred from pursuing its own AGI research, and the agreement even capped how large a model the company could train, restricting it from building systems beyond a certain computing threshold measured in FLOPS. That arrangement was formally renegotiated. As Fortune and Axios reported in November, a revised deal with OpenAI removed those restrictions, clearing the way for Suleyman to launch the MAI Superintelligence Team and pursue what he calls " humanist superintelligence ." The result, in Suleyman's telling at the time, was a "best-of-both environment, where we're free to pursue our own superintelligence and also work closely with them." By the time he sat down with VentureBeat at Build 2026, roughly six months had passed since that self-sufficiency effort formally began. Microsoft had already started shipping in-house models — including MAI-Image-2-Efficient , a lighter-weight image generation model released in April — but the seven MAI models announced at Build are the team's most ambitious release yet: a full multimodal family spanning reasoning, code, image generation, transcription, and voice. Even so, Suleyman does not view the shift as a rupture with OpenAI. He described Microsoft's current position as one of abundance, not scarcity. "There's no immediate urgent need to fill a gap in three months' time or six months' time," he said. "We have OpenAI, we have Anthropic, we have thousands of models inside Foundry. So there's already a huge amount of optionality available to us." The framing is telling. Microsoft's push into first-party frontier models is not born out of a crisis in the OpenAI relationship but out of a strategic calculation: as AI becomes the most consequential technology layer in enterprise computing, the company cannot afford to depend entirely on partners for the foundational capability. "Over the next five years, we have to be able to produce state-of-the-art frontier-scale models," Suleyman said. "That's our mission." Suleyman says the shift from chatbots to autonomous AI agents has already begun If the seven MAI models represent the technical ambition, a new capability called Frontier Tuning represents the commercial logic. Announced alongside the models at Build, Frontier Tuning allows enterprise customers to customize MAI models using their own proprietary data, workflows, and domain terminology, all within their own secure compliance boundary. The system uses reinforcement learning environments — what Microsoft calls " training gyms for AI " — that let agents learn directly from real workplace tasks without affecting production systems. The results Microsoft shared are striking. An MAI model tuned for Excel reportedly matches GPT 5.4 performance while operating at up to ten times greater efficiency. Early enterprise adopters are seeing similar gains: when tuned for one unnamed organization's exacting standards, the MAI model achieved the highest win rate of any model tested at roughly one-tenth the cost. Suleyman framed Frontier Tuning as part of a broader evolutionary stage — a move from intelligence to action. "We've basically moved beyond just conversation," he told VentureBeat. "Now we're moving to action." He introduced a new framework for thinking about that progression: the shift from IQ (factual intelligence) to EQ (emotional intelligence, or the ability to follow tone and style instructions) to what he calls AQ — the "Actions Quotient." Future AI agents, in Suleyman's telling, won't just answer questions. They will log into enterprise software, navigate complex multi-application workflows, and execute tasks across Excel, Word, Teams, Jira, Adobe InDesign, and customer relationship management systems — just as a human employee would. "You should be able to show up on day one and almost provision credentials to a new AI agent," he said. "The model needs to be able to move across all of these different environments, and that's actually the great strength of Microsoft." The Build 2026 announcements bore this out in concrete product terms. Microsoft Scout , the company's first "Autopilot" agent, operates as an always-on background assistant built on the open-source OpenClaw technology. It runs with its own governed identity inside Microsoft Entra , so its actions are auditable and attributable. Windows 365 for Agents gives AI agents their own managed Cloud PCs, allowing them to interact directly with applications and browsers inside enterprise environments. And the Foundry platform received major updates — including hosted agents with sub-100-millisecond cold starts, a new Microsoft Agent Framework, and one-click publishing to Teams and Microsoft 365 Copilot. Why Microsoft believes enterprise data is the next AI training frontier Suleyman also articulated why he believes Microsoft's position is uniquely defensible — and the argument has less to do with model architecture than with where work actually happens. "We've sort of hoovered up all of the obvious pools of training data," he said, referring to the industry's early scramble to ingest the open web. "In the next phase, we actually want to be able to give these agents to companies to train on their specific tasks with the data that they have inside of their own big workflows." The claim is subtle but consequential. The first wave of generative AI was trained on publicly available text — books, websites, Reddit posts, code repositories. That data is now largely exhausted, and its use is increasingly contested in court. The next wave, Suleyman argues, will be trained on enterprise-specific data: the internal workflows, decision traces, and institutional knowledge that define how real organizations operate. Microsoft, which serves 493 of the Fortune 500 through Azure according to Suleyman, is already embedded inside those workflows through Microsoft 365, Teams, Dynamics 365, and the broader Azure ecosystem. Frontier Tuning is the mechanism that converts that positional advantage into model performance. "People underappreciate that that's going to be the next domain," Suleyman said. The early partner list for Frontier Tuning reflects the ambition: Mayo Clinic , where Microsoft is co-creating a frontier AI model for healthcare using de-identified clinical data; EY , which is tuning a tax-advisory agent for deployment to 75,000 professionals globally; Land O'Lakes , where Frontier Tuning delivered what the company's product development scientist called "meaningful improvements in grounded outputs and style compliance"; and Pearson , which is using tuned models to provide learning-science-aligned feedback in its Communication Coach product. The Mayo Clinic partnership may be the most significant. Microsoft and Mayo Clinic are collaborating to build a healthcare-specific frontier model that combines Mayo's clinical expertise and longitudinal patient insights with Microsoft's AI capabilities. The model will be owned by Mayo Clinic and deployed first within Mayo's own environment before being made available to other organizations through Foundry. Microsoft's custom AI chips and GPU buying spree reveal the scale of its compute advantage None of this works without an industrial-scale compute infrastructure, and Suleyman was unusually candid about the hardware economics underlying Microsoft's strategy. "We are the largest buyer of GPUs on the planet," he said. "We're the largest buyer of GB200s and GB300s in the world." Microsoft will continue purchasing Nvidia accelerators "for many, many years to come," Suleyman said. But the company is simultaneously building its own custom silicon. Maia 200 , Microsoft's second-generation AI accelerator, is already running in production across data centers in Iowa and Arizona, with deployments planned for Italy, Australia, and South Korea. According to Microsoft, Maia 200 delivers the best tokens-per-dollar-per-watt in the company’s fleet. Suleyman put a finer point on the economics in the interview: Maia 200 is 30 percent more cost-efficient than Nvidia's GB200, he said. And when Microsoft co-optimizes its own MAI models to run natively on Maia silicon, the company sees an additional 1.4x improvement in performance per watt. "It is going to be cheaper in years to come to build on MAI models with Maia 200 and Maia 300 inside of Azure," he said. That claim — if it holds at scale — has profound implications for the competitive landscape. It means Microsoft is not merely buying its way to AI dominance through Nvidia; it is building a vertically integrated stack in which its own models, running on its own chips, inside its own cloud, tuned on its customers' own data, could offer performance and cost characteristics that no competitor can replicate. Suleyman rejects the idea that AI models are becoming commodities Suleyman also pushed back sharply against one of the most popular narratives in Silicon Valley: that AI models are rapidly commoditizing. "A lot of people are saying models are commoditizing," he said. "I don't think that's true." His argument hinges on what he calls "quality tokens" — the proposition that the composition, curation, licensing, and deduplication of training data matter at least as much as raw scale. Microsoft's new MAI models, he said, were trained on a pre-training mix composed of approximately 50 percent high-quality code, with the remainder drawn from commercially licensed and carefully curated sources. The result, he argued, is a distinct "lineage" of models optimized for coding, reasoning, and agentic behavior — fundamentally different from models optimized for consumer chat, cultural content, or multilingual breadth. "We're going to see very distinct lineages that reflect different training objectives of different companies," he said. "Quality tokens matter more than just brute-force scale." This is a strategically important argument for Microsoft to make. If models are commodities — if any lab can match the frontier within months using cheaper compute and distilled training data — then the model layer becomes a race to the bottom, and Microsoft's billions in compute investment offer no durable advantage. But if model quality is a function of data discipline, research depth, and institutional patience, then the lab-building approach Suleyman is pursuing becomes a genuine competitive moat. He used a specific metaphor to describe that approach, one borrowed from optimization theory: the " hill-climbing machine ." The phrase describes a system that continuously improves — cycle after cycle — by applying more compute, better data, and sharper evaluation. "The goal here is to build what we think of as a hill-climbing machine," he wrote in his blog post . "An organization that can continuously improve, cycle after cycle." The metaphor is revealing because it describes a process, not a destination. Suleyman is not promising that Microsoft will build the world's best model next quarter. He is arguing that Microsoft is building the system — the research culture, the data pipelines, the silicon co-optimization, the evaluation infrastructure — that will produce progressively better models over years. Inside Microsoft's five-year plan to become a self-sufficient AI superpower The strategic picture that emerges from Suleyman's comments — and from the full scope of the Build 2026 announcements — is of a company preparing for a future in which AI capability is not rented from a partner but generated internally, at scale, across every layer of the stack. Microsoft still needs OpenAI. The partnership continues to power Copilot, Azure AI services, and ChatGPT's infrastructure. Suleyman acknowledged as much, describing Microsoft's portfolio of model providers as a source of strength, not a problem to be solved. But the direction of travel is unmistakable. With its own frontier models, its own custom silicon, its own reinforcement learning environments for enterprise tuning, and its own autonomous agent infrastructure, Microsoft is constructing a parallel path — one that, by 2030, could make the company a fully self-sufficient frontier AI lab embedded inside the world's largest enterprise software platform. "Our ultimate goal is what we call Humanist Superintelligence," Suleyman wrote in his blog post . "That means advanced AI systems designed to serve people and organizations, not replace them." Whether that goal is achievable — or even clearly definable — remains one of the great open questions in technology. And Suleyman expressed more confidence than caution when asked about the trajectory of progress. "I really think we're at the tip of the iceberg," he said. "The models are so much more powerful than we know how to extract intelligence from them." But confidence and execution are different things. Building a frontier lab is not an announcement; it is a decade-long commitment that requires retaining elite researchers, maintaining scientific rigor under commercial pressure, and producing results that justify the staggering capital expenditure. Google learned this with DeepMind — which Suleyman himself co-founded in 2010, before joining Microsoft — and even that lab, widely regarded as one of the best in the world, spent years navigating the tension between pure research and product delivery. Suleyman seemed aware of the contradiction. "If you rush it, you'll screw it up," he said. The sticker on his laptop reads: "Patience and urgency." It is a paradox that Microsoft now has five years — and several hundred billion dollars — to resolve.
View original source — VentureBeat ↗
