Microsoft Unveils Two Customised Chips For AI

Two custom-designed chips will deliver AI services and will be used internally in Microsoft data centres and Azure cloud

Microsoft at its Ignite developer conference has unveiled two inhouse customised processors designed to deliver artificial intelligence (AI) services.

The tech giant’s announcement revealed the two custom-designed processors, namely the Microsoft Azure Maia AI Accelerator, optimised for AI tasks and generative AI, and the Microsoft Azure Cobalt CPU, an ARM-based processor tailored to run general purpose compute workloads on the Microsoft Cloud.

Microsoft of course is now well established in AI having invested in both OpenAI and Inflection, and has over the past year been steadily integrating AI capabilities into its product portfolio.

These servers inside a datacenter in Quincy, Washington, are the first to be powered by the Microsoft Azure Cobalt 100 CPU. Photo by John Brecher for Microsoft.

AI chips

The software giant bundled AI capabilities into both its Bing search engine and Edge browser in February.

Then in March Microsoft enhanced its core productivity tools, including Word, Excel and Outlook, with ChatGPT technology.

And in May Redmond opened access to its AI enhanced Bing search engine, to all users.

Now the firm has introduced two custom-designed computing chips to help it tackle the high cost of delivering AI services.

The chips will start to roll out internally early next year to Microsoft’s data centres, initially powering the company’s services such as Microsoft Copilot or Azure OpenAI Service.

“Microsoft is building the infrastructure to support AI innovation, and we are reimagining every aspect of our data centres to meet the needs of our customers,” said Scott Guthrie, executive VP of Microsoft’s Cloud + AI Group.

A custom-built rack for the Maia 100 AI Accelerator and its “sidekick” inside a thermal chamber at a Microsoft lab in Redmond, Washington. The sidekick acts like a car radiator, cycling liquid to and from the rack to cool the chips as they handle the computational demands of AI workloads. Photo by John Brecher for Microsoft.

“At the scale we operate, it’s important for us to optimise and integrate every layer of the infrastructure stack to maximise performance, diversify our supply chain and give customers infrastructure choice,” said Guthrie.

The end goal is an Azure hardware system that offers maximum flexibility and can also be optimised for power, performance, sustainability or cost, added Rani Borkar, corporate VP for Azure Hardware Systems and Infrastructure (AHSI).

“Software is our core strength, but frankly, we are a systems company,” Borkar said. “At Microsoft we are co-designing and optimizing hardware and software together so that one plus one is greater than two. We have visibility into the entire stack, and silicon is just one of the ingredients.”

Azure Maia AI Accelerator

The new Maia 100 AI Accelerator will power some of the largest internal AI workloads running on Microsoft Azure.

Additionally, OpenAI has provided feedback on Azure Maia and Microsoft’s deep insights into how OpenAI’s workloads run on infrastructure tailored for its large language models is helping inform future Microsoft designs.

“Since first partnering with Microsoft, we’ve collaborated to co-design Azure’s AI infrastructure at every layer for our models and unprecedented training needs,” said Sam Altman, CEO of OpenAI.

Sam Altman OpenAI ChatGPT
Image credit: Sam Altman

“We were excited when Microsoft first shared their designs for the Maia chip, and we’ve worked together to refine and test it with our models,” said Altman. “Azure’s end-to-end AI architecture, now optimized down to the silicon with Maia, paves the way for training more capable models and making those models cheaper for our customers.”

The Maia 100 AI Accelerator was also designed specifically for the Azure hardware stack, according to Brian Harry, a Microsoft technical fellow leading the Azure Maia team.

Microsoft Azure Maia AI Accelerator first designed for large language model training and inferencing in the Microsoft Cloud.
Image credit Microsoft

That vertical integration – the alignment of chip design with the larger AI infrastructure designed with Microsoft’s workloads in mind – can yield huge gains in performance and efficiency, he said.

“Azure Maia was specifically designed for AI and for achieving the absolute maximum utilisation of the hardware,” he said.

Azure Cobalt CPU

Meanwhile, the Cobalt 100 CPU is built on ARM architecture, to keep achieve Microsoft’s sustainability goal.

Microsoft Azure Cobalt CPU is developed by Microsoft for Microsoft Cloud.
Image credit Microsoft

It aims to optimise “performance per watt” throughout Microsoft data centres, which essentially means getting more computing power for each unit of energy consumed.

“The architecture and implementation is designed with power efficiency in mind,” said Wes McCullough, corporate VP of hardware product development. “We’re making the most efficient use of the transistors on the silicon. Multiply those efficiency gains in servers across all our data centres, it adds up to a pretty big number.”

Microsoft’s decision to build its own customised chips reflects the high cost of delivering AI services, which can be 10 times greater than for traditional services (i.e. search engines).

The Cobalt 100 CPU will help Microsoft achieve more in-house cost savings and help Redmond against its main cloud rival, Amazon Web Service (AWS).