OpenAI eyes $10B datacenter buildout as it loosens Microsoft ties
Large storage vendor opportunity looms
ChatGPT developer OpenAI is reportedly considering building its own datacenters to hold its training data.
A paywalled report in The Information says OpenAI is looking at spending up to $10 billion on storage hardware and software to store around five exabytes of data needed to train its large language models (LLMs).
This storage datacenter would be located near Abilene in Texas and link to the first of the $500 billion Stargate joint-venture datacenters planned by OpenAI, Oracle, SoftBank, and the MGX investment fund.
OpenAI would also triple its overall datacenter capacity in 2025, increasing its overall computer power eightfold from the end of 2024.
So far, OpenAI has not had its own datacenters, relying on Microsoft’s Azure IT Infrastructure. Microsoft has invested $13 billion in OpenAI and the two had an exclusive partnership with Microsoft integrating OpenAI’s ChatGPT in its Copilot AI implementation.
In January, their relationship was modified and they agreed that Microsoft had the first shot at providing new datacenter capacity but could refuse, and that OpenAI could step outside Azure for its GPU services.
Since then, OpenAI has started using GPU servers from Oracle and CoreWeave, and Microsoft has been looking at LLMs from xAI, Meta, and DeepSeek.
Lately, Microsoft seems to be shrinking its AI datacenter needs. In February, the Windows giant ended leases with at least two US datacenter operators. These were sized at several hundreds of megawatts. Investment bank TD Cowen reported that Microsoft ended negotiations to lease 2 GW of datacenter capacity in the US and Europe, and deferred and canceled other leases, “largely driven by the decision to not support incremental OpenAI training workloads.”
TD Cowen thinks Microsoft may have a “potential oversupply position” in AI datacenters. Microsoft said it may strategically pace or adjust its infrastructure in some areas, but its investment plans remain on track.
One interpretation is that Microsoft thinks less computationally intensive AI models, like DeepSeek, mean that it doesn’t need so much new AI datacenter capacity in the near and mid-term, hence the lease terminations and loosening of its OpenAI relationship.
OpenAI is also lessening its dependence on Microsoft and Azure, leading to its private datacenter storage idea. If this is more than a negotiation concept to put pressure on Microsoft, it represents a fantastic opportunity for an AI datacenter storage supplier. A front-runner could be VAST, given its success supplying GPU server cloud infrastructure to CoreWeave, Lambda, and others, and its storage sales to xAI’s Colossus system. DDN could point to its strong Nvidia relationship and xAI presence as well. Other companies in the frame are Pure Storage, with its FlashArray//EXAproduct, and WEKA.