Saturday, November 23, 2024

Getting infrastructure right for generative AI

Must read

Facts, it has been said, are stubborn things. For generative AI, a stubborn fact is that it  consumes very large quantities of compute cycles, data storage, network bandwidth, electrical power, and air conditioning. As CIOs respond to corporate mandates to “just do something” with genAI, many are launching cloud-based or on-premises initiatives. But while the payback promised by many genAI projects is nebulous, the costs of the infrastructure to run them is finite, and too often, unacceptably high.

Infrastructure-intensive or not, generative AI is on the march. According to IDC, genAI workloads are increasing from 7.8% of the overall AI server market in 2022 to 36% in 2027. In storage, the curve is similar, with growth from 5.7% of AI storage in 2022 to 30.5% in 2027. IDC research finds roughly half of worldwide genAI expenditures in 2024 will go toward digital infrastructure. IDC projects the worldwide infrastructure market (server and storage) for all kinds of AI will double from $28.1 billion in 2022 to $57 billion in 2027.

But the sheer quantity of infrastructure needed to process genAI’s large language models (LLMs), along with power and cooling requirements, is fast becoming unsustainable.

Latest article