Monday, December 23, 2024

Will the cost of scaling infrastructure limit AI’s potential?

Must read


We want to hear from you! Take our quick AI survey and share your insights on the current state of AI, how you’re implementing it, and what you expect to see in the future. Learn More


AI delivers innovation at a rate and pace the world has never experienced. However, there is a caveat, as the resources required to store and compute data in the age of AI could potentially exceed availability. 

The challenge of applying AI at scale is one that the industry has been grappling with in different ways for some time. As large language models (LLMs) have grown, so too have both the training and inference requirements at scale. Added to that are concerns about GPU AI accelerator availability as demand has outpaced expectations.

The race is now on to scale AI workloads while controlling infrastructure costs. Both conventional infrastructure providers and an emerging wave of alternative infrastructure providers are actively pursuing efforts to increase the performance of processing AI workloads while reducing costs, energy consumption, and the environmental impact to meet the rapidly growing needs of enterprises scaling AI workloads. 

“We see many complexities that will come with the scaling of AI,” Daniel Newman, CEO at The Futurum Group, told VentureBeat. “Some with more immediate effect and others that will likely have a substantial impact down the line.”


Countdown to VB Transform 2024

Join enterprise leaders in San Francisco from July 9 to 11 for our flagship AI event. Connect with peers, explore the opportunities and challenges of Generative AI, and learn how to integrate AI applications into your industry. Register Now


Newman’s concerns involve the availability of power as well as the actual long-term impact on business growth and productivity.

Is Quantum Computing a solution for AI scaling?

While one solution to the power issue is to build more power generation capacity, there are many other options. Among them is integrating other types of non-traditional computing platforms, such as Quantum computing.

“Current AI systems are still being explored at a rapid pace and their progress can be limited by factors such as energy consumption, long processing times, and high compute power demands,” Jamie Garcia, director of Quantum Algorithms and Partnerships at IBM told VentureBeat. “As quantum computing advances in scale, quality, and speed to open new and classically inaccessible computational spaces, it could hold the potential to help AI process certain types of data.”

Garcia noted that IBM has a very clear path to scaling quantum systems in a way that will deliver both scientific and business value to users. As quantum computers scale, he said they will have increasing capabilities to process incredibly complicated datasets. 

“This gives them the natural potential to accelerate AI applications that require generating complex correlations in data, such as uncovering patterns that could reduce the training time of LLMs,” Garcia said. “This could benefit applications across a wide range of industries, including healthcare and life sciences; finance, logistics and materials science.”

AI scaling in the cloud is under control (for now)

AI scaling, much like any other type of technology scaling is dependent on infrastructure.

“You can’t do anything else unless you go up from the infrastructure stack,” Paul Roberts, director of Strategic Account at AWS, told VentureBeat.

Roberts noted that there was a big explosion of gen AI that got started in late 2022 when ChatGPT first went public. While in 2022 it might not have been clear where the technology was headed, he said that in 2024 AWS has its hands around the problem very well. AWS in particular has invested significantly in infrastructure, partnerships and development to help enable and support AI at scale.

Roberts suggests that AI scaling is in some respects a continuation of the technological progress that enabled the rise of cloud computing.

“Where we are today I think we have the tooling, the infrastructure and directionally I don’t see this as a hype cycle,” Roberts said.  I think this is just a continued evolution on the path, perhaps starting from when mobile devices really became truly smart, but today we’re now building these models on the path to AGI, where we’re going to be augmenting human capabilities in the future.”

AI scaling isn’t just about training, it’s also about inference

Kirk Bresniker, Hewlett Packard Labs Chief Architect, HPE Fellow/VP has numerous concerns about the current trajectory of AI scaling.

Bresniker sees a potential risk of a “hard ceiling” on AI advancement if concerns are left unchecked. He noted that given what it takes to train a leading LLM today, if the current processes remain the same he expects that by the end of the decade, more resources would be required to train a single model than the IT industry can likely support.

“We are heading towards a very, very hard ceiling if we continue current course and speed,” Bresniker told VentureBeat. “That’s frightening because we have other computational goals we need to achieve as a species other than to train one model one time.”

The resources required to train increasingly bigger LLMs isn’t the only issue. Bresniker noted that after an LLM is created, the inference is continuously run on them and when that is running 24 hours a day, 7 days a week, the energy consumption is massive

“What’s going to kill the polar bears is inference,” Bresniker said.

How deductive reasoning might help with AI scaling

According to Bresniker, one potential way to improve AI scaling is to include deductive reasoning capabilities, in addition to the current focus on inductive reasoning.

Bresniker argues that deductive reasoning could potentially be more energy-efficient than the current inductive reasoning approaches, which require assembling massive amounts of information, and then analyzing it to inductively reason over the data to find the pattern. In contrast, deductive reasoning takes a logic-based approach to infer conclusions. Bresniker noted that deductive reasoning is another faculty that humans have, that isn’t yet really present in AI. He doesn’t think that deductive reasoning should entirely replace inductive reasoning, but rather that it is used as a complementary approach.

“Adding that second capability means we’re attacking a problem in the right way,” Bresniker said.  “It’s as simple as the right tool for the right job.”

Learn more about the challenges and opportunities for scaling AI at VentureBeat Transform next week. Among the speakers to address this topic at VB Transform are Kirk Bresniker, Hewlett Packard Labs Chief Architect, HPE Fellow/VP; Jamie Garcia, Director of Quantum Algorithms and Partnerships, IBM; and Paul Roberts, Director of Strategic Accounts, AWS.

Latest article