Robert Triggs / Android Authority
If you’ve been weighing up a new laptop purchase, you’ll have no doubt spotted that they’re increasingly boasting NPU capabilities that sound an awful lot like the hardware we’ve seen in the very best smartphones for a number of years now. The driving factor is the push for laptops to catch up with mobile AI capabilities, embuing them with advanced AI features, like Microsoft’s Copilot, that can run securely on-device without needing an internet connection. So here’s everything you need to know about NPUs, why your next laptop might have one, and whether or not you should buy one.
Are you interested in laptop AI capabilities?
781 votes
What is an NPU?
NPU is an acronym for Neural Processing Unit. NPUs are dedicated to running mathematical functions associated with neural network/machine learning/AI tasks. While these can be standalone chips, they are increasingly integrated directly into a system-on-chip (SoC) alongside more familiar CPU and GPU components.
NPUs are dedicated to accelerating machine learning, aka AI tasks.
NPUs come in various shapes and sizes and are often called something slightly different depending on the chip designer. You’ll already find different models scattered across the smartphone landscape. Qualcomm has Hexagon inside its Snapdragon processors, Google has its TPUs for both cloud and its mobile Tensor chips, and Samsung has its own implementation for Exynos.
The idea is now taking off in the laptop and PC space, too. For instance, there’s the Neural Engine inside the latest Apple M4, Qualcomm’s Hexagon features in the Snapdragon X Elite platform, and AMD and Intel have begun integrating NPUs into their latest chipsets. While not quite the same, NVIDIA’s GPUs blur the lines, given their impressive number-crunching capabilities. NPUs are increasingly everywhere.
Why do gadgets need an NPU?
Robert Triggs / Android Authority
As we mentioned, NPUs are purpose-built to handle machine learning workloads (along with other math-heavy tasks). In layman’s terms, an NPU is a very useful, perhaps even essential, component for running AI on-device rather than in the cloud. As you’ve no doubt spotted, AI seems to be everywhere these days, and incorporating support directly into products is a key step in that journey.
A lot of today’s AI processing is done in the cloud, but this isn’t ideal for several reasons. First is latency and network requirements; you can’t access tools when offline or might have to wait for long processing times during peak hours. Sending data over the internet is also less secure, which is a very important factor when using AI that has access to your personal information, such as Microsoft’s Recall.
Put simply, running on-device is preferable. However, AI tasks are very compute-heavy and don’t run well on traditional hardware. You might have noticed this if you’ve tried to generate images via Stable Diffusion on your laptop. It can be painfully slow for more advanced tasks, although CPUs can run a number of “simpler” AI tasks just fine.
NPUs enable AI tasks to run on-device, without the need of an internet connection.
The solution is to adopt dedicated hardware to speed up these advanced tasks. You can read more about what NPUs do later in this article, but the TLDR is that they run AI tasks faster and more efficiently than your CPU can do alone. Their performance is often quoted in trillions of operations per second (TOPS), but this isn’t a hugely useful metric because it doesn’t tell you exactly what each operation is doing. Instead, it’s often better to look for figures that tell you how quickly it takes to process tokens for large models.
Talking of TOPS, smartphone and early laptop NPUs are rated in the tens of TOPS. Broadly speaking, this means they can accelerate basic AI tasks, such as camera object detection to apply bokeh blur or summarize text. If you want to run a large language model or use generative AI to produce media quickly, you’ll want a more powerful accelerator/GPU in the hundreds or thousands of TOPS range.
Is an NPU different from a CPU?
A neural processing unit is quite different from a central processing unit due to the type of workload it is designed to run. A typical CPU in your laptop or smartphone is fairly general-purpose to cater to a wide range of applications, supporting broad instruction sets (functions it can perform), various ways to cache and recall functions (to speed up repeating loops), and big out-of-order execution windows (so they can keep doing things instead of waiting).
However, machine learning workloads are different and don’t need quite so much flexibility. They’re much more math-heavy for a start, often requiring repetitive computationally expensive instructions like matrix-multiply and very quick access to large pools of memory. They also often operate on unusual data formats, such as sixteen-, eight- or even four-bit integers. By comparison, your typical CPU is built around 64-bit integer and floating-point math (often with additional instructions added on).
An NPU is faster and more power efficient at running AI tasks compared to a CPU.
Building an NPU dedicated to mass parallel computing of these specific functions results in faster performance and less power wasted on idle features that are not helpful for the task at hand. However, not all NPUs are equal. Even outside of their sheer number-crunching capabilities, they can be built to support different integer types and operations, meaning that some NPUs are better at working on certain models. Some smartphone NPUs, for example, run on INT8 or even INT4 formats to save on power consumption, but you’ll obtain better accuracy from a more advanced but power-hungry FP16 model. If you need really advanced compute, dedicated GPUs and external accelerators are still more powerful and format-diverse than integrated NPUs.
As a backup, CPUs can run machine-learning tasks but are often much slower. Modern CPUs from Arm, Apple, Intel, and AMD support the necessary mathematical instructions and some of the smaller quantization levels. Their bottleneck is often just how many of these functions they can run in parallel and how quickly they can move data in and out of memory, which is what NPUs are specifically designed to do.
Should I buy a laptop with an NPU?
Robert Triggs / Android Authority
While far from essential, especially if you don’t care about the AI trend, NPUs are required for some of the latest features you’ll find in the mobile and PC space.
Microsoft’s Copilot Plus, for example, specifies an NPU with 40TOPS of performance as its minimum requirement, which you’ll need to use Windows Recall. Unfortunately, Intel’s Meteor Lake and AMD’s Ryzen 8000 chips found in current laptops (at the time of writing) don’t meet that requirement. However, AMD’s newly announced Stix Point Ryzen chips are compatible. You won’t have to wait long for an x64 alternative to Arm-based Snapdragon X Elite laptops, as Stix Point-powered laptops are expected in H1 2024.
Popular PC-class tools like Audacity, DaVinci Resolve, Zoom, and many others are increasingly experimenting with more demanding on-device AI capabilities. While not essential for core workloads, these features are becoming increasingly popular, and AI capabilities should factor into your next purchase if you’re regularly using these tools.
CoPilot Plus will only be supported on laptops with a sufficiently powerful NPU.
When it comes to smartphones, features and capabilities vary a bit more widely by brand. For instance, Samsung’s Galaxy AI only runs on its powerful flagship Galaxy S handsets. It hasn’t brought features like chat assist or interpreter to affordable Galaxy A55, likely because it lacks the necessary processing power. That said, some of Samsung’s features run in the cloud, too, but these are likely not funded with more affordable purchases. Speaking of, Google is equally so-so in terms of feature consistency. You’ll find the very best of Google’s AI extras on the Pixel 8 Pro, such as Video Boost — still, the Pixel 8 and even the affordable 8a run many of the same AI tools.
Ultimately, AI is here, and NPUs are the key to enjoying on-device features that can’t run on older hardware. That said, we’re still in the early days of AI workloads, especially in the laptop space. Software requirements and hardware capabilities will only grow in the coming years. In that sense, waiting until the dust settles before leaping in won’t hurt.