
▲MCUs with added NPUs can perform advanced AI functions such as segmenting and positioning fast-moving objects, estimating poses, classifying objects, and recognizing voices.
ST Neural-ART Accelerator™ integrated real-time AI capabilities
Significantly reducing power consumption and minimizing delays, drawing attention from industries and medical institutions.
“NPU-equipped MCUs significantly expand the capabilities of MCUs, enabling more complex AI tasks that were previously impossible in non-cloud environments, opening up new possibilities for industrial sites, consumer electronics, and smart city infrastructure. It is gaining attention as a core edge AI solution that guarantees data privacy and service stability while reducing power and network costs.
According to a white paper titled “Edge AI Innovation: The Power of Neural Processing Units in Modern Microcontrollers” recently published by STMicroelectronics (ST), MCUs are accelerating edge AI innovation through the inclusion of NPUs.
Artificial intelligence (AI) is a technology that learns data, recognizes patterns, and makes predictions. It is being adopted in various fields such as smartphones, wearables, autonomous vehicles, industrial automation, and medical diagnosis.
On the other hand, network delays and bandwidth burdens that occur when transmitting to cloud servers for processing have been identified as obstacles to expanding edge AI.
Edge AI, which moves AI computation to power-constrained local devices, addresses these issues, simultaneously improving IoT devices' response speed, privacy protection, and energy efficiency.
Traditional CPUs are suited for sequential processing, and GPUs are suited for parallel computation, but AI workloads require frequent memory access and large-scale accumulation and multiplication operations.
NPU provides low-latency, high-efficiency parallel processing by arranging multiple specialized cores optimized for convolutional neural network operations.
It significantly outperforms CPUs and GPUs in terms of TOPS (operations per second)/watt performance, opening up new possibilities for edge devices.
Meanwhile, ST announced the STM32 microcontroller (MCU) family with an integrated neural network processing unit (NPU), Neural-ART Accelerator™, in November 2024.
This product line is for edge devices.The goal is to significantly reduce power consumption and minimize latency by processing AI inference on-device.

▲Power efficiency of various hardware architectures
This product is attracting attention in the industrial, medical, and smart city sectors, which are seeking to reduce cloud dependency and implement real-time AI capabilities.
At the heart of the Neural-ART Accelerator is a reconfigurable stream processing engine and an 8-bit/16-bit fixed-point MAC unit.
Maximize the computational efficiency of your AI model by adjusting the computational precision and number of cores as needed.
Integrated into the STM32 MCU as an IP block, it can process complex image, voice, and sensor data analysis even in environments with limited power budgets.
.jpg)
▲Microcontrollers equipped with NPU accelerators open up a new range of embedded AI possibilities.
Developers can easily optimize and quantize models trained with popular AI frameworks such as Keras, TensorFlow, and ONNX and convert them into code accelerated by the Neural-ART Accelerator via the STM32Cube.AI desktop application or the ST Edge AI developer cloud platform.
The platform automatically maps operators and generates optimal code, reducing development time without the need for cumbersome manual optimization.
In performance benchmarks, the Neural-ART Accelerator improved MobileNet v1 performance by up to 120x compared to Cortex-M55 alone, and Tiny Yolo v2 by 134x. In a human detection example using Yolo v8, it achieved 26 fps at a 1 GHz clock, demonstrating its ability to handle real-time image processing.
Tiny Yolo v2 for smart city cameras supports vehicle and pedestrian recognition at 18fps, significantly enhancing traffic monitoring efficiency.

▲Measured using the Neural-ART Accelerator Gen1 with four convolutional arrays at 1GHz.
According to ABI Research, MCU shipments for edge AI are expected to reach approximately 1.8 billion units by 2030.
Numerous new applications, including industrial predictive maintenance, visitor analytics in smart retail, healthcare monitoring, and agricultural robots, are expected to leverage NPU-based MCU technology.
ST plans to further expand its Neural-ART Accelerator family in the future to enhance in-memory computing, improve energy efficiency, and support a wider range of operators. The goal is to lead the next-generation AIoT market by focusing on developing hardware and toolchains for implementing high-performance, low-power edge AI.
Meanwhile, ST will present "ST Edge AI Solutions Based on the STM32N6" at the "
2025 e4ds Tech Day " event, held at the ST Center on September 9th. Attendees will be able to explore the STM32 MCU family with integrated Neural-ART Accelerator™ directly through the presentation and a demo booth operated by ST. Registration for the event can be made at
https://www.e4ds.com/conference/techday/ .