ARM is making changes to the ARMv8 instruction set to boost the performance of artificial intelligence and machine learning by up to 50 times.
The new DynamIQ technology is a shift in multicore microarchitecture for the industry and the foundation for future ARM Cortex-A processors. The nww cluster technology will allow up to eight completely different cores to be used in a big.LITTLE style. ARM promises that the technology will redefine the multi-core experience across a greater range of devices from edge to cloud across a secure, common platform. DynamIQ technology will be pervasive in cars, homes, and of course our smartphones as well as other connected devices where machine learning is applied to the zettabytes of data they generate.
DynamIQ big.LITTLE carries on the 'right processor for the right task' approach and enables configurations of big and LITTLE processors on a single compute cluster which were previously not possible. For example, 1+3 or 1+7 DynamIQ big.LITTLE configurations with substantially more granular and optimal control are now possible. This offers flexibility in SoCs designed with right-sized compute with heterogeneous processing that deliver AI performance at the device itself.
ARM claims that DynamIQ boosts AI computations by up to 50x performance over the next three to five years when compared to previous systems. For additional AI performance, there is improved access to accelerators with a dedicated low-latency port, resulting in up to 10x quicker response. This improved data transfer speed alongside higher data bandwidth, bolsters the overall throughput of a Compute Vision (CV) or ML system.
Additionally, safety-related application performance is improved with shorter latency in decision making and actuation across AI scenarios such as Advanced Driver Assistance Systems (ADAS) for autonomous vehicles.
DynamIQ technology also improves energy efficiency by incorporating intelligent power features within the cluster that help extract every ounce of performance.
DynamIQ supports multiple, configurable, performance domains within a single cluster. These domains, consisting of single or multiple ARM CPUs, can scale in performance and power with finer granularity than previous quad-core clusters. This means more fine-tuned ability to stay within the thermal envelope, which results in longer periods of sustained performance.
Autonomous CPU memory power management, a way to intelligently adapt to the amount of local memory available to the CPUs depending on the type of application running, is another key feature in the technology. Applications that demand a high amount of compute performance, such as Augmented Reality (AR), will have the maximum amount of local memory at its dispense, while lighter applications, such as music streaming, will have a scaled-back amount of local memory, saving CPU memory power.
This will be available as an extension to ARMv8 and in new Cortex A processor cores later this year. Partners such as NVIDIA have already modified ARM processor cores for AI applications.