ARM announced four new cores for mainstream smartphones and digital TVs, two Mali GPUs and associated video and display cores for them.
ARM's new Mali Multimedia Suite is featuring the Mali-G52 & Mali-G31 GPUs, the Mali-D51 display processor and the Mali-V52 video processor, expressly for the DTV and mainstream mobile markets.
Mali-G52 brings a new execution engine design to provide greater performance in a smaller area, supporting use cases like Machine Learning (ML) and Augmented Reality (AR) in Mainstream devices, and Mali-G31 is ARM's smallest GPU to date to support latest APIs for graphics development and complex User Interfaces for DTV.
Even on mainstream phones, we expect our camera to be able to automatically detect our faces in order to focus accurately, or to be able to search our camera gallery for pictures of our cat, but you may not have realised this is facilitated by ML. Arm recently launched Project Trillium to address it. However, in mainstream and cost sensitive devices it often simply wouldn't be viable to dedicate sufficient silicon to such a processor, but for smaller workloads, the GPU or CPU is often the right choice.
Mali-G52 features a redesigned execution engine, the part of the processor which handles the arithmetic, to bring the higher performance needed for ML, within a small silicon budget. Mali-G52 is equipped with twice the number of lanes per execution engine, yet is only 22% bigger than its predecessor, allowing double the compute performance for all that complex content. The Mali-G52 also introduces Int8 dot product support. ML inference on device makes extensive use of general matrix multipliers but you often don't need FP16/FP32 level of precision. In many cases, Int8 is just as effective and much more efficient, and with the ability to handle four cycles per lane of each execution engine, Mali-G52 achieves nearly 4 times the ML performance of its predecessor on current image detection and other ML benchmark tests.
AR is fast becoming the alternative reality tech of choice, and it's vital to be able to support this, and other forms like 360 video, in the Mainstream. This is where Mali-G52's energy efficiency, combined with the smart task allocation capabilities of DynamIQ CPU architecture, really come into their own. The combination recommended in this generation of Mainstream Mobile solution is a DynamIQ configuration of one Cortex-A75 and seven Cortex-A55s. These multiple smaller cores are able to simultaneously handle a multitude of less taxing tasks, enabling the big and powerful Cortex-A75 to focus on the areas in which its full performance is necessary. Add in the Mali-G52, which achieves 30% better performance density than its predecessor, and highly complex content is suddenly achievable. Much like the flexibility of DynamIQ, Mali-G52 features scalable implementation options to allow for the implementation of exactly the balance of performance and efficiency each device requires.
- 30% more performance density: Mali-G52 utilizes wider execution engines with up to 8 pipelines compared to the four of its predecessor to provide greater graphics performance in the same silicon area
- 15% energy efficiency reduces the power consumption and thermal output of your device and supports greater game time for even battery-draining technologies like AR
- 3.6x the ML performance of the previous generation product ensures next-gen ML use cases are supported across all tiers of device
Designed with developers in mind, Mali-G31 is Arm's latest ultra-efficient GPU, and the first on the Bifrost architecture. Mali-G31 is the smallest processor to be able to support not only OpenGL ES 3.2, but the more recent Khronos API, Vulkan, too.
- The first Ultra-Efficient GPU to be built on the Bifrost architecture
- Arm's smallest processor to support OpenGL ES 3.2 and the latest generation Vulkan API
- 20% smaller, with 20% better performance density than its Bifrost predecessor, Mali-G51, saving silicon area while saving silicon area and delivering exceptional energy efficiency
The Mali-D51 takes many of the benefits of 2017's Premium display processor, Mali-D71. It is the first mainstream display processor to be built on the Komeda architecture, and achieves:
- 30% power saving across the entire system compared to its predecessor
- Double the scene complexity, supporting the 8 full layers as per the Mali-D71
- 50% better memory latency for seamless and highly efficient content casting
- Supports any resolution up to 2048x4096 pixels at 60fps, and across the same number of layers as the Mali-D71.
Optimized to work with the other IP in the Mali Multimedia Suite, Mali-D51 even brings HDR to the mainstream when combined with Assertive Display 5, and provides system-wide memory management efficiencies in collaboration with CoreLink MMU-600.
More and more of our favorite content is being produced in higher resolutions than ever before, meaning 4K content is quickly becoming an essential requirement. The Mali-V52 provides:
- Scalable from 1-4 cores
- 20% better upload quality than its predecessor, providing clearer, crisper video
- 38% smaller silicon area
- Double the decode performance enabling 4K content for all mainstream devices, which means that in the same area you can perform 4k60 decode or 4k30 encode. With a single core of Mali-V61 you could decode 1080p60, whereas a single core of Mali-V52 can support 4k30 or 1080p120, as the core has been designed to perform twice as fast for the main codecs such as HEVC, H.264 and VP9.
- Supports 10bit HDR content
ARM said that 159 licensees have shipped a total of 1.2 billion Mali GPU cores to date. The cores are currently used in half of all handsets and 80% of digital TVs, it said.
ARM's Mali leads in the mobile GPU space with a 48% share with design wins in handsets, tablets, and TVs as well as some IoT and automotive systems, according to Jon Peddie Research. Qualcomm's Snapdragon with its Adreno GPUs follows at 25%, and Imagination Technologies, which used to lead the sector, now is in third at 12%.
|Frequency||650 MHz||850 MHz|
|Pixel/Texturing Throughput||1.3 Gpix/s||6.8 Gpix/s|
|Technology||in 28nm HPM||in 16nm|
|API Support||OpenGL ES 1.1, 2.0, 3.1, 3.2
OpenCL 1.1, 1.2, 2.0 Full Profile,
|Bus Interface||AMBA 4
|L2 Cache||Configurable 32kB-512kB|
|Memory System||Virtual Memory|
| Multi-Core Scaling
||1 uni-pixel core or dual-pixel core
||1 to 4 dual-pixel cores|
| Adaptive Scalable Texture Compression (ASTC)
||Low dynamic range (LDR) and high dynamic range (HDR). Supports both 2D and 3D images .
|Arm Frame Buffer Compression (AFBC)||Version 1.2
4x4 pixel block size
|Transaction Elimination||16x16 pixel block size|
|Smart Composition||16x16 pixel block size|
|Implementation||Dual Display output. Compatible with all major display standards, including HDMI, MIPI, VESA, CEA-861, ITU-R
|Compatibility||Display solution with Assertive Display 5 and MMU-600. Optimized to work with Arm Mali GPUs and Video Processors.|
|Bandwidth Reduction||AFBC1.2 support enables system-wide bandwidth reduction of up to 50%|
|Bus Interface||AMBA 4 AXI|
|Security||Content is hardware protected right to the glass. The Arm Mali-D51 comes with a TrustZone secure layer for secure payment and is compatible with Arm TrustZone Ready Client 2, GlobalPlatform Trusted User Interface and TrustZone Media Protection|
|Composition||Max of eight alpha-blended layers (2 can be video layers). Mixed HDR/SDR composition|
|Scaling||Scaling on the display processor reduces CPU/GPU power consumption. The Mali-D51 DPU can scale up to two layers in parallel. High quality up-scaling (up to 64x) and down-scaling (up to 6x) in any ratio|
|Resolution||Up to 2048 pixels wide with native 10-bit display output support|
|Rotation||90°, 180° and 270° rotation is supported along with highly configurable cropping options. In-line rotation of up to 8x AFBC layers.|
|Picture Quality||With color enhancements, single pixel accuracy for smooth window transitions and edge sharpening|
|Input/Output||Wide range of input and output formats supported.|
||For encode and decode : VP9 Profile 2 (10-bit) and Profile 0 (8-bit), HEVC Main 10 and Main, H.264 Hi10P/HP/MP/BP, VP8, JPEG. Decode only : MPEG4, MPEG2, VC-1/WMV, Real, H.263, AVS+/AVS
||Driver and video streaming infrastructure based on OpenMAX, which runs on the host CPU.|
||AMBA4 AXI||Compatible with a wide range of bus interconnect and peripheral IP|
||MMU||Built-in Memory Management Unit (MMU) to support virtual memory|
|Performance||1080p60 to 4K120||Scalable from one to four cores with multiple performance points.|