SPRADA8 may 2023 AM68A , TDA4VL-Q1
The AM68A is a dual-core Arm® Cortex® A72 microprocessor. The processor is designed as a high-performance and highly-integrated device providing significant levels of processing power, image and video processing, and graphics capability. Compared with the AM62A(2), which is designed for the applications with one or two cameras, the AM68A enables real-time processing of four to eight 2MP cameras with improved AI performance. Figure 2-1 shows the following multiple sub-systems based on the heterogeneous architecture of the AM68A:
Deep learning inference efficiency is crucial for the performance of an edge AI system. As the Performance and efficiency benchmarking with TDA4 Edge AI processors application note shows, MMA-based deep learning inference is 60% more efficient than a GPU-based one in terms of FPS or TOPS. The optimized network models for C7xMMA are also provided by the TI Model Zoo(3), which is a large collection of DNN models optimized for C7xMMA for various computer vision tasks. The models include popular image classification, 2D and 3D object detection, semantic segmentation, and 6D pose estimation models. Table 2-1 shows the 8-bit fixed-point inference performances on AM68A for several models in the TI Model Zoo.
Task | Model | Image Resolution | Frame Rate (fps) | Accuracy (%) |
---|---|---|---|---|
Classification | mobileNetV2-tv | 224 × 224 | 500 | 70.27(1) |
Object detection | ssdLite-mobDet-DSP-coco | 320 × 320 | 218 | 34.64(2) |
Object detection | yolox-nano-lite-mmdet-coco | 416 × 416 | 268 | 18.96(2) |
Semantic segmentation | deeplabv3lite-mobv2-cocoseq21 | 512 × 512 | 120 | 55.47(3) |
Semantic segmentation | deeplabv3lite-regnetx800mf-cocoseq21 | 512 × 512 | 58 | 60.62(3) |
The multicore heterogeneous architecture of the AM68A provides flexibility to optimize the performance of edge AI system for various applications by utilizing suitable programmable cores or HWAs for particular tasks. For example, computationally intense deep learning (DL) inference can run on MMA with enhanced DL models, and vision processing, video encoding and decoding can be offloaded to VPAC3 and hardware-accelerated video codec for the best performance. Other functional blocks can be programmed in A72 or C7x. Section 3 describes in detail how edge AI systems can be built on the AM68A for various industrial (non-automotive) use cases.