SPRADA8 White paper

SPRADA8 may 2023 AM68A , TDA4VL-Q1

3.1 AI Box

AI Box is a cost-effective way of adding intelligence to existing non-analytics-based cameras present in retail stores, traffic roads, factories, and buildings. AI Box is preferred over AI camera because AI Box is more cost effective than replacing legacy cameras with smart cameras that have AI capabilities. Such a system receives live video streams from multiple cameras, decodes them, and does intelligent video analytics at the edge relieving the burden of transferring large video streams back to the cloud for analysis. The applications of AI Box include security surveillance system with anomaly or event detection, workplace safety systems that verifies workers wear personal protective equipment (PPE) such as goggles, safety vests, and hard hats before entering a hazardous zone. In traffic management, AI Box is used for vehicle counting, vehicle type classification, and moving direction predictions for traffic flow measurement and vehicle tracking.

Figure 3-1 AI Box Block Diagram With Data Flow on AM68A

Figure 3-1 shows the data flow for AI Box on AM68A, where six channels of 2MP bitstreams are coming through Ethernet at 30 fps. The HW accelerated H.264 or H.265 decoder decodes the bitstreams and the decoded frames are scaled to smaller resolution by MSC. DL networks are applied to these smaller-resolution frames at a lower frame rate, for example, 12 fps. DL networks are accelerated by MMA. In DL preprocessing, the smaller resolution frames in YUV are converted to RGB, which is the input format to the DL network. In DL post-processing, the outputs (for example, detections) are overlaid on the input frame. Next, the output frames from six channels are stitched together into a single 2MP frame and seven channels, that is, six channels plus one composite channel are encoded by hardware accelerated H.264 or H.265 encoder at lower frame rates and streamed out or saved in storage. Table 3-1 summarizes the resource utilization and estimated power consumption with six and four channels of 2MP bitstreams. An assumption made here is that each channel needs 1 TOPS for inference. The second C7x core is still available for additional vision processing and JPEG image encoding to create snapshots. While both DL pre- and post-processing run on A72 cores in this example, both processes can run on the second C7x. In such cases, power estimates can be a little higher. The AM68A can enable the AI Box with eight channels of 2 MP bitstreams. However, due to the maximum throughput of video codec, the input frame rate and output frame rate need to be reduced to 24 fps and 4 fps, respectively.

Table 3-1 AM68A Resource Utilization and Power Consumption for the AI Box Use Case

Main IP	Utilization (6 × 2MP at 30 fps)	Utilization (4 × 2MP at 30 fps)
Decoder	6 × 2MP at 30 fps = 360 MP/s (75%)	4 × 2MP at 30fps = 240 MP/s (50%)
Encoder	6 × 2MP at 6 fps + 1 composite × 2MP at 6 fps = 84 MP/s (18%)	4 × 2MP at 6fps + 1 composite × 2MP at 6fps = 60 MP/s (18%)
Decoder + Encoder	360 MP/s + 84 MP/s = 444 MP/s (93%)	240 MP/s + 50 MP/s = 300 MP/s (62.5%)
GPU	20%	20%
VPAC (MSC)	6 × 2MP at 30 fps = 360 MP/s (60%)	4 × 2MP at 30 fps = 240 MP/s (40%)
MMA	6 × 1 TOPS per ch = 6 TOPS (75%)	4 × 1 TOPS per ch = 4 TOPS (50 %)
2 × A72	DL pre- and post-processing, depacketization, and so forth (50%)	DL pre- and post-processing, depacketization, and so forth (40%)
DDR Bandwidth	5.19 GBps (15%)	3.54 GBps (10%)
Power Consumption (85°C)	6.9 W	6.3 W