SPRADB4 White paper

SPRADB4 june 2023 AM69A , TDA4VH-Q1

3.2 Machine Vision

Industry 4.0 targets increased automation for production processes within the manufacturing industry. And Industry 5.0 emphasizes the human-centric collaboration between human and robots with artificial intelligence, that is, collaborative robot (cobot), to optimize the manufacturing process with improved automation. Machine vision is one of key technologies in Industry 4.0 and 5.0 and the real-time processing of visual data at the edge is crucial for machine vision. The main use cases of machine vision include:

Quantity and presence inspection where 2D or 3D cameras verify the presence or absence of parts and ingredients in assembly and packaging systems. Vision-based deep learning is used to detect, classify, and count parts and ingredients.
Visual quality inspection where 2D or 3D vision-based deep learning is used for various visual inspection purposes, for example, detecting defects or identifying the characters on printed circuit boards (PCB), gauging the dimension of parts, verifying proper assembly of parts and the wrapping of labels around containers, detecting tool wear defects as preventive maintenance, and UAV- or drone-based fault detection systems of solar panels, turbines, pipelines, and so forth.
Vision-guided robot, for example, the robot arm picking and placing parts or bins is another use case of machine vision for the improved collaboration between human and cobots. Using the camera mounted on the robot arm, the pose of the object to pick is estimated and the optimum path is calculated for the robot arm.
Camera-based barcode reading system is gaining popularity in the e-commerce logistics market. Since many customers make purchases online with 1- or 2-day shipping or less, the logistic market has been widely adopting the camera-based barcode reading system to improve the package processing throughput and reduce average shipping time as a result. In the barcode reading system, packages are placed on the barcode scanning tunnel that is moving fast and the camera-based barcode readers mounted on the upper, left, and right sides of the tunnel read the 2D or 3D barcodes on the packages at a high frame rate. The highly successful barcode reading rate is important in the barcode reading system since a failure to read barcodes results in manual handling. For the package whose barcodes were not read, the operator manually types the information and replaces the damaged barcode with a new barcode. The manual interruption increases labor costs and reduces the efficiency resulting in poor throughput. To improve successful barcode read rate, the camera-based barcode readers make use of AI to locate barcode and filter damaged or poorly captured barcodes.

Figure 3-2 Machine Vision Block Diagram With Data Flow on AM69A

Figure 3-2 illustrates the data flow for a machine vision use case example on the AM69A, which involves capturing three 8MP image sequences at 30 fps through the MIPI CSI-2 RX ports. The captured raw Bayer images are processed and demosaiced to YUV by VPAC3 VISS, and VPAC3 LDC corrects any lens distortion that is present. In this machine vision use case, the DL networks are applied to region of interests (ROIs), which are extracted using A72 cores. The number of ROIs and their sizes vary depending on the specific use case. The frame rate at which DL networks are applied is also dependent on the use case. The outputs obtained through DL pre-processing, DL network on MMA, and DL post-processing are displayed via DSS. In the event of any unexpected detection, an alarm can be activated for human attention.

The resource utilization and estimated power consumption of AM69A are shown in Table 3-2 for three 8MP cameras. Single MMA is assumed to be fully utilized for one 8MP camera in this table even though the actual MMA utilization depends on the application. There is still enough room for CSI-2, VPAC, A72, and DDR bandwidth to process higher resolution. Therefore, the AM69A can enable the machine vision use case with higher resolution cameras, for example, 12MP, as long as MMA can handle the necessary DL inference at the cost of increased power.

Table 3-2 AM69A Resource Utilization and Power Consumption Estimate for the Machine Vision Use Case

Main IP	Utilization (3 × 8MP at 30 fps)
3 × CSI-2 RX	3 × 8MP at 30 fps = 11.52 Gbps (38%)
VPAC (VISS, LDC)	3 × 8MP at 30 fps = 720 MP/s (60%)
MMA	24 TOPS (75%)
8 × A72	ROI extraction, DL pre- and post-processing, and so forth (50%)
DSS	100%
DDR Bandwidth	15.35GBps (24%)
Power Consumption (85°C)	19 W