SPRADC1 Application note

SPRADC1 june 2023 DRA829J , DRA829J-Q1 , DRA829V , DRA829V-Q1 , TDA4VM , TDA4VM-Q1

3.2 Memory Access Latency

Table 3-1 and Table 3-2 illustrate the relative memory access latency (read) from each core to primary memory end-points in the SoC system. Table 3-1 shows the latency for a core to access different memory end-points with access to DDR as a baseline. For example, in the first row of Table 3-1, the A72 core has a latency to access MSMC that is 33% of the latency to access DDR. Latencies to MCU OCRAM and MAIN OCRAM are both greater than the latency to DDR. Table 3-2 compares the latency from each core to DDR, with the A72 core’s access to DDR as the baseline. The MCU R5 core takes 2.55x longer to access DDR than the A72 core.

It should be noted that DDR access latency is typically not constant due to factors such as SDRAM refresh cycles and periodic retraining. To provide an accurate comparison, this analysis uses optimal DDR access latency by excluding non-deterministic factors. Other SRAM-based memory end-points do not have variance in access latency.

Overall, these tables provide insight into the relative memory access latency for each core in the system, with implications for system performance and optimizations.

Note that the relative performance in Table 3-2 is for the worst case latency, such that, in cases when the data is read from the physical memory rather than the cache. If data being processed has a lot of locality, then the caches will hide most of the latency. The performance boost of using the different memories starts to show up as the cache miss rate increases. Observed performance from a source to a destination can vary depending on caching and contention. The SDK data sheet compares the cached performance using benchmarks like LMBench for Linux running on A72 and a memory benchmarking application for FreeRTOS running on R5F.

Table 3-1 Relative Latency From Each Core to Memory End-Point

	DDR	MSMC	C7x L2SRAM	C6x L2SRAM	MCU OCRAM	MAIN OCRAM
A72	1.00x	0.33x	0.38x	1.30x	1.59x	1.13x
C7x	1.00y	0.36y	0.03y	1.24y	1.50y	1.08y
C6x	1.00z	0.63z	0.67z	0.01z	0.52z	0.28z
MCU R5F	1.00a	0.73a	0.78a	0.54a	0.20a	0.47a
MAIN R5F	1.00b	0.71b	0.75b	0.51b	0.53b	0.42b

Table 3-2 Comparison of DDR Access Latency From Each Core

	A72	C7x	C6x	MCU R5F	MAIN R5F
DDR	1.00c	1.12c	1.98c	2.55c	2.37c

From above tables, items which should be considered during system design:

Using memory local to the processor where software is running, results in reduced memory access latency, when compared to DDR
When using memory to share data between cores (A72/R5/C6/C7)
- The memory access time will be different depending on the core the software is running on.
- Care should be taken when selecting which memory will be used to share the data
MSMC memory, has low access time from all cores, when compared to DDR. Use of MSMC memory can lead to increased performance in many use cases. The most efficient use of this memory for the overall system should be reviewed.