SPRAD45B July   2022  – December 2024 AM623 , AM625

 

  1.   Abstract
  2.   2
  3.   Trademarks
  4. 1Introduction
  5. 2Processor Core Benchmarks
    1. 2.1 Dhrystone
    2. 2.2 CoreMark®-Pro
    3. 2.3 Fast Fourier Transform
    4. 2.4 Cryptographic Benchmarks
  6. 3Compute and Memory System Benchmarks
    1. 3.1 Memory Bandwidth and Latency
      1. 3.1.1 LMBench
      2. 3.1.2 STREAM
    2. 3.2 Critical Memory Access Latency
    3. 3.3 UDMA: DDR to DDR Data Copy
  7. 4Graphics Processing Unit Benchmarks
    1. 4.1 Glmark2 and Kanzi
    2. 4.2 GFXBench5
  8. 5Summary
  9. 6References
  10. 7Revision History

Critical Memory Access Latency

This section provides round-trip read latency measurements for processors in AM62x to various memory destinations in the system. The measurements where made on the AM62x platform using bare-metal silicon verification tests. The tests execute on A53, M4F and R5F processors out of LPDDR4. Each test includes a loop of 8192 iterations to read a total of 32 KiB of data. The number of cycles for each access were counted and divided by the respective processor clock frequency to obtain latency time. Table 3-4 shows the average latency results.

Table 3-4 Critical Memory Access Latency of A53, R5F WKUP, and M4F MCU
MemoryArm-Cortex-A53
(Avg ns)
Arm-Cortex-R5F WKUP
(Avg ns)

Arm-Cortex-M4F MCU (Avg ns)

LPDDR4

219

228

350

OCSRAM MAIN

127

70

170

R5F WKUP TCM104

2.5

180

M4F MCU TCM

263

233

10

Tests were done at 0.75V VDD_CORE, 1.25Ghz A53 cores, 400MHz R5F core, 400MHz M4F cores, and 1600MT/s LPDDR4. Tightly-Coupled Memory, or TCM, is RAM that is directly attached to an ARM Cortex core. ARM architecture provides a local internal low latency path and also allows external access to the memory through SoC bus infrastructure.