SPRAD45A July   2022  – October 2024 AM623 , AM625

 

  1.   Abstract
  2.   2
  3.   Trademarks
  4. 1Introduction
  5. 2Processor Core Benchmarks
    1. 2.1 Dhrystone
    2. 2.2 CoreMark®-Pro
    3. 2.3 Fast Fourier Transform
    4. 2.4 Cryptographic Benchmarks
  6. 3Compute and Memory System Benchmarks
    1. 3.1 Memory Bandwidth and Latency
      1. 3.1.1 LMBench
      2. 3.1.2 STREAM
    2. 3.2 Critical Memory Access Latency
    3. 3.3 UDMA: DDR to DDR Data Copy
  7. 4Graphics Processing Unit Benchmarks
    1. 4.1 Glmark2 and Kanzi
    2. 4.2 GFXBench5
  8. 5Summary
  9. 6References
  10. 7Revision History

STREAM

STREAM is a microbenchmark for measuring data memory system performance without any data reuse. STREAM is designed to miss on caches and exercise the data prefetcher and speculative accesses. STREAM uses double precision floating point (64 bit), but, in most modern processors, the memory access is the bottleneck. The four individual scores are copy, scale as in multiply by constant, add two numbers, and triad for multiply accumulate. For bandwidth, a byte read counts as one and a byte written counts as one resulting in a score that is double the bandwidth LMBench. The measured bandwidth and the efficiency compared to theoretical wire rate is shown in Table 3-3. The wire rate used is the DDR MT/s rate times the width. To get overall maximum achieved throughput the command used is stream -M 16M -P 4-N 10, which means two parallel threads and 10 iterations.

Table 3-3 Stream Benchmarks
DDR4-1600MT/s-16-Bit Bandwidth DDR4-1600MT/s-16-Bit Efficiency
copy 2448MB/s 77%
scale 2372MB/s 74%
add 2491MB/s 78%
triad 2493MB/s 78%