SPRUI30H November 2015 – May 2024 DRA745 , DRA746 , DRA750 , DRA756
Each stage is described in more details in the following sections.
Note that loads and stores are predicated on the loop variables matching specific conditions and are thus not always carried out at every i4 iteration.
Each i4 iteration takes a number of cycles equal to the maximal number of cycles spent in loads, arithmetic operatoins, and stores. Cycle count in the arthmetic operations is constant for each loop, but cycle count in load and store can change depending on pointer update, loop level, and read/write memory contention.
For example, with lpend1 = 3, lpend2 = 1, lpend3 = 2, and lpend4 = 4, the nested for loop executes in exactly 4 × 2 × 3 × 5 = 120 i4 iterations. For the first few iterations the loop variables progress as:
i1 | i2 | i3 | i4 |
---|---|---|---|
0 | 0 | 0 | 0 |
0 | 0 | 0 | 1 |
0 | 0 | 0 | 2 |
0 | 0 | 0 | 3 |
0 | 0 | 0 | 4 |
0 | 0 | 1 | 0 |
0 | 0 | 1 | 1 |
0 | 0 | 1 | 2 |
... |