SPRUI04F july 2015 – april 2023
A loop iterates some number of times before the loop terminates. The number of iterations is called the trip count. The variable that counts iterations is the trip counter. When the trip counter reaches a limit equal to the trip count, the loop terminates. The Code Generation Tools use the trip count to determine whether or not a loop can be pipelined. The structure of a software pipelined loop requires the execution of a minimum number of loop iterations (a minimum trip count) in order to fill or prime the pipeline.
The minimum trip count for a software pipelined loop is set by the number of iterations executing in parallel. In Figure 4-1, the minimum trip count is five. In the following example A, B, and C are instructions in a software pipeline, so the minimum trip count for this single-cycle software pipelined loop is three.
A | ||||
B | A | |||
C | B | A | ←Three iterations in parallel = minimum trip count | |
C | B | |||
C |
When the Code Generation Tools cannot determine the trip count for a loop, then by default two loops and control logic are generated. The first loop is not pipelined, and it executes if the run-time trip count is less than the loop's minimum safe trip count. The second loop is the software pipelined loop, and it executes when the run-time trip count is greater than or equal to the minimum trip count. At any given time, one of the loops is a redundant loop. For example:
foo(N) /* N is the trip count */
{
for (I=0; I < N; I++) /* I is the trip counter */
}
After finding a software pipeline for the loop, the compiler transforms foo() as below, assuming the minimum trip count for the loop is 3. Two versions of the loop would be generated and the following comparison would be used to determine which version should be executed:
foo(N)
{
if (N < 3)
{
for (I=0; I < N; I++) /* Unpipelined version */
}
else
}
for (I=0; I < N; I++) /* Pipelined version */
}
}
foo(50); /* Execute software pipelined loop */
foo(2); /* Execute loop (unpipelined)*/
You may be able to help the compiler avoid producing redundant loops with the use of --program_level_compile --opt_level=3 (see Section 4.4) or the use of the MUST_ITERATE pragma (see Section 7.9.22).