SPRUIG8J January 2018 – March 2024
When a loop is software pipelined, a prolog and epilog are generally required. The prolog is used to pipe up the loop and epilog is used to pipe down the loop.
In general, a loop must execute a minimum number of iterations before the software-pipelined version can be safely executed. If the minimum known iteration count is too small, either a redundant loop is added or software pipelining is disabled. Collapsing the prolog and epilog of a loop can reduce the minimum iteration count necessary to safely execute the pipelined loop.
Collapsing can also substantially reduce code size. Some of this code size growth is due to the redundant loop. The remainder is due to the prolog and epilog.
The prolog and epilog of a software-pipelined loop consists of up to p-1 stages of length ii, where p is the number of iterations that are executed in parallel during the steady state and ii is the cycle time for the pipelined loop body. During prolog and epilog collapsing the compiler tries to collapse as many stages as possible. However, over-collapsing can have a negative performance impact. Thus, by default, the compiler attempts to collapse as many stages as possible without sacrificing performance. When the --opt_for_speed=2 or --opt_for_speed=1 options are invoked, the compiler increasingly favors code size over performance.