SPRUIV4D May   2020  – May 2024

 

  1.   1
  2.   Read This First
    1.     About This Manual
    2.     Related Documentation
    3.     Trademarks
  3. 2Introduction
    1. 2.1 C7000 Digital Signal Processor CPU Architecture Overview
    2. 2.2 C7000 Split Datapath and Functional Units
  4. 3C7000 C/C++ Compiler Options
    1. 3.1 Overview
    2. 3.2 Selecting Compiler Options for Performance
    3. 3.3 Understanding Compiler Optimization
      1. 3.3.1 Software Pipelining
      2. 3.3.2 Vectorization and Vector Predication
      3. 3.3.3 Automatic Use of Streaming Engine and Streaming Address Generator
      4. 3.3.4 Loop Collapsing and Loop Coalescing
      5. 3.3.5 Automatic Inlining
      6. 3.3.6 If Conversion
  5. 4Basic Code Optimization
    1. 4.1  Signed Types for Iteration Counters and Limits
    2. 4.2  Floating-Point Division
    3. 4.3  Loop-Carried Dependencies and the Restrict Keyword
      1. 4.3.1 Loop-Carried Dependencies
      2. 4.3.2 The Restrict Keyword
      3. 4.3.3 Run-Time Alias Disambiguation
    4. 4.4  Function Calls and Inlining
    5. 4.5  MUST_ITERATE and PROB_ITERATE Pragmas and Attributes
    6. 4.6  If Statements and Nested If Statements
    7. 4.7  Intrinsics
    8. 4.8  Vector Types
    9. 4.9  C++ Features to Use and Avoid
    10. 4.10 Streaming Engine
    11. 4.11 Streaming Address Generator
    12. 4.12 Optimized Libraries
    13. 4.13 Memory Optimizations
  6. 5Understanding the Assembly Comment Blocks
    1. 5.1 Software Pipelining Processing Stages
    2. 5.2 Software Pipeline Information Comment Block
      1. 5.2.1 Loop and Iteration Count Information
      2. 5.2.2 Dependency and Resource Bounds
      3. 5.2.3 Initiation Interval (ii) and Iterations
      4. 5.2.4 Constant Extensions
      5. 5.2.5 Resources Used and Register Tables
      6. 5.2.6 Stage Collapsing
      7. 5.2.7 Memory Bank Conflicts
      8. 5.2.8 Loop Duration Formula
    3. 5.3 Single Scheduled Iteration Comment Block
    4. 5.4 Identifying Pipeline Failures and Performance Issues
      1. 5.4.1 Issues that Prevent a Loop from Being Software Pipelined
      2. 5.4.2 Software Pipeline Failure Messages
      3. 5.4.3 Performance Issues
  7. 6Revision History

Software Pipeline Failure Messages

Possible software pipeline failure messages provided by the compiler include the following:

  • Address increment too large. During software pipelining, the compiler allows reordering of all loads and stores occurring from the same array or pointer. This maximizes flexibility in scheduling. Once a schedule is found, the compiler returns and adds the appropriate offsets and increments/decrements to each load and store. Sometimes, the loads and/or stores end up being offset too far from each other after reordering (the limit for standard load pointers is +/− 32). If this happens, try to restructure the loop so that the pointers are closer together or to rewrite the pointers to use precomputed register offsets.
  • Cannot allocate machine registers. After software pipelining and finding a valid schedule, the compiler allocates all values in the loop to specific machine registers. In some cases, the compiler runs out of machine registers in which it can allocate values of variables and intermediate results. If this happens, either try to simplify the loop or break the loop up into multiple smaller loops. In some cases, the compiler can successfully software pipeline a loop at a higher initiation interval (ii).
  • Cycle Count Too High. Not Profitable. In rare cases, the iteration interval of a software pipelined loop is higher than a non-pipelined loop. In this case it is more efficient to execute the non-software pipelined loop. A possible solution is to split the loop into multiple loops or reduce the complexity of the loop.
  • Did not find schedule. Sometimes the compiler simply cannot find a valid software pipeline schedule at a particular initiation interval. A possible solution is to split the loop into multiple loops or reduce the complexity of the loop.
  • Iterations in parallel > max. iteration count. Not all loops can be profitably pipelined. Based on the available information for the largest possible iteration count, the compiler estimates that it will always be more profitable to execute a non-software-pipelined version than to execute the pipelined version, given the schedule found at the current initiation interval. A possible solution may be to unroll the loop completely.
  • Iterations in parallel > min. iteration count. Based on the available information on the minimum iteration count, it is not always safe to execute the pipelined version of the loop. Normally, a redundant loop would be generated. However, in this case, redundant loop generation has been suppressed via the --opt_for_speed=3 or lower option. A possible solution is to add the MUST_ITERATE pragma to give the compiler more information on the minimum iteration count of the loop.
  • Register is live-too long. Sometimes the compiler finds a valid software pipeline schedule, but one or more of the values is live too long. The lifetime of a register is determined by the cycle time between when a value is written into the register and the last cycle this value is read by another instruction. By definition, a variable can never be live longer than the ii of the loop, because the next iteration of the loop overwrites that value before it is read. After this message, the compiler provides a detailed description of which values are live to long:
    ii = 11 Register is live too long
    |72| −> |74|
    |73| −> |75|
    

    The numbers 72, 73, 74, and 75 in this example correspond to line numbers and can be mapped back to the offending instructions. The compiler aggressively attempts to both prevent and fix live-too longs. Techniques you can use to resolve live-too longs have low probabilities of success. Therefore, such techniques are not discussed in this document. In addition, the compiler can usually find a successful software pipeline schedule at a higher initiation interval (ii).