SPRACW5A April   2021  – December 2021 F29H850TU , F29H859TU-Q1 , TMS320F2800132 , TMS320F2800133 , TMS320F2800135 , TMS320F2800137 , TMS320F280021 , TMS320F280021-Q1 , TMS320F280023 , TMS320F280023-Q1 , TMS320F280023C , TMS320F280025 , TMS320F280025-Q1 , TMS320F280025C , TMS320F280025C-Q1 , TMS320F280033 , TMS320F280034 , TMS320F280034-Q1 , TMS320F280036-Q1 , TMS320F280036C-Q1 , TMS320F280037 , TMS320F280037-Q1 , TMS320F280037C , TMS320F280037C-Q1 , TMS320F280038-Q1 , TMS320F280038C-Q1 , TMS320F280039 , TMS320F280039-Q1 , TMS320F280039C , TMS320F280039C-Q1 , TMS320F280040-Q1 , TMS320F280040C-Q1 , TMS320F280041 , TMS320F280041-Q1 , TMS320F280041C , TMS320F280041C-Q1 , TMS320F280045 , TMS320F280048-Q1 , TMS320F280048C-Q1 , TMS320F280049 , TMS320F280049-Q1 , TMS320F280049C , TMS320F280049C-Q1 , TMS320F28075 , TMS320F28075-Q1 , TMS320F28076 , TMS320F28374D , TMS320F28374S , TMS320F28375D , TMS320F28375S , TMS320F28375S-Q1 , TMS320F28376D , TMS320F28376S , TMS320F28377D , TMS320F28377D-EP , TMS320F28377D-Q1 , TMS320F28377S , TMS320F28377S-Q1 , TMS320F28378D , TMS320F28378S , TMS320F28379D , TMS320F28379D-Q1 , TMS320F28379S , TMS320F28384D , TMS320F28384D-Q1 , TMS320F28384S , TMS320F28384S-Q1 , TMS320F28386D , TMS320F28386D-Q1 , TMS320F28386S , TMS320F28386S-Q1 , TMS320F28388D , TMS320F28388S , TMS320F28P650DH , TMS320F28P650DK , TMS320F28P650SH , TMS320F28P650SK , TMS320F28P659DH-Q1 , TMS320F28P659DK-Q1 , TMS320F28P659SH-Q1

 

  1.   Trademarks
  2. 1Introduction
  3. 2ACI Motor Control Benchmark Application
    1. 2.1 Source Code
    2. 2.2 CCS Project for TMS320F28004x
    3. 2.3 CCS Project for TMS320F2837x
    4. 2.4 Validate Application Behavior
    5. 2.5 Benchmarking Methodology
      1. 2.5.1 Details of Benchmarking With Counters
    6. 2.6 ERAD Module for Profiling Application
  4. 3Real-time Benchmark Data Analysis
    1. 3.1 ADC Interrupt Response Latency
    2. 3.2 Peripheral Access
    3. 3.3 TMU (math enhancement) Impact
    4. 3.4 Flash Performance
    5. 3.5 Control Law Accelerator (CLA)
      1. 3.5.1 Full Signal Chain Execution on CLA
        1. 3.5.1.1 CLA ADC Interrupt Response Latency
        2. 3.5.1.2 CLA Peripheral Access
        3. 3.5.1.3 CLA Trigonometric Math Compute
      2. 3.5.2 Offloading Compute to CLA
  5. 4C2000 Value Proposition
    1. 4.1 Efficient Signal Chain Execution With Better Real-Time Response Than Higher Computational MIPS Devices
    2. 4.2 Excellent Real-Time Interrupt Response With Low Latency
    3. 4.3 Tight Peripheral Integration That Scales Applications With Large Number of Peripheral Accesses
    4. 4.4 Best in Class Trigonometric Math Engine
    5. 4.5 Versatile Performance Boosting Compute Engine (CLA)
    6. 4.6 Deterministic Execution due to Low Execution Variance
  6. 5Summary
  7. 6References
  8. 7Revision History

Offloading Compute to CLA

The C28x CPU can software trigger a CLA task using the IACK instruction. This can be used by the C28x to offload computations to the CLA at specific points in the code running on the C28x CPU. The ACI benchmark example is fairly linear in that one control algorithm block requires an input from a previous control algorithm block. However there are two instances where parallelism can be introduced:

  1. PID control block:

    There are three instances of the PID control algorithm and one of these (PID Id instance) does not have dependencies on the others and can be parallelized. In the implementation, the C28x offloads the PID Id control execution to the CLA Task 2 while the C28x is executing the PID for Speed and Iq in parallel.

  2. SVGen control block:

    The SVGen control block depends on the output of the Inverse Park transform. Since this is a sensor-less ACI implementation additional control blocks are also present such as Flux and Speed Estimators. The SVGen is not dependent on these and can be parallelized with the Estimators. In the implemetation the C28x offloads the SVGen calculation to the CLA Task 1.

For the implementation, see the 'SignalChain_RAM_TMU_CLA_OFFLOAD' build configuration. The implementation demonstrates how easy it is to transition code to offload compute to the CLA. The same PID inline C function header file is included on the C28x source file as well as CLA source file and gets compiled into C28x code on C28x side and CLA code on CLA side.

GUID-20211129-SS0I-TJPW-1RD2-R0D0PLR9WM9Z-low.pngFigure 3-9 ACI Motor Benchmark Output for C28x With CLA Offloading

The opportunity to offload compute to CLA resulted in an execution cycle count reduction thus boosting performance by 12%.