SPRUI04F july 2015 – april 2023
The program cache layout tool, clt6x, relies on the availability of dynamic profile information in the form of a weighted call graph in order to produce a preferred function order command file that can be used to guide function placement at link-time when your application is re-built.
There are several ways in which this dynamic profile information can be collected. For example, if you are running your application on hardware, you may have the capability to collect a PC discontinuity trace. The discontinuity trace can then be post-processed to construct weighted call graph input information for the clt6x.
The method for collecting dynamic profile information that is presented here relies on the path profiling capabilities in the C6000 code generation tools. Here is how it works:
Using --gen_profile_info instructs the compiler to embed counters into the code along the execution paths of each function.
To compile only use:
cl6x options --gen_profile_info src_file(s) |
The compile and link use:
cl6x options --gen_profile_info src_file(s) -run_linker --library lnk.cmd |
When the application runs, the counters embedded into the application by --gen_profile_info keep track of how many times a particular execution path through a function is traversed. The data collected in these counters is written out to a profile data file named pprofout.pdat.
The profile data file is automatically generated.
Once you have a profile data file, the file is decoded by the profile data decoder tool, pdd6x, as follows:
pdd6x -e=instrumented app out file -o=pprofout.prf pprofout.pdat |
Using pdd6x produces a .prf file which is then fed into the re-compile of the application that uses the profile information to generate weighted call graph input data.
The --analyze compiler option tells the compiler to generate weighted call graph or code coverage analysis information. Its syntax is as follows:
--analyze=callgraph | Instructs the compiler to generate weighted call graph information. | |
--analyze=codecov | Instructs the compiler to generate code coverage information. This option replaces the previous --codecov option. |
The compiler also supports a new --analyze_only option which instructs the compiler to halt compilation after the generation of analysis information has been completed. This option replaces the previous --onlycodecov option.
To make use of the dynamic profile information that you gathered, re-compile the source code for your application using the --analyze=callgraph option in combination with the --use_profile_info option:
cl6xoptions -mo --analyze=callgraph --use_profile_info=pprofout.prf src_file(s) |
The use of -mo instructs the compiler to generate code for each function into its own subsection. This option provides the linker with the means to directly control the placement of the code for a given function.
The compiler generates a CSV file containing weighted call graph information for each source file that is specified on the command line. If such a CSV file already exists, then new call graph analysis information will be appended to the existing CSV file. These CSV files are then input to the cache layout tool (clt6x) to produce a preferred function order command file for your application.
For more details on the content of the CSV files (containing weighted call graph information) generated by the compiler, see Section 4.11.6.