SPRUIG3C January 2018 – August 2019 TDA4VM , TDA4VM-Q1
There are a number of options available to improve performance under certain circumstances. Not all are applicable to all kernels and some may require some refactoring of source.
--transpose
– Enables the
streaming engine based transpose read transformation to generate more efficient
OFFSET_NP1
transpose sequences.
--vcop_simd=16
– Enables 16 lanes
and changes VCOP_SIMD_WIDTH
to 16.