TIDUD31B May   2017  – September 2019

 

  1.   Revision History

Low Level Processing Details

An example processing chain for traffic monitoring using the medium range chirp and frame design is implemented on the IWR1642 EVM.

The main processing elements involved in the processing chain consist of the following:

  • Front end
    • Represents the antennas and the analog RF transceiver implementing the FMCW transmitter and receiver and various harware-based signal conditioning operations. This must be properly configured for the chirp and frame settings of the use case.
  • ADC
    • The ADC is the main element that interfaces to the DSP chain. The ADC output samples are buffered in ADC output buffers for access by the digital part of the processing chain.
  • EDMA controller
    • This is a user-programed, DMA engine employed to move data from one memory location to another without using another processor. The EDMA can be programed to trigger automatically and can also be configured to reorder some of the data during the movement operations.
  • C674 DSP
    • This is the digital signal processing core that implements the configuration of the front end and executes the low level signal processing operations on the data. This core has access to several memory resources as noted further in the design description.

The Low Level Processing chain is implemented on the DSP. There are several physical memory resources used in the processing chain, which are described in Table 2.

Table 2. Memory Configuration

SECTION NAME SIZE (KB) AS CONFIGURED MEMORY USED (KB) DESCRIPTION
L1D SRAM 16 16 Layer one data static RAM is the fastest data access for DSP and is used for most time-critical DSP processing data that can fit in this section.
L1D cache 16 Used as cache Layer one data cache caches data accesses to any other section configured as cacheable. The LL2, L3, and HSRAM are configured as cacheable.
L1P SRAM 28 22 Layer one program static RAM is the fastest program access RAM for DSP and is used for most time-critical DSP program that can fit in this section.
L1P cache 4 Used as cache Layer one cache caches program accesses to any other section configured as cacheble. The LL2, L3, and HSRAM are configured as cacheable.
LL2 256 227.3 Local layer two memory is lower latency than layer three for accessing and is visible only from the DSP. This memory is used for most of the program and data for the signal processing chain.
L3 768 580 Higher latency memory for DSP accesses primarily stores the radar cube and the range-Doppler power map. It is a less time-sensitive program. Data can also be stored here.
HSRAM 32 Currently unused Shared memory buffer between the DSP and the R4F relays visualization data to the R4F for output over the UART in this design.

As described in Figure 9, the implementation of the traffic monitoring example in the signal-processing chain consists of the following blocks implemented as DSP code executing on the C674x core in the IWR1642:

TIDEP-0090 Low Level Signal Processing.pngFigure 9. Low Level Signal Processing Chain
  • Range processing
    • For each antenna, EDMA is used to move samples from the ADC output buffer to DSP’s local memory. A 16-bit, fixed-point 1D windowing and 16-bit, fixed-point 1D FFT are performed. EDMA is used to move output from DSP local memory to radar cube storage in layer three (L3) memory. Range processing is interleaved with active chirp time of the frame. All other processing happens each frame, except where noted, during the idle time between the active chirp time and the end of the frame.
  • Doppler processing, antenna combining
    • For each antenna, EDMA transfers data between radar cube in L3 and DSP local L2 memory. The DSP operations are, 16-bit fixed point 2D windowing, formatting from 16-bit fixed-point IQ to floating-point IQ, floating-point 2D FFT, and non-coherent combining of received power across antennas in floating point. The output range-Doppler power signal or heat map is stored in L3 memory separate from the radar cube. Note that per antenna, floating-point Doppler data is discarded to reduce memory storage.
  • Range-Doppler detection
    • An algorithm is applied to the range-Doppler power mapping to find detection points in range and Doppler space. The algorithm consists of a first pass along range axis using cell averaging smaller of (CASO) CFAR, and a second pass along Doppler axis using cell averaging (CA) CFAR. Due to the data access pattern, the detection code accesses the integrated signal in L3 memory through the L1D cache. The output detected point list is stored in L2 memory.
  • Angle estimation
    • For each detected point in range and Doppler space, the input to angle estimation is reconstructed by re-computing per antenna Doppler data from radar cube and applying Doppler compensation for 2-TX antenna TDM-MIMO.
    • In the case of TDM-MIMO with velocity ambiguity, additional processing is needed for Vmax extension. This is subsequently referred as angle correction for Vmax extension. 2 hypothesis of Doppler compensation are applied, one corresponding to speed of (2*N-1)*v, and one corresponding to speed of 2*N*v, for 2-TX antenna case. A beamforming search algorithm then will be performed on both hypothesis, and whichever returns the larger peak will be declared as the correct angle estimates for this [range, Doppler] pair. The output is stored in the L2 memory, and then copied to the shared memory between DSP and ARM, along with range/Doppler estimation, as well as the detection SNR.
    • The angle correction for Vmax extension is very sensitive to the antenna phase correctness, especially the Tx antenna phase. The Tx antenna phase non-ideality will introduce similar angle bias effect as that from Doppler effect. If Tx antenna phase error is not compensated, the angle correction for Vmax extension will result in a wrong angle which will show up as a strong biased ghost that will cause tracking error. This phase non-ideality needs to be compensated with a calibration procedure described in the "Range Bias and Rx Channel Gain/Offset Measurement and Compensation" section of the mmWave SDK documentation for the xwr16xx Out of Box Demo.