SPRAC21A June 2016 – June 2019 OMAP-L132 , OMAP-L138 , TDA2E , TDA2EG-17 , TDA2HF , TDA2HG , TDA2HV , TDA2LF , TDA2P-ABZ , TDA2P-ACD , TDA2SA , TDA2SG , TDA2SX , TDA3LA , TDA3LX , TDA3MA , TDA3MD , TDA3MV
NOTE
On the TDA2xx and TDA2ex device, all DSP subsystem transfer controllers yield identical performance for all transfer scenarios because both TC have the same configuration, and most importantly the same FIFOSIZE for a given burst size.
EDMA channel parameters allow many different transfer configurations. Typical transfer configurations result in transfer controllers bursting the read write data in default burst size chunks, thereby, keeping the busses fully utilized. However, in some configurations, the TC issues less than optimally sized read/write commands (less than default burst size), reducing performance. To properly design a system, it is important to know which configurations offer the best performance for high-speed operations.
On TDA2xx and TDA2ex, there are two transfer controllers to move data between slave end points. The default configuration for the transfer controllers is shown in Table 21.
Name | Description | TC0 | TC1 |
---|---|---|---|
TCCFG[2:0] FIFOSIZE | Channel FIFO Size | 1024 Bytes | 1024 Bytes |
TCCFG[5:4] BUSWIDTH | Data Transfer Bus Width | 16 Bytes | 16 Bytes |
TCCFG[9:8] DSTREGDEPTH | Destination Register Depth | 4 entries | 4 entries |
DBS (Default Burst Size) | Size of each data burst | Configurable | Configurable |
The individual TC performance for paging/memory to memory transfers is essentially dictated by the TC configuration. In most scenarios, the FIFOSIZE and default burst size configuration for the TC have the most significant impact on the TC performance; the BUSWIDTH configuration is dependent on the device architecture and the DSTREGDEPTH values impact the number of in-flight transfers.
The default burst size (DBS) can be controlled with the C66x_OSS_BUS_CONFIG register in the TDA2xx and TDA2ex DSP Subsystem OCP Registers, as shown in Table 22.
Address offset | 0x0000 0014 | ||||
Physical Address | 0x1D0 0014 (DSP View) | Instance | C66x_OCP_REGISTERS | ||
Description | Bus Configuration | ||||
Type | RW |
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
RESERVED | SDMA_PRI | RESERVED | NOPOSTOVERRIDE | RESERVED | SDMA_L2PRES | RESERVED | CFG_L2PRES | RESERVED | TC1_L2PRES | RESERVED | TC0_L2PRES | RESERVED | TC1_DBS | RESERVED | TC0_DBS |
Bits | Field Name | Description | Type | Reset |
---|---|---|---|---|
31 | RESERVED | Reserved | R | 0x0 |
30:28 | SDMA_PRI | Sets the CBA/VBusM Priority for the CGEM SDMA port. Can typically be left at default value. | R | 0x4 |
27:25 | RESERVED | Reserved | R | 0x0 |
24 | NOPOSTOVERRIDE | Non-Posted writes setting | RW | 0x1 |
23:22 | RESERVED | Reserved | R | 0x0 |
21:20 | SDMA_L2PRES | OCP Slave port L2 interconnect pressure driven on ocp mflag to control arbitration within the L2 interconnect | RW | 0x0 |
19:18 | RESERVED | Reserved | R | 0x0 |
17:16 | CFG_L2PRES | CGEM CFG L2 interconnect pressure driven on ocp mflag to control arbitration within the L2 interconnect | RW | 0x0 |
15:14 | RESERVED | Reserved | R | 0x0 |
13:12 | TC1_L2PRES | TC1 L2 interconnect pressure driven on ocp mflag to control arbitration within the L2 interconnect | RW | 0x0 |
11:10 | RESERVED | Reserved | R | 0x0 |
9:8 | TC0_L2PRES | TC0 L2 interconnect pressure driven on ocp mflag to control arbitration within the L2 interconnect | RW | 0x0 |
7:6 | RESERVED | Reserved | R | 0x0 |
5:4 | TC1_DBS | TC1 Default Burst size setting | RW | 0x3 |
3:2 | RESERVED | Reserved | R | 0x0 |
1:0 | TC0_DBS | TC0 Default Burst size setting | RW | 0x3 |
The TC read and write controllers in conjunction with the source and destination register sets are responsible for issuing optimally-sized reads and writes to the slave endpoints. An optimally-sized command is defined by the transfer controller default burst size (DBS).
The EDMA_TPTC attempts to issue the largest possible command size as limited by the DBS value or the ABCNT_n[15:0] ACNT and ABCNT_n[31:16] BCNT value of the TR. EDMA_TPTC obeys the following rules: The read/write controllers always issue commands less than or equal to the DBS value. The first command of a 1D transfer command always aligns the address of subsequent commands to the DBS value.
Table 23 lists the TR segmentation rules that are followed by the EDMA_TPTC. In summary, if the ABCNT_n[15:0] ACNT value is larger than the DBS value, then the EDMA_TPTC breaks the ABCNT_n[15:0] ACNT array into DBS-sized commands to the source/destination addresses. Each ABCNT_n[31:16] BCNT number of arrays are then serviced in succession.
For BCNT arrays of ACNT bytes (that is, a 2D transfer), if the ABCNT_n[15:0] ACNT value is less than or equal to the DBS value, then the TR may be optimized into a 1D-transfer in order to maximize efficiency. The optimization takes place if the EDMA_TPTC recognizes that the 2D-transfer is organized as a single dimension (ABCNT_n[15:0] ACNT == BIDX_n) and the ACNT value is a power of 2.
ACNT ≤ DBS | ACNT is Power of 2 | BIDX = ACNT | BCNT ≤ 1023 | SAM/DAM = Increment | Description |
---|---|---|---|---|---|
Yes | Yes | Yes | Yes | Yes | Optimized |
No | X | X | X | X | Not Optimized |
X | No | X | X | X | Not Optimized |
X | X | No | X | X | Not Optimized |
X | X | X | No | X | Not Optimized |
X | X | X | X | No | Not Optimized |
In summary, Table 24 lists the factors that affect the EDMA performance.
Factors | Impact | General Recommendation |
---|---|---|
Source/Destination Memory | The transfer speed depends on SRC/DST memory bandwidth. | Know the nature of the source and destination memory, specifically the frequency of operation and the bus width. |
Transfer Size | Throughput is less for small transfers due to transfer overhead/latency. | Configure EDMA for larger transfer size as throughput, small transfer size is dominated by transfer overhead. |
A-Sync/AB-Sync | Performance depends on the number of TRs (Transfer Requests). More TRs would mean more overhead. | Using AB-Sync transfers gives better performance than chaining A-Sync transfers. |
Source/Destination Bidx | Optimization will not be done if BIDX is not equal to ACNT value optimization guidelines. | Whenever possible, follow the EDMA TC optimization guidelines. See the TPTC spec for optimization details. |
Queue TC Usage | Performance is the same for both TCs. | Both TCs have the same configuration and show the same performance. |
Burst Size | Decides the largest possible read/write command submission by TC. | The default burst size for all transfer controllers is 128 bytes. This also results in most efficient transfers/throughput in most memory-to-memory transfer scenarios. |
Source/Destination Alignment | Slight performance degradation if source/destination are not aligned to Default Burst Size (DBS) boundaries. | For smaller transfers, as much as possible, source and destination addresses should be aligned across DBS boundaries. |