SPRUJ28E November 2021 – September 2024 AM68 , AM68A , TDA4AL-Q1 , TDA4VE-Q1 , TDA4VL-Q1
The following code illustrates the basic algorithm for a 4-level loop nest block move. This model assumes that it transfers data 1 byte at a time and the input and output block are the same size in all dimensions.
// sptr is a byte pointer.
// dptr is a byte pointer.
// Source address is indirect
if (TR_ISA) {
sptr = *sptr;
}
// Destination address is indirect
if (TR_IDA) {
dptr = *dptr;
}
// TR_TRIGX can be selected to come from Global events, Local event, or none.
// Check for trigger of TYPE0
if (TR_TRIG0_TYPE == TYPE0) while(!TR_TRIG0);
if (TR_TRIG1_TYPE == TYPE0) while(!TR_TRIG1);
for (i3 = 0; i3 < ICNT3; i3++)
{
sptr3 = sptr; // save current position before entering next level
dptr3 = dptr; // save current position before entering next level
// Check for trigger of TYPE1
if (TR_TRIG0_TYPE == TYPE1) while(!TR_TRIG0);
If (TR_TRIG1_TYPE == TYPE1) while(!TR_TRIG1);
for (i2 = 0; i2 < ICNT2; i2++) {
sptr2 = sptr; // save current position before entering next level
dptr2 = dptr; // save current position before entering next level
// Check for trigger of TYPE2
if (TR_TRIG0_TYPE == TYPE2) while(!TR_TRIG0);
if (TR_TRIG1_TYPE == TYPE2) while(!TR_TRIG1);
for (i1 = 0; i1 < ICNT1; i1++) {
sptr1 = sptr; // save current position before entering next level
dptr1 = dptr; // save current position before entering next level
// Check for trigger of TYPE3
if(TR_TRIG0_TYPE == TYPE3) while(!TR_TRIG0);
if(TR_TRIG1_TYPE == TYPE3) while(!TR_TRIG1);
for (i0 = 0; i0 < ICNT0; i0++) {
// UTC can combine these in optimized burst aligned accesses.
*dptr = *sptr;
sptr = sptr++;
dptr = dptr++;
}
// Update based on saved pointer for this level
sptr = sptr1 + SDIM1;
dptr = dptr1 + DDIM1;
}
// Update based on saved pointer for this level
sptr = sptr2 + SDIM2;
dptr = dptr1 + DDIM2;
}
// Update based on saved pointer for this level
sptr = sptr3 + SDIM3;
dptr = dptr1 + DDIM3;
}
This form of addressing allows programs to specify regular paths through memory in a small number of parameters. Additionally, it allows for various stall points through the loop to allow for hardware or software induced pausing of the transfer. The following table defines these parameters more explicitly.
Parameter | Definition |
---|---|
ICNT0 | Number of iterations for the innermost loop dimension, loop level 0. That is, DIM0 = BYTES. |
ICNT1 | Number of iterations for the first level above the innermost loop, loop level 1. |
SDIM1 | Number of bytes between the starting points for consecutive iterations of loop level 1 for the source. |
DDIM1 | Number of bytes between the starting points for consecutive iterations of loop level 1 for the destination. |
ICNT2 | Number of iterations for loop level 2. |
SDIM2 | Number of bytes between the starting points for consecutive iterations of loop level 2 for the source. |
DDIM2 | Number of bytes between the starting points for consecutive iterations of loop level 2 for the destination. |
ICNT3 | Number of iterations for loop level 3. |
SDIM3 | Number of bytes between starting points for consecutive iterations of loop level 3 for the source. |
DDIM3 | Number of bytes between the starting points for consecutive iterations of loop level 3 for the destination. |