The C29x CPU pipeline architecture is
divided into four sets of pipeline stages, which operate independently of each
other. This decoupling of pipeline stages allows for one set of pipeline stages to
progress while the other sets are stalled. The decoupling of the pipeline stages is
shown in Figure 4-2
Four
decoupled pipelines are:
- F1, F2, D1:
- The fetch1 through
decode1 (F1-D1) hardware acts independently of the decode 2 through
execute 6 (D2 - E6) hardware. This allows the CPU to continue
fetching instructions when the D2, R1, R2-E6 phases are
stalled.
- When the read and
write buses (or) pipeline protection ready condition is not ready,
the instruction fetches can still occur and populate the instruction
buffer. The instruction pipeline of the CPU can advance, even if the
program bus is not ready, as long as the current content in the
instruction buffer is sufficient to form instruction packets.
- The VLIW (Very Long
Instruction Word) Architecture of a CPU allows for variable
instruction width and variable number of instructions within a
packet, with a max of 8 instructions per packet. To optimize
performance, instruction fetch bus always fetches 128-bits and
stores them in two sets of 128-bit buffers in the instruction fetch
buffer. The instructions are then sent to the D2 phase of the
pipeline based on the packet sizes. There are two sets of 128-bit
buffers in the instruction fetch buffer which can store the fetched
instructions and dispatch them to the D2 phase of the pipeline based
on the instruction packet sizes. This is depicted in Figure 4-2. The condition that advances F1, F2, D1 stage pipeline is called
instruction_valid
- D2:
- Instructions in the
fetch 1, fetch 2, and decode 1 phases are discarded if an interrupt
or other program flow discontinuity occurs. An instruction that
reaches decode 2 phase always runs to completion before any
program-flow discontinuity is taken.
- Instruction in D2 is
forwarded to next pipeline stage even when next instruction packet
is not available. This is done to ease the timing due to ECC errors.
The condition that advances this D2 stage pipeline is called
d2_ready.
- R1:
- The instruction from
D2 can progress to the R1 stage of the pipeline when a prior read
access is still in progress. In this case, the read access from the
current instruction is stored in a buffer and reinitiated on the
read bus when the ongoing access is completed. This helps to reduce
timing issues associated with data read bus ready, write bus ready,
and memory read protection caused by pending writes in the pipeline.
The condition that advances this R1 stage pipeline is called
r1_ready.
- R2 …. E6:
- These pipeline stages
advance together. The condition that advances this stage pipeline is
called exe_ready.