SPRUI30H November 2015 – May 2024 DRA745 , DRA746 , DRA750 , DRA756
The ARP32 ISA has instructions that are either 16 bit or 32 bit; the mix being chosen to achieve optimal code size without sacrificing the performance (of the amount of actual work done per instruction per cycle).
The ARP32 CPU allows 16/32-bit instructions to be mixed freely without any overhead or switching between instruction decoding mode. The program counter (PC) always points to a halfword address aligned with an instruction boundary. However, the program fetch is always aligned with a word boundary. The ARP32 CPU always requests a 32-bit word from a word-aligned address over its 32-bit wide program interface. Internally, the CPU maintains a 16 bit single entry fetch buffer (FetchBuffer[15:0]) to store an unused instruction word or part of an instruction word.
This allows a simpler program memory subsystem design and allows area, power efficient implementation of tightly coupled program memory subsystem without incurring any overhead on code density.
Since instructions in the ARP32 CPU are either 16 bit or 32 bit and they are freely mixable, any 32-bit word fetched from the program memory (FetchWord[31:0]) may have the following cases:
Thus, for a sequential program with an arbitrary mix of 16-bit and 32-bit instructions, the instruction is executed without a single cycle of stall (due to unavailability of a whole instruction word for decoding) in the pipeline. However, if the target address of a program discontinuity (due to branch, call, return interrupt) is not word aligned and there exists a 32-bit instruction at the target address, a single 32-bit aligned fetch is not sufficient to fetch a full 32-bit instruction word. In this case, there is a stall/bubble cycle in the CPU pipeline. The fetch engine of the CPU issues an additional instruction fetch request at the next cycle to fulfill the required 32-bit instruction fetch and the CPU pipeline continues normally.