The VCOP is an SIMD engine with built-in loop control and address generation. The VCOP is programmed in array of 2D block processing level. The vector core has the following resources:
- 4 nested for loops, with loop variables i1, i2, i3, and i4
- 8 address generators, each capable of 4-dimensional addressing; address pattern is base + i1*const1 + i2*const2 + i3*const3 + i4*const4
- 16-entry vector register file, each entry is 8-way SIMD × 40-bit signed (sign-extended or zero-padded from 8/16/32-bit signed/unused memory data, or zero-padded from operation upon register data)
- Two general-purpose functional units, each N-way SIMD, N = 2, 4, 8, 16, 32
- Table lookup unit supporting up to N parallel histogram operations
- 8 load units
- 8 store units
- Flat versus Aliased view of EVE memory
In addition, the VCOP supports the following functions:
- Generic compute
- Table lookup
- Histogram and weighted histogram
For additional information about VCOP, see Section 8.3, VCOP CPU and Instruction Set.