SPRUI04F july 2015 – april 2023
To align an array in a structure, place it inside a union with a dummy object that has the desired alignment. If you want 8 byte alignment, use a "long long" dummy field. For example:
struct s
{
union u
{ long long dummy; /* 8-byte alignment */
short buffer[50]; /* also 8-byte alignment */
} u;
...
};
If you want to declare several arrays contiguously, and maintain a given alignment, you can do so by keeping the array size, measured in bytes, an even multiple of the desired alignment. For example:
struct s
{
union u
{ long long dummy; /* 8-byte alignment */
short buffer[50]; /* also 8-byte alignment */
short buf2[50]; /* 4-byte alignment */
...
} u;
};
Because the size of buf1 is 50 * 2-bytes per short = 100 bytes, and 100 is an even multiple of 4, not 8, buf2 is only aligned on a 4-byte boundary. Padding buf1 out to 52 elements makes buf2 8-byte aligned.
Within a structure or class, there is no way to enforce an array alignment greater than 8. For the purposes of SIMD optimization, this is not necessary.
In most cases program-level optimization (see Section 4.4) entails compiling all of your source files with a single invocation of the compiler, while using the -pm -o3 options. This allows the compiler to see all of your source code at once, thus enabling optimizations that are rarely applied otherwise. Among these optimizations is seeing that, for instance, all of the calls to the function f() are passing the base address of an array to ptr, and thus ptr is always correctly aligned for SIMD optimization. In such a case, the _nassert() is not required. The compiler automatically determines that ptr must be aligned, and produces the optimized SIMD instructions.