SPNU151 User guide

SPNU151W January 1998 – March 2023 66AK2E05 , 66AK2H06 , 66AK2H12 , 66AK2H14 , AM1705 , AM1707 , AM1802 , AM1806 , AM1808 , AM1810 , AM5K2E04 , OMAP-L132 , OMAP-L137 , OMAP-L138 , SM470R1B1M-HT , TMS470R1A288 , TMS470R1A384 , TMS470R1A64 , TMS470R1B1M , TMS470R1B512 , TMS470R1B768

5.14 ARM Instruction Intrinsics

Assembly instructions can be generated using the intrinsics in the following tables. Table 5-3 shows which intrinsics are available on the different ARM targets. Table 5-4 shows the calling syntax for each intrinsic, along with the corresponding assembly instruction and a description. Additional intrinsices for getting and setting the CPSR register and to enable/disable interrupts are provided in Section 6.8.1.

Table 5-3 ARM Intrinsic Support by Target

C/C++ Compiler Intrinsic	ARM V5e (ARM9E)	ARM V6 (ARM11)	ARM V6M0 (Cortex-M0)	ARM V7M3 (Cortex-M3)	ARM V7M4 (Cortex-M4)	ARM V7R (Cortex-R4)	ARM V7A8 (Cortex-A8)
_ _clz	yes	yes		yes	yes	yes	yes
_ _delay_cycles			yes	yes	yes	yes
_ _get_MSP			yes	yes	yes
_ _get_PRIMASK			yes	yes	yes
_ _ldrex		yes		yes	yes	yes	yes
_ _ldrexb		yes		yes	yes	yes	yes
_ _ldrexd		yes				yes	yes
_ _ldrexh		yes		yes	yes	yes	yes
_ _MCR	yes	yes		yes	yes	yes	yes
_ _MRC	yes	yes		yes	yes	yes	yes
_ _ nop	yes	yes	yes	yes	yes	yes	yes
_norm	yes	yes		yes	yes	yes	yes
_ _rev		yes	yes		yes	yes	yes
_ _rev16		yes	yes		yes	yes	yes
_ _revsh		yes	yes		yes	yes	yes
_ _rbit		yes			yes	yes	yes
_ _ror	yes	yes	yes	yes	yes	yes	yes
_pkhbt		yes			yes	yes	yes
_pkhtb		yes			yes	yes	yes
_qadd16		yes			yes	yes	yes
_qadd8		yes			yes	yes	yes
_qaddsubx		yes			yes	yes	yes
_qsub16		yes			yes	yes	yes
_qsub8		yes			yes	yes	yes
_qsubaddx		yes			yes	yes	yes
_sadd	yes	yes			yes	yes	yes
_sadd16		yes			yes	yes	yes
_sadd8		yes			yes	yes	yes
_saddsubx		yes			yes	yes	yes
_sdadd	yes	yes			yes	yes	yes
_sdsub	yes	yes			yes	yes	yes
_sel		yes			yes	yes	yes
_ _set_MSP			yes	yes	yes
_ _set_PRIMASK			yes	yes	yes
_shadd16		yes			yes	yes	yes
_shadd8		yes			yes	yes	yes
_shsub16		yes			yes	yes	yes
_shsub8		yes			yes	yes	yes
_smac	yes	yes			yes	yes	yes
_smlabb	yes	yes			yes	yes	yes
_smlabt	yes	yes			yes	yes	yes
_smlad		yes			yes	yes	yes
_smladx		yes			yes	yes	yes
_smlalbb	yes	yes			yes	yes	yes
_smlalbt	yes	yes			yes	yes	yes
_smlald		yes			yes	yes	yes
_smlaldx		yes			yes	yes	yes
_smlaltb	yes	yes			yes	yes	yes
_smlaltt	yes	yes			yes	yes	yes
_smlatb	yes	yes			yes	yes	yes
_smlatt	yes	yes			yes	yes	yes
_smlawb	yes	yes			yes	yes	yes
_smlawt	yes	yes			yes	yes	yes
_smlsd		yes			yes	yes	yes
_smlsdx		yes			yes	yes	yes
_smlsld		yes			yes	yes	yes
_smlsldx		yes			yes	yes	yes
_smmla		yes			yes	yes	yes
_smmlar		yes			yes	yes	yes
_smmls		yes			yes	yes	yes
_smmlsr		yes			yes	yes	yes
_smmul		yes			yes	yes	yes
_smmulr		yes			yes	yes	yes
_smuad		yes			yes	yes	yes
_smuadx		yes			yes	yes	yes
_smusd		yes			yes	yes	yes
_smusdx		yes			yes	yes	yes
_smpy	yes	yes			yes	yes	yes
_smsub	yes	yes			yes	yes	yes
_smulbb	yes	yes			yes	yes	yes
_smulbt	yes	yes			yes	yes	yes
_smultb	yes	yes			yes	yes	yes
_smultt	yes	yes			yes	yes	yes
_smulwb	yes	yes			yes	yes	yes
_smulwt	yes	yes			yes	yes	yes
_ _sqrt	yes	yes				yes	yes
_ _sqrtf	yes	yes			yes	yes	yes
_ssat16		yes			yes	yes	yes
_ssata	yes	yes		yes	yes	yes	yes
_ssatl	yes	yes		yes	yes	yes	yes
_ssub	yes	yes			yes	yes	yes
_ssub16		yes			yes	yes	yes
_ssub8		yes			yes	yes	yes
_ssubaddx		yes			yes	yes	yes
_ _strex		yes		yes	yes	yes	yes
_ _strexb		yes		yes	yes	yes	yes
_ _strexd		yes				yes	yes
_ _strexh		yes		yes	yes	yes	yes
_subc	yes	yes			yes	yes	yes
_sxtab		yes			yes	yes	yes
_sxtab16		yes			yes	yes	yes
_sxtah		yes			yes	yes	yes
_sxtb	yes	yes		yes	yes	yes	yes
_sxtb16		yes			yes	yes	yes
_sxth	yes	yes		yes	yes	yes	yes
_uadd16		yes			yes	yes	yes
_uadd8		yes			yes	yes	yes
_uaddsubx		yes			yes	yes	yes
_uhadd16		yes			yes	yes	yes
_uhadd8		yes			yes	yes	yes
_uhsub16		yes			yes	yes	yes
_uhsub8		yes			yes	yes	yes
_umaal		yes			yes	yes	yes
_uqadd16		yes			yes	yes	yes
_uqadd8		yes			yes	yes	yes
_uqaddsubx		yes			yes	yes	yes
_uqsub16		yes			yes	yes	yes
_uqsub8		yes			yes	yes	yes
_uqsubaddx		yes			yes	yes	yes
_usad8		yes			yes	yes	yes
_usat16		yes			yes	yes	yes
_usata	yes	yes		yes	yes	yes	yes
_usatl	yes	yes		yes	yes	yes	yes
_usub16		yes			yes	yes	yes
_usub8		yes			yes	yes	yes
_usubaddx		yes			yes	yes	yes
_uxtab		yes			yes	yes	yes
_uxtab16		yes			yes	yes	yes
_uxtah		yes			yes	yes	yes
_uxtb	yes	yes		yes	yes	yes	yes
_uxtb16		yes			yes	yes	yes
_uxth	yes	yes		yes	yes	yes	yes
_ _wfe			yes	yes	yes	yes	yes
_ _wfi			yes	yes	yes	yes	yes

Table 5-4 shows the calling syntax for each intrinsic, along with the corresponding assembly instruction and a description. See Table 5-3 for a list of which intrinsics are available on the different ARM targets. Additional intrinsices for getting and setting the CPSR register and to enable/disable interrupts are provided in Section 6.8.1.

Table 5-4 ARM Compiler Intrinsics

C/C++ Compiler Intrinsic	Assembly Instruction	Description
int count = _ _clz(int src );	CLZcount , src	Returns the count of leading zeros.
void _ _delay_cycles( unsigned int cycles );	varies	Delays execution for the specified number of cycles. The number of cycles must be a constant. The __delay_cycles intrinsic inserts code to consume precisely the number of specified cycles with no side effects. The number of cycles delayed must be a compile-time constant. Note: Cycle timing is based on 0 wait states. Results vary with additional wait states. The implementation does not account for dynamic prediction. Lower delay cycle counts may be less accurate given pipeline flush behaviors.
unsigned int dst = _ _get_MSP(void );	MRS dst, MSP	Returns the current value of the Main Stack Pointer.
unsigned int dst = _ _get_PRIMASK(void );	MRS dst, PRIMASK	Returns the current value of the Priority Mask Register. If this value is 1, activation of all exceptions with configurable priority is prevented.
unsigned int dest = _ _ldrex(void* src );	LDREXdst , src	Loads data from memory address containing word (32-bit) data
unsigned int dest= _ _ldrexb(void* src );	LDREXBdst , src	Loads data from memory address containing byte data
unsigned long long dest = _ _ldrexd(void* src );	LDREXDdst , src	Loads data from memory address with long long support
unsigned int dest = _ _ldrexh(void* src );	LDREXHdst , src	Loads data from memory address containing halfword (16-bit) data
void __MCR (unsigned int coproc, unsigned int opc1, unsigned int src, unsigned int coproc_reg1, unsigned int coproc_reg2, unsigned int opc2);	MCRcoproc, opc1, src, CR<coproc_reg1>, CR<coproc_reg2>, opc2	Access the coprocessor registers
unsigned int __MRC(unsigned int coproc, unsigned int opc1, unsigned int coproc_reg1, unsigned int coproc_reg2, unsigned int opc2);	MRCcoproc, opc1, src, CR<coproc_reg1>, CR<coproc_reg2>, opc2	Access the coprocessor registers
void _ _nop( void );	NOP	Perform an instruction that does nothing.
int dst = _norm(int src );	CLZ dst , src	Count leading zero bits. This intrinsic can be used when implementing integer normalization.
int dst = _pkhbt(int src1 , int src2 , int shift );	PKHBTdst , src1 , src2 , #shift	Combine bottom halfword of src1 with shifted top halfword of src2
nt dst = _pkhtb(int src1 , int src2 , int shift );	PKHTBdst , src1 , src2 , #shift	Combine top halfword of src1 with shifted bottom halfword of src2
int dst = _qadd16(int src1 , int src2 );	QADD16dst , src1 , src2	Performs two signed halfword saturated additions
int dst = _qadd8(int src1 , int src2 );	QADD8dst , src1 , src2	Performs four signed saturated 8-bit additions
int dst = _qaddsubx(int src1 , int src2 );	QASXdst , src1 , src2	Exchange halfwords of src2, perform signed saturated addition on the top halfwords and signed saturated subtraction on the bottom halfwords.
int dst = _qsub16(int src1 , int src2 );	QSUB16dst , src1 , src2	Performs two signed saturated halfword subtractions
int dst = _qsub8(int src1 , int src2 );	QSUB8dst , src1 , src2	Performs four signed saturated 8-bit subtractions
int dst = _qsubaddx(int src1 , int src2 );	QSAXdst , src1 , src2	Exchange halfwords of src2, perform signed saturated subtraction on top halfwords and signed saturated addition on bottom halfwords
int dst = _ _rbit(int src );	RBITdst , src	Reverses the bit order in a word.
int dst = _ _rev(int src );	REVdst , src	Reverses byte order in a word. That is, converts 32-bit data between big-endian and little-endian or vice versa.
int dst = _ _rev16(int src );	REV16dst , src	Reverses byte order in each byte in a word independently. That is, converts 16-bit data between big-endian and little-endian or vice versa.
int dst = _ _revsh(int src );	REVSHdst , src	Reverses byte order in the lower byte of a word, and extends the sign to 32 bits. That is, converts 16-bit signed data to 32-bit signed data, while also converting between big-endian and little-endian or vice versa.
int dst = _ _ror(int src , int shift );	RORdst , src , shift	Rotates the value to the right by the number of bits specified. Bits rotated off the right end are placed into empty bits on the left.
int dst =_sadd(int src1 , int src2 );	QADDdst , src1 , src2	Saturated add
int dst = _sadd16(int src1 , int src2 );	SADD16 dst , src1 , src2	Performs two signed halfword additions
int dst = _sadd8(int src1 , int src2 );	SADD8dst , src1 , src2	Performs four signed 8-bit additions
int dst = _saddsubx(int src1 , int src2 );	SASXdst , src1 , src2	Exchange halfwords of src2, add the top halfwords and subtract the bottom halfwords
int dst =_sdadd(int src1 , int src2 );	QDADDdst , src1 , src2	Saturated double-add
int dst =_sdsub(int src1 , int src2 );	QDSUBdst , src1 , src2	Saturated double-subtract
int dst = _sel(int src1 , int src2 );	SELdst , src1 , src2	Selects byte n from src1 if GE bit n is set or from src2 if GE bit n is not set, where n ranges from 0 to 3.
void _ _set_MSP(unsigned int src);	MSR MSP,src	Sets the value of the Main Stack Pointer to src.
unsigned int dst = _ _set_PRIMASK(unsigned int src);	MRSdst , PRIMASK (optional) MSR PRIMASK, src	Sets the Priority Mask Register to the src value and returns the value as it was prior to being set as dst. Setting this register to 1 prevents the activation of all exceptions with configurable priority.
int dst = _shadd16(int src1 , int src2 );	SHADD16dst , src1 , src2	Performs two signed halfword additions and halves the results
int dst = _shadd8(int src1 , int src2 );	SHADD8 dst , src1 , src2	Performs four signed 8-bit additions and halves the results
int dst = _shsub16(int src1 , int src2 );	SHSUB16dst , src1 , src2	Performs two signed halfword subtractions and halves the results
int dst = _shsub8int src1 , int src2 );	SHSUB8dst , src1 , src2	Performs four signed 8-bit subtractions and halves the results
int dst =_smac(int dst, int src1 , int src2 );	SMULBB tmp , src1 , src2 QDADD dst, dst , tmp	Saturated multiply-accumulate
int dst =_smlabb(int dst , short src1 , short src2 );	SMLABB dst , src1 , src2	Signed multiply-accumulate bottom halfwords
int dst =_smlabt(int dst , short src1 , int src2 );	SMLABT dst , src1 , src2	Signed multiply-accumulate bottom and top halfwords
int dst _smlad(int src1 , int src2 , int acc );	SMLADdst , src1 , src2 , acc	Performs two signed 16-bit multiplications on the top and bottom halfwords of src1 and src2 and adds the results to acc.
int dst _smladx(int src1 , int src2 , int acc );	SMLADXdst , src1 , src2 , acc	Same as _smlad except the halfwords in src2 are exchange before the multiplication.
long long dst =_smlalbb(long long dst , short src1 , short src2 );	SMLALBB dstlo , dsthi , src1 , src2	Signed multiply-long and accumulate bottom halfwords
long long dst =_smlalbt(long long dst , short src1 , int src2 );	SMLALBT dstlo , dsthi , src1 , src2	Signed multiply-long and accumulate bottom and top halfwords
long long dst _smlald(long long acc , int src1 , int src2 );	SMLALDdst , src1 , src2	Performs two 16-bit multiplication on the top and bottom halfwords of src1 and src2 and adds the results to the 64-bit acc operand
long long dst _smlaldx(long long acc , int src1 , int src2 );	SMLALDX dst , src1 , src2	Same as _smlald except the halfwords in src2 are exchanged.
long long dst =_smlaltb(long long dst , int src1 , short src2 );	SMLALTB dstlo , dsthi , src1 , src2	Signed multiply-long and accumulate top and bottom halfwords
long long dst =_smlaltt(long long dst , int src1 , int src2 );	SMLALTT dstlo , dsthi , src1 , src2	Signed multiply-long and accumulate top halfwords
int dst =_smlatb(int dst , int src1 , short src2 );	SMLATB dst , src1 , src2	Signed multiply-accumulate top and bottom halfwords
int dst =_smlatt(int dst , int src1 , int src2 );	SMLATT dst , src1 , src2	Signed multiply-accumulate top halfwords
int dst _smlawb(int src1 , short src2 , int acc );	SMLAWB dst , src1 , src2	Signed multiply-accumulate word and bottom halfword
int dst _smlawt(int src1 , short src2 , int acc );	SMLAWT dst , src1 , src2	Signed multiply-accumulate word and top halfword
int dst _smlsd(int src1 , int src2 , int acc );	SMLSDdst , src1 , src2 , acc	Performs two signed 16-bit multiplications on the top and bottom halfwords of src1 and src2 and adds the difference of the results to acc.
int dst _smlsdx(int src1 , int src2 , int acc );	SMLSDXdst , src1 , src2 , acc	Same as _smlsd except the halfwords in src2 are exchange before the multiplication.
long long dst _smlsld(long long acc , int src1 , int src2 );	SMLSLDdst , src1 , src2	Performs two 16-bit multiplication on the top and bottom halfwords of src1 and src2 and adds the difference of the results to the 64-bit acc operand.
long long dst _smlsldx(long long acc , int src1 , int src2 );	SMLSLDX dst , src1 , src2	Same as _smlsld except the halfwords in src2 are exchanged.
int dst _smmla(int src1 , int src2 , int acc );	SMMLAdst , src1 , src2 , acc	Performs a signed multiplication on src1 and src2, extracts the most significant 32 bits of the result, and adds an accumulate value.
int dst _smmlar(int src1 , int src2 , int acc );	SMMLARdst , src1 , src2 , acc	Same as _smmla execpt the result is rounded instead of being truncated.
int dst _smmls(int src1 , int src2 , int acc );	SMMLSdst , src1 , src2 , acc	Performs a signed multiplication on src1 and src2, subtracts the result from an accumulate value that is shifted left by 32 bits, and extracts the most significant 32 bits of the result of the subtraction.
int dst _smmlsr(int src1 , int src2 , int acc );	SMMLSRdst , src1 , src2 , acc	Same as _smmls except the result is rounded instead of being truncated.
int dst _smmul(int src1 , int src2 , int acc );	SMMULdst , src1 , src2 , acc	Performs a signed 32-bit multiplication on src1 and src2 and extracts the most significant 32-bits of the result.
int dst _smmulr(int src1 , int src2 , int acc );	SMMULRdst , src1 , src2 , acc	Same as _smmul except the result is rounded instead of being truncated.
int dst =_smpy(int src1 , int src2 );	SMULBB dst , src1 , src2 QADD dst, dst , dst	Saturated multiply
int dst =_smsub(int src1 , int src2 );	SMULBB tmp , src1 , src2 QDSUB dst, dst , tmp	Saturated multiply-subtract
int dst _smuad(int src1 , int src2 );	SMUADdst , src1 , src2	Performs two signed 16-bit multiplications on the top and bottom halfwords and adds the products.
int dst _smuadx(int src1 , int src2 );	SMUADXdst , src1 , src2	Same as _smuad except the halfwords in src2 are exchange before the multiplication.
int dst =_smulbb(int src1 , int src2 );	SMULBB dst , src1 , src2	Signed multiply bottom halfwords
int dst =_smulbt(int src1 , int src2 );	SMULBT dst , src1 , src2	Signed multiply bottom and top halfwords
int dst =_smultb(int src1 , int src2 );	SMULTB dst , src1 , src2	Signed multiply top and bottom halfwords
int dst =_smultt(int src1 , int src2 );	SMULTT dst , src1 , src2	Signed multiply top halfwords
int dst _smulwb(int src1 , short src2 , int acc );	SMULWB dst , src1 , src2	Signed multiply word and bottom halfword
int dst _smulwt(int src1 , short src2 , int acc );	SMULWT dst , src1 , src2	Signed multiply word and top halfword
int dst _smusd(int src1 , int src2 );	SMUSDdst , src1 , src2	Performs two signed 16-bit multiplications on the top and bottom halfwords and subtracts the products.
int dst _smusdx(int src1 , int src2 );	SMUSDXdst , src1 , src2	Same as _smusd except the halfwords in src2 are exchanged before the multiplication.
double __sqrt( double );	VSQRTdst , src1	Return the square root of the specified double. This intrinsic is directly replaced with the VSQRT instruction if --fp_mode=relaxed. If strict floating point mode is used, the function must also set an errno if a domain error occurs.
float __sqrtf( float );	VSQRTdst , src1	Return the square root of the specified float. This intrinsic is directly replaced with the VSQRT instruction if --fp_mode=relaxed. If strict floating point mode is used, the function must also set an errno if a domain error occurs.
int dst =_ssat16(int src , int bitpos );	SSAT16dst , #bitpos	Performs two halfword saturations to a selectable signed range specified by bitpos
int dst =_ssata(int src , int shift , int bitpos );	SSAT dst , #bitpos, src, ASR #shift	Right shifts src and saturates to a selectable signed range specified by bitpos
int dst =_ssatl(int src , int shift , int bitpos );	SSAT dst , #bitpos, src, LSL #shift	Left shifts src and saturates to a selectable signed range specified by bitpos
int dst =_ssub(int src1 , int src2 );	QSUBdst , src1 , src2	Saturated subtract
int dst = _ssub16(int src1 , int src2 );	SSUB16 dst , src1 , src2	Performs two signed halfword subtractions
int dst = _ssub8(int src1 , int src2 );	SSUB8dst , src1 , src2	Performs four signed 8-bit subtractions
int dst = _ssubaddx(int src1 , int src2 );	SSAXdst , src1 , src2	Exchange halfwords of src2, subtract the top halfwords and add the bottom halfwords
int status = _ _strex(unsigned int src, void* dst );	STREXstatus , src , dest	Stores word (32-bit) data in memory address
int status = _ _strexb(unsigned char src, void* dst );	STREXBstatus , src , dest	Stores byte data in memory address
int status = _ _strexd(unsigned long long src, void* dst );	STREXDstatus , src , dest	Stores long long data in memory address
int status = _ _strexh(unsigned short src, void* dst );	STREXHstatus , src , dest	Stores halfword (16-bit) data in memory address
int dst = _subc(int src1 , int src2 );	SUBCdst , src1 , src2	Subtract with carry
int dst _sxtab(int src1 , int src2 , int rotamt );	SXTABdst , src1 , src2 , ROR #rotamt	Extracts an optionally rotated 8-bit value from src2 and sign extends it to 32 bits, then adds the value to src1. The rotation amount can be 0, 8, 16, or 24.
int dst _sxtab16(int src1 , int src2 , int rotamt );	SXTAB16 dst , src1 , src2 , ROR #rotamt	Extracts two optionally rotated 8-bit values from src2 and sign extends them to 16 bits each, then adds the values to the two 16-bit values in src1. The rotation amount should be 0, 8, 16, or 24.
int dst _sxtah(int src1 , int src2 , int rotamt );	SXTAHdst , src1 , src2 , ROR #rotamt	Extracts an optionally rotated 16-bit value from src2 and sign extends it to 32 bits, then adds the result to src1. The rotation amount can be 0, 8, 16, or 32.
int dst _sxtb(int src1 , int rotamt );	SXTB dst , src1 , ROR #rotamt	Extracts an optionally rotated 8-bit value from src1 and sign extends it to 32 bits. The rotation amount can be 0, 8, 16, or 24.
int dst _sxtb16(int src1 , int rotamt );	SXTAB16 dst , src1 , ROR #rotamt	Extracts two optionally rotated 8-bit values from src1 and sign extends them to 16-bits. The rotation amount can be 0, 8, 16, or 24.
int dst _sxth(int src1 , int rotamt );	SXTHdst , src1 , ROR #rotamt	Extracts an optionally rotated 16-bit value from src2 and sign extends it to 32 bits. The rotation amount can be 0, 8, 16, or 24.
int dst = _uadd16(int src1 , int src2 );	UADD16 dst , src1 , src2	Performs two unsigned halfword additions
int dst = _uadd8(int src1 , int src2 );	UADD8dst , src1 , src2	Performs four unsigned 8-bit additions
int dst = _uaddsubx(int src1 , int src2 );	UASXdst , src1 , src2	Exchange halfwords of src2, add the top halfwords and subtract the bottom halfwords
int dst = _uhadd16(int src1 , int src2 );	UHADD16dst , src1 , src2	Performs two unsigned halfword additions and halves the results
int dst = _uhadd8(int src1 , int src2 );	UHADD8 dst , src1 , src2	Performs four unsigned 8-bit additions and halves the results
int dst = _uhsub16(int src1 , int src2 );	UHSUB16dst , src1 , src2	Performs two unsigned halfword subtractions and halves the results
int dst = _uhsub8(int src1 , int src2 );	UHSUB8 dst , src1 , src2	Performs four unsigned 8-bit subtractions and halves the results
int dst = _umaal(long long acc , int src1 , int src2 );	UMAALdst1 , dst2 , src1 , src2	Performs an unsigned 32-bit multiplication on src1 and src2, then adds two unsigned 32-bit values in acc.
int dst = _uqadd16(int src1 , int src2 );	UQADD16dst , src1 , src2	Performs two unsigned halfword saturated additions
int dst = _uqadd8(int src1 , int src2 );	UQADD8 dst , src1 , src2	Performs four unsigned saturated 8-bit additions
int dst = _uqaddsubx(int src1 , int src2 );	UQASXdst , src1 , src2	Exchange halfwords of src2, perform unsigned saturated addition on the top halfwords and unsigned saturated subtraction on the bottom halfwords.
int dst = _uqsub16(int src1 , int src2 );	UQSUB16dst , src1 , src2	Performs two unsigned saturated halfword subtractions
int dst = _uqsub8(int src1 , int src2 );	UQSUB8 dst , src1 , src2	Performs four unsigned saturated 8-bit subtractions
int dst = _uqsubaddx(int src1 , int src2 );	UQSAXdst , src1 , src2	Exchange halfwords of src2, perform unsigned saturated subtraction on top halfwords and unsigned saturated addition on bottom halfwords
int dst = _usad8(int src1 , int src2 );	USAD8dst , src1 , src2	Performs four unsigned 8-bit subtractions, and adds the absolute value of the differences together.
int dst =_usat16(int src , int bitpos );	USAT16 dst , #bitpos	Performs two halfword saturations to a selectable unsigned range specified by bitpos
int dst =_usata(int src , int shift , int bitpos );	USAT dst , #bitpos, src, ASR #shift	Right shifts src and saturates to a selectable unsigned range specified by bitpos
int dst =_usatl(int src , int shift , int bitpos );	USAT dst , #bitpos, src, LSL #shift	Left shifts src and saturates to a selectable unsigned range specified by bitpos
int dst = _usub16(int src1 , int src2 );	USUB16 dst , src1 , src2	Performs two unsigned halfword subtractions
int dst = _usub8(int src1 , int src2 );	USUB8dst , src1 , src2	Performs four unsigned 8-bit subtractions
int dst = _usubaddx(int src1 , int src2 );	USAXdst , src1 , src2	Exchange halfwords of src2, subtract the top halfwords and add the bottom halfwords
int dst _uxtab(int src1 , int src2 , int rotamt );	UXTABdst , src1 , src2 , ROR #rotamt	Extracts an optionally rotated 8-bit value from src2 and zero extends it to 32 bits, then adds the value to src1. The rotation amount can be 0, 8, 16, or 24.
int dst _uxtab16(int src1 , int src2 , int rotamt );	UXTAB16 dst , src1 , src2 , ROR #rotamt	Extracts two optionally rotated 8-bit values from src2 and zero extends them to 16 bits each, then adds the values to the two 16-bit values in src1. The rotation amount should be 0, 8, 16, or 24.
int dst _uxtah(int src1 , int src2 , int rotamt );	UXTAHdst , src1 , src2 , ROR #rotamt	Extracts an optionally rotated 16-bit value from src2 and zero extends it to 32 bits, then adds the result to src1. The rotation amount can be 0, 8, 16, or 32.
int dst _uxtb(int src1 , int rotamt );	UXTBdst , src1 , ROR #rotamt	Extracts an optionally rotated 8-bit value from src2 and zero extends it to 32 bits. The rotation amount can be 0, 8, 16, or 24.
int dst _uxtb16(int src1 , int rotamt );	UXTB16dst , src1 , ROR #rotamt	Extracts two optionally rotated 8-bit values from src1 and zero extends them to 16-bits. The rotation amount can be 0, 8, 16, or 24.
int dst _uxth(int src1 , int rotamt );	UXTHdst , src1 , ROR #rotamt	Extracts an optionally rotated 16-bit value from src2 and zero extends it to 32 bits. The rotation amount can be 0, 8, 16, or 24.
void _ _wfe( void );	WFE	Wait for event. Save power by waiting for an exception or event..
void _ _wfi( void );	WFI	Wait for interrupt. Enter standby, dormant or shutdown mode, where an interrupt is required to wake-up the processor.

In addition, the compiler supports many of the intrinsics described in the ARM C Language Extensions (ACLE) specification. These intrinsics are applicable for the Cortex-M and Cortex-R processor variants. The ACLE intrinsics are implemented in order to support the development of source code that can be compiled using ACLE-compliant compilers from multiple vendors for a variety of ARM processors. A number of the intrinsics are duplicates of intrinsics listed in the previous table but with slightly different names (such as one vs. two leading underscores).

The compiler does not support all of the ACLE intrinsics listed in the ACLE specification. For example, the __cls, __clsl, and __clsll ACLE intrinsics are not supported, because the CLS instruction is not available on the Cortex-M or Cortex-R architectures.

In order to use the ACLE intrinsics, your code must include the provided arm_acle.h header file. For details about the ACLE intrinsics, see the ACLE specification. For information about which ACLE intrinsics are supported, see the arm_acle.h file. Where applicable, the declarations of ACLE intrinsics that are not supported are enclosed in comments in that header file along with a brief explanation of why the intrinsic is not supported and a reference to the appropriate section in the ACLE specification where the intrinsic is described.