SPRAB89A September 2011 – March 2014
The compiler generates calls to helper functions to perform operations that need to be supported by the compiler, but are not supported directly by the architecture, such as floating-point operations on devices that lack dedicated hardware. These helper functions must be implemented in the RTS library of any toolchain that conforms to the ABI.
Helper functions are named using the prefix _ _C6000_. Any identifier with this prefix is reserved for the ABI. In addition, the _ _tls_get_addr() helper function is needed to support dynamic linking access to thread-local storage.
The helper functions adhere to the standard calling conventions, except as indicated in Section 8.4.
The following tables specify the helper functions using C notation and syntax. The types in the table correspond to the generic data types specified in Section 2.2.
The functions in Table 8-1 convert floating-point values to integer values, in accordance with C's conversion rules and the floating-point behavior specified by Section 8.2.
Signature | Description |
---|---|
int32 _ _C6000_fixdi(float64 x); | Convert float64 to int32 |
int40 _ _C6000_fixdli(float64 x); | Convert float64 to int40 |
int64 _ _C6000_fixdlli(float64 x); | Convert float64 to int64 |
uint32 _ _C6000_fixdu(float64 x); | Convert float64 to uint32 |
uint40 _ _C6000_fixdul(float64 x); | Convert float64 to uint40 |
uint64 _ _C6000_fixdull(float64 x); | Convert float64 to uint64 |
int32 _ _C6000_fixfi(float32 x); | Convert float32 to int32 |
int40 _ _C6000_fixfli(float32 x); | Convert float32 to int40 |
int64 _ _C6000_fixflli(float32 x); | Convert float32 to int64 |
uint32 _ _C6000_fixfu(float32 x); | Convert float32 to uint32 |
uint40 _ _C6000_fixful(float32 x); | Convert single-precision float to uint40 |
uint64 _ _C6000_fixfull(float32 x); | Convert single-precision float to uint64 |
The functions in Table 8-2 convert integer values to floating-point values, in accordance with C's conversion rules and the floating-point behavior specified by Section 8.2.
Signature | Description |
---|---|
float64 _ _C6000_fltid(int32 x); | Convert int32 to double-precision float |
float64 _ _C6000_fltlid(int40 x); | Convert int40 to double-precision float |
float64 _ _C6000_fltllid(int64 x); | Convert int64 to double-precision float |
float64 _ _C6000_fltud(uint32 x); | Convert uint32 to double-precision float |
float64 _ _C6000_fltuld(uint40 x); | Convert uint40 to double-precision float |
float64 _ _C6000_fltulld(uint64 x); | Convert uint64 to double-precision float |
float32 _ _C6000_fltif(int32 x); | Convert int32 to single-precision float |
float32 _ _C6000_fltlif(int40 x); | Convert int40 to single-precision float |
float32 _ _C6000_fltllif(int64 x); | Convert int64 to single-precision float |
float32 _ _C6000_fltuf(uint32 x); | Convert uint32 to single-precision float |
float32 _ _C6000_fltulf(uint40 x); | Convert uint40 to single-precision float |
float32 _ _C6000_fltullf(uint64 x); | Convert uint64 to single-precision float |
The functions in Table 8-3 convert floating-point values from one format to another, in accordance with C's conversion rules and the floating-point behavior specified by Section 8.2.
Signature | Description |
---|---|
float32 _ _C6000_cvtdf(float64 x); | Convert double-precision float to single-precision |
float64 _ _C6000_cvtfd(float32 x); | Convert single-precision float to double-precision |
The functions in Table 8-4 perform floating-point arithmetic, in accordance with C semantics and the floating-point behavior specified by Section 8.2.
Signature | Description |
---|---|
float64 _ _C6000_absd(float64 x); | Return absolute value of double-precision float |
float32 _ _C6000_absf(float32 x); | Return absolute value of single-precision float |
float64 _ _C6000_addd(float64 x, float64 y); | Add two double-precision floats (x+y) |
float32 _ _C6000_addf(float32 x, float32 y); | Add two single-precision floats (x+y) |
float64 _ _C6000_divd(float64 x, float64 y); | Divide two double-precision floats (x/y) |
float32 _ _C6000_divf(float32 x, float32 y); | Divide two single-precision floats (x/y) |
float64 _ _C6000_mpyd(float64 x, float64 y); | Multiply two double-precision floats (x*y) |
float32 _ _C6000_mpyf(float32 x, float32 y); | Multiply two single-precision floats (x*y) |
float64 _ _C6000_negd(float64 x); | Return negated double-precision float (-x) |
float32 _ _C6000_negf(float32 x); | Return negated single-precision float (-x) |
float64 _ _C6000_subd(float64 x, float64 y); | Subtract two double-precision floats (x-y) |
float32 _ _C6000_subf(float32 x, float32 y); | Subtract two single-precision floats (x-y) |
int64 _ _C6000_trunc(float64 x); | Truncate double-precision float toward zero |
int32 _ _C6000_truncf(float32 x); | Truncate single-precision float toward zero |
The functions in Table 8-5 perform floating-point comparisons in accordance with C semantics and the floating-point behavior specified by Section 8.2.
The _ _C6000_cmp* functions return an integer less than 0 if x is less than y, 0 if the values are equal, or an integer greater than 0 of x is greater than y. If either operand is NaN, the result is undefined.
The explicit comparison functions operate correctly with unordered (NaN) operands. That is, they return non-zero if the comparison is true even if one of the operands is NaN, or 0 otherwise.
Signature | Description |
---|---|
int32 _ _C6000_cmpd(float64 x, float64 y); | Double-precision comparison |
int32 _ _C6000_cmpf(float32 x, float32 y); | Single-precision comparison |
int32 _ _C6000_unordd(float64 x, float64 y); | Double-precision check for unordered operands |
int32 _ _C6000_unordf(float32 x, float32 y); | Single-precision check for unordered operands |
int32 _ _C6000_eqd(float64 x, float64 y); | Double-precision comparison: x == y |
int32 _ _C6000_eqf(float32 x, float32 y); | Single-precision comparison: x == y |
int32 _ _C6000_neqd(float64 x, float64 y); | Double-precision comparison: x != y |
int32 _ _C6000_neqf(float32 x, float32 y); | Single-precision comparison: x != y |
int32 _ _C6000_ltd(float64 x, float64 y); | Double-precision comparison: x < y |
int32 _ _C6000_ltf(float32 x, float32 y); | Single-precision comparison: x < y |
int32 _ _C6000_gtd(float64 x, float64 y); | Double-precision comparison: x > y |
int32 _ _C6000_gtf(float32 x, float32 y); | Single-precision comparison: x > y |
int32 _ _C6000_led(float64 x, float64 y); | Double-precision comparison: x <= y |
int32 _ _C6000_lef(float32 x, float32 y); | Single-precision comparison: x <= y |
int32 _ _C6000_ged(float64 x, float64 y); | Double-precision comparison: x >= y |
int32 _ _C6000_gef(float32 x, float32 y); | Single-precision comparison: x >= y |
The integer divide and remainder functions in Table 8-6 operate according to C semantics.
The _ _C6000_divremi and _ _C6000_divremu functions compute both a quotient (x/y) and remainder (x%y). The quotient is returned in A4 and the remainder in A5. The _ _C6000_divremll and _ _C6000_divremull functions compute the quotient (x/y) and remainder (x%y) of 64-bit integers. The quotient is returned in A5:A4 and the remainder in B5:B4.
Signature | Description |
---|---|
int32 _ _C6000_divi(int32 x, int32 y); | 32-bit signed integer division (x/y) |
int40 _ _C6000_divli(int40 x, int40 y); | 40-bit signed integer division (x/y) |
int64 _ _C6000_divlli(int64 x, int64 y); | 64-bit signed integer division (x/y) |
uint32 _ _C6000_divu(uint32 x, uint32 y); | 32-bit unsigned integer division (x/y) |
uint40 _ _C6000_divlu(uint40 x, uint40 y); | 40-bit unsigned integer division (x/y) |
uint64 _ _C6000_divllu(uint64 x, uint64 y); | 64-bit unsigned integer division (x/y) |
int32 _ _C6000_remi(int32 x, int32 y); | 32-bit signed integer modulo (x%y) |
int40 _ _C6000_remli(int40 x, int40 y); | 40-bit signed integer modulo (x%y) |
int64 _ _C6000_remlli(int64x. int64 y); | 64-bit signed integer modulo (x%y) |
uint32 _ _C6000_remu(uint32 x, uint32 y); | 32-bit unsigned integer modulo (x%y) |
uint40 _ _C6000_remul(uint40, uint40); | 40-bit unsigned integer modulo (x%y) |
uint64 _ _C6000_remull(uint64, uint64); | 64-bit unsigned integer modulo (x%y) |
_ _C6000_divremi(int32 x, int32 y); | 32-bit combined divide and modulo |
_ _C6000_divremu(uint32 x, uint32 y); | 32-bit unsigned combined divide and modulo |
_ _C6000_divremull(uint64 x, uint64 y); | 64-bit unsigned combined divide and modulo |
The wide integer arithmetic functions in Table 8-7 operate according to C semantics.
Signature | Description |
---|---|
int64 _ _C6000_negll(int64 x); | 64-bit integer negate |
uint64 _ _C6000_mpyll(uint64 x, uint64 y); | 64x64 bit multiply |
int64 _ _C6000_mpyiill(int32 x, int32 y); | 32x32 bit multiply |
uint64 _ _C6000_mpyuiill(uint32 x, uint32 y); | 32x32 bit unsigned multiply |
int64 _ _C6000_llshr(int64 x, uint32 y); | 64-bit signed right shift (x>>y) |
uint64 _ _C6000_llshru(uint64 x, uint32 y); | 64-bit unsigned right shift (x>>y) |
uint64 _ _C6000_llshl(uint64 x, uint32 y); | 64-bit left shift (x<<y) |
The miscellaneous helper functions in Table 8-8 are described in the sections that follow.
Signature | Description |
---|---|
void _ _C6000_strasgi(int32 *dst, const int32 *src, uint32 cnt); | Interrupt safe block copy; cnt >= 28 |
void _ _C6000_strasgi_64plus(int32*, const inst32*, uint32) ; | Interrupt safe block copy; cnt >= 28 |
void _ _C6000_abort_msg(const char *string); | Report failed assertion |
void _ _C6000_push_rts(void); | Push all callee-saved registers |
void _ _C6000_pop_rts(void); | Pop all callee-saved registers |
void _ _C6000_call_stub(void); | Save caller-save registers; call B31 |
void _ _C6000_weak_return(void); | Resolution target for imported weak calls |
void _ _C6000_get_addr(ptrdiff_t TPR_offst); | Get the address of the thread-pointer register (TPR) offset. |
void _ _C6000_get_tp(void); | Get the thread pointer value of the current thread. |
void * _ _tls_get_addr(struct TLS_descriptor); | Get the address of a thread-local variable. |
_ _C6000_strasgi
The function _ _C6000_strasgi is generated by the compiler for efficient out-of-line structure or array copy operations. The cnt argument is the size in bytes, which must be a multiple of 4 greater than or equal to 28 (7 words). It makes the following assumptions:
The 7-word minimum is the threshold that allows a software-pipelined loop to be used on C64x+. For smaller objects, the compiler typically generates an inline sequence of load/store instructions. _ _C6000_strasgi does not disable interrupts and can be safely interrupted.
The function _ _C6000_strasgi_64plus is a version of _ _C6000_strasgi optimized for C64x+ architectures.
_ _C6000_abort_msg
The function _ _C6000_abort_msg is generated to print a diagnostic message when a run-time assertion (for example, the C assert macro) fails. It must not return. That is, it must call abort or terminate the program by other means.
_ _C6000_push_rts and _ _C6000_pop_rts
The function _ _c6x_push_rts is used on C64x+ architectures when optimizing for code size. Many functions save and restore most or all of the callee-saved registers. To avoid duplicating the save code in the prolog and restore code in the epilog of each such function, the compiler can employ this library function instead. The function pushes all 13 callee-saved registers on the stack, decrementing SP by 56 bytes, according to the protocol in Section 4.5.4.
The function _ _c6x_push_rts is implemented as shown:
__c6xabi_push_rts:
STW B14, *B15--[2]
STDW A15:A14, *B15--
STDW B13:B12, *B15--
STDW A13:A12, *B15--
STDW B11:B10, *B15--
STDW A11:A10, *B15--
STDW B3:B2, *B15--
B A3
(This is a serial, unscheduled representation. Refer to the source code in the TI run-time library for the actual implementation.)
The function _ _C6000_pop_rts restores the callee-saved registers as pushed by _ _C6000_push_rts and increments (pops) the stack by 56 bytes.
_ _C6000_call_stub
The function _ _C6000_call_stub is also used to help optimize c64x+ functions for code size. Many call sites have several caller-save registers that are live across the call. These registers are not preserved by the call and therefore must be saved and restored by the caller. The compiler can route the call through _ _C6000_call_stub, which performs the following sequence of operations:
In this way the selected registers are preserved across the call without the caller having to save and restore them. The registers preserved by _ _C6000_call_stub are: A0, A1, A2, A6, A7, B0, B1, B2, B4, B5, B6, B7.
The caller invokes _ _C6000_call_stub by placing the address of the function to be called in B31, then branching to _ _C6000_call_stub. (The return address is in B3 as usual.)
The function _ _C6000_call_stub is implemented as shown:
__c6xabi_call_stub:
STW A2, *B15--[2]
STDW A7:A6, *B15--
STDW A1:A0, *B15--
STDW B7:B6, *B15--
STDW B5:B4, *B15--
STDW B1:B0, *B15--
STDW B3:B2, *B15--
ADDKPC __STUB_RET, B3, 0
CALL B31
__STUB_RET:
LDDW *++B15, B3:B2
LDDW *++B15, B1:B0
LDDW *++B15, B5:B4
LDDW *++B15, B7:B6
LDDW *++B15, A1:A0
LDDW *++B15, A7:A6
LDW *++B15[2], A2
B B3
(This is a serial, unscheduled representation. Refer to the source code in the TI run-time library for the actual implementation.)
Since _ _C6000_call_stub uses non-standard conventions, it cannot be called via a PLT entry. Its definition in the library must be marked as STV_INTERNAL or STV_HIDDEN to prevent it from being importable from a shared library.
_ _C6000_weak_return
The function _ _C6000_weak_return is a function that simply returns. The linker shall include it in a dynamic executable or shared object that contains any unresolved calls to imported weak symbols. The dynamic linker can use it to resolve those calls if they remain unresolved at dynamic load time.
_ _C6000_get_addr
The function _ _C6000_get_addr accepts 32-bit TPR offset and returns the address of the thread-local. A special value of -1 is used to indicate a weak undefined reference and a zero is returned in this case. This function is used when compiling for the Static Executable and Bare Metal Dynamic TLS access models. See Chapter 7 for details about thread-local storage.
_ _C6000_get_tp
The function _ _C6000_get_tp returns the thread pointer value for the current thread. This function does not modify any register other than the return register A4. This function can be called via PLT and hence the caller should assume B30 and B31 are modified by the call to this function. See Chapter 7 and Section 14.2.4 for details about thread-local storage.
_ _tls_get_addr
The function _ _tls_get_addr returns the address of a thread-local variable. See Section 7.6.1.1 for details about this function and the TLS_descriptor structure passed to it to specify the offset of a thread-local variable. This function is used when compiling for all access models other than the Static Executable and Bare Metal Dynamic TLS access models. See Chapter 7 for details about thread-local storage.