SPRAB89A September 2011 – March 2014
This is the most generic TLS access model. Objects using this access model can be used to build any Linux module: executables, initially loaded modules, and dlopened modules. The generated code for this model cannot assume the module-id or the offset is known during static linking.
With this access model, a dynamic module can be loaded at run time. To allow for this possibility, the thread library’s thread management architecture must provide a way for TLS blocks to be added and removed as dynamic modules are loaded and unloaded at run-time.
The compiler generates a call to __tls_get_addr() to get the address of the thread-local variable. The module-id and the thread-local variable’s offset in the module’s TLS block are passed as parameters. The code obtains the module-id and offset from the Global Offset Table (GOT) entries to ensure position independence (PIC) and symbol preemption.
The simplest way for the __tls_get_addr() function to pass the module-id and offset is as follows:
void * __tls_get_addr(unsigned int module_id, ptrdiff_t offset);
Note that both are 32-bit arguments, and the GOT entries are also 32-bit entries. As an optimization, we can load these two GOT entries as a 64-bit double word if the ISA supports this. The two GOT entries must be allocated consecutively and aligned to a 64-bit boundary. This GOT entity can be thought of as the following struct:
struct TLS_descriptor
{
unsigned int module_id;
ptrditt_t offset;
} __attribute__ ((aligned (8)));
Then the __tls_get_addr() interface becomes:
void * __tls_get_addr(struct TLS_descriptor);
In this EABI, a struct of size 64 bits or less is passed by value, resulting in passing the TLS descriptor in the A5:A4 register pair. In little-endian mode, the module-id is passed in A4 and the offset is in A5. In big-endian mode, the registers are swapped as per the C6x EABI calling conventions. The examples in this section use little-endian mode.
Using this interface, the thread-local access becomes the following (for C64 and above):
LDDW *+DP($GOT_TLS(X)), A5:A4 ;reloc R_C6000_SBR_GOT_U15_D_TLS
|| CALLP __tls_get_addr,B3 ; A4 has the address of X at return
LDW *A4, A4 ; A4 has the value of X
The relocation R_C6000_SBR_GOT_U15_D_TLS causes the linker to create GOT entries for the module-id and offset for x as follows:
64-bit aligned address:
GOT[n] ;reloc R_C6000_TLSMOD (symbol X)
GOT[n+1] ;reloc R_C6000_TBR_U32 (symbol X)
The linker then resolves the R_C6000_SBR_GOT_U15_D_TLS relocation with the DP-relative offset of the GOT entity. The dynamic loader resolves R_C6000_TLSMOD to the module-id of the module where x is defined. It resolves R_C6000_TBR_U32 to the offset of x in the module’s TLS block.
The C6x ISA does not currently have an instruction to load the 64-bit TLS descriptor directly. However, we define the __tls_get_addr() interface using the 64-bit descriptor in anticipation of a future ISA having such support.
void * __tls_get_addr(struct TLS_descriptor);
The linker is required to allocate the GOT entries of a thread-local variable’s module-id and offset consecutively and align the first entry to a 64-bit boundary when the R_C6000_SBR_GOT_U15_D_TLS relocation is found.
Lacking support for a DP-relative 64-bit load, the following sequence can be used on current ISAs:
LDW *+DP($GOT_TLSMOD(X)), A5 ;reloc R_C6000_SBR_GOT_U15_W_TLSMOD
LDW *+DP($GOT_TBR(X)), A4 ;reloc R_C6000_SBR_GOT_U15_W_TBR
|| CALLP __tls_get_addr,B3 ; A4 has the address of X at return
LDW *A4, A4 ; A4 has the value of X
The relocations R_C6000_SBR_GOT_U15_W_TLSMOD and R_C6000_SBR_GOT_U15_W_TBR cause the linker to create GOT entries for the module-id and offset respectively for x. This access mode does not require these GOT entries to be consecutive and 64-bit aligned. If the linker does not also see a DW_TLS relocation for the same symbol, it is free to define the module-id and offset GOT entries separately without 64-bit alignment. However, if it sees DW_TLS in addition to the TLSMOD/TBR relocations for the same symbol, 64-bit aligned consecutive GOT entries must be defined and reused for the TLSMOD/TBR relocations.
If the GOT must be addressed using far-DP addressing, then the general dynamic addressing becomes:
MVKL $DPR_GOT_TLSMOD(X), A5 ;reloc R_C6000_SBR_GOT_L16_W_TLSMOD
MVKH $DPR_GOT_TLSMOD(X), A5 ;reloc R_C6000_SBR_GOT_H16_W_TLSMOD
ADD DP, A5, A5
LDW *A5, A5
MVKL $DPR_GOT_TPR(X), A4 ;reloc R_C6000_SBR_GOT_L16_W_TBR
MVKH $DPR_GOT_TPR(X), A4 ;reloc R_C6000_SBR_GOT_H16_W_TBR
ADD DP, A4, A4
LDW *A4, A4
|| CALLP __tls_get_addr,B3 ; A4 has the address of X at return
LDW *A4, A4 ; A4 has the value of X
__tls_get_addr() can calculate the thread-local address as follows:
void * __tls_get_addr(struct TLS_descriptor desc)
{
void *TP = __c6xabi_get_tp();
int *dtv = (int*)(((int*) TP)[0]);
char *tls = (char *)dtv[desc.module_id];
return tls + desc.offset;
}