SPRAB89A September 2011 – March 2014
Complex multi-threaded programs can be better structured and easier to develop if the threads can use variables with static storage duration and that are specific to the thread. That is, other threads cannot see or access such thread-specific variables with static storage duration. Consider the following C code:
int global_x;
foo() {
int local_x;
static int static_x = 0;
...
}
The global_x
and
static_x
variables are allocated once per process, and all
threads share the same instance. In contrast, local_x
is
allocated from the stack. Since each thread gets its own stack, the variable local_x
is thread specific, while static_x is not. However, there is no easy way to define a
global/static variable on a per thread basis. The POSIX thread interface allows
creating thread-specific static storage variables using pthread getspecific
and pthread setspecific
. But this interface is
cumbersome to use.
To solve this issue, Thread-Local Storage (TLS) is a class of storage that allows a program to define thread-specific variables with static storage durations. A TLS variable or "thread-local" is a global/static variable that is instanced once per thread.
Memory used for TLS is allocated statically for the full time the program runs. Each thread has its own instance of all the thread-local variables (even the ones it doesn't declare or use) that are defined by all of the dynamic modules that are loaded at the time a thread is created. When a thread is created, its TLS block is allocated and initialized by the underlying OS thread support library. A thread’s TLS block is reinitialized if a thread completes and then runs again within the same program run. TLS variables are not re-initialized if the thread is suspended or blocked by other threads and then resumes execution after it becomes un-blocked.
The way a TLS variable is accessed depends on how the OS or RTOS creates and manages thread-local storage for each thread. Linux systems need to support TLS allocation for multiple dynamic libraries and libraries loaded during runtime using dlopen(). Also, Linux systems may require allocating TLS storage lazily only when the thread-local is accessed. This requires sophisticated TLS storage management and affects how the thread-local is accessed. On the other hand, a static executable that includes an RTOS needs only to manage a single TLS block and the access can be simple.
After an overview of thread-local concepts, this document describes how thread-locals are specified in source code and how they are represented in the ELF object file (Section 7.5). Then it describes how thread-locals are accessed for C6x Linux, static executable, and bare-metal dynamic linking TLS models (Section 7.6) and how weak references to thread-local variables are resolved. (Section 7.7).
The C6000 TLS mechanism is based on industry-standard conventions, for example the mechanism described in the ELF Handling for Thread-Local Storage paper by Ulrich Drepper.