SPRAB89 Application note

SPRAB89A September 2011 – March 2014

1 Introduction
1. 1.1 ABIs for the C6000
2. 1.2 Scope
3. 1.3 ABI Variants
4. 1.4 Toolchains and Interoperability
5. 1.5 Libraries
6. 1.6 Types of Object Files
7. 1.7 Segments
8. 1.8 C6000 Architecture Overview
9. 1.9 Reference Documents
10. 1.10 Code Fragment Notation
2 Data Representation
1. 2.1 Basic Types
2. 2.2 Data in Registers
3. 2.3 Data in Memory
4. 2.4 Complex Types
5. 2.5 Structures and Unions
6. 2.6 Arrays
7. 2.7 Bit Fields
  1. 2.7.1 Volatile Bit Fields
8. 2.8 Enumeration Types
3 Calling Conventions
1. 3.1 Call and Return
2. 3.2 Register Conventions
3. 3.3 Argument Passing
4. 3.4 Return Values
5. 3.5 Structures and Unions Passed and Returned by Reference
6. 3.6 Conventions for Compiler Helper Functions
7. 3.7 Scratch Registers for Inter-Section Calls
8. 3.8 Setting Up DP
4 Data Allocation and Addressing
1. 4.1 Data Sections and Segments
2. 4.2 Allocation and Addressing of Static Data
3. 4.3 Automatic Variables
4. 4.4 Frame Layout
5. 4.5 Heap-Allocated Objects
5 Code Allocation and Addressing
1. 5.1 Computing the Address of a Code Label
2. 5.2 Branching
3. 5.3 Calls
4. 5.4 Addressing Compact Instructions
6 Addressing Model for Dynamic Linking
1. 6.1 Terms and Concepts
2. 6.2 Overview of Dynamic Linking Mechanisms
3. 6.3 DSOs and DLLs
4. 6.4 Preemption
5. 6.5 PLT Entries
6. 6.6 The Global Offset Table
  1. 6.6.1 GOT-Based Reference Using Near DP-Relative Addressing
  2. 6.6.2 GOT-Based Reference Using Far DP-Relative Addressing
7. 6.7 The DSBT Model
8. 6.8 Performance Implications of Dynamic Linking
7 Thread-Local Storage Allocation and Addressing
1. 7.1 About Multi-Threading and Thread-Local Storage
2. 7.2 Terms and Concepts
3. 7.3 User Interface
4. 7.4 ELF Object File Representation
5. 7.5 TLS Access Models
6. 7.6 Thread-Local Symbol Resolution and Weak References
8 Helper Function API
1. 8.1 Floating-Point Behavior
2. 8.2 C Helper Function API
3. 8.3 Special Register Conventions for Helper Functions
4. 8.4 Helper Functions for Complex Types
5. 8.5 Floating-Point Helper Functions for C99
9 Standard C Library API
1. 9.1 Reserved Symbols
2. 9.2 <assert.h> Implementation
3. 9.3 <complex.h> Implementation
4. 9.4 <ctype.h> Implementation
5. 9.5 <errno.h> Implementation
6. 9.6 <float.h> Implementation
7. 9.7 <inttypes.h> Implementation
8. 9.8 <iso646.h> Implementation
9. 9.9 <limits.h> Implementation
10. 9.10 <locale.h> Implementation
11. 9.11 <math.h> Implementation
12. 9.12 <setjmp.h> Implementation
13. 9.13 <signal.h> Implementation
14. 9.14 <stdarg.h> Implementation
15. 9.15 <stdbool.h> Implementation
16. 9.16 <stddef.h> Implementation
17. 9.17 <stdint.h> Implementation
18. 9.18 <stdio.h> Implementation
19. 9.19 <stdlib.h> Implementation
20. 9.20 <string.h> Implementation
21. 9.21 <tgmath.h> Implementation
22. 9.22 <time.h> Implementation
23. 9.23 <wchar.h> Implementation
24. 9.24 <wctype.h> Implementation
10C++ ABI
1. 10.1 Limits (GC++ABI 1.2)
2. 10.2 Export Template (GC++ABI 1.4.2)
3. 10.3 Data Layout (GC++ABI Chapter 2)
4. 10.4 Initialization Guard Variables (GC++ABI 2.8)
5. 10.5 Constructor Return Value (GC++ABI 3.1.5)
6. 10.6 One-Time Construction API (GC++ABI 3.3.2)
7. 10.7 Controlling Object Construction Order (GC++ ABI 3.3.4)
8. 10.8 Demangler API (GC++ABI 3.4)
9. 10.9 Static Data (GC++ ABI 5.2.2)
10. 10.10 Virtual Tables and the Key function (GC++ABI 5.2.3)
11. 10.11 Unwind Table Location (GC++ABI 5.3)
11Exception Handling
1. 11.1 Overview
2. 11.2 PREL31 Encoding
3. 11.3 The Exception Index Table (EXIDX)
4. 11.4 The Exception Handling Instruction Table (EXTAB)
5. 11.5 Unwinding Instructions
6. 11.6 Descriptors
7. 11.7 Special Sections
8. 11.8 Interaction With Non-C++ Code
  1. 11.8.1 Automatic EXIDX Entry Generation
  2. 11.8.2 Hand-Coded Assembly Functions
9. 11.9 Interaction With System Features
10. 11.10 Assembly Language Operators in the TI Toolchain
12DWARF
1. 12.1 DWARF Register Names
2. 12.2 Call Frame Information
3. 12.3 Vendor Names
4. 12.4 Vendor Extensions
13ELF Object Files (Processor Supplement)
1. 13.1 Registered Vendor Names
2. 13.2 ELF Header
3. 13.3 Sections
4. 13.4 Symbol Table
5. 13.5 Relocation
14ELF Program Loading and Dynamic Linking (Processor Supplement)
1. 14.1 Program Header
2. 14.2 Program Loading
3. 14.3 Dynamic Linking
4. 14.4 Bare-Metal Dynamic Linking Model
15Linux ABI
1. 15.1 File Types
2. 15.2 ELF Identification
3. 15.3 Program Headers and Segments
4. 15.4 Data Addressing
  1. 15.4.1 Data Segment Base Table (DSBT)
  2. 15.4.2 Global Offset Table (GOT)
5. 15.5 Code Addressing
6. 15.6 Lazy Binding
7. 15.7 Visibility
8. 15.8 Preemption
9. 15.9 Import-as-Own Preemption
10. 15.10 Program Loading
11. 15.11 Dynamic Information
12. 15.12 Initialization and Termination Functions
13. 15.13 Summary of the Linux Model
16Symbol Versioning
1. 16.1 ELF Symbol Versioning Overview
2. 16.2 Version Section Identification
17Build Attributes
1. 17.1 C6000 ABI Build Attribute Subsection
2. 17.2 C6000 Build Attribute Tags
18Copy Tables and Variable Initialization
1. 18.1 Copy Table Format
2. 18.2 Compressed Data Formats
  1. 18.2.1 RLE
  2. 18.2.2 LZSS Format
3. 18.3 Variable Initialization
19Extended Program Header Attributes
1. 19.1 Encoding
2. 19.2 Attribute Tag Definitions
3. 19.3 Extended Program Header Attributes Section Format
20Revision History

7.1 About Multi-Threading and Thread-Local Storage

Complex multi-threaded programs can be better structured and easier to develop if the threads can use variables with static storage duration and that are specific to the thread. That is, other threads cannot see or access such thread-specific variables with static storage duration. Consider the following C code:

    int global_x;
    foo()  {
        int local_x;
        static int static_x = 0;
        ... 
    }

The global_x and static_x variables are allocated once per process, and all threads share the same instance. In contrast, local_x is allocated from the stack. Since each thread gets its own stack, the variable local_x is thread specific, while static_x is not. However, there is no easy way to define a global/static variable on a per thread basis. The POSIX thread interface allows creating thread-specific static storage variables using pthread getspecific and pthread setspecific . But this interface is cumbersome to use.

To solve this issue, Thread-Local Storage (TLS) is a class of storage that allows a program to define thread-specific variables with static storage durations. A TLS variable or "thread-local" is a global/static variable that is instanced once per thread.

Memory used for TLS is allocated statically for the full time the program runs. Each thread has its own instance of all the thread-local variables (even the ones it doesn't declare or use) that are defined by all of the dynamic modules that are loaded at the time a thread is created. When a thread is created, its TLS block is allocated and initialized by the underlying OS thread support library. A thread’s TLS block is reinitialized if a thread completes and then runs again within the same program run. TLS variables are not re-initialized if the thread is suspended or blocked by other threads and then resumes execution after it becomes un-blocked.

The way a TLS variable is accessed depends on how the OS or RTOS creates and manages thread-local storage for each thread. Linux systems need to support TLS allocation for multiple dynamic libraries and libraries loaded during runtime using dlopen(). Also, Linux systems may require allocating TLS storage lazily only when the thread-local is accessed. This requires sophisticated TLS storage management and affects how the thread-local is accessed. On the other hand, a static executable that includes an RTOS needs only to manage a single TLS block and the access can be simple.

After an overview of thread-local concepts, this document describes how thread-locals are specified in source code and how they are represented in the ELF object file (Section 7.5). Then it describes how thread-locals are accessed for C6x Linux, static executable, and bare-metal dynamic linking TLS models (Section 7.6) and how weak references to thread-local variables are resolved. (Section 7.7).

The C6000 TLS mechanism is based on industry-standard conventions, for example the mechanism described in the ELF Handling for Thread-Local Storage paper by Ulrich Drepper.