|  | =============== | 
|  | LLVM Extensions | 
|  | =============== | 
|  |  | 
|  | .. contents:: | 
|  | :local: | 
|  |  | 
|  | .. toctree:: | 
|  | :hidden: | 
|  |  | 
|  | Introduction | 
|  | ============ | 
|  |  | 
|  | This document describes extensions to tools and formats LLVM seeks compatibility | 
|  | with. | 
|  |  | 
|  | General Assembly Syntax | 
|  | =========================== | 
|  |  | 
|  | C99-style Hexadecimal Floating-point Constants | 
|  | ---------------------------------------------- | 
|  |  | 
|  | LLVM's assemblers allow floating-point constants to be written in C99's | 
|  | hexadecimal format instead of decimal if desired. | 
|  |  | 
|  | .. code-block:: gas | 
|  |  | 
|  | .section .data | 
|  | .float 0x1c2.2ap3 | 
|  |  | 
|  | Machine-specific Assembly Syntax | 
|  | ================================ | 
|  |  | 
|  | X86/COFF-Dependent | 
|  | ------------------ | 
|  |  | 
|  | Relocations | 
|  | ^^^^^^^^^^^ | 
|  |  | 
|  | The following additional relocation types are supported: | 
|  |  | 
|  | **@IMGREL** (AT&T syntax only) generates an image-relative relocation that | 
|  | corresponds to the COFF relocation types ``IMAGE_REL_I386_DIR32NB`` (32-bit) or | 
|  | ``IMAGE_REL_AMD64_ADDR32NB`` (64-bit). | 
|  |  | 
|  | .. code-block:: text | 
|  |  | 
|  | .text | 
|  | fun: | 
|  | mov foo@IMGREL(%ebx, %ecx, 4), %eax | 
|  |  | 
|  | .section .pdata | 
|  | .long fun@IMGREL | 
|  | .long (fun@imgrel + 0x3F) | 
|  | .long $unwind$fun@imgrel | 
|  |  | 
|  | **.secrel32** generates a relocation that corresponds to the COFF relocation | 
|  | types ``IMAGE_REL_I386_SECREL`` (32-bit) or ``IMAGE_REL_AMD64_SECREL`` (64-bit). | 
|  |  | 
|  | **.secidx** relocation generates an index of the section that contains | 
|  | the target.  It corresponds to the COFF relocation types | 
|  | ``IMAGE_REL_I386_SECTION`` (32-bit) or ``IMAGE_REL_AMD64_SECTION`` (64-bit). | 
|  |  | 
|  | .. code-block:: none | 
|  |  | 
|  | .section .debug$S,"rn" | 
|  | .long 4 | 
|  | .long 242 | 
|  | .long 40 | 
|  | .secrel32 _function_name + 0 | 
|  | .secidx   _function_name | 
|  | ... | 
|  |  | 
|  | ``.linkonce`` Directive | 
|  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
|  |  | 
|  | Syntax: | 
|  |  | 
|  | ``.linkonce [ comdat type ]`` | 
|  |  | 
|  | Supported COMDAT types: | 
|  |  | 
|  | ``discard`` | 
|  | Discards duplicate sections with the same COMDAT symbol. This is the default | 
|  | if no type is specified. | 
|  |  | 
|  | ``one_only`` | 
|  | If the symbol is defined multiple times, the linker issues an error. | 
|  |  | 
|  | ``same_size`` | 
|  | Duplicates are discarded, but the linker issues an error if any have | 
|  | different sizes. | 
|  |  | 
|  | ``same_contents`` | 
|  | Duplicates are discarded, but the linker issues an error if any duplicates | 
|  | do not have exactly the same content. | 
|  |  | 
|  | ``largest`` | 
|  | Links the largest section from among the duplicates. | 
|  |  | 
|  | ``newest`` | 
|  | Links the newest section from among the duplicates. | 
|  |  | 
|  |  | 
|  | .. code-block:: gas | 
|  |  | 
|  | .section .text$foo | 
|  | .linkonce | 
|  | ... | 
|  |  | 
|  | ``.section`` Directive | 
|  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
|  |  | 
|  | MC supports passing the information in ``.linkonce`` at the end of | 
|  | ``.section``. For example,  these two codes are equivalent | 
|  |  | 
|  | .. code-block:: gas | 
|  |  | 
|  | .section secName, "dr", discard, "Symbol1" | 
|  | .globl Symbol1 | 
|  | Symbol1: | 
|  | .long 1 | 
|  |  | 
|  | .. code-block:: gas | 
|  |  | 
|  | .section secName, "dr" | 
|  | .linkonce discard | 
|  | .globl Symbol1 | 
|  | Symbol1: | 
|  | .long 1 | 
|  |  | 
|  | Note that in the combined form the COMDAT symbol is explicit. This | 
|  | extension exists to support multiple sections with the same name in | 
|  | different COMDATs: | 
|  |  | 
|  |  | 
|  | .. code-block:: gas | 
|  |  | 
|  | .section secName, "dr", discard, "Symbol1" | 
|  | .globl Symbol1 | 
|  | Symbol1: | 
|  | .long 1 | 
|  |  | 
|  | .section secName, "dr", discard, "Symbol2" | 
|  | .globl Symbol2 | 
|  | Symbol2: | 
|  | .long 1 | 
|  |  | 
|  | In addition to the types allowed with ``.linkonce``, ``.section`` also accepts | 
|  | ``associative``. The meaning is that the section is linked  if a certain other | 
|  | COMDAT section is linked. This other section is indicated by the comdat symbol | 
|  | in this directive. It can be any symbol defined in the associated section, but | 
|  | is usually the associated section's comdat. | 
|  |  | 
|  | The following restrictions apply to the associated section: | 
|  |  | 
|  | 1. It must be a COMDAT section. | 
|  | 2. It cannot be another associative COMDAT section. | 
|  |  | 
|  | In the following example the symobl ``sym`` is the comdat symbol of ``.foo`` | 
|  | and ``.bar`` is associated to ``.foo``. | 
|  |  | 
|  | .. code-block:: gas | 
|  |  | 
|  | .section	.foo,"bw",discard, "sym" | 
|  | .section	.bar,"rd",associative, "sym" | 
|  |  | 
|  | MC supports these flags in the COFF ``.section`` directive: | 
|  |  | 
|  | - ``b``: BSS section (``IMAGE_SCN_CNT_INITIALIZED_DATA``) | 
|  | - ``d``: Data section (``IMAGE_SCN_CNT_UNINITIALIZED_DATA``) | 
|  | - ``n``: Section is not loaded (``IMAGE_SCN_LNK_REMOVE``) | 
|  | - ``r``: Read-only | 
|  | - ``s``: Shared section | 
|  | - ``w``: Writable | 
|  | - ``x``: Executable section | 
|  | - ``y``: Not readable | 
|  | - ``D``: Discardable (``IMAGE_SCN_MEM_DISCARDABLE``) | 
|  |  | 
|  | These flags are all compatible with gas, with the exception of the ``D`` flag, | 
|  | which gnu as does not support. For gas compatibility, sections with a name | 
|  | starting with ".debug" are implicitly discardable. | 
|  |  | 
|  |  | 
|  | ARM64/COFF-Dependent | 
|  | -------------------- | 
|  |  | 
|  | Relocations | 
|  | ^^^^^^^^^^^ | 
|  |  | 
|  | The following additional symbol variants are supported: | 
|  |  | 
|  | **:secrel_lo12:** generates a relocation that corresponds to the COFF relocation | 
|  | types ``IMAGE_REL_ARM64_SECREL_LOW12A`` or ``IMAGE_REL_ARM64_SECREL_LOW12L``. | 
|  |  | 
|  | **:secrel_hi12:** generates a relocation that corresponds to the COFF relocation | 
|  | type ``IMAGE_REL_ARM64_SECREL_HIGH12A``. | 
|  |  | 
|  | .. code-block:: gas | 
|  |  | 
|  | add x0, x0, :secrel_hi12:symbol | 
|  | ldr x0, [x0, :secrel_lo12:symbol] | 
|  |  | 
|  | add x1, x1, :secrel_hi12:symbol | 
|  | add x1, x1, :secrel_lo12:symbol | 
|  | ... | 
|  |  | 
|  |  | 
|  | ELF-Dependent | 
|  | ------------- | 
|  |  | 
|  | ``.section`` Directive | 
|  | ^^^^^^^^^^^^^^^^^^^^^^ | 
|  |  | 
|  | In order to support creating multiple sections with the same name and comdat, | 
|  | it is possible to add an unique number at the end of the ``.seciton`` directive. | 
|  | For example, the following code creates two sections named ``.text``. | 
|  |  | 
|  | .. code-block:: gas | 
|  |  | 
|  | .section	.text,"ax",@progbits,unique,1 | 
|  | nop | 
|  |  | 
|  | .section	.text,"ax",@progbits,unique,2 | 
|  | nop | 
|  |  | 
|  |  | 
|  | The unique number is not present in the resulting object at all. It is just used | 
|  | in the assembler to differentiate the sections. | 
|  |  | 
|  | The 'o' flag is mapped to SHF_LINK_ORDER. If it is present, a symbol | 
|  | must be given that identifies the section to be placed is the | 
|  | .sh_link. | 
|  |  | 
|  | .. code-block:: gas | 
|  |  | 
|  | .section .foo,"a",@progbits | 
|  | .Ltmp: | 
|  | .section .bar,"ao",@progbits,.Ltmp | 
|  |  | 
|  | which is equivalent to just | 
|  |  | 
|  | .. code-block:: gas | 
|  |  | 
|  | .section .foo,"a",@progbits | 
|  | .section .bar,"ao",@progbits,.foo | 
|  |  | 
|  | ``.linker-options`` Section (linker options) | 
|  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
|  |  | 
|  | In order to support passing linker options from the frontend to the linker, a | 
|  | special section of type ``SHT_LLVM_LINKER_OPTIONS`` (usually named | 
|  | ``.linker-options`` though the name is not significant as it is identified by | 
|  | the type).  The contents of this section is a simple pair-wise encoding of | 
|  | directives for consideration by the linker.  The strings are encoded as standard | 
|  | null-terminated UTF-8 strings.  They are emitted inline to avoid having the | 
|  | linker traverse the object file for retrieving the value.  The linker is | 
|  | permitted to not honour the option and instead provide a warning/error to the | 
|  | user that the requested option was not honoured. | 
|  |  | 
|  | The section has type ``SHT_LLVM_LINKER_OPTIONS`` and has the ``SHF_EXCLUDE`` | 
|  | flag to ensure that the section is treated as opaque by linkers which do not | 
|  | support the feature and will not be emitted into the final linked binary. | 
|  |  | 
|  | This would be equivalent to the follow raw assembly: | 
|  |  | 
|  | .. code-block:: gas | 
|  |  | 
|  | .section ".linker-options","e",@llvm_linker_options | 
|  | .asciz "option 1" | 
|  | .asciz "value 1" | 
|  | .asciz "option 2" | 
|  | .asciz "value 2" | 
|  |  | 
|  | The following directives are specified: | 
|  |  | 
|  | - lib | 
|  |  | 
|  | The parameter identifies a library to be linked against.  The library will | 
|  | be looked up in the default and any specified library search paths | 
|  | (specified to this point). | 
|  |  | 
|  | - libpath | 
|  |  | 
|  | The paramter identifies an additional library search path to be considered | 
|  | when looking up libraries after the inclusion of this option. | 
|  |  | 
|  | ``SHT_LLVM_CALL_GRAPH_PROFILE`` Section (Call Graph Profile) | 
|  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
|  |  | 
|  | This section is used to pass a call graph profile to the linker which can be | 
|  | used to optimize the placement of sections.  It contains a sequence of | 
|  | (from symbol, to symbol, weight) tuples. | 
|  |  | 
|  | It shall have a type of ``SHT_LLVM_CALL_GRAPH_PROFILE`` (0x6fff4c02), shall | 
|  | have the ``SHF_EXCLUDE`` flag set, the ``sh_link`` member shall hold the section | 
|  | header index of the associated symbol table, and shall have a ``sh_entsize`` of | 
|  | 16.  It should be named ``.llvm.call-graph-profile``. | 
|  |  | 
|  | The contents of the section shall be a sequence of ``Elf_CGProfile`` entries. | 
|  |  | 
|  | .. code-block:: c | 
|  |  | 
|  | typedef struct { | 
|  | Elf_Word cgp_from; | 
|  | Elf_Word cgp_to; | 
|  | Elf_Xword cgp_weight; | 
|  | } Elf_CGProfile; | 
|  |  | 
|  | cgp_from | 
|  | The symbol index of the source of the edge. | 
|  |  | 
|  | cgp_to | 
|  | The symbol index of the destination of the edge. | 
|  |  | 
|  | cgp_weight | 
|  | The weight of the edge. | 
|  |  | 
|  | This is represented in assembly as: | 
|  |  | 
|  | .. code-block:: gas | 
|  |  | 
|  | .cg_profile from, to, 42 | 
|  |  | 
|  | ``.cg_profile`` directives are processed at the end of the file.  It is an error | 
|  | if either ``from`` or ``to`` are undefined temporary symbols.  If either symbol | 
|  | is a temporary symbol, then the section symbol is used instead.  If either | 
|  | symbol is undefined, then that symbol is defined as if ``.weak symbol`` has been | 
|  | written at the end of the file.  This forces the symbol to show up in the symbol | 
|  | table. | 
|  |  | 
|  | ``SHT_LLVM_ADDRSIG`` Section (address-significance table) | 
|  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
|  |  | 
|  | This section is used to mark symbols as address-significant, i.e. the address | 
|  | of the symbol is used in a comparison or leaks outside the translation unit. It | 
|  | has the same meaning as the absence of the LLVM attributes ``unnamed_addr`` | 
|  | and ``local_unnamed_addr``. | 
|  |  | 
|  | Any sections referred to by symbols that are not marked as address-significant | 
|  | in any object file may be safely merged by a linker without breaking the | 
|  | address uniqueness guarantee provided by the C and C++ language standards. | 
|  |  | 
|  | The contents of the section are a sequence of ULEB128-encoded integers | 
|  | referring to the symbol table indexes of the address-significant symbols. | 
|  |  | 
|  | There are two associated assembly directives: | 
|  |  | 
|  | .. code-block:: gas | 
|  |  | 
|  | .addrsig | 
|  |  | 
|  | This instructs the assembler to emit an address-significance table. Without | 
|  | this directive, all symbols are considered address-significant. | 
|  |  | 
|  | .. code-block:: gas | 
|  |  | 
|  | .addrsig_sym sym | 
|  |  | 
|  | This marks ``sym`` as address-significant. | 
|  |  | 
|  | CodeView-Dependent | 
|  | ------------------ | 
|  |  | 
|  | ``.cv_file`` Directive | 
|  | ^^^^^^^^^^^^^^^^^^^^^^ | 
|  | Syntax: | 
|  | ``.cv_file`` *FileNumber FileName* [ *checksum* ] [ *checksumkind* ] | 
|  |  | 
|  | ``.cv_func_id`` Directive | 
|  | ^^^^^^^^^^^^^^^^^^^^^^^^^ | 
|  | Introduces a function ID that can be used with ``.cv_loc``. | 
|  |  | 
|  | Syntax: | 
|  | ``.cv_func_id`` *FunctionId* | 
|  |  | 
|  | ``.cv_inline_site_id`` Directive | 
|  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
|  | Introduces a function ID that can be used with ``.cv_loc``. Includes | 
|  | ``inlined at`` source location information for use in the line table of the | 
|  | caller, whether the caller is a real function or another inlined call site. | 
|  |  | 
|  | Syntax: | 
|  | ``.cv_inline_site_id`` *FunctionId* ``within`` *Function* ``inlined_at`` *FileNumber Line* [ *Colomn* ] | 
|  |  | 
|  | ``.cv_loc`` Directive | 
|  | ^^^^^^^^^^^^^^^^^^^^^ | 
|  | The first number is a file number, must have been previously assigned with a | 
|  | ``.file`` directive, the second number is the line number and optionally the | 
|  | third number is a column position (zero if not specified).  The remaining | 
|  | optional items are ``.loc`` sub-directives. | 
|  |  | 
|  | Syntax: | 
|  | ``.cv_loc`` *FunctionId FileNumber* [ *Line* ] [ *Column* ] [ *prologue_end* ] [ ``is_stmt`` *value* ] | 
|  |  | 
|  | ``.cv_linetable`` Directive | 
|  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
|  | Syntax: | 
|  | ``.cv_linetable`` *FunctionId* ``,`` *FunctionStart* ``,`` *FunctionEnd* | 
|  |  | 
|  | ``.cv_inline_linetable`` Directive | 
|  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
|  | Syntax: | 
|  | ``.cv_inline_linetable`` *PrimaryFunctionId* ``,`` *FileNumber Line FunctionStart FunctionEnd* | 
|  |  | 
|  | ``.cv_def_range`` Directive | 
|  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
|  | The *GapStart* and *GapEnd* options may be repeated as needed. | 
|  |  | 
|  | Syntax: | 
|  | ``.cv_def_range`` *RangeStart RangeEnd* [ *GapStart GapEnd* ] ``,`` *bytes* | 
|  |  | 
|  | ``.cv_stringtable`` Directive | 
|  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
|  |  | 
|  | ``.cv_filechecksums`` Directive | 
|  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
|  |  | 
|  | ``.cv_filechecksumoffset`` Directive | 
|  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
|  | Syntax: | 
|  | ``.cv_filechecksumoffset`` *FileNumber* | 
|  |  | 
|  | ``.cv_fpo_data`` Directive | 
|  | ^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
|  | Syntax: | 
|  | ``.cv_fpo_data`` *procsym* | 
|  |  | 
|  | Target Specific Behaviour | 
|  | ========================= | 
|  |  | 
|  | X86 | 
|  | --- | 
|  |  | 
|  | Relocations | 
|  | ^^^^^^^^^^^ | 
|  |  | 
|  | ``@ABS8`` can be applied to symbols which appear as immediate operands to | 
|  | instructions that have an 8-bit immediate form for that operand. It causes | 
|  | the assembler to use the 8-bit form and an 8-bit relocation (e.g. ``R_386_8`` | 
|  | or ``R_X86_64_8``) for the symbol. | 
|  |  | 
|  | For example: | 
|  |  | 
|  | .. code-block:: gas | 
|  |  | 
|  | cmpq $foo@ABS8, %rdi | 
|  |  | 
|  | This causes the assembler to select the form of the 64-bit ``cmpq`` instruction | 
|  | that takes an 8-bit immediate operand that is sign extended to 64 bits, as | 
|  | opposed to ``cmpq $foo, %rdi`` which takes a 32-bit immediate operand. This | 
|  | is also not the same as ``cmpb $foo, %dil``, which is an 8-bit comparison. | 
|  |  | 
|  | Windows on ARM | 
|  | -------------- | 
|  |  | 
|  | Stack Probe Emission | 
|  | ^^^^^^^^^^^^^^^^^^^^ | 
|  |  | 
|  | The reference implementation (Microsoft Visual Studio 2012) emits stack probes | 
|  | in the following fashion: | 
|  |  | 
|  | .. code-block:: gas | 
|  |  | 
|  | movw r4, #constant | 
|  | bl __chkstk | 
|  | sub.w sp, sp, r4 | 
|  |  | 
|  | However, this has the limitation of 32 MiB (±16MiB).  In order to accommodate | 
|  | larger binaries, LLVM supports the use of ``-mcode-model=large`` to allow a 4GiB | 
|  | range via a slight deviation.  It will generate an indirect jump as follows: | 
|  |  | 
|  | .. code-block:: gas | 
|  |  | 
|  | movw r4, #constant | 
|  | movw r12, :lower16:__chkstk | 
|  | movt r12, :upper16:__chkstk | 
|  | blx r12 | 
|  | sub.w sp, sp, r4 | 
|  |  | 
|  | Variable Length Arrays | 
|  | ^^^^^^^^^^^^^^^^^^^^^^ | 
|  |  | 
|  | The reference implementation (Microsoft Visual Studio 2012) does not permit the | 
|  | emission of Variable Length Arrays (VLAs). | 
|  |  | 
|  | The Windows ARM Itanium ABI extends the base ABI by adding support for emitting | 
|  | a dynamic stack allocation.  When emitting a variable stack allocation, a call | 
|  | to ``__chkstk`` is emitted unconditionally to ensure that guard pages are setup | 
|  | properly.  The emission of this stack probe emission is handled similar to the | 
|  | standard stack probe emission. | 
|  |  | 
|  | The MSVC environment does not emit code for VLAs currently. | 
|  |  | 
|  | Windows on ARM64 | 
|  | ---------------- | 
|  |  | 
|  | Stack Probe Emission | 
|  | ^^^^^^^^^^^^^^^^^^^^ | 
|  |  | 
|  | The reference implementation (Microsoft Visual Studio 2017) emits stack probes | 
|  | in the following fashion: | 
|  |  | 
|  | .. code-block:: gas | 
|  |  | 
|  | mov x15, #constant | 
|  | bl __chkstk | 
|  | sub sp, sp, x15, lsl #4 | 
|  |  | 
|  | However, this has the limitation of 256 MiB (±128MiB).  In order to accommodate | 
|  | larger binaries, LLVM supports the use of ``-mcode-model=large`` to allow a 8GiB | 
|  | (±4GiB) range via a slight deviation.  It will generate an indirect jump as | 
|  | follows: | 
|  |  | 
|  | .. code-block:: gas | 
|  |  | 
|  | mov x15, #constant | 
|  | adrp x16, __chkstk | 
|  | add x16, x16, :lo12:__chkstk | 
|  | blr x16 | 
|  | sub sp, sp, x15, lsl #4 | 
|  |  |