diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst index 6ba3759080cc3..8b9a939eda955 100644 --- a/llvm/docs/LangRef.rst +++ b/llvm/docs/LangRef.rst @@ -6748,161 +6748,23 @@ parameter, and it will be included in the ``retainedNodes:`` field of its type: !3) !2 = !DILocalVariable(name: "y", scope: !5, file: !2, line: 7, type: !3) -.. _DIExpression: - DIExpression """""""""""" ``DIExpression`` nodes represent expressions that are inspired by the DWARF -expression language. They are used in :ref:`debug records ` -(such as ``#dbg_declare`` and ``#dbg_value``) to describe how the -referenced LLVM variable relates to the source language variable. Debug -expressions are interpreted left-to-right: start by pushing the value/address -operand of the record onto a stack, then repeatedly push and evaluate -opcodes from the DIExpression until the final variable description is produced. - -The current supported opcode vocabulary is limited: - -- ``DW_OP_deref`` dereferences the top of the expression stack. -- ``DW_OP_plus`` pops the last two entries from the expression stack, adds - them together and appends the result to the expression stack. -- ``DW_OP_minus`` pops the last two entries from the expression stack, subtracts - the last entry from the second last entry and appends the result to the - expression stack. -- ``DW_OP_plus_uconst, 93`` adds ``93`` to the working expression. -- ``DW_OP_LLVM_fragment, 16, 8`` specifies the offset and size (``16`` and ``8`` - here, respectively) of the variable fragment from the working expression. Note - that contrary to DW_OP_bit_piece, the offset is describing the location - within the described source variable. -- ``DW_OP_LLVM_convert, 16, DW_ATE_signed`` specifies a bit size and encoding - (``16`` and ``DW_ATE_signed`` here, respectively) to which the top of the - expression stack is to be converted. Maps into a ``DW_OP_convert`` operation - that references a base type constructed from the supplied values. -- ``DW_OP_LLVM_extract_bits_sext, 16, 8,`` specifies the offset and size - (``16`` and ``8`` here, respectively) of bits that are to be extracted and - sign-extended from the value at the top of the expression stack. If the top of - the expression stack is a memory location then these bits are extracted from - the value pointed to by that memory location. Maps into a ``DW_OP_shl`` - followed by ``DW_OP_shra``. -- ``DW_OP_LLVM_extract_bits_zext`` behaves similarly to - ``DW_OP_LLVM_extract_bits_sext``, but zero-extends instead of sign-extending. - Maps into a ``DW_OP_shl`` followed by ``DW_OP_shr``. -- ``DW_OP_LLVM_tag_offset, tag_offset`` specifies that a memory tag should be - optionally applied to the pointer. The memory tag is derived from the - given tag offset in an implementation-defined manner. -- ``DW_OP_swap`` swaps top two stack entries. -- ``DW_OP_xderef`` provides extended dereference mechanism. The entry at the top - of the stack is treated as an address. The second stack entry is treated as an - address space identifier. -- ``DW_OP_stack_value`` marks a constant value. -- ``DW_OP_LLVM_entry_value, N`` refers to the value a register had upon - function entry. When targeting DWARF, a ``DBG_VALUE(reg, ..., - DIExpression(DW_OP_LLVM_entry_value, 1, ...)`` is lowered to - ``DW_OP_entry_value [reg], ...``, which pushes the value ``reg`` had upon - function entry onto the DWARF expression stack. - - The next ``(N - 1)`` operations will be part of the ``DW_OP_entry_value`` - block argument. For example, ``!DIExpression(DW_OP_LLVM_entry_value, 1, - DW_OP_plus_uconst, 123, DW_OP_stack_value)`` specifies an expression where - the entry value of ``reg`` is pushed onto the stack, and is added with 123. - Due to framework limitations ``N`` must be 1, in other words, - ``DW_OP_entry_value`` always refers to the value/address operand of the - instruction. - - Because ``DW_OP_LLVM_entry_value`` is defined in terms of registers, it is - usually used in MIR, but it is also allowed in LLVM IR when targeting a - :ref:`swiftasync ` argument. The operation is introduced by: - - - ``LiveDebugValues`` pass, which applies it to function parameters that - are unmodified throughout the function. Support is limited to simple - register location descriptions, or as indirect locations (e.g., - parameters passed-by-value to a callee via a pointer to a temporary copy - made in the caller). - - ``AsmPrinter`` pass when a call site parameter value - (``DW_AT_call_site_parameter_value``) is represented as entry value of - the parameter. - - ``CoroSplit`` pass, which may move variables from allocas into a - coroutine frame. If the coroutine frame is a - :ref:`swiftasync ` argument, the variable is described with - an ``DW_OP_LLVM_entry_value`` operation. - -- ``DW_OP_LLVM_arg, N`` is used in debug intrinsics that refer to more than one - value, such as one that calculates the sum of two registers. This is always - used in combination with an ordered list of values, such that - ``DW_OP_LLVM_arg, N`` refers to the ``N``\ :sup:`th` element in that list. For - example, ``!DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_minus, - DW_OP_stack_value)`` used with the list ``(%reg1, %reg2)`` would evaluate to - ``%reg1 - reg2``. This list of values should be provided by the containing - intrinsic/instruction. -- ``DW_OP_breg`` (or ``DW_OP_bregx``) represents a content on the provided - signed offset of the specified register. The opcode is only generated by the - ``AsmPrinter`` pass to describe call site parameter value which requires an - expression over two registers. -- ``DW_OP_push_object_address`` pushes the address of the object which can then - serve as a descriptor in subsequent calculation. This opcode can be used to - calculate bounds of fortran allocatable array which has array descriptors. -- ``DW_OP_over`` duplicates the entry currently second in the stack at the top - of the stack. This opcode can be used to calculate bounds of fortran assumed - rank array which has rank known at run time and current dimension number is - implicitly first element of the stack. -- ``DW_OP_LLVM_implicit_pointer`` It specifies the dereferenced value. It can - be used to represent pointer variables which are optimized out but the value - it points to is known. This operator is required as it is different than DWARF - operator DW_OP_implicit_pointer in representation and specification (number - and types of operands) and later can not be used as multiple level. - -.. code-block:: text +expression language. They are used in :ref:`debug records ` +(such as ``#dbg_declare`` and ``#dbg_value``) to describe how the referenced +LLVM variable relates to the source language variable. - IR for "*ptr = 4;" - -------------- - #dbg_value(i32 4, !17, !DIExpression(DW_OP_LLVM_implicit_pointer), !20) - !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5, - type: !18) - !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64) - !19 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) - !20 = !DILocation(line: 10, scope: !12) - - IR for "**ptr = 4;" - -------------- - #dbg_value(i32 4, !17, - !DIExpression(DW_OP_LLVM_implicit_pointer, DW_OP_LLVM_implicit_pointer), - !21) - !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5, - type: !18) - !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64) - !19 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !20, size: 64) - !20 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) - !21 = !DILocation(line: 10, scope: !12) - -DWARF specifies three kinds of simple location descriptions: Register, memory, -and implicit location descriptions. Note that a location description is -defined over certain ranges of a program, i.e the location of a variable may -change over the course of the program. Register and memory location -descriptions describe the *concrete location* of a source variable (in the -sense that a debugger might modify its value), whereas *implicit locations* -describe merely the actual *value* of a source variable which might not exist -in registers or in memory (see ``DW_OP_stack_value``). - -A ``#dbg_declare`` record describes an indirect value (the address) of a -source variable. The first operand of the record must be an address of some -kind. A DIExpression operand to the record refines this address to produce a -concrete location for the source variable. - -A ``#dbg_value`` record describes the direct value of a source variable. -The first operand of the record may be a direct or indirect value. A -DIExpression operand to the record refines the first operand to produce a -direct value. For example, if the first operand is an indirect value, it may be -necessary to insert ``DW_OP_deref`` into the DIExpression in order to produce a -valid debug record. +See :ref:`diexpression` for details. .. note:: - A DIExpression is interpreted in the same way regardless of which kind of - debug record it's attached to. - DIExpressions are always printed and parsed inline; they can never be referenced by an ID (e.g. ``!1``). +Some examples of expressions: + .. code-block:: text !DIExpression(DW_OP_deref) diff --git a/llvm/docs/SourceLevelDebugging.rst b/llvm/docs/SourceLevelDebugging.rst index ea27ee5b8fb1b..96d5de7341d21 100644 --- a/llvm/docs/SourceLevelDebugging.rst +++ b/llvm/docs/SourceLevelDebugging.rst @@ -160,15 +160,15 @@ which can have a value, including at least: There is no special provision for "true" constants in LLVM today, and they are instead treated as local or global variables. -A variable is represented by a `local variable `_ -or `global variable `_ metadata node. +A variable is represented by a :ref:`local variable ` or +:ref:`global variable ` metadata node. A "variable fragment" (or just "fragment") is a contiguous span of bits of a variable. -A :ref:`debug record ` which refers to a ``DIExpression`` ending -with a ``DW_OP_LLVM_fragment`` operation describes a fragment of the variable -it refers to. +A :ref:`debug record ` which refers to a :ref:`diexpression` +ending with a ``DW_OP_LLVM_fragment`` operation describes a fragment of the +variable it refers to. The operands of the ``DW_OP_LLVM_fragment`` operation encode the bit offset of the fragment relative to the start of the variable, and the size of the @@ -205,16 +205,16 @@ debugger to interpret the information. To provide basic functionality, the LLVM debugger does have to make some assumptions about the source-level language being debugged, though it keeps these to a minimum. The only common features that the LLVM debugger assumes -exist are `source files `_, and `program objects -`_. These abstract objects are used by a -debugger to form stack traces, show information about local variables, etc. +exist are :ref:`source files `, and :ref:`program objects +`. These abstract objects are used by a debugger to form +stack traces, show information about local variables, etc. This section of the documentation first describes the representation aspects common to any source-language. :ref:`ccxx_frontend` describes the data layout conventions used by the C and C++ front-ends. -Debug information descriptors are `specialized metadata nodes -`_, first-class subclasses of ``Metadata``. +Debug information descriptors are :ref:`specialized metadata nodes +`, first-class subclasses of ``Metadata``. There are two models for defining the values of source variables at different states of the program and tracking these values through optimization and code @@ -229,7 +229,7 @@ document. .. _debug_records: Debug Records ----------------------------- +------------- Debug records define the value that a source variable has during execution of the program; they appear interleaved with instructions, although they are not @@ -256,14 +256,13 @@ comma-separated arguments in parentheses, as with a `call`. #dbg_declare([Value|MDNode], DILocalVariable, DIExpression, DILocation) -This record provides information about a local element (e.g., variable). -The first argument is an SSA ``ptr`` value corresponding to a variable address, -and is typically a static alloca in the function entry block. The second -argument is a `local variable `_ containing a -description of the variable. The third argument is a `complex expression -`_. The fourth argument is a `source location -`_. A ``#dbg_declare`` record describes the -*address* of a source variable. +This record provides information about a local element (e.g., variable). The +first argument is an SSA ``ptr`` value corresponding to a variable address, and +is typically a static alloca in the function entry block. The second argument +is a :ref:`local variable ` containing a description of the +variable. The third argument is a :ref:`complex expression `. +The fourth argument is a :ref:`source location `. A +``#dbg_declare`` record describes the *address* of a source variable. .. code-block:: llvm @@ -299,11 +298,10 @@ must agree on the memory location. #dbg_value([Value|DIArgList|MDNode], DILocalVariable, DIExpression, DILocation) This record provides information when a user source variable is set to a new -value. The first argument is the new value. The second argument is a `local -variable `_ containing a description of the -variable. The third argument is a `complex expression -`_. The fourth argument is a `source location -`_. +value. The first argument is the new value. The second argument is a +:ref:`local variable ` containing a description of the +variable. The third argument is a :ref:`complex expression `. +The fourth argument is a :ref:`source location `. A ``#dbg_value`` record describes the *value* of a source variable directly, not its address. Note that the value operand of this intrinsic may @@ -311,7 +309,7 @@ be indirect (i.e, a pointer to the source variable), provided that interpreting the complex expression derives the direct value. ``#dbg_assign`` -^^^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^ .. toctree:: :hidden: @@ -333,15 +331,21 @@ performs the assignment, and the destination address. The first three arguments are the same as for a ``#dbg_value``. The fourth argument is a ``DIAssignID`` used to reference a store. The fifth is the -destination of the store, the sixth is a `complex -expression `_ that modifies it, and the seventh is a -`source location `_. +destination of the store, the sixth is a :ref:`complex expression +` that modifies it, and the seventh is a :ref:`source location +`. See :doc:`AssignmentTracking` for more info. Debugger intrinsic functions ---------------------------- +.. warning:: + + These intrinsics are deprecated, please use :ref:`debug records + ` instead. For more details see `RemoveDIs + `_. + .. _format_common_intrinsics: In intrinsic-mode, LLVM uses several intrinsic functions (name prefixed with "``llvm.dbg``") to @@ -400,6 +404,189 @@ This intrinsic is equivalent to ``#dbg_assign``: metadata i32 %i, metadata !1, metadata !DIExpression(), metadata !2, metadata ptr %i.addr, metadata !DIExpression(), metadata !3), !dbg !3 +.. _diexpression: + +DIExpression +------------ + +Debug expressions are represented as :ref:`specialized-metadata`. + +Debug expressions are interpreted left-to-right: start by pushing the +value/address operand of the record onto a stack, then repeatedly push and +evaluate opcodes from the DIExpression until the final variable description is +produced. + +The opcodes available in these expressions are described in +:ref:`dwarf-opcodes` and :ref:`internal-opcodes`. + +DWARF specifies three kinds of simple location descriptions: Register, memory, +and implicit location descriptions. Note that a location description is +defined over certain ranges of a program, i.e the location of a variable may +change over the course of the program. Register and memory location +descriptions describe the *concrete location* of a source variable (in the +sense that a debugger might modify its value), whereas *implicit locations* +describe merely the actual *value* of a source variable which might not exist +in registers or in memory (see ``DW_OP_stack_value``). + +A ``#dbg_declare`` record describes an indirect value (the address) of a +source variable. The first operand of the record must be an address of some +kind. A DIExpression operand to the record refines this address to produce a +concrete location for the source variable. + +A ``#dbg_value`` record describes the direct value of a source variable. +The first operand of the record may be a direct or indirect value. A +DIExpression operand to the record refines the first operand to produce a +direct value. For example, if the first operand is an indirect value, it may be +necessary to insert ``DW_OP_deref`` into the DIExpression in order to produce a +valid debug record. + +.. note:: + + A DIExpression is interpreted in the same way regardless of which kind of + debug record it's attached to. + + DIExpressions are always printed and parsed inline; they can never be + referenced by an ID (e.g. ``!1``). + +Examples using ``DW_OP_LLVM_implicit_pointer``: + +.. code-block:: text + + IR for "*ptr = 4;" + -------------- + #dbg_value(i32 4, !17, !DIExpression(DW_OP_LLVM_implicit_pointer), !20) + !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5, + type: !18) + !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64) + !19 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) + !20 = !DILocation(line: 10, scope: !12) + + IR for "**ptr = 4;" + -------------- + #dbg_value(i32 4, !17, + !DIExpression(DW_OP_LLVM_implicit_pointer, DW_OP_LLVM_implicit_pointer), + !21) + !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5, + type: !18) + !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64) + !19 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !20, size: 64) + !20 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) + !21 = !DILocation(line: 10, scope: !12) + + +.. _dwarf-opcodes: + +DWARF Opcodes +^^^^^^^^^^^^^ + +When possible LLVM reuses DWARF opcodes and gives them identical semantics in +LLVM expressions as in DWARF expressions. The current supported opcode +vocabulary is limited, but includes at least: + +- ``DW_OP_deref`` dereferences the top of the expression stack. +- ``DW_OP_plus`` pops the last two entries from the expression stack, adds + them together and appends the result to the expression stack. +- ``DW_OP_minus`` pops the last two entries from the expression stack, subtracts + the last entry from the second last entry and appends the result to the + expression stack. +- ``DW_OP_plus_uconst, 93`` adds ``93`` to the working expression. +- ``DW_OP_swap`` swaps top two stack entries. +- ``DW_OP_xderef`` provides extended dereference mechanism. The entry at the top + of the stack is treated as an address. The second stack entry is treated as an + address space identifier. +- ``DW_OP_stack_value`` marks a constant value. +- ``DW_OP_breg`` (or ``DW_OP_bregx``) represents a content on the provided + signed offset of the specified register. The opcode is only generated by the + ``AsmPrinter`` pass to describe call site parameter value which requires an + expression over two registers. +- ``DW_OP_push_object_address`` pushes the address of the object which can then + serve as a descriptor in subsequent calculation. This opcode can be used to + calculate bounds of fortran allocatable array which has array descriptors. +- ``DW_OP_over`` duplicates the entry currently second in the stack at the top + of the stack. This opcode can be used to calculate bounds of fortran assumed + rank array which has rank known at run time and current dimension number is + implicitly first element of the stack. + +.. _internal-opcodes: + +Internal Opcodes +^^^^^^^^^^^^^^^^ + +Where the DWARF equivalent is not suitable, or no DWARF equivalent exists, LLVM +defines internal-only opcodes which have no direct analog in DWARF. + +.. note:: + + Some opcodes do not influence the final DWARF expression directly, instead + encoding information logically belonging to the debug records which use + them. + +- ``DW_OP_LLVM_fragment, 16, 8`` specifies the offset and size (``16`` and + ``8`` here, respectively) of the variable fragment from the working + expression. Note that contrary to DW_OP_bit_piece, the offset is describing + the location within the described source variable. This does not affect the + semantics of the expression. +- ``DW_OP_LLVM_convert, 16, DW_ATE_signed`` specifies a bit size and encoding + (``16`` and ``DW_ATE_signed`` here, respectively) to which the top of the + expression stack is to be converted. Maps into a ``DW_OP_convert`` operation + that references a base type constructed from the supplied values. +- ``DW_OP_LLVM_tag_offset, tag_offset`` specifies that a memory tag should be + optionally applied to the pointer. The memory tag is derived from the + given tag offset in an implementation-defined manner. This does not affect + the semantics of the expression. +- ``DW_OP_LLVM_entry_value, N`` refers to the value a register had upon + function entry. When targeting DWARF, a ``DBG_VALUE(reg, ..., + DIExpression(DW_OP_LLVM_entry_value, 1, ...)`` is lowered to + ``DW_OP_entry_value [reg], ...``, which pushes the value ``reg`` had upon + function entry onto the DWARF expression stack. + + The next ``(N - 1)`` operations will be part of the ``DW_OP_entry_value`` + block argument. For example, ``!DIExpression(DW_OP_LLVM_entry_value, 1, + DW_OP_plus_uconst, 123, DW_OP_stack_value)`` specifies an expression where + the entry value of ``reg`` is pushed onto the stack, and is added with 123. + Due to framework limitations ``N`` must be 1, in other words, + ``DW_OP_entry_value`` always refers to the value/address operand of the + instruction. + + Because ``DW_OP_LLVM_entry_value`` is defined in terms of registers, it is + usually used in MIR, but it is also allowed in LLVM IR when targeting a + :ref:`swiftasync ` argument. The operation is introduced by: + + - ``LiveDebugValues`` pass, which applies it to function parameters that + are unmodified throughout the function. Support is limited to simple + register location descriptions, or as indirect locations (e.g., + parameters passed-by-value to a callee via a pointer to a temporary copy + made in the caller). + - ``AsmPrinter`` pass when a call site parameter value + (``DW_AT_call_site_parameter_value``) is represented as entry value of + the parameter. + - ``CoroSplit`` pass, which may move variables from allocas into a + coroutine frame. If the coroutine frame is a + :ref:`swiftasync ` argument, the variable is described with + an ``DW_OP_LLVM_entry_value`` operation. + +- ``DW_OP_LLVM_implicit_pointer`` It specifies the dereferenced value. It can + be used to represent pointer variables which are optimized out but the value + it points to is known. This operator is required as it is different than DWARF + operator DW_OP_implicit_pointer in representation and specification (number + and types of operands) and later can not be used as multiple level. +- ``DW_OP_LLVM_arg, N`` is used in debug intrinsics that refer to more than one + value, such as one that calculates the sum of two registers. This is always + used in combination with an ordered list of values, such that + ``DW_OP_LLVM_arg, N`` refers to the ``N``\ :sup:`th` element in that list. For + example, ``!DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_minus, + DW_OP_stack_value)`` used with the list ``(%reg1, %reg2)`` would evaluate to + ``%reg1 - reg2``. This list of values should be provided by the containing + intrinsic/instruction. +- ``DW_OP_LLVM_extract_bits_sext, 16, 8,`` specifies the offset and size + (``16`` and ``8`` here, respectively) of bits that are to be extracted and + sign-extended from the value at the top of the expression stack. If the top of + the expression stack is a memory location then these bits are extracted from + the value pointed to by that memory location. Maps into a ``DW_OP_shl`` + followed by ``DW_OP_shra``. +- ``DW_OP_LLVM_extract_bits_zext`` behaves similarly to + ``DW_OP_LLVM_extract_bits_sext``, but zero-extends instead of sign-extending. + Maps into a ``DW_OP_shl`` followed by ``DW_OP_shr``. Object lifetimes and scoping ============================ @@ -506,11 +693,11 @@ scope information for the variable ``X``. isLocal: false, isDefinition: true, scopeLine: 1, isOptimized: false, retainedNodes: !2) -Here ``!13`` is metadata providing `location information -`_. In this example, scope is encoded by ``!4``, a -`subprogram descriptor `_. This way the location -information parameter to the records indicates that the variable ``X`` is -declared at line number 2 at a function level scope in function ``foo``. +Here ``!13`` is metadata providing :ref:`location information `. +In this example, scope is encoded by ``!4``, a :ref:`subprogram descriptor +`. This way the location information parameter to the records +indicates that the variable ``X`` is declared at line number 2 at a function +level scope in function ``foo``. Now, let's take another example. @@ -782,8 +969,7 @@ And has the following operands: location operands, which may take any of the same values as the first operand of the ``DBG_VALUE`` instruction above. These variable location operands are inserted into the final DWARF Expression in positions indicated - by the ``DW_OP_LLVM_arg`` operator in the `DIExpression - `_. + by the ``DW_OP_LLVM_arg`` operator in the :ref:`diexpression`. The position at which the DBG_VALUEs are inserted should correspond to the positions of their matching ``#dbg_value`` records in the IR block. As