Skip to content

[LangRef] Rework DIExpression docs #153072

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
150 changes: 6 additions & 144 deletions llvm/docs/LangRef.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6748,161 +6748,23 @@ parameter, and it will be included in the ``retainedNodes:`` field of its
type: !3)
!2 = !DILocalVariable(name: "y", scope: !5, file: !2, line: 7, type: !3)

.. _DIExpression:

DIExpression
""""""""""""

``DIExpression`` nodes represent expressions that are inspired by the DWARF
expression language. They are used in :ref:`debug records <debugrecords>`
(such as ``#dbg_declare`` and ``#dbg_value``) to describe how the
referenced LLVM variable relates to the source language variable. Debug
expressions are interpreted left-to-right: start by pushing the value/address
operand of the record onto a stack, then repeatedly push and evaluate
opcodes from the DIExpression until the final variable description is produced.

The current supported opcode vocabulary is limited:

- ``DW_OP_deref`` dereferences the top of the expression stack.
- ``DW_OP_plus`` pops the last two entries from the expression stack, adds
them together and appends the result to the expression stack.
- ``DW_OP_minus`` pops the last two entries from the expression stack, subtracts
the last entry from the second last entry and appends the result to the
expression stack.
- ``DW_OP_plus_uconst, 93`` adds ``93`` to the working expression.
- ``DW_OP_LLVM_fragment, 16, 8`` specifies the offset and size (``16`` and ``8``
here, respectively) of the variable fragment from the working expression. Note
that contrary to DW_OP_bit_piece, the offset is describing the location
within the described source variable.
- ``DW_OP_LLVM_convert, 16, DW_ATE_signed`` specifies a bit size and encoding
(``16`` and ``DW_ATE_signed`` here, respectively) to which the top of the
expression stack is to be converted. Maps into a ``DW_OP_convert`` operation
that references a base type constructed from the supplied values.
- ``DW_OP_LLVM_extract_bits_sext, 16, 8,`` specifies the offset and size
(``16`` and ``8`` here, respectively) of bits that are to be extracted and
sign-extended from the value at the top of the expression stack. If the top of
the expression stack is a memory location then these bits are extracted from
the value pointed to by that memory location. Maps into a ``DW_OP_shl``
followed by ``DW_OP_shra``.
- ``DW_OP_LLVM_extract_bits_zext`` behaves similarly to
``DW_OP_LLVM_extract_bits_sext``, but zero-extends instead of sign-extending.
Maps into a ``DW_OP_shl`` followed by ``DW_OP_shr``.
- ``DW_OP_LLVM_tag_offset, tag_offset`` specifies that a memory tag should be
optionally applied to the pointer. The memory tag is derived from the
given tag offset in an implementation-defined manner.
- ``DW_OP_swap`` swaps top two stack entries.
- ``DW_OP_xderef`` provides extended dereference mechanism. The entry at the top
of the stack is treated as an address. The second stack entry is treated as an
address space identifier.
- ``DW_OP_stack_value`` marks a constant value.
- ``DW_OP_LLVM_entry_value, N`` refers to the value a register had upon
function entry. When targeting DWARF, a ``DBG_VALUE(reg, ...,
DIExpression(DW_OP_LLVM_entry_value, 1, ...)`` is lowered to
``DW_OP_entry_value [reg], ...``, which pushes the value ``reg`` had upon
function entry onto the DWARF expression stack.

The next ``(N - 1)`` operations will be part of the ``DW_OP_entry_value``
block argument. For example, ``!DIExpression(DW_OP_LLVM_entry_value, 1,
DW_OP_plus_uconst, 123, DW_OP_stack_value)`` specifies an expression where
the entry value of ``reg`` is pushed onto the stack, and is added with 123.
Due to framework limitations ``N`` must be 1, in other words,
``DW_OP_entry_value`` always refers to the value/address operand of the
instruction.

Because ``DW_OP_LLVM_entry_value`` is defined in terms of registers, it is
usually used in MIR, but it is also allowed in LLVM IR when targeting a
:ref:`swiftasync <swiftasync>` argument. The operation is introduced by:

- ``LiveDebugValues`` pass, which applies it to function parameters that
are unmodified throughout the function. Support is limited to simple
register location descriptions, or as indirect locations (e.g.,
parameters passed-by-value to a callee via a pointer to a temporary copy
made in the caller).
- ``AsmPrinter`` pass when a call site parameter value
(``DW_AT_call_site_parameter_value``) is represented as entry value of
the parameter.
- ``CoroSplit`` pass, which may move variables from allocas into a
coroutine frame. If the coroutine frame is a
:ref:`swiftasync <swiftasync>` argument, the variable is described with
an ``DW_OP_LLVM_entry_value`` operation.

- ``DW_OP_LLVM_arg, N`` is used in debug intrinsics that refer to more than one
value, such as one that calculates the sum of two registers. This is always
used in combination with an ordered list of values, such that
``DW_OP_LLVM_arg, N`` refers to the ``N``\ :sup:`th` element in that list. For
example, ``!DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_minus,
DW_OP_stack_value)`` used with the list ``(%reg1, %reg2)`` would evaluate to
``%reg1 - reg2``. This list of values should be provided by the containing
intrinsic/instruction.
- ``DW_OP_breg`` (or ``DW_OP_bregx``) represents a content on the provided
signed offset of the specified register. The opcode is only generated by the
``AsmPrinter`` pass to describe call site parameter value which requires an
expression over two registers.
- ``DW_OP_push_object_address`` pushes the address of the object which can then
serve as a descriptor in subsequent calculation. This opcode can be used to
calculate bounds of fortran allocatable array which has array descriptors.
- ``DW_OP_over`` duplicates the entry currently second in the stack at the top
of the stack. This opcode can be used to calculate bounds of fortran assumed
rank array which has rank known at run time and current dimension number is
implicitly first element of the stack.
- ``DW_OP_LLVM_implicit_pointer`` It specifies the dereferenced value. It can
be used to represent pointer variables which are optimized out but the value
it points to is known. This operator is required as it is different than DWARF
operator DW_OP_implicit_pointer in representation and specification (number
and types of operands) and later can not be used as multiple level.

.. code-block:: text
expression language. They are used in :ref:`debug records <debug_records>`
(such as ``#dbg_declare`` and ``#dbg_value``) to describe how the referenced
LLVM variable relates to the source language variable.

IR for "*ptr = 4;"
--------------
#dbg_value(i32 4, !17, !DIExpression(DW_OP_LLVM_implicit_pointer), !20)
!17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5,
type: !18)
!18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64)
!19 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
!20 = !DILocation(line: 10, scope: !12)

IR for "**ptr = 4;"
--------------
#dbg_value(i32 4, !17,
!DIExpression(DW_OP_LLVM_implicit_pointer, DW_OP_LLVM_implicit_pointer),
!21)
!17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5,
type: !18)
!18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64)
!19 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !20, size: 64)
!20 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
!21 = !DILocation(line: 10, scope: !12)

DWARF specifies three kinds of simple location descriptions: Register, memory,
and implicit location descriptions. Note that a location description is
defined over certain ranges of a program, i.e the location of a variable may
change over the course of the program. Register and memory location
descriptions describe the *concrete location* of a source variable (in the
sense that a debugger might modify its value), whereas *implicit locations*
describe merely the actual *value* of a source variable which might not exist
in registers or in memory (see ``DW_OP_stack_value``).

A ``#dbg_declare`` record describes an indirect value (the address) of a
source variable. The first operand of the record must be an address of some
kind. A DIExpression operand to the record refines this address to produce a
concrete location for the source variable.

A ``#dbg_value`` record describes the direct value of a source variable.
The first operand of the record may be a direct or indirect value. A
DIExpression operand to the record refines the first operand to produce a
direct value. For example, if the first operand is an indirect value, it may be
necessary to insert ``DW_OP_deref`` into the DIExpression in order to produce a
valid debug record.
See :ref:`diexpression` for details.

.. note::

A DIExpression is interpreted in the same way regardless of which kind of
debug record it's attached to.

DIExpressions are always printed and parsed inline; they can never be
referenced by an ID (e.g. ``!1``).

Some examples of expressions:

.. code-block:: text

!DIExpression(DW_OP_deref)
Expand Down
Loading
Loading