Skip to content
Draft
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
200 changes: 184 additions & 16 deletions docs/core/diagnostics/dotnet-trace.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@
* Is a cross-platform .NET Core tool.
* Enables the collection of .NET Core traces of a running process without a native profiler.
* Is built on [`EventPipe`](./eventpipe.md) of the .NET Core runtime.
* Delivers the same experience on Windows, Linux, or macOS.
* On Linux, provides additional integration with kernel user_events for native tracing tool compatibility.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to tweak the wording a bit later, but not focusing on this for the moment :)


## Options

Expand All @@ -55,15 +55,12 @@

Displays the version of the dotnet-trace utility.

- **`--duration`**

How long to run the trace. `--duration 00:00:00:05` will run it for 5 seconds.

## Commands

| Command |
|-----------------------------------------------------------|
| [dotnet-trace collect](#dotnet-trace-collect) |
| [dotnet-trace collect-linux](#dotnet-trace-collect-linux) |
| [dotnet-trace convert](#dotnet-trace-convert) |
| [dotnet-trace ps](#dotnet-trace-ps) |
| [dotnet-trace list-profiles](#dotnet-trace-list-profiles) |
Expand All @@ -76,16 +73,27 @@
### Synopsis

```dotnetcli
dotnet-trace collect [--buffersize <size>] [--clreventlevel <clreventlevel>] [--clrevents <clrevents>]
dotnet-trace collect
[--buffersize <size>]
[--clreventlevel <clreventlevel>]
[--clrevents <clrevents>]
[--dsrouter <ios|ios-sim|android|android-emu>]
[--format <Chromium|NetTrace|Speedscope>] [-h|--help] [--duration dd:hh:mm:ss]
[-n, --name <name>] [--diagnostic-port] [-o|--output <trace-file-path>] [-p|--process-id <pid>]
[--profile <profile-name>] [--providers <list-of-comma-separated-providers>]
[--format <Chromium|NetTrace|Speedscope>]
[-h|--help]
[--duration dd:hh:mm:ss]
[-n, --name <name>]
[--diagnostic-port]
[-o|--output <trace-file-path>]
[-p|--process-id <pid>]
[--profile <profile-name>]
[--providers <list-of-comma-separated-providers>]
[-- <command>] (for target applications running .NET 5 or later)
[--show-child-io] [--resume-runtime]
[--show-child-io]
[--resume-runtime]
[--stopping-event-provider-name <stoppingEventProviderName>]
[--stopping-event-event-name <stoppingEventEventName>]
[--stopping-event-payload-filter <stoppingEventPayloadFilter>]
[--event-filters <list-of-comma-separated-event-filters>]
```

### Options
Expand Down Expand Up @@ -158,7 +166,7 @@

- **`--dsrouter {ios|ios-sim|android|android-emu}**

Starts [dotnet-dsrouter](dotnet-dsrouter.md) and connects to it. Requires [dotnet-dsrouter](dotnet-dsrouter.md) to be installed. Run `dotnet-dsrouter -h` for more information.
Starts [dotnet-dsrouter](dotnet-dsrouter.md) and connects to it. Requires [dotnet-dsrouter](dotnet-dsrouter.md) to be installed. Run `dotnet-dsrouter -h` for more information.

- **`--format {Chromium|NetTrace|Speedscope}`**

Expand Down Expand Up @@ -204,11 +212,11 @@

A named pre-defined set of provider configurations that allows common tracing scenarios to be specified succinctly. The following profiles are available:

| Profile | Description |
|---------|-------------|
|`cpu-sampling`|Useful for tracking CPU usage and general .NET runtime information. This is the default option if no profile or providers are specified.|
|`gc-verbose`|Tracks GC collections and samples object allocations.|
|`gc-collect`|Tracks GC collections only at very low overhead.|
| Profile | Description |
|---------|-------------|
|`cpu-sampling`|Useful for tracking CPU usage and general .NET runtime information. This is the default option if no profile or providers are specified.|
|`gc-verbose`|Tracks GC collections and samples object allocations.|
|`gc-collect`|Tracks GC collections only at very low overhead.|

- **`--providers <list-of-comma-separated-providers>`**

Expand Down Expand Up @@ -249,6 +257,34 @@

A string, parsed as [payload_field_name]:[payload_field_value] pairs separated by commas, that will stop the trace upon hitting an event containing all specified payload pairs. Requires `--stopping-event-provider-name` and `--stopping-event-event-name` to be set. for example, `--stopping-event-provider-name Microsoft-Windows-DotNETRuntime --stopping-event-event-name Method/JittingStarted --stopping-event-payload-filter MethodNameSpace:Program,MethodName:OnButtonClick` to stop the trace upon the first `Method/JittingStarted` event for the method `OnButtonClick` in the `Program` namespace emitted by the `Microsoft-Windows-DotNETRuntime` event provider.

- **`--event-filters <list-of-comma-separated-event-filters>`**

Defines an additional optional filter for each provider's events. When no `--event-filters` is specified for a provider, all events allowed by the provider's keywords and level configuration are collected. Event filters provide additional granular control beyond the keyword/level filtering.

**Format:** `ProviderName:<Enable>:<EventIds>`

Where:
- `ProviderName`: The EventPipe provider name (e.g., `Microsoft-Windows-DotNETRuntime`)
- `Enable` : Boolean value indicating whether EventIds will be enabled or disabled, defaults to false
- `EventIds`: Plus-delimited event IDs to enable or disable, defaults to empty.

**Examples:**
```

Check failure on line 272 in docs/core/diagnostics/dotnet-trace.md

View workflow job for this annotation

GitHub Actions / lint

Fenced code blocks should be surrounded by blank lines

docs/core/diagnostics/dotnet-trace.md:272 MD031/blanks-around-fences Fenced code blocks should be surrounded by blank lines [Context: "```"] https://github.com/DavidAnson/markdownlint/blob/v0.38.0/doc/md031.md
# Scenario: Disable specific events from Microsoft-Windows-DotNETRuntime
--event-filters "Microsoft-Windows-DotNETRuntime:false:1+2+3+4+5+6+7+8+9"

# Scenario: Enable specific events from a provider
--event-filters "Microsoft-Windows-DotNETRuntime:true:80+129+130+250"
# Only events 80, 129, 130, and 250 will be collected from this provider (others are filtered out)

# Scenario: Multiple providers with mixed filtering - some providers have no filters
--providers "Microsoft-Windows-DotNETRuntime:0xFFFFFFFF:5,System.Threading.Tasks.TplEventSource:0xFFFFFFFF:5,MyCustomProvider:0xFFFFFFFF:5"
--event-filters "Microsoft-Windows-DotNETRuntime:false:1+2+3,System.Threading.Tasks.TplEventSource:true:7+8+9"
# Microsoft-Windows-DotNETRuntime: All events EXCEPT 1,2,3 are collected
# System.Threading.Tasks.TplEventSource: ONLY events 7,8,9 are collected
# MyCustomProvider: ALL events are collected (no filter specified - follows provider keywords/level)
```

> [!NOTE]

> - Stopping the trace may take a long time (up to minutes) for large applications. The runtime needs to send over the type cache for all managed code that was captured in the trace.
Expand All @@ -259,6 +295,138 @@

> - When specifying a stopping event through the `--stopping-event-*` options, as the EventStream is being parsed asynchronously, there will be some events that pass through between the time a trace event matching the specified stopping event options is parsed and the EventPipeSession is stopped.

## dotnet-trace collect-linux

Collects diagnostic traces from .NET applications using Linux user_events as a transport layer. This command provides the same functionality as [`dotnet-trace collect`](#dotnet-trace-collect) but routes .NET runtime events through the Linux kernel's user_events subsystem before writing them to `.nettrace` files.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend the initial technology we refer to should be perf_events and try to explain that:

  • Its a Linux OS technology
  • It supports capturing a variety of events from kernel and user mode
  • It requires admin privileges
  • By default it captures events from all processes

We could mention that the .NET portion of those events are communicated using the user_events feature as a detail in the broader explanation.

This command provides the same functionality as dotnet-trace collect

We probably don't want to say this is the 'same functionality as dotnet-trace collect' because it can do more. Instead we might say it supports including the same .NET events.

Copy link
Member Author

@mdh1418 mdh1418 Aug 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback! Reworded the introduction for the new verb


This transport approach enables automatic unification of user-space .NET events with kernel-space system events, since both are captured in the same kernel tracing infrastructure. Linux tools like `perf` and `ftrace` can monitor events in real-time while maintaining full compatibility with existing .NET profiling workflows.

### Prerequisites

- Linux kernel with `CONFIG_USER_EVENTS=y` support (kernel 6.4+)
- Appropriate permissions to access `/sys/kernel/tracing/user_events_data`
- .NET 10+

### Synopsis

```dotnetcli
dotnet-trace collect-linux
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can prune away some of these options

  • buffersize - If OneCollect can reasonably pick the size then we may not need an option to explicitly set it.
  • diagnostics-port - I think we'd only need this in some advanced scenarios. Given that we could be profiling multiple processes maybe we'd even need multiple ports? I'd suggest lets leave it out for now and wait to see what happens.
  • resume-runtime - if we don't have diagnostics port then we shouldn't need this.
  • event-filters - I'd suggest we leave this out for now to keep things simpler.
  • tracepoint-configs - can we leave this out? If the user cares about the tracepoint names that imples they are going to use some other tool besides dotnet-trace to record the events. But if that is true I'm not sure why they'd want dotnet-trace to be creating a nettrace file as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would there be a scenario where users are capturing events with another tool, but they want to start capturing runtime events from their .NET app as well, and just use dotnet-trace as the medium to start a user_events EventPipeSession?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can conceive of that scenario, but I'm suggesting we don't do anything to proactively support it right now to keep the scope smaller.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pruned and grouped the options into functionality (the previous order already deviates from dotnet-trace collect --help, and unless there's a reason to keep the options in some sort of alphabetical order, I think the functional order is easier to understand)

[--buffersize <size>]
[--clreventlevel <clreventlevel>]
[--clrevents <clrevents>]
[--format <Chromium|NetTrace|Speedscope>]
[-h|--help]
[--duration dd:hh:mm:ss]
[-n, --name <name>]
[--diagnostic-port]
[-o|--output <trace-file-path>]
[-p|--process-id <pid>]
[--profile <profile-name>]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need to decide what profiles are available and which events they each collect. I expect this is mostly the same as for the 'collect' verb, but cpu-sampling probably should be different. We also might want a thread-time profile that collects context switches.

If we do update these profiles, we should strongly consider also renaming the highly misleading "cpu-sampling" profile for the collect verb. Currently that profile collects thread-time information, not CPU samples.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I lightly updated the profiles for both collect and collect-linux. Still up for refinement once I have a better understanding of how to mix/match the .NET events and linux perf events. I'm wondering if we should allow --profiles if we want to offer several orthogonal profiles and allow customers to mix/match. We already have some logic to mix a profile with the providers/keywords as well.

[--providers <list-of-comma-separated-providers>]
[-- <command>] (for target applications running .NET 10 or later)
[--show-child-io]
[--resume-runtime]
[--stopping-event-provider-name <stoppingEventProviderName>]
[--stopping-event-event-name <stoppingEventEventName>]
[--stopping-event-payload-filter <stoppingEventPayloadFilter>]
[--event-filters <list-of-comma-separated-event-filters>]
[--tracepoint-configs <list-of-comma-separated-tracepoint-configs>]
[--kernel-events <list-of-kernel-events>]
```

### Options

`dotnet-trace collect-linux` supports all the same options as [`dotnet-trace collect`](#dotnet-trace-collect), excluding `--dsrouter`, and additionally offers:

- **`--tracepoint-configs <list-of-comma-separated-tracepoint-configs>` (required)**

Defines the explicit mapping between EventPipe providers and kernel tracepoints. Each provider in `--providers` must have a corresponding entry in `--tracepoint-configs`

**Format:** `ProviderName:<DefaultTracepointName>:<TracepointSets>`

Where:
- `ProviderName`: The EventPipe provider name (e.g., `Microsoft-Windows-DotNETRuntime`)
- `DefaultTracepointName`: Default tracepoint name for this provider (can be empty to require explicit assignment)
- `TracepointSets`: Semi-colon delimited `TracepointName=<EventIds>`
- `EventIds`: Plus-delimited event IDs to route to that tracepoint

> [!NOTE]
> All tracepoint names are automatically prefixed with the provider name to avoid collisions. For example, `gc_events` for the `Microsoft-Windows-DotNETRuntime` provider becomes `Microsoft_Windows_DotNETRuntime_gc_events`.

> [!TIP]
> Use `--event-filters` to disable specific events before they are routed to tracepoints. Event filtering happens before tracepoint routing - only events that pass the filter will be sent to their assigned tracepoints.

**Examples:**
```

Check failure on line 360 in docs/core/diagnostics/dotnet-trace.md

View workflow job for this annotation

GitHub Actions / lint

Fenced code blocks should be surrounded by blank lines

docs/core/diagnostics/dotnet-trace.md:360 MD031/blanks-around-fences Fenced code blocks should be surrounded by blank lines [Context: "```"] https://github.com/DavidAnson/markdownlint/blob/v0.38.0/doc/md031.md
# Scenario: All events from provider go to a default tracepoint
--tracepoint-configs "Microsoft-Windows-DotNETRuntime:dotnet_runtime"
# All enabled events from Microsoft-Windows-DotNETRuntime will be written to Microsoft_Windows_DotNETRuntime_dotnet_runtime

# Scenario: Split events by categories
--tracepoint-configs "Microsoft-Windows-DotNETRuntime::gc_events=1+2+3;jit_events=10+11+12"
# EventIDs 1, 2, and 3 will be written to Microsoft_Windows_DotNETRuntime_gc_events
# EventIDs 10, 11, and 12 will be written to Microsoft_Windows_DotNETRuntime_jit_events

# Multiple providers (comma-separated)
--tracepoint-configs "Microsoft-Windows-DotNETRuntime::gc_events=1+2+3,MyCustomProvider:custom_events"
# EventIds 1, 2, and 3 from Microsoft-Windows-DotNETRuntime will be written to Microsoft_Windows_DotNETRuntime_gc_events
# All enabled events from MyCustomProvider will be written to MyCustomProvider_custom_events
```

- **`--kernel-events <list-of-kernel-events>` (optional)**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should call this perf-events or some other name? kernel-events to me implies that the kernel is generating the event yet some of these events in this list might be generated by user-mode code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, changed to perf-event-tracepoints

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should work with Beau to figure out what kind of event names can be supported. For example https://man7.org/linux/man-pages/man1/perf-record.1.html shows a whole bunch of different things can be specified as an event and I have no idea what part of that OneCollect handles.

I'm hoping we can at least support anything in /sys/kernel/tracing/available_events as well as the symbolic PMU events like cpu-cycles.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like all events under /sys/kernel/tracing/available_events can be enabled, so I changed the wording to suggest that all of those events found there or found categorically under /sys/kernel/tracing/events/ can be enabled through this option --perf-event-tracepoints.


A comma-separated list of kernel event categories to include in the trace. These events are automatically grouped into kernel-named tracepoints. Available categories include:

| Category | Description | Linux Tracepoints |
|----------|-------------|-------------------|
| `syscalls` | System call entry/exit events | `syscalls:sys_enter_*`, `syscalls:sys_exit_*` |
| `sched` | Process scheduling events | `sched:sched_switch`, `sched:sched_wakeup` |
| `net` | Network-related events | `net:netif_rx`, `net:net_dev_xmit` |
| `fs` | Filesystem I/O events | `ext4:*`, `vfs:*` |
| `mm` | Memory management events | `kmem:*`, `vmscan:*` |

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If someone specified "sched:sched_wakeup,sched:sched_switch" what provider name, event name, and field lists would we expect to show up in the nettrace file? (This level of detail may not go into the docs, but we should understand it ourselves to decide what info should go in the docs)

These events correspond to Linux kernel tracepoints documented in the [Linux kernel tracing documentation](https://www.kernel.org/doc/html/latest/trace/index.html). For more details on available tracepoints, see [ftrace](https://www.kernel.org/doc/html/latest/trace/ftrace.html) and [tracepoints](https://www.kernel.org/doc/html/latest/trace/tracepoints.html).

Example: `--kernel-events syscalls,sched,net`

### Linux Integration

**Tracepoint Configuration Requirements:**

- **Mandatory Mapping**: Every provider must be explicitly mapped to at least a default tracepoint and/or exclusive tracepoint sets via `--tracepoint-configs`
- **Tracepoint Isolation**: Each tracepoint can only receive events from one provider
- **Event Routing**: Different event IDs within a provider can be routed to different tracepoints for granular control
- **Automatic Prefixing**: All tracepoint names are prefixed with the provider name to avoid collisions

**Kernel Integration Points:**

The kernel tracepoints can be accessed through standard Linux tracing interfaces:

- **ftrace**: `/sys/kernel/tracing/events/user_events/`
- **perf**: Use `perf list user_events*` to see available events
- **System monitoring tools**: Any tool that can consume Linux tracepoints

### Examples
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lower in this doc are some examples of using dotnet-trace collect in various situations. No need just yet, but we'd probably want to update or add to those examples for the collect-linux functionality once we've clarified what it will be.


```dotnetcli
# All runtime events to one tracepoint
dotnet-trace collect-linux --process-id 1234 \
--providers Microsoft-Windows-DotNETRuntime:0x8000:5 \
--kernel-events syscalls,sched \
--tracepoint-configs "Microsoft-Windows-DotNETRuntime:dotnet_runtime"

# Split runtime events by category
dotnet-trace collect-linux --process-id 1234 \
--providers Microsoft-Windows-DotNETRuntime:0x8001:5 \
--kernel-events syscalls,sched,net,fs \
--tracepoint-configs "Microsoft-Windows-DotNETRuntime::exception_events=80;gc_events=1+2"

# Multiple providers
dotnet-trace collect-linux --process-id 1234 \
--providers "Microsoft-Windows-DotNETRuntime:0x8001:5,MyCustomProvider:0xFFFFFFFF:5" \
--tracepoint-configs "Microsoft-Windows-DotNETRuntime:dotnet_runtime,MyCustomProvider:custom_events"
```

## dotnet-trace convert

Converts `nettrace` traces to alternate formats for use with alternate trace analysis tools.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -214,7 +214,7 @@ into other formats, such as Chromium or [Speedscope](https://www.speedscope.app/
Trace completed.
```

dotnet-trace uses the [conventional text format](#conventions-for-describing-provider-configuration) for describing provider configuration in
dotnet-trace uses a comma-delimited variant of the [conventional text format](#conventions-for-describing-provider-configuration) for describing provider configuration in
the `--providers` argument. For more options on how to take traces using dotnet-trace, see the
[dotnet-trace docs](./dotnet-trace.md#collect-a-trace-with-dotnet-trace).

Expand Down
Loading