-
Notifications
You must be signed in to change notification settings - Fork 6k
[Diangostics][dotnet-trace] Add collect-linux verb #47894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 3 commits
a666306
a2e4f74
e1a3ec7
6c9fdb6
728afed
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -43,7 +43,7 @@ | |
* Is a cross-platform .NET Core tool. | ||
* Enables the collection of .NET Core traces of a running process without a native profiler. | ||
* Is built on [`EventPipe`](./eventpipe.md) of the .NET Core runtime. | ||
* Delivers the same experience on Windows, Linux, or macOS. | ||
* On Linux, provides additional integration with kernel user_events for native tracing tool compatibility. | ||
|
||
## Options | ||
|
||
|
@@ -55,15 +55,12 @@ | |
|
||
Displays the version of the dotnet-trace utility. | ||
|
||
- **`--duration`** | ||
|
||
How long to run the trace. `--duration 00:00:00:05` will run it for 5 seconds. | ||
|
||
## Commands | ||
|
||
| Command | | ||
|-----------------------------------------------------------| | ||
| [dotnet-trace collect](#dotnet-trace-collect) | | ||
| [dotnet-trace collect-linux](#dotnet-trace-collect-linux) | | ||
| [dotnet-trace convert](#dotnet-trace-convert) | | ||
| [dotnet-trace ps](#dotnet-trace-ps) | | ||
| [dotnet-trace list-profiles](#dotnet-trace-list-profiles) | | ||
|
@@ -76,16 +73,27 @@ | |
### Synopsis | ||
|
||
```dotnetcli | ||
dotnet-trace collect [--buffersize <size>] [--clreventlevel <clreventlevel>] [--clrevents <clrevents>] | ||
dotnet-trace collect | ||
[--buffersize <size>] | ||
[--clreventlevel <clreventlevel>] | ||
[--clrevents <clrevents>] | ||
[--dsrouter <ios|ios-sim|android|android-emu>] | ||
[--format <Chromium|NetTrace|Speedscope>] [-h|--help] [--duration dd:hh:mm:ss] | ||
[-n, --name <name>] [--diagnostic-port] [-o|--output <trace-file-path>] [-p|--process-id <pid>] | ||
[--profile <profile-name>] [--providers <list-of-comma-separated-providers>] | ||
[--format <Chromium|NetTrace|Speedscope>] | ||
[-h|--help] | ||
[--duration dd:hh:mm:ss] | ||
[-n, --name <name>] | ||
[--diagnostic-port] | ||
[-o|--output <trace-file-path>] | ||
[-p|--process-id <pid>] | ||
[--profile <profile-name>] | ||
[--providers <list-of-comma-separated-providers>] | ||
[-- <command>] (for target applications running .NET 5 or later) | ||
[--show-child-io] [--resume-runtime] | ||
[--show-child-io] | ||
[--resume-runtime] | ||
[--stopping-event-provider-name <stoppingEventProviderName>] | ||
[--stopping-event-event-name <stoppingEventEventName>] | ||
[--stopping-event-payload-filter <stoppingEventPayloadFilter>] | ||
[--event-filters <list-of-comma-separated-event-filters>] | ||
mdh1418 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
``` | ||
|
||
### Options | ||
|
@@ -158,7 +166,7 @@ | |
|
||
- **`--dsrouter {ios|ios-sim|android|android-emu}** | ||
|
||
Starts [dotnet-dsrouter](dotnet-dsrouter.md) and connects to it. Requires [dotnet-dsrouter](dotnet-dsrouter.md) to be installed. Run `dotnet-dsrouter -h` for more information. | ||
Starts [dotnet-dsrouter](dotnet-dsrouter.md) and connects to it. Requires [dotnet-dsrouter](dotnet-dsrouter.md) to be installed. Run `dotnet-dsrouter -h` for more information. | ||
|
||
- **`--format {Chromium|NetTrace|Speedscope}`** | ||
|
||
|
@@ -204,11 +212,11 @@ | |
|
||
A named pre-defined set of provider configurations that allows common tracing scenarios to be specified succinctly. The following profiles are available: | ||
|
||
| Profile | Description | | ||
|---------|-------------| | ||
|`cpu-sampling`|Useful for tracking CPU usage and general .NET runtime information. This is the default option if no profile or providers are specified.| | ||
|`gc-verbose`|Tracks GC collections and samples object allocations.| | ||
|`gc-collect`|Tracks GC collections only at very low overhead.| | ||
| Profile | Description | | ||
|---------|-------------| | ||
|`cpu-sampling`|Useful for tracking CPU usage and general .NET runtime information. This is the default option if no profile or providers are specified.| | ||
|`gc-verbose`|Tracks GC collections and samples object allocations.| | ||
|`gc-collect`|Tracks GC collections only at very low overhead.| | ||
|
||
- **`--providers <list-of-comma-separated-providers>`** | ||
|
||
|
@@ -249,6 +257,34 @@ | |
|
||
A string, parsed as [payload_field_name]:[payload_field_value] pairs separated by commas, that will stop the trace upon hitting an event containing all specified payload pairs. Requires `--stopping-event-provider-name` and `--stopping-event-event-name` to be set. for example, `--stopping-event-provider-name Microsoft-Windows-DotNETRuntime --stopping-event-event-name Method/JittingStarted --stopping-event-payload-filter MethodNameSpace:Program,MethodName:OnButtonClick` to stop the trace upon the first `Method/JittingStarted` event for the method `OnButtonClick` in the `Program` namespace emitted by the `Microsoft-Windows-DotNETRuntime` event provider. | ||
|
||
- **`--event-filters <list-of-comma-separated-event-filters>`** | ||
|
||
Defines an additional optional filter for each provider's events. When no `--event-filters` is specified for a provider, all events allowed by the provider's keywords and level configuration are collected. Event filters provide additional granular control beyond the keyword/level filtering. | ||
|
||
**Format:** `ProviderName:<Enable>:<EventIds>` | ||
|
||
Where: | ||
- `ProviderName`: The EventPipe provider name (e.g., `Microsoft-Windows-DotNETRuntime`) | ||
- `Enable` : Boolean value indicating whether EventIds will be enabled or disabled, defaults to false | ||
- `EventIds`: Plus-delimited event IDs to enable or disable, defaults to empty. | ||
|
||
**Examples:** | ||
``` | ||
Check failure on line 272 in docs/core/diagnostics/dotnet-trace.md
|
||
# Scenario: Disable specific events from Microsoft-Windows-DotNETRuntime | ||
--event-filters "Microsoft-Windows-DotNETRuntime:false:1+2+3+4+5+6+7+8+9" | ||
|
||
# Scenario: Enable specific events from a provider | ||
--event-filters "Microsoft-Windows-DotNETRuntime:true:80+129+130+250" | ||
# Only events 80, 129, 130, and 250 will be collected from this provider (others are filtered out) | ||
|
||
# Scenario: Multiple providers with mixed filtering - some providers have no filters | ||
--providers "Microsoft-Windows-DotNETRuntime:0xFFFFFFFF:5,System.Threading.Tasks.TplEventSource:0xFFFFFFFF:5,MyCustomProvider:0xFFFFFFFF:5" | ||
--event-filters "Microsoft-Windows-DotNETRuntime:false:1+2+3,System.Threading.Tasks.TplEventSource:true:7+8+9" | ||
# Microsoft-Windows-DotNETRuntime: All events EXCEPT 1,2,3 are collected | ||
# System.Threading.Tasks.TplEventSource: ONLY events 7,8,9 are collected | ||
# MyCustomProvider: ALL events are collected (no filter specified - follows provider keywords/level) | ||
``` | ||
|
||
> [!NOTE] | ||
|
||
> - Stopping the trace may take a long time (up to minutes) for large applications. The runtime needs to send over the type cache for all managed code that was captured in the trace. | ||
|
@@ -259,6 +295,138 @@ | |
|
||
> - When specifying a stopping event through the `--stopping-event-*` options, as the EventStream is being parsed asynchronously, there will be some events that pass through between the time a trace event matching the specified stopping event options is parsed and the EventPipeSession is stopped. | ||
|
||
## dotnet-trace collect-linux | ||
|
||
Collects diagnostic traces from .NET applications using Linux user_events as a transport layer. This command provides the same functionality as [`dotnet-trace collect`](#dotnet-trace-collect) but routes .NET runtime events through the Linux kernel's user_events subsystem before writing them to `.nettrace` files. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd recommend the initial technology we refer to should be perf_events and try to explain that:
We could mention that the .NET portion of those events are communicated using the user_events feature as a detail in the broader explanation.
We probably don't want to say this is the 'same functionality as dotnet-trace collect' because it can do more. Instead we might say it supports including the same .NET events. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for the feedback! Reworded the introduction for the new verb |
||
|
||
This transport approach enables automatic unification of user-space .NET events with kernel-space system events, since both are captured in the same kernel tracing infrastructure. Linux tools like `perf` and `ftrace` can monitor events in real-time while maintaining full compatibility with existing .NET profiling workflows. | ||
mdh1418 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
### Prerequisites | ||
|
||
- Linux kernel with `CONFIG_USER_EVENTS=y` support (kernel 6.4+) | ||
- Appropriate permissions to access `/sys/kernel/tracing/user_events_data` | ||
mdh1418 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- .NET 10+ | ||
|
||
### Synopsis | ||
|
||
```dotnetcli | ||
dotnet-trace collect-linux | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we can prune away some of these options
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would there be a scenario where users are capturing events with another tool, but they want to start capturing runtime events from their .NET app as well, and just use dotnet-trace as the medium to start a user_events EventPipeSession? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I can conceive of that scenario, but I'm suggesting we don't do anything to proactively support it right now to keep the scope smaller. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Pruned and grouped the options into functionality (the previous order already deviates from |
||
[--buffersize <size>] | ||
[--clreventlevel <clreventlevel>] | ||
[--clrevents <clrevents>] | ||
[--format <Chromium|NetTrace|Speedscope>] | ||
[-h|--help] | ||
[--duration dd:hh:mm:ss] | ||
[-n, --name <name>] | ||
[--diagnostic-port] | ||
[-o|--output <trace-file-path>] | ||
[-p|--process-id <pid>] | ||
mdh1418 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
[--profile <profile-name>] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We'll need to decide what profiles are available and which events they each collect. I expect this is mostly the same as for the 'collect' verb, but cpu-sampling probably should be different. We also might want a thread-time profile that collects context switches. If we do update these profiles, we should strongly consider also renaming the highly misleading "cpu-sampling" profile for the collect verb. Currently that profile collects thread-time information, not CPU samples. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I lightly updated the profiles for both collect and collect-linux. Still up for refinement once I have a better understanding of how to mix/match the .NET events and linux perf events. I'm wondering if we should allow |
||
[--providers <list-of-comma-separated-providers>] | ||
[-- <command>] (for target applications running .NET 10 or later) | ||
[--show-child-io] | ||
[--resume-runtime] | ||
[--stopping-event-provider-name <stoppingEventProviderName>] | ||
[--stopping-event-event-name <stoppingEventEventName>] | ||
[--stopping-event-payload-filter <stoppingEventPayloadFilter>] | ||
mdh1418 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
[--event-filters <list-of-comma-separated-event-filters>] | ||
[--tracepoint-configs <list-of-comma-separated-tracepoint-configs>] | ||
[--kernel-events <list-of-kernel-events>] | ||
``` | ||
|
||
### Options | ||
|
||
`dotnet-trace collect-linux` supports all the same options as [`dotnet-trace collect`](#dotnet-trace-collect), excluding `--dsrouter`, and additionally offers: | ||
mdh1418 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
- **`--tracepoint-configs <list-of-comma-separated-tracepoint-configs>` (required)** | ||
mdh1418 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Defines the explicit mapping between EventPipe providers and kernel tracepoints. Each provider in `--providers` must have a corresponding entry in `--tracepoint-configs` | ||
|
||
**Format:** `ProviderName:<DefaultTracepointName>:<TracepointSets>` | ||
|
||
Where: | ||
- `ProviderName`: The EventPipe provider name (e.g., `Microsoft-Windows-DotNETRuntime`) | ||
- `DefaultTracepointName`: Default tracepoint name for this provider (can be empty to require explicit assignment) | ||
- `TracepointSets`: Semi-colon delimited `TracepointName=<EventIds>` | ||
- `EventIds`: Plus-delimited event IDs to route to that tracepoint | ||
|
||
> [!NOTE] | ||
> All tracepoint names are automatically prefixed with the provider name to avoid collisions. For example, `gc_events` for the `Microsoft-Windows-DotNETRuntime` provider becomes `Microsoft_Windows_DotNETRuntime_gc_events`. | ||
|
||
> [!TIP] | ||
> Use `--event-filters` to disable specific events before they are routed to tracepoints. Event filtering happens before tracepoint routing - only events that pass the filter will be sent to their assigned tracepoints. | ||
|
||
**Examples:** | ||
``` | ||
Check failure on line 360 in docs/core/diagnostics/dotnet-trace.md
|
||
# Scenario: All events from provider go to a default tracepoint | ||
--tracepoint-configs "Microsoft-Windows-DotNETRuntime:dotnet_runtime" | ||
# All enabled events from Microsoft-Windows-DotNETRuntime will be written to Microsoft_Windows_DotNETRuntime_dotnet_runtime | ||
|
||
# Scenario: Split events by categories | ||
--tracepoint-configs "Microsoft-Windows-DotNETRuntime::gc_events=1+2+3;jit_events=10+11+12" | ||
# EventIDs 1, 2, and 3 will be written to Microsoft_Windows_DotNETRuntime_gc_events | ||
# EventIDs 10, 11, and 12 will be written to Microsoft_Windows_DotNETRuntime_jit_events | ||
|
||
# Multiple providers (comma-separated) | ||
--tracepoint-configs "Microsoft-Windows-DotNETRuntime::gc_events=1+2+3,MyCustomProvider:custom_events" | ||
# EventIds 1, 2, and 3 from Microsoft-Windows-DotNETRuntime will be written to Microsoft_Windows_DotNETRuntime_gc_events | ||
# All enabled events from MyCustomProvider will be written to MyCustomProvider_custom_events | ||
``` | ||
|
||
- **`--kernel-events <list-of-kernel-events>` (optional)** | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps we should call this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point, changed to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should work with Beau to figure out what kind of event names can be supported. For example https://man7.org/linux/man-pages/man1/perf-record.1.html shows a whole bunch of different things can be specified as an event and I have no idea what part of that OneCollect handles. I'm hoping we can at least support anything in /sys/kernel/tracing/available_events as well as the symbolic PMU events like cpu-cycles. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seems like all events under /sys/kernel/tracing/available_events can be enabled, so I changed the wording to suggest that all of those events found there or found categorically under /sys/kernel/tracing/events/ can be enabled through this option |
||
|
||
A comma-separated list of kernel event categories to include in the trace. These events are automatically grouped into kernel-named tracepoints. Available categories include: | ||
mdh1418 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
| Category | Description | Linux Tracepoints | | ||
|----------|-------------|-------------------| | ||
| `syscalls` | System call entry/exit events | `syscalls:sys_enter_*`, `syscalls:sys_exit_*` | | ||
mdh1418 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| `sched` | Process scheduling events | `sched:sched_switch`, `sched:sched_wakeup` | | ||
| `net` | Network-related events | `net:netif_rx`, `net:net_dev_xmit` | | ||
| `fs` | Filesystem I/O events | `ext4:*`, `vfs:*` | | ||
| `mm` | Memory management events | `kmem:*`, `vmscan:*` | | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If someone specified "sched:sched_wakeup,sched:sched_switch" what provider name, event name, and field lists would we expect to show up in the nettrace file? (This level of detail may not go into the docs, but we should understand it ourselves to decide what info should go in the docs) |
||
These events correspond to Linux kernel tracepoints documented in the [Linux kernel tracing documentation](https://www.kernel.org/doc/html/latest/trace/index.html). For more details on available tracepoints, see [ftrace](https://www.kernel.org/doc/html/latest/trace/ftrace.html) and [tracepoints](https://www.kernel.org/doc/html/latest/trace/tracepoints.html). | ||
mdh1418 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Example: `--kernel-events syscalls,sched,net` | ||
|
||
### Linux Integration | ||
mdh1418 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
**Tracepoint Configuration Requirements:** | ||
|
||
- **Mandatory Mapping**: Every provider must be explicitly mapped to at least a default tracepoint and/or exclusive tracepoint sets via `--tracepoint-configs` | ||
- **Tracepoint Isolation**: Each tracepoint can only receive events from one provider | ||
- **Event Routing**: Different event IDs within a provider can be routed to different tracepoints for granular control | ||
- **Automatic Prefixing**: All tracepoint names are prefixed with the provider name to avoid collisions | ||
|
||
**Kernel Integration Points:** | ||
|
||
The kernel tracepoints can be accessed through standard Linux tracing interfaces: | ||
|
||
- **ftrace**: `/sys/kernel/tracing/events/user_events/` | ||
- **perf**: Use `perf list user_events*` to see available events | ||
- **System monitoring tools**: Any tool that can consume Linux tracepoints | ||
|
||
### Examples | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Lower in this doc are some examples of using dotnet-trace collect in various situations. No need just yet, but we'd probably want to update or add to those examples for the collect-linux functionality once we've clarified what it will be. |
||
|
||
```dotnetcli | ||
# All runtime events to one tracepoint | ||
dotnet-trace collect-linux --process-id 1234 \ | ||
--providers Microsoft-Windows-DotNETRuntime:0x8000:5 \ | ||
--kernel-events syscalls,sched \ | ||
--tracepoint-configs "Microsoft-Windows-DotNETRuntime:dotnet_runtime" | ||
|
||
# Split runtime events by category | ||
dotnet-trace collect-linux --process-id 1234 \ | ||
--providers Microsoft-Windows-DotNETRuntime:0x8001:5 \ | ||
--kernel-events syscalls,sched,net,fs \ | ||
--tracepoint-configs "Microsoft-Windows-DotNETRuntime::exception_events=80;gc_events=1+2" | ||
|
||
# Multiple providers | ||
dotnet-trace collect-linux --process-id 1234 \ | ||
--providers "Microsoft-Windows-DotNETRuntime:0x8001:5,MyCustomProvider:0xFFFFFFFF:5" \ | ||
--tracepoint-configs "Microsoft-Windows-DotNETRuntime:dotnet_runtime,MyCustomProvider:custom_events" | ||
``` | ||
|
||
## dotnet-trace convert | ||
|
||
Converts `nettrace` traces to alternate formats for use with alternate trace analysis tools. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to tweak the wording a bit later, but not focusing on this for the moment :)