Skip to content

Added new page for automated analysis, as well as screenshots #29406

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
May 19, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 15 additions & 10 deletions config/_default/menus/main.en.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4093,56 +4093,61 @@ menu:
parent: profiler
identifier: profiler_compare
weight: 5
- name: Automated Analysis
url: profiler/automated_analysis
parent: profiler
identifier: profiler_automated_analysis
weight: 6
- name: Profiler Troubleshooting
url: profiler/profiler_troubleshooting/
parent: profiler
identifier: profiler_troubleshooting
weight: 6
weight: 7
- name: Java
url: profiler/profiler_troubleshooting/java/
parent: profiler_troubleshooting
identifier: profiler_troubleshooting_java
weight: 601
weight: 701
- name: Python
url: profiler/profiler_troubleshooting/python/
parent: profiler_troubleshooting
identifier: profiler_troubleshooting_python
weight: 602
weight: 702
- name: Go
url: profiler/profiler_troubleshooting/go/
parent: profiler_troubleshooting
identifier: profiler_troubleshooting_go
weight: 603
weight: 703
- name: Ruby
url: profiler/profiler_troubleshooting/ruby/
parent: profiler_troubleshooting
identifier: profiler_troubleshooting_ruby
weight: 604
weight: 704
- name: Node.js
url: profiler/profiler_troubleshooting/nodejs/
parent: profiler_troubleshooting
identifier: profiler_troubleshooting_nodejs
weight: 605
weight: 705
- name: .NET
url: profiler/profiler_troubleshooting/dotnet/
parent: profiler_troubleshooting
identifier: profiler_troubleshooting_dotnet
weight: 606
weight: 706
- name: PHP
url: profiler/profiler_troubleshooting/php/
parent: profiler_troubleshooting
identifier: profiler_troubleshooting_php
weight: 607
weight: 707
- name: C/C++/Rust
url: profiler/profiler_troubleshooting/ddprof/
parent: profiler_troubleshooting
identifier: profiler_troubleshooting_linux
weight: 608
weight: 708
- name: Guides
url: profiler/guide/
parent: profiler
identifier: profiler_guides
weight: 7
weight: 8
- name: Database Monitoring
url: database_monitoring/
pre: database-2
Expand Down
72 changes: 72 additions & 0 deletions content/en/profiler/automated_analysis.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
---
title: Automated Analysis
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High-level question, but what's the reason we call this Automated Analysis instead of Recommendations like we have for APM, CCM, DBM?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed for that in the past but many of these insights are missing a vital piece to make then a Recommendation which is: always having a clear impact statement.

We can say that the findings here can help with a given issue, but it's hard to definitely tie that to an impact (example: fixing GC issues will reduce CPU load but we can't concretely say by how much due to many other factors). However, when we do have that info, we promote that insight and it will show up in APM recommendations

description: Automatically surface critical issues with contextual insights and recommended next steps
further_reading:
- link: 'profiler/enabling'
tag: 'Documentation'
text: 'Enable continuous profiler for your application'
- link: 'getting_started/profiler'
tag: 'Documentation'
text: 'Getting Started with Profiler'
- link: 'https://www.datadoghq.com/blog/introducing-datadog-profiling/'
tag: 'Blog'
text: 'Introducing always-on production profiling in Datadog'
- link: 'https://www.datadoghq.com/blog/continuous-profiler-timeline-view/'
tag: 'Blog'
text: "Diagnose runtime and code inefficiencies using Continuous Profiler's timeline view"
---

## Overview
Automated Analysis automatically detects performance issues in your applications using Continuous Profiler data and provides actionable insights for resolution. When an issue is detected, Automated Analysis provides:

- A high-level summary explaining the issue and why it matters
- Contextual insights from profiling data (for example, affected methods, packages, or processes)
- Recommended next steps to help you resolve the issue

This reduces the profiling expertise needed to identify and resolve performance issues in your applications that might otherwise go unnoticed.

{{< img src="profiler/profiling_automated_analysis.png" alt="The Profiler Thread Time line showing a Thrown Exception insight" style="width:100%;" >}}

## Explore insights
Access Automated Analysis from the [Profile explorer][1]. Insights are displayed:

- In the **Top Insights** banner at the top of the page when you're scoped to a specific service
{{< img src="profiler/profiling_automated_analysis_banner.png" alt="The Automated Analysis banner displaying insights detected for a given service" style="width:100%;">}}

- In the **Insights** column within the service list
{{< img src="profiler/profiling_automated_analysis_column.png" alt="The Automated Analysis column displaying insights detected for a given service within the service list" style="width:100%;">}}

- Within a flame graph view
{{< img src="profiler/profiling_automated_analysis_flamegraph_viz.png" alt="The Automated Analysis column displaying insights detected for a given service within a flamegraph" style="width:100%;">}}

- Within a timeline view
{{< img src="profiler/profiling_automated_analysis.png" alt="The Automated Analysis column displaying insights detected for a given service within a timeline" style="width:100%;">}}

Click an insight to see a high-level summary that explains the issue, contextual insights from profiling data, and recommended next steps.
{{< img src="profiler/profiling_automated_analysis_details.png" alt="Expanded Profiling Insights showing the details of a detected Issue" style="width:100%;">}}

## Supported insights

Automated Analysis supports finding the following insights:

| Name | Severity | Description |
|---------------------------|----------|-------------|
| Duplicated Flags | Info | Triggers if duplicate flags were provided to the runtime (for example, `-Xmx2g -Xmx5g`). This is a problem as it may lead to changes not having the expected effect. |
| Explicit GC | Info | Triggers if there are System.gc() calls. |
| GC Pause Peak Duration | Info | Triggers if at least one GC pause took more than 1 second. |
| GC Setup | Info | Triggers when one of the following is detected - serial GC used on a multi-core machine, parallel GC on a single-core machine, more GC threads were configured than available cores, or a parallel GC was configured to run in 1 thread |
| Head of line blocking | Info | Triggers if a queue event gets stuck behind the given activity. |
| Primitive Value Boxing | Info | Triggers if more than 5% of CPU time was spent doing primitive<>object value conversions. |
| Deadlocked Threads Detected | Warn | Triggers if max number of deadlocked threads over query context is bigger than 0. |
| GC Pauses | Warn | Triggers if more than 10% of time was spent in GC pauses. |
| Options | Warn | Triggers if undocumented, deprecated or non-recommended option flags were detected. |
| Stackdepth Setting | Warn | Triggers if events were found with truncated stacktraces which may make it hard to understand profiling data. |
| Thrown Exceptions | Warn | Triggers when the rate of thrown (caught and uncaught) exceptions per minute goes above a threshold (defaults to 10K) |
| VMOperation Peak Duration | Warn | Triggers if a blocking VM operation (or combination of operations close in time) takes more than 2 seconds. Reports details about the operation with the highest duration. |

## Further reading

{{< partial name="whats-next/whats-next.html" >}}

[1]: https://app.datadoghq.com/profiling/explorer

13 changes: 8 additions & 5 deletions content/en/profiler/connect_traces_and_profiles.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,15 +116,18 @@ The Trace to Profiling integration is enabled when you [turn on profiling for yo

### Span execution timeline view

{{< img src="profiler/profiles_tab.png" alt="Profiles tab has a timeline view that breaks down threads and execution over time" >}}
{{< img src="profiler/profiling_automated_analysis_individual.png" alt="Profiles tab has a timeline view that breaks down threads and execution over time" >}}

The timeline view surfaces time-based patterns and work distribution over the period of the span.
The timeline view surfaces time-based patterns and work distribution over the period of the span. It provides a visual breakdown of how threads contributed to the request over time.

With the span timeline view, you can:

- Isolate time-consuming methods.
- Sort out complex interactions between threads.
- Surface runtime activity that impacted the request.
- Isolate time-consuming methods
- Sort out complex interactions between threads
- Surface runtime activity that impacted the request
- Leverage [Automated Analysis][1] to highlight performance issues directly in the view, such as oversized thread pools or GC contention

[1]: /profiler/automated_analysis/

Depending on the runtime and language, the lanes vary:

Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading