-
Notifications
You must be signed in to change notification settings - Fork 1.2k
[MLOB-2411] Add Distributed Proxy/Gateway Service Guide for LLM Observability #28593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
estherk15
merged 19 commits into
master
from
sabrenner/llmobs-proxy-service-quickstart-guide
Jun 9, 2025
Merged
Changes from 16 commits
Commits
Show all changes
19 commits
Select commit
Hold shift + click to select a range
9fa3de6
add initial layout
sabrenner a77384c
different code blocks
sabrenner 445f8a7
change to messages
sabrenner eeca755
update example
sabrenner 0a3d31b
add images for assisted viewing
sabrenner 4996f8b
Merge branch 'master' into sabrenner/llmobs-proxy-service-quickstart-…
sabrenner 6ad0fe3
move proxy guide, reword with new language and steps
sabrenner 4d6e909
add timeseries image
sabrenner da7b195
visual fixes and add image
sabrenner 7ae31b8
fix section hyperlink
sabrenner e5f0e74
Update content/en/llm_observability/trace_proxy_services.md
sabrenner 25d9f26
Update content/en/llm_observability/trace_proxy_services.md
sabrenner ac25ec0
review comments
sabrenner cb551cd
Merge branch 'master' into sabrenner/llmobs-proxy-service-quickstart-…
sabrenner 6ee8a58
fix whatsnext links for guide index
sabrenner d007a34
fix whatsnext links
sabrenner afdf9de
Update content/en/llm_observability/trace_proxy_services.md
sabrenner 8d744ed
Update content/en/llm_observability/trace_proxy_services.md
sabrenner f41d4d4
Merge branch 'master' into sabrenner/llmobs-proxy-service-quickstart-…
sabrenner File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,188 @@ | ||
--- | ||
title: Trace Proxy Services | ||
--- | ||
{{< site-region region="gov" >}} | ||
<div class="alert alert-warning">LLM Observability is not available in the selected site ({{< region-param key="dd_site_name" >}}).</div> | ||
{{< /site-region >}} | ||
|
||
## Overview | ||
|
||
Like traditional applications, an LLM application can span multiple microservices. With LLM Observability, if one of these services is an LLM proxy or gateway, you can trace LLM calls within a complete end-to-end trace, capturing the full request path across services. | ||
|
||
## Enabling LLM Observability for a proxy or gateway service | ||
|
||
To enable LLM Observability for a proxy or gateway service used by multiple ML applications, you can configure it without specifying an ML application name. Instead, set the service name. This allows you to [filter spans specific to that proxy or gateway service within LLM observability](#observing-llm-gateway-and-proxy-services). | ||
|
||
{{< tabs >}} | ||
{{% tab "Python" %}} | ||
|
||
```python | ||
# proxy.py | ||
from ddtrace.llmobs import LLMObs | ||
|
||
LLMObs.enable(service="chat-proxy") | ||
|
||
# proxy-specific logic, including guardrails, sensitive data scans, and the LLM call | ||
``` | ||
|
||
{{% /tab %}} | ||
{{% tab "Node.js" %}} | ||
|
||
```javascript | ||
// proxy.js | ||
const tracer = require('dd-trace').init({ | ||
llmobs: true, | ||
service: "chat-proxy" | ||
}); | ||
const llmobs = tracer.llmobs; | ||
|
||
// proxy-specific logic, including guardrails, sensitive data scans, and the LLM call | ||
``` | ||
|
||
{{% /tab %}} | ||
{{< /tabs >}} | ||
|
||
|
||
In your specific services that orchestrate the ML applications that make calls to the proxy or gateway service, enable LLM Observability with the ML application name: | ||
|
||
{{< tabs >}} | ||
{{% tab "Python" %}} | ||
|
||
```python | ||
# application.py | ||
from ddtrace.llmobs import LLMObs | ||
LLMObs.enable(ml_app="my-ml-app") | ||
|
||
import requests | ||
|
||
if __name__ == "__main__": | ||
with LLMObs.workflow(name="run-chat"): | ||
# other application-specific logic - (such as RAG steps and parsing) | ||
|
||
response = requests.post("http://localhost:8080/chat", json={ | ||
# data to pass to the proxy service | ||
}) | ||
|
||
|
||
# other application-specific logic handling the response | ||
``` | ||
|
||
{{% /tab %}} | ||
{{% tab "Node.js" %}} | ||
|
||
```javascript | ||
// application.js | ||
const tracer = require('dd-trace').init({ | ||
llmobs: { | ||
mlApp: 'my-ml-app' | ||
} | ||
}); | ||
const llmobs = tracer.llmobs; | ||
|
||
const axios = require('axios'); | ||
|
||
async function main () { | ||
llmobs.trace({ name: 'run-chat', kind: 'workflow' }, async () => { | ||
// other application-specific logic - (such as RAG steps and parsing) | ||
|
||
// wrap the proxy call in a task span | ||
const response = await axios.post('http://localhost:8080/chat', { | ||
// data to pass to the proxy service | ||
}); | ||
|
||
// other application-specific logic handling the response | ||
}); | ||
} | ||
|
||
main(); | ||
``` | ||
|
||
{{% /tab %}} | ||
{{< /tabs >}} | ||
|
||
When the LLM application makes a request to the proxy or gateway service, the LLM Observability SDK automatically propagates the ML application name from the original LLM application. The propagated ML application name takes precedence over the ML application name specified in the proxy or gateway service. | ||
|
||
## Observing LLM gateway and proxy services | ||
|
||
### All requests to the proxy or gateway service | ||
|
||
To view all requests to the proxy service as top-level spans, wrap the entrypoint of the proxy service endpoint in a `workflow` span: | ||
|
||
{{< tabs >}} | ||
{{% tab "Python" %}} | ||
|
||
```python | ||
# proxy.py | ||
from ddtrace.llmobs import LLMObs | ||
|
||
LLMObs.enable(service="chat-proxy") | ||
|
||
@app.route('/chat') | ||
def chat(): | ||
with LLMObs.workflow(name="chat-proxy-entrypoint"): | ||
# proxy-specific logic, including guardrails, sensitive data scans, and the LLM call | ||
``` | ||
|
||
{{% /tab %}} | ||
{{% tab "Node.js" %}} | ||
|
||
```javascript | ||
// proxy.js | ||
const tracer = require('dd-trace').init({ | ||
llmobs: true, | ||
service: "chat-proxy" | ||
}); | ||
const llmobs = tracer.llmobs; | ||
|
||
app.post('/chat', async (req, res) => { | ||
await llmobs.trace({ name: 'chat-proxy-entrypoint', kind: 'workflow' }, async () => { | ||
// proxy-specific logic, including guardrails, sensitive data scans, and the LLM call | ||
res.send("Hello, world!"); | ||
}); | ||
}); | ||
``` | ||
|
||
{{% /tab %}} | ||
{{< /tabs >}} | ||
|
||
All requests to the proxy service can now be viewed as top-level spans within the LLM trace view: | ||
sabrenner marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
1. In the [LLM trace][1] page, select **All Applications** from the top-left dropdown. | ||
2. Switch to the **All Spans** view in the top-right dropdown. | ||
3. Filter the list by the `service` tag and the workflow name. | ||
|
||
{{< img src="llm_observability/all-spans-with-service-and-span-name.png" alt="View all spans from all ML applications with the service and workflow name tags" style="width:100%;" >}} | ||
|
||
You can also filter the workflow **Span Name** using the facet on the left hand side of the trace view: | ||
|
||
{{< img src="llm_observability/span-name-facet-for-proxy-service-monitoring.png" alt="Select the workflow span name from the facet on the left hand side of the trace view" style="width:50%;" >}} | ||
|
||
### All LLM calls made within the proxy or gateway service | ||
|
||
To only monitor the LLM calls made within a proxy or gateway service, filter by `llm` spans in the trace view: | ||
|
||
{{< img src="llm_observability/all-spans-with-service-and-span-kind.png" alt="View all spans from all ML applications with the service tags and the LLM span kind" style="width:100%;" >}} | ||
|
||
You can also filter the **Span Kind** facet on the left hand side of the trace view: | ||
|
||
{{< img src="llm_observability/span-kind-facet-for-proxy-service-monitoring.png" alt="Select the LLM span kind facet from the left hand side of the trace view" style="width:50%;" >}} | ||
|
||
### Filtering by a specific ML application and observing patterns and trends | ||
|
||
You can apply both filtering processes ([top-level calls to the proxy service](#all-requests-to-the-proxy-or-gateway-service) and [LLM calls made within the proxy or gateway service](#all-llm-calls-made-within-the-proxy-or-gateway-service)) to a specific ML application to view its interaction with the proxy or gateway service. | ||
|
||
1. In the top-left dropdown, select the ML application of interest. | ||
2. To see all traces for the ML application, switch from the **All Spans** view to the **Traces** view in the top-right dropdown. | ||
3. To see a timeseries of traces for the ML application, switch back to the **All Spans** filter in the top-right dropdown and next to "Visualize as", select **Timeseries**. | ||
|
||
{{< img src="llm_observability/timeseries-view-for-proxy-services.png" alt="Switch from a List view to a Timeseries view in the Traces view while maintaining the All Span filter" style="width:100%;" >}} | ||
|
||
## Observing end-to-end usage of LLM applications making calls to a proxy or gateway service | ||
|
||
To observe the complete end-to-end usage of an LLM application that makes calls to a proxy or gateway service, you can filter for traces with that ML application name: | ||
|
||
1. In the LLM trace view, select the ML application name of interest from the top-left dropdown. | ||
2. Switch to the `Traces` view in the top-right dropdown. | ||
|
||
|
||
[1]: https://app.datadoghq.com/llm/traces |
Binary file added
BIN
+31.4 KB
static/images/llm_observability/all-spans-with-service-and-span-kind.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+32.6 KB
static/images/llm_observability/all-spans-with-service-and-span-name.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+7.57 KB
static/images/llm_observability/span-kind-facet-for-proxy-service-monitoring.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+13 KB
static/images/llm_observability/span-name-facet-for-proxy-service-monitoring.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+123 KB
static/images/llm_observability/timeseries-view-for-proxy-services.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.