Skip to content

Commit 92f6ee4

Browse files
ncybulYun-Kimbrettlangdon
authored andcommitted
fix(litellm): [MLOS-182] select metadata keys to tag from litellm kwargs (#14067) [backport to 3.10] (#14083)
We received a customer ticket explaining that `vertex_credentials` was being collected as a metadata field on LLM spans emitted by the LiteLLM integration. This PR addresses that issue and safeguards against potentially other sensitive information being exposed by explicitly selecting a subset of kwargs to attach to the LLM span's metadata. Keys were chosen based on the arguments passed into the LiteLLM SDK [completion](https://github.com/BerriAI/litellm/blob/main/litellm/main.py#L874-L917) method and [text_completion](https://github.com/BerriAI/litellm/blob/main/litellm/main.py#L4446-L4497) method. I verified that this PR does resolve the issue. The following script was run to produce this [trace](https://app.datadoghq.com/llm/traces?query=%40ml_app%3Anicole-test%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=true&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZgeNigRWrccCwAAABhBWmdlTmlnUkFBQmM2eEc4M1pUNUFBQUEAAAAkMDE5ODFlMzYtNDNiMi00NGQ1LWJlMTUtNzk3MDUxZTNmNTBhAAAALQ%22%7D%5D&spanId=1740319997238873855&start=1752852637158&end=1752853537158&paused=false) where the metadata field does not contain `vertex_credentials`. ``` from litellm import completion import json file_path = '/path/to/credentials' with open(file_path, 'r') as file: vertex_credentials = json.load(file) vertex_credentials_json = json.dumps(vertex_credentials) response = completion( model="vertex_ai/gemini-1.5-flash", messages=[{ "content": "Hello, how are you?","role": "user"}], vertex_credentials=vertex_credentials_json, stream=True ) for chunk in response: print(chunk) ``` - [x] PR author has checked that all the criteria below are met - The PR description includes an overview of the change - The PR description articulates the motivation for the change - The change includes tests OR the PR description describes a testing strategy - The PR description notes risks associated with the change, if any - Newly-added code is easy to change - The change follows the [library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) - The change includes or references documentation updates if necessary - Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) - [x] Reviewer has checked that all the criteria below are met - Title is accurate - All changes are related to the pull request's stated goal - Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - Testing strategy adequately addresses listed risks - Newly-added code is easy to change - Release note makes sense to a user of the library - If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting) --------- (cherry picked from commit e31f11f) ## Checklist - [x] PR author has checked that all the criteria below are met - The PR description includes an overview of the change - The PR description articulates the motivation for the change - The change includes tests OR the PR description describes a testing strategy - The PR description notes risks associated with the change, if any - Newly-added code is easy to change - The change follows the [library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) - The change includes or references documentation updates if necessary - Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) ## Reviewer Checklist - [x] Reviewer has checked that all the criteria below are met - Title is accurate - All changes are related to the pull request's stated goal - Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - Testing strategy adequately addresses listed risks - Newly-added code is easy to change - Release note makes sense to a user of the library - If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting) Co-authored-by: Yun Kim <[email protected]> Co-authored-by: Brett Langdon <[email protected]>
1 parent 48961a1 commit 92f6ee4

File tree

3 files changed

+78
-6
lines changed

3 files changed

+78
-6
lines changed

ddtrace/llmobs/_integrations/litellm.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -71,9 +71,9 @@ def _llmobs_set_tags(
7171

7272
# use Open AI helpers since response format will match Open AI
7373
if self.is_completion_operation(operation):
74-
openai_set_meta_tags_from_completion(span, kwargs, response)
74+
openai_set_meta_tags_from_completion(span, kwargs, response, integration_name="litellm")
7575
else:
76-
openai_set_meta_tags_from_chat(span, kwargs, response)
76+
openai_set_meta_tags_from_chat(span, kwargs, response, integration_name="litellm")
7777

7878
# custom logic for updating metadata on litellm spans
7979
self._update_litellm_metadata(span, kwargs, operation)

ddtrace/llmobs/_integrations/utils.py

Lines changed: 72 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,57 @@
5050
LITELLM_ROUTER_INSTANCE_KEY,
5151
)
5252

53+
LITELLM_METADATA_CHAT_KEYS = (
54+
"timeout",
55+
"temperature",
56+
"top_p",
57+
"n",
58+
"stream",
59+
"stream_options",
60+
"stop",
61+
"max_completion_tokens",
62+
"max_tokens",
63+
"modalities",
64+
"prediction",
65+
"presence_penalty",
66+
"frequency_penalty",
67+
"logit_bias",
68+
"user",
69+
"response_format",
70+
"seed",
71+
"tool_choice",
72+
"parallel_tool_calls",
73+
"logprobs",
74+
"top_logprobs",
75+
"deployment_id",
76+
"reasoning_effort",
77+
"base_url",
78+
"api_base",
79+
"api_version",
80+
"model_list",
81+
)
82+
LITELLM_METADATA_COMPLETION_KEYS = (
83+
"best_of",
84+
"echo",
85+
"frequency_penalty",
86+
"logit_bias",
87+
"logprobs",
88+
"max_tokens",
89+
"n",
90+
"presence_penalty",
91+
"stop",
92+
"stream",
93+
"stream_options",
94+
"suffix",
95+
"temperature",
96+
"top_p",
97+
"user",
98+
"api_base",
99+
"api_version",
100+
"model_list",
101+
"custom_llm_provider",
102+
)
103+
53104

54105
def extract_model_name_google(instance, model_name_attr):
55106
"""Extract the model name from the instance.
@@ -299,12 +350,14 @@ def get_messages_from_converse_content(role: str, content: list):
299350
return messages
300351

301352

302-
def openai_set_meta_tags_from_completion(span: Span, kwargs: Dict[str, Any], completions: Any) -> None:
353+
def openai_set_meta_tags_from_completion(
354+
span: Span, kwargs: Dict[str, Any], completions: Any, integration_name: str = "openai"
355+
) -> None:
303356
"""Extract prompt/response tags from a completion and set them as temporary "_ml_obs.meta.*" tags."""
304357
prompt = kwargs.get("prompt", "")
305358
if isinstance(prompt, str):
306359
prompt = [prompt]
307-
parameters = {k: v for k, v in kwargs.items() if k not in OPENAI_SKIPPED_COMPLETION_TAGS}
360+
parameters = get_metadata_from_kwargs(kwargs, integration_name, "completion")
308361
output_messages = [{"content": ""}]
309362
if not span.error and completions:
310363
choices = getattr(completions, "choices", completions)
@@ -318,15 +371,17 @@ def openai_set_meta_tags_from_completion(span: Span, kwargs: Dict[str, Any], com
318371
)
319372

320373

321-
def openai_set_meta_tags_from_chat(span: Span, kwargs: Dict[str, Any], messages: Optional[Any]) -> None:
374+
def openai_set_meta_tags_from_chat(
375+
span: Span, kwargs: Dict[str, Any], messages: Optional[Any], integration_name: str = "openai"
376+
) -> None:
322377
"""Extract prompt/response tags from a chat completion and set them as temporary "_ml_obs.meta.*" tags."""
323378
input_messages = []
324379
for m in kwargs.get("messages", []):
325380
tool_call_id = m.get("tool_call_id")
326381
if tool_call_id:
327382
core.dispatch(DISPATCH_ON_TOOL_CALL_OUTPUT_USED, (tool_call_id, span))
328383
input_messages.append({"content": str(_get_attr(m, "content", "")), "role": str(_get_attr(m, "role", ""))})
329-
parameters = {k: v for k, v in kwargs.items() if k not in OPENAI_SKIPPED_CHAT_TAGS}
384+
parameters = get_metadata_from_kwargs(kwargs, integration_name, "chat")
330385
span._set_ctx_items({INPUT_MESSAGES: input_messages, METADATA: parameters})
331386

332387
if span.error or not messages:
@@ -398,6 +453,19 @@ def openai_set_meta_tags_from_chat(span: Span, kwargs: Dict[str, Any], messages:
398453
span._set_ctx_item(OUTPUT_MESSAGES, output_messages)
399454

400455

456+
def get_metadata_from_kwargs(
457+
kwargs: Dict[str, Any], integration_name: str = "openai", operation: str = "chat"
458+
) -> Dict[str, Any]:
459+
metadata = {}
460+
if integration_name == "openai":
461+
keys_to_skip = OPENAI_SKIPPED_CHAT_TAGS if operation == "chat" else OPENAI_SKIPPED_COMPLETION_TAGS
462+
metadata = {k: v for k, v in kwargs.items() if k not in keys_to_skip}
463+
elif integration_name == "litellm":
464+
keys_to_include = LITELLM_METADATA_CHAT_KEYS if operation == "chat" else LITELLM_METADATA_COMPLETION_KEYS
465+
metadata = {k: v for k, v in kwargs.items() if k in keys_to_include}
466+
return metadata
467+
468+
401469
def openai_get_input_messages_from_response_input(
402470
messages: Optional[Union[str, List[Dict[str, Any]]]]
403471
) -> List[Dict[str, Any]]:
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
fixes:
2+
- |
3+
litellm: This fix resolves an issue where potentially sensitive parameters were being tagged as metadata on LLM Observability spans.
4+
Now, metadata tags are based on an allowlist instead of a denylist.

0 commit comments

Comments
 (0)