You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ddtrace has a contrib module to count tokens if we have tiktoken installed.
it's found in ddtrace/contrib/internal/openai/utils.py
openai usually doesn't encode special tokens from the user's input, because it would mess with the generation. these special tokens are used as separators for the user/assistant messages and also marking the end of message. one exception is <|endoftext|>.
we encountered an issue with sending the text <|endoftext|> - the tiktoken call in ddtrace raises an exception, because it's on the list of disallowed_special tokens.
while this might indicate a bug on the caller side (since the caller might not want to send a special token, but rather just plain text), it shouldn't change how we calculate token counts. encoding this text as a token for token count calculation is the right behaviour, since this is what will happen on the openai backend.
I'd say this is the right behaviour for any LLM with its default tokenizer - and tiktoken loads the default tokenizer based on the model name.
I suggest setting disallowed_special=(), so we don't get runtime exceptions when this happens.
openai will interpret the literal string as the EOS token anyways, so this is correct.
another issue with his contrib module is that there's no way of disabling the exact token count calculation OR falling back to rough token count estimation when an exception occurs. ideally, there'd be an env var or some other option to disable tiktoken.
we have tiktoken installed in our project for other reasons - not for ddtrace - so auto-enabling it when tiktoken is available is not desirable in our case.
I'm happy to open a PR after we agreed on the right course of action.
Trace
File "/usr/lib/python3.11/site-packages/ddtrace/contrib/openai/_endpoint_hooks.py", line 133, in _handle_streamed_response
estimated, prompt_tokens = _compute_prompt_token_count(m.get("content", ""), kwargs.get("model"))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/site-packages/ddtrace/contrib/openai/utils.py", line 37, in _compute_prompt_token_count
num_prompt_tokens += len(enc.encode(prompt))
^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/site-packages/tiktoken/core.py", line 117, in encode
raise_disallowed_special_token(match.group())
File "/usr/lib/python3.11/site-packages/tiktoken/core.py", line 400, in raise_disallowed_special_token
raise ValueError(
ValueError: Encountered text corresponding to disallowed special token '<|endoftext|>'.
If you want this text to be encoded as a special token, pass it to `allowed_special`, e.g. `allowed_special={'<|endoftext|>', ...}`.
If you want this text to be encoded as normal text, disable the check for this token by passing `disallowed_special=(enc.special_tokens_set - {'<|endoftext|>'})`.
To disable this check for all special tokens, pass `disallowed_special=()`.
Reproduction Code
No response
Error Logs
No response
Libraries in Use
No response
Operating System
No response
The text was updated successfully, but these errors were encountered:
Tracer Version(s)
2.5.1
Python Version(s)
Python 3.11
Pip Version(s)
pip 24.0
Bug Report
ddtrace
has a contrib module to count tokens if we havetiktoken
installed.it's found in
ddtrace/contrib/internal/openai/utils.py
openai usually doesn't encode special tokens from the user's input, because it would mess with the generation. these special tokens are used as separators for the user/assistant messages and also marking the end of message. one exception is
<|endoftext|>
.we encountered an issue with sending the text
<|endoftext|>
- the tiktoken call in ddtrace raises an exception, because it's on the list ofdisallowed_special
tokens.while this might indicate a bug on the caller side (since the caller might not want to send a special token, but rather just plain text), it shouldn't change how we calculate token counts. encoding this text as a token for token count calculation is the right behaviour, since this is what will happen on the openai backend.
I'd say this is the right behaviour for any LLM with its default tokenizer - and tiktoken loads the default tokenizer based on the model name.
I suggest setting
disallowed_special=()
, so we don't get runtime exceptions when this happens.openai will interpret the literal string as the EOS token anyways, so this is correct.
another issue with his contrib module is that there's no way of disabling the exact token count calculation OR falling back to rough token count estimation when an exception occurs. ideally, there'd be an env var or some other option to disable tiktoken.
we have tiktoken installed in our project for other reasons - not for ddtrace - so auto-enabling it when tiktoken is available is not desirable in our case.
I'm happy to open a PR after we agreed on the right course of action.
Trace
Reproduction Code
No response
Error Logs
No response
Libraries in Use
No response
Operating System
No response
The text was updated successfully, but these errors were encountered: