-
Notifications
You must be signed in to change notification settings - Fork 338
wip(vercel-ai): add vercel ai integration #5858
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Overall package sizeSelf size: 9.66 MB Dependency sizes| name | version | self size | total size | |------|---------|-----------|------------| | @datadog/libdatadog | 0.6.0 | 30.47 MB | 30.47 MB | | @datadog/native-appsec | 8.5.2 | 19.33 MB | 19.34 MB | | @datadog/pprof | 5.8.0 | 12.55 MB | 12.92 MB | | @datadog/native-iast-taint-tracking | 4.0.0 | 11.72 MB | 11.73 MB | | @opentelemetry/core | 1.30.1 | 908.66 kB | 7.16 MB | | protobufjs | 7.5.3 | 2.95 MB | 5.6 MB | | @datadog/wasm-js-rewriter | 4.0.1 | 2.85 MB | 3.58 MB | | @datadog/native-metrics | 3.1.1 | 1.02 MB | 1.43 MB | | @opentelemetry/api | 1.8.0 | 1.21 MB | 1.21 MB | | import-in-the-middle | 1.14.0 | 120.58 kB | 841.68 kB | | source-map | 0.7.4 | 226 kB | 226 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | lru-cache | 7.18.3 | 133.92 kB | 133.92 kB | | pprof-format | 2.1.0 | 111.69 kB | 111.69 kB | | @datadog/sketches-js | 2.1.1 | 109.9 kB | 109.9 kB | | lodash.sortby | 4.7.0 | 75.76 kB | 75.76 kB | | ignore | 5.3.2 | 53.63 kB | 53.63 kB | | istanbul-lib-coverage | 3.2.2 | 34.37 kB | 34.37 kB | | rfdc | 1.4.1 | 27.15 kB | 27.15 kB | | @isaacs/ttlcache | 1.4.1 | 25.2 kB | 25.2 kB | | dc-polyfill | 0.1.9 | 25.11 kB | 25.11 kB | | tlhunter-sorted-set | 0.1.0 | 24.94 kB | 24.94 kB | | shell-quote | 1.8.2 | 23.54 kB | 23.54 kB | | limiter | 1.1.5 | 23.17 kB | 23.17 kB | | retry | 0.13.1 | 18.85 kB | 18.85 kB | | semifies | 1.0.0 | 15.84 kB | 15.84 kB | | jest-docblock | 29.7.0 | 8.99 kB | 12.76 kB | | crypto-randomuuid | 1.0.0 | 11.18 kB | 11.18 kB | | ttl-set | 1.0.0 | 4.61 kB | 9.69 kB | | mutexify | 1.4.0 | 5.71 kB | 8.74 kB | | path-to-regexp | 0.1.12 | 6.6 kB | 6.6 kB | | koalas | 1.0.2 | 6.47 kB | 6.47 kB | | module-details-from-path | 1.0.4 | 3.96 kB | 3.96 kB |🤖 This report was automatically generated by heaviest-objects-in-the-universe |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #5858 +/- ##
==========================================
+ Coverage 80.65% 80.75% +0.10%
==========================================
Files 463 464 +1
Lines 19841 19924 +83
==========================================
+ Hits 16002 16090 +88
+ Misses 3839 3834 -5 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
BenchmarksBenchmark execution time: 2025-06-11 19:06:23 Comparing candidate commit 407d6dc in PR branch Found 0 performance improvements and 0 performance regressions! Performance is the same for 1269 metrics, 54 unstable metrics. |
Datadog ReportBranch report: ✅ 0 Failed, 1256 Passed, 0 Skipped, 21m 58.51s Total Time |
return new Identifier(value, radix) | ||
} | ||
}, { Identifier }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this isn't necessary but is a helpful export for some instanceof
checks we do in llmobs/tagger.js
- seems all tests still pass as-is, if we're ok to do this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not expose this as we should replace the current implementation with BigInt
at some point, and any exposed internals will make that more difficult.
isEnabled: true, | ||
// TODO(sabrenner): need to figure out how a user can configure this tracer | ||
// maybe we advise they do this manually? | ||
tracer: provider.getTracer() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we wouldn't go through translation as that could have unintended side-effects and won't be as flexible. Is there a centralized place in the module that handles tracing? Maybe we could patch something there instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use orchestrion for this? I think we're not ready to use it for APM yet, but for LLMObs I think it's already used? If not I guess passing in our OTel tracer might be the only option, although I'd probably prefer to pass a "fake" tracer that only works for this integration at that point to avoid having to go through the OTel SDK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah i thought about this too. aside from the node version problem (it's possible we have customers not above 20.6 yet in order to use orchestrion), it would just mean more on our side for maintaining patching for something they already instrument with the otel sdk themselves.
imho, i'm more in favor of just re-using their otel instrumentation and translating it as-is, although it wouldn't change the existing logic in the plugins too much, just our maintenance of patching in the instrumentation.
I'd probably prefer to pass a "fake" tracer that only works for this integration at that point to avoid having to go through the OTel SDK
could you clarify this a bit - i think i'm confused, i'm not seeing the harm in passing ours in as is 😅 it's only currently passed in if there's no tracer passed in which, by default, there isn't anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is that for example if it expected to receive an object with startSpan
on it, you could just pass an object created in the instrumentation that would just send a message on a channel like other integrations do instead of passing in a provider. Basically converting OTel to what the library should actually be doing itself 😅 Then the plugin can listen to that as normal without any hacks to make it work with OTel.
For example:
tracer: {
startSpan ( ... ) {
startCh.publish( ... )
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, i've tried this in a4014cc. it works (yeah there are no tests but it looks the same w the dummy app i'm using for now 😆), but my main concern is if someone wants to pass in their own dd-trace provider otel tracer (as noted is possible in the PR description), we'll need to still handle those events (or just always override the tracer option, which i don't think is a good solution).
at that point, instead of handling the case where they pass in their own dd-trace otel tracer, and us using this custom one in instrumentation, it would be better to just keep the implementation before this commit.
let me know what you think - i actually do agree it's easier for parent context setting/manipulation if we do it ourselves as i implemented in the commit, but i wonder if we should have consideration for folks passing in their own dd-trace otel tracer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to further that point a bit, the dummy 'span' passed into the function in the instrumentation for startActiveSpan
doesn't have all of the otel methods, just the ones used by vercel-ai. should they use one not on there, we break. imho any solution there that's like a) subclass the span from a noop otel span kind or b) just have all the methods, points us back to just listening for otel spans in the first place 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my main concern is if someone wants to pass in their own dd-trace provider otel tracer (as noted is possible in the PR description), we'll need to still handle those events (or just always override the tracer option, which i don't think is a good solution)
i've pushed up one more change in 407d6dc that addresses this concern, which seems a lot cleaner. but i guess i still have a question on reusability of otel tracer spans from integrations we want to have. if we add more integrations that have their own otel tracing under the hood, is this a pattern we want to do every time?
return new Identifier(value, radix) | ||
} | ||
}, { Identifier }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not expose this as we should replace the current implementation with BigInt
at some point, and any exposed internals will make that more difficult.
span.context()._trace.tags[PROPAGATED_PARENT_ID_KEY] ?? | ||
ROOT_PARENT_ID | ||
let parentId | ||
if (parent instanceof Identifier) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is an identifier as the parent needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh i suppose i was being silly with what i was doing here haha. i can change this back to what i had before
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah actually i had it this way because i don't think i have a reference to the parent span at the point of the vercel ai integration for llmobs parentage, just the current span, and on it, its parent id, which i opted to pass in here (but, correct me if i'm wrong about spans holding a reference to their entire parent span)
|
||
this.start(ctx, { | ||
useLlmObsParent, | ||
enterIntoLlmObsStorage: false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could these be replaced by a different class hierarchy or by runStores/bindStore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was a bit tricky to me - because we're using the points of the otel spans, which are the start and end, it doesn't leave a lot of flexibility for running/binding stores over a context if the only points of the context we have are disjoint start and end events. but, i could be thinking about this wrong - lmk if so!
since the otel spans started by the vercel ai sdk are activated spans in the proper scope, i wanted to just add an option to use the APM span parentage as opposed to the LLMObs parentage, since these operations should be isolated from whatever manual LLMObs instrumentation users might do.
const noopTracer = { | ||
startActiveSpan () { | ||
const fn = arguments[arguments.length - 1] | ||
|
||
const span = { | ||
end () {}, | ||
setAttributes () { return this }, | ||
addEvent () { return this }, | ||
recordException () { return this }, | ||
setStatus () { return this } | ||
} | ||
|
||
return fn(span) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i guess this could be extracted out into an otel noop tracer that could be shared
What does this PR do?
Adds support for
[email protected]
and greater.The Vercel AI SDK provides OTel tracing of their operations under the hood. This gives us a nice "in" to automatically pass our otel-compatible tracer in patching, and do whatever we want on span start and end. This is a better alternative to doing the patching ourselves, as the Vercel AI SDK does not nicely expose model provider functions through a base class, and certain functions, like tool executions, are not exported and are instead inlined.
To assist with this, we emit span start and end events for OpenTelemetry spans from our OTel tracer provider, and from there assert that the spans are Vercel AI spans in the corresponding APM and LLMObs integrations.
Motivation
Need to link FR when this PR is ready
Plugin Checklist
Additional Notes