-
Notifications
You must be signed in to change notification settings - Fork 25.3k
Description
Description
It would be helpful to gain further visibility into where Elasticsearch spends its time on bulk indexing. Currently we have spans emitted at the per-shard-task level: indices:data/write/bulk[s][p]
and indices:data/write/bulk[s][r]
respectively. However, we do not have tracing on the delineation between the persistence to the segment, and the post-write refresh (wait_until
or immediate
) part of the write path. The post-write refresh part, especially when using wait_until
with a long refresh interval, may be the part that’s slow in a bulk write. This can come up when a user doesn’t have direct knowledge of the index’s refresh interval setting and issues wait_until refresh policies on writes without realizing the implications.
Particularly for wait_until
refreshes, the existing architecture of the tracing does not lend itself to making this contribution easy to build. Every span needs to be attached to a task, unless the span will open and close on the same thread (in which case withScope can be used). wait_until
post-write refresh submits the refresh listener on the writer thread, but then the completion is deferred until the asynchronous periodic refresh. One wonders if we should be using a task to track the post-write refresh action instead, as that would give us built-in tracing for said task.