You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Switch to a custom pebble merger for codesearch (#9512)
Previously, we relied on the default pebble merger, which appended
merged values together, to store posting lists. When unmarshalling
posting lists, we would just repeatedly read from the buffer until we
exhausted it.
This was simple and worked, but over the long term this will cause the
index to grow in proportion to the number of updates - as new postings
are added to existing lists, we'll write out an entirely new roaring
bitmap, rather than adding the new posting to the existing roaring
bitmap.
The merger included in this PR merges posting lists together (using
`Or`), which will result in a single serialize roaring bitmap per entry,
rather than N concatenated bitmaps.
More on pebble mergers:
https://pkg.go.dev/github.com/cockroachdb/[email protected]/internal/base#Merger
and
https://pkg.go.dev/github.com/cockroachdb/[email protected]/internal/base#ValueMerger.
Important note: the merger name is stored in the pebble db. Trying to
open an existing db with a different merger is an error. So, when this
change goes in, we'll need to blow away the existing code search dbs and
re-index the repos. This PR will stay in draft until we have a plan for
that.
0 commit comments