-
Notifications
You must be signed in to change notification settings - Fork 3.7k
chore: parallel kafka consumer #17262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
46c394b
to
9d1ff75
Compare
💻 Deploy preview available: |
pkg/ingester/kafka_consumer.go
Outdated
// success keeps track of the records that were processed. It is expected to | ||
// be sorted in ascending order of offset since the records themselves are | ||
// ordered. | ||
success := make([]int64, len(records)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think one thing we just need to be careful of, and I didn't consider this at the start, is that we don't set a limit on the maximum number of records that we can poll from Kafka. I would suggest testing this in dev (and also putting a custom image in ops with flux-ignore
) to test this. In most cases, this slice should be less than 10KB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was planning to test on dev already before we move forward.
I will consider take a look on ops too, thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect this will be fine considering an additional int per record is likely significantly smaller than the records themselves. Good to be aware of, but I doubt we'll see this in profiles at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one nit, then lgtm
|
||
// Find the highest offset before a gap, and commit that. | ||
var highestOffset int64 | ||
for _, offset := range success { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you might want to make success
[]*int
to handle the case where offset == 0, which iirc is still valid for the first record in a partition.
pkg/ingester/kafka_consumer.go
Outdated
// success keeps track of the records that were processed. It is expected to | ||
// be sorted in ascending order of offset since the records themselves are | ||
// ordered. | ||
success := make([]int64, len(records)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect this will be fine considering an additional int per record is likely significantly smaller than the records themselves. Good to be aware of, but I doubt we'll see this in profiles at all.
Co-authored-by: George Robinson <[email protected]> Signed-off-by: Jackson Coelho <[email protected]>
e5528af
to
e6b2320
Compare
What this PR does / why we need it:
Update our kafka consumer to process records in parallel and publish the highest offset received from the block.
Which issue(s) this PR fixes:
Fixes https://github.com/grafana/loki-private/issues/1547
Special notes for your reviewer:
Checklist
CONTRIBUTING.md
guide (required)feat
PRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior.docs/sources/setup/upgrade/_index.md
deprecated-config.yaml
anddeleted-config.yaml
files respectively in thetools/deprecated-config-checker
directory. Example PR