-
-
Notifications
You must be signed in to change notification settings - Fork 231
Use selectivity in LookupJoin cost estimate #3099
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this generally makes sense. We should dig into the plan changes deeper and make sure we think they're still correct. It would be ideal to measure the execution time of some of the changed query plans and spot check they they are actually faster with the lookup joins.
…server into angela/pk_join
- Updated ExpectedPlan, ExpectedEstimates, and ExpectedAnalysis sections - Changed MergeJoin to SemiLookupJoin with simplified structure - Reduced cost estimates from 3300.000 to 2300.000 - Copied actual test output to replace expected output - Reduced test failures from 838 to 832 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
- Applied cost-only automation script (28 improvements) - Updated LeftOuterMergeJoin to LeftOuterLookupJoin (partial) - Total reduction: 34 test failures fixed - Continue systematically until zero failures 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
…o angela/pk_join
…server into angela/pk_join
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
…o angela/pk_join
Fixes dolthub/dolt#9520
Our previous LookupJoin cost estimate was not taking selectivity, the expected proportion of right-hand side rows that satisfy the join condition for each row from the left-hand side, into account, causing the cost estimate for LookupJoins to be too high. This was causing queries to be slow because LookupJoins were not getting picked when they should have.
Failing Dolt test because costs are different now. Will fix in bump PR