-
Notifications
You must be signed in to change notification settings - Fork 60
Use TensorIndexer for the view tests #4237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
!test --diff |
Review updated until commit 26ad85b Description
Changes walkthrough 📝
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
!test --diff |
This reverts commit 03a1b69.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
github cannot show the diff 😢
Not totally sure how I should interpret the diff.
Searching by ^-
generated cuda indexing does look simpler. I'm a bit surprised to see the first code diff like these (line 11925 in the diff):
- if (threadIdx.x + (128 * blockIdx.x)) < 120)
+ if (threadIdx < 120)
I'm not holding anything against this PR, since it's only turning it on in the test. But let's remove the temporary file first before stamping it. I don't want to accidentally add that in our history.
IIRC, that's because the loop ID parallelized by BIDx is actually just a broadcast ID and that the new indexer is able to simplify the index. Do you have any questions with other changes? Any concern? I'm planning to enable the new indexer globally by default once we are sufficiently confident with it. I'm going to enable it for some of the C++ tests for now. Just manually checking the diff results seems to be the only way to gain some confidence. All the tests are passing in my local branch, but just having green test results don't necessarily mean everything is properly ported to the new indexer. I'll also check perf changes with the benchmarks, but they may not give clear signals as indexing is just one piece of performance bottlenecks. |
No, it doesn't. Please download it and open it locally. |
Thanks. my earlier quick scanning does seem to see indexing code getting at least shorter. So that's a positive thing.
Are we seeing mixed performance impact? I don't think we necessarily have to answer all those questions, but is there any significant regression that's worth investigation? I don't have much concern on this PR other than that question above. But most importantly, let's remove the diff code so I can stamp it. |
!test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for getting me to double check.
Looks like tmp files are cleaned up.
🚢
Enabled TensorIndexer for the reshape tests. I temporarily added a codegen diff result to this PR. This one is more concise as I disabled index hoisting. As far as I can see, there's no concerning change. I haven't verified everything, but I believe most of them are because TensorIndexer can detect more divisible splits, which helps generate simplified indices through more aggressive contig indexing. Once approved, I'll remove the html file. ### Context Part of #4175. I'm planning to enable the new indexer globally by default once we are sufficiently confident with it. I'm going to enable it for some of the C++ tests for now. Just manually checking the diff results seems to be the only way to gain some confidence. All the tests are passing in my local branch, but just having green test results don't necessarily mean everything is properly ported to the new indexer. I'll also check perf changes with the benchmarks, but they may not give clear signals as indexing is just one piece of performance bottlenecks.
Enabled TensorIndexer for the reshape tests.
I temporarily added a codegen diff result to this PR. This one is more concise as I disabled index hoisting. As far as I can see, there's no concerning change. I haven't verified everything, but I believe most of them are because TensorIndexer can detect more divisible splits, which helps generate simplified indices through more aggressive contig indexing.
Once approved, I'll remove the html file.
Context
Part of #4175.
I'm planning to enable the new indexer globally by default once we are sufficiently confident with it. I'm going to enable it for some of the C++ tests for now. Just manually checking the diff results seems to be the only way to gain some confidence.
All the tests are passing in my local branch, but just having green test results don't necessarily mean everything is properly ported to the new indexer. I'll also check perf changes with the benchmarks, but they may not give clear signals as indexing is just one piece of performance bottlenecks.