-
Notifications
You must be signed in to change notification settings - Fork 5.1k
[APX] fix a few emitter bugs. #117791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[APX] fix a few emitter bugs. #117791
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
Hi @tannergooding, while waiting for the CI, would you please share some thoughts on this, shall we fix this bug first before I start to work on #117326? Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes two APX emitter bugs related to instruction encoding: correcting LEA encoding and disabling EVEX compressed displacement for APX extended EVEX instructions. The changes address issues where APX extended EVEX instructions were incorrectly applying compressed displacement optimization, which could enable unexpected features since these instructions have a constant scaling factor of 1.
- Adds tuple type information checks to prevent compressed displacement optimization on APX extended EVEX instructions
- Fixes LEA encoding by adding the X86 prefix when needed
- Updates displacement handling logic to properly differentiate between original EVEX and extended EVEX instructions
Failures should be unrelated timeout error. PR is ready for review. c.c @dotnet/intel |
Hi @tannergooding, we have a few bug fixes for APX, this PR is expected to target on RC1, please take a look, thanks! |
Thanks for the quick response! |
This PR fixes 2 bugs:
ShiftRotate
andMul
.A few more explanation for the second fix:
In the current context, APX extended EVEX is specifically referring to the EVEX functionality for those instructions originally without EVEX and promoted to EVEX space because of some new features like: NDD, NF, ZU, etc. As these instructions do not have tuple information and the scaling factor N when using disp8 is constantly treated as 1, so the optimization in
TryEvexCompressedDisplacement
, where the optimization tries to use embedded broadcast to get a smaller displacement - disp8 instead of disp32, is not really necessary for this group of instructions as the scaling factor N is not affected by EVEX.b, whether to use disp8 or disp32 should purely depend ondspInByte
. (And in fact EVEX.b has different semantic in the extended EVEX case, hence this bug adds EVEX to legacy instructions in some unnecessary scenarios and enables some unexpected features - EVEX.NF as how compressed displacement is defined on instruction descriptor.). The PR fixes this by skipping the optimization on instruction without tuple information.p.s. I understand that we have logged an issue improving the emitter logics for APX-EVEX: letting
TakesEvexPrefix
andIsEvexEncodableInstruction
manage all the EVEX encodable instructions including APX extended ones. But it might be better to have the emitter in a good state (clean SuperPMI replay with APX on) before we start to improve the code quality?p.p.s More explanation for the third fix:
When doing code generating for
ShftRotate
andMul
, there are some optimizations that BMI instructions will be used. Without BMI instructions, Shiftrotate and Mul nodes will be mapped to legacy instructions so EVEX is not required for EGPR access since APX provides REX2, but if there are optimizations that these nodes will be mapped to BMI instructions, EVEX is required, so we choose a conservative strategy and mask out EGPRs in register allocator for these nodes when EVEX is not available.