-
Notifications
You must be signed in to change notification settings - Fork 19.6k
fix quantization save and load error #21504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
fix quantization save and load error #21504
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @divyashreepathihalli, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request addresses a critical bug where Keras models containing quantized layers failed to save and load correctly. The changes introduce specific logic within the layer's variable management to properly handle the serialization and deserialization of quantized weights, ensuring that the state of quantized models can be persistently stored and retrieved.
Highlights
- Quantization Save Logic: Modified the
save_own_variables
method inkeras/src/layers/layer.py
to differentiate between quantized and non-quantized variables. Quantized variables are now saved with aquantized_
prefix, while non-quantized trainable variables are saved without a prefix. This ensures that the state of quantized layers is correctly serialized. - Quantization Load Logic: Updated the
load_own_variables
method inkeras/src/layers/layer.py
to detect if the saved state containsquantized_
prefixed keys. If so, it delegates to a new_load_quantized_variables
helper, which correctly loads both quantized and non-quantized variables based on their respective keys. - Quantized Variable Identification: Introduced a new private helper method
_get_quantized_variables
inkeras/src/layers/layer.py
. This method identifies quantized variables by checking if theirdtype
is not a float, allowing for proper segregation during save and load operations. - New Test Case for Quantization: Added a new test
test_quantized_model_save_and_load
inkeras/src/layers/layer_test.py
. This test verifies that a Keras model, after being quantized, can be successfully saved to disk and then loaded back, ensuring the fix addresses the reported issue.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
The PR fixes an issue with saving and loading quantized models. The changes correctly separate quantized and non-quantized variables. However, I found a critical issue where non-trainable float variables (like those in BatchNormalization
) would be dropped during saving and loading of a quantized model. I've suggested fixes for this in both save_own_variables
and _load_quantized_variables
. I also suggested some minor performance and readability improvements. The added test case is good but doesn't cover the case with non-trainable float variables.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #21504 +/- ##
==========================================
+ Coverage 78.00% 82.85% +4.84%
==========================================
Files 565 565
Lines 55701 55755 +54
Branches 8691 8700 +9
==========================================
+ Hits 43451 46194 +2743
+ Misses 10212 7444 -2768
- Partials 2038 2117 +79
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
5b624ce
to
886bfbe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, what is the root cause of the bug and why does it fix it?
Regardless of whether quantize or not, we do v.assign
. So what's the issue? The order of variables?
keras/src/layers/layer.py
Outdated
def _get_quantized_variables(self): | ||
quantized_vars = [] | ||
for v in self._trainable_variables + self._non_trainable_variables: | ||
if not backend.is_float_dtype(v.dtype): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assumes that all integral variables come from quantization. But what if you have variables that intrinsically represent ints and are unrelated to quantization? We definitely have layers using int vars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed this to check for _is_quantized
instead
The issue is that Keras's standard save-and-load function relies on a basic, index-based system to match up the variables. This process completely breaks for a quantized model because the number and names of the variables no longer match what the saved file expects. So, when Keras tried to load the model, it was looking for the original variables but found the new, quantized ones instead. |
So the unit test you have passes without any of your changes. Which is consistent with this comment: #21378 (comment) Have you tested with Gemma?
But your logic relies on having the right number of variables in some order. All it's changing is the order. If the
That makes me think that the issue has to do with turning on ("building") the quantization before reloading the variables. |
store[f"{i}"] = v | ||
return | ||
|
||
# Case: quantized layer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you invert the if
? If getattr(self, "_is_quantized", False):
... (more readable)
Fixes : #21378
When layers are quantized, their state changes. This was causing the Keras saving and loading to fail.