Skip to content

fix quantization save and load error #21504

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

divyashreepathihalli
Copy link
Collaborator

Fixes : #21378

When layers are quantized, their state changes. This was causing the Keras saving and loading to fail.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @divyashreepathihalli, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical bug where Keras models containing quantized layers failed to save and load correctly. The changes introduce specific logic within the layer's variable management to properly handle the serialization and deserialization of quantized weights, ensuring that the state of quantized models can be persistently stored and retrieved.

Highlights

  • Quantization Save Logic: Modified the save_own_variables method in keras/src/layers/layer.py to differentiate between quantized and non-quantized variables. Quantized variables are now saved with a quantized_ prefix, while non-quantized trainable variables are saved without a prefix. This ensures that the state of quantized layers is correctly serialized.
  • Quantization Load Logic: Updated the load_own_variables method in keras/src/layers/layer.py to detect if the saved state contains quantized_ prefixed keys. If so, it delegates to a new _load_quantized_variables helper, which correctly loads both quantized and non-quantized variables based on their respective keys.
  • Quantized Variable Identification: Introduced a new private helper method _get_quantized_variables in keras/src/layers/layer.py. This method identifies quantized variables by checking if their dtype is not a float, allowing for proper segregation during save and load operations.
  • New Test Case for Quantization: Added a new test test_quantized_model_save_and_load in keras/src/layers/layer_test.py. This test verifies that a Keras model, after being quantized, can be successfully saved to disk and then loaded back, ensuring the fix addresses the reported issue.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The PR fixes an issue with saving and loading quantized models. The changes correctly separate quantized and non-quantized variables. However, I found a critical issue where non-trainable float variables (like those in BatchNormalization) would be dropped during saving and loading of a quantized model. I've suggested fixes for this in both save_own_variables and _load_quantized_variables. I also suggested some minor performance and readability improvements. The added test case is good but doesn't cover the case with non-trainable float variables.

@codecov-commenter
Copy link

codecov-commenter commented Jul 23, 2025

Codecov Report

Attention: Patch coverage is 54.90196% with 23 lines in your changes missing coverage. Please review.

Project coverage is 82.85%. Comparing base (129e3d7) to head (6daa6a5).
Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
keras/src/layers/layer.py 20.68% 21 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #21504      +/-   ##
==========================================
+ Coverage   78.00%   82.85%   +4.84%     
==========================================
  Files         565      565              
  Lines       55701    55755      +54     
  Branches     8691     8700       +9     
==========================================
+ Hits        43451    46194    +2743     
+ Misses      10212     7444    -2768     
- Partials     2038     2117      +79     
Flag Coverage Δ
keras 82.66% <54.90%> (+4.81%) ⬆️
keras-jax 63.45% <54.90%> (?)
keras-numpy 58.66% <54.90%> (-0.01%) ⬇️
keras-openvino 33.97% <3.92%> (-0.03%) ⬇️
keras-tensorflow 63.88% <54.90%> (-0.02%) ⬇️
keras-torch 63.52% <54.90%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Collaborator

@hertschuh hertschuh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, what is the root cause of the bug and why does it fix it?

Regardless of whether quantize or not, we do v.assign. So what's the issue? The order of variables?

def _get_quantized_variables(self):
quantized_vars = []
for v in self._trainable_variables + self._non_trainable_variables:
if not backend.is_float_dtype(v.dtype):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes that all integral variables come from quantization. But what if you have variables that intrinsically represent ints and are unrelated to quantization? We definitely have layers using int vars.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed this to check for _is_quantized instead

@divyashreepathihalli
Copy link
Collaborator Author

The issue is that Keras's standard save-and-load function relies on a basic, index-based system to match up the variables. This process completely breaks for a quantized model because the number and names of the variables no longer match what the saved file expects.

So, when Keras tried to load the model, it was looking for the original variables but found the new, quantized ones instead.

@hertschuh
Copy link
Collaborator

hertschuh commented Jul 24, 2025

So the unit test you have passes without any of your changes. Which is consistent with this comment: #21378 (comment)
I don't think this is addressing the issue.

Have you tested with Gemma?

The issue is that Keras's standard save-and-load function relies on a basic, index-based system to match up the variables. This process completely breaks for a quantized model because the number and names of the variables no longer match what the saved file expects.

But your logic relies on having the right number of variables in some order. All it's changing is the order. If the kernel_scale variable is missing in the dense layer, it's still going to fail.

So, when Keras tried to load the model, it was looking for the original variables but found the new, quantized ones instead.

That makes me think that the issue has to do with turning on ("building") the quantization before reloading the variables.

store[f"{i}"] = v
return

# Case: quantized layer
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you invert the if? If getattr(self, "_is_quantized", False): ... (more readable)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Keras Fails to load quantized model
5 participants