A unified framework for attributing model components, data, and training dynamics to model behavior.
neural-networks interpretability explainability training-dynamics data-attribution model-attribution kernel-machines
-
Updated
May 27, 2025 - Python