Skip to content

Conversation

ilan-gold
Copy link
Contributor

@ilan-gold ilan-gold commented Aug 5, 2025

A bit of an "oops" from me, but hopefully this is more robust (than #3082)! It worked for my one example but testing this without using a spy object seems impossible (happy to contribute that though, or some other test).

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/user-guide/*.rst
  • Changes documented as a new file in changes/
  • GitHub Actions have all passed
  • Test coverage is 100% (Codecov passes)

@ilan-gold ilan-gold changed the title fix: check for non-int fill values fix: check for non-int 0 fill values Aug 5, 2025
@github-actions github-actions bot added the needs release notes Automatically applied to PRs which haven't added release notes label Aug 5, 2025
Copy link

codecov bot commented Aug 6, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 94.70%. Comparing base (0e28404) to head (5fe5391).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3345   +/-   ##
=======================================
  Coverage   94.70%   94.70%           
=======================================
  Files          79       79           
  Lines        9532     9533    +1     
=======================================
+ Hits         9027     9028    +1     
  Misses        505      505           
Files with missing lines Coverage Δ
src/zarr/core/buffer/cpu.py 100.00% <100.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@d-v-b
Copy link
Contributor

d-v-b commented Aug 6, 2025

to test this, could we perhaps patch np.full and np.zeros to raise exceptions, that you would then catch in the test? I'm thinking there would be 2 test functions, 1 testing the various inputs that should hit the np.full branch, and another testing the various inputs that should hit the np.zeros branch. This could also work for a single test function that internally checks which numpy routine the input scalar should trigger

Copy link
Contributor

@dstansby dstansby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me - needs a release note entry, and I left one suggestion to improve the test too.



@pytest.mark.parametrize("dtype", [np.int8, np.uint16, np.float32, int, float])
@pytest.mark.parametrize("fill_value", [None, 0, 1])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@pytest.mark.parametrize("fill_value", [None, 0, 1])
@pytest.mark.parametrize("fill_value", [None, 0, 0.0, 1])

Worth explicitly including a float here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think everything is cast anyway by dtype, so it shouldn't matter, no?

@dstansby dstansby added this to the 3.1.2 milestone Aug 11, 2025
@ilan-gold
Copy link
Contributor Author

I've done some digging into this topic, I'll post in a bit but I would like to hold off on merging for a bit.

@ilan-gold
Copy link
Contributor Author

Ok apologies, I know that Tom's PR supercedes this in many ways but for anyone curious, my worry was about how memory is allocated.

I learned that zeros is only faster than full because it doesn't allocate memory when requested, lazily doing so later when needed (see https://stackoverflow.com/questions/70055063/how-is-memory-handled-once-touched-for-the-first-time-in-numpy-zeros/70056188#70056188). In benchmarks, I see the times of subsequent operations on the matrix, whether something like sum that should touch all elements or __setitem__ to be identical independent of initialization so I definitely think this should be merged!

import numpy as np
buf = np.arange(100_000, dtype=np.float64)
# cell 1
%%timeit full = np.full((1_000_000,), 0, dtype=np.float64)

full[900_000:] = buf
# cell 2
%%timeit zeros = np.zeros((1_000_000,), dtype=np.float64)
    
zeros[900_000:] = buf
# cell 3
%%timeit full = np.full((1_000_000,), 0, dtype=np.float64)

full.sum()
# cell 4
%%timeit zeros = np.zeros((1_000_000,), dtype=np.float64)

zeros.sum()
Screenshot 2025-08-25 at 16 46 16

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs release notes Automatically applied to PRs which haven't added release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants