-
Notifications
You must be signed in to change notification settings - Fork 2
More validation for partial read parameters #116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #116 +/- ##
==========================================
- Coverage 97.09% 95.80% -1.30%
==========================================
Files 8 8
Lines 241 262 +21
==========================================
+ Hits 234 251 +17
- Misses 7 11 +4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
try | ||
zeros(count, obj.Datatype.MATLABType); | ||
catch ME | ||
if strcmp(ME.identifier, 'MATLAB:array:SizeLimitExceeded') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you have to use IF here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I only want to check for the error related to exceeding size limit. I am choosing to silently swallow any other possible errors (rare/unlikely edge-cases). We are trying to create an array of zeros here just as a way to see if reading a similarly-sized array from Zarr file will exceed the memory limit (unfortunately couldn't find a better way) , so my thinking is that any other errors would be unrelated, and we shouldn't throw them and interrupt user workflow. If something goes wrong with creating an array if zeros that is unrelated to memory limit, we should just try reading from the zarr file anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarifying this.
Changes include:
Preventing a crash when trying to read too much data ( Fixes MATLAB crashed while using zarrread function to read a big (2317GB) array #108 )
a. The crash is caused by running out of memory, and the best we can do is error out instead of crashing. There isn't an easy way to determine memory limit in MATLAB, so doing a very hacky workaround of trying to create a zeros array of same size and datatype and seeing if it errors. Luckily creating an array like that is efficient and errors out quickly. At least this is better than accidentally causing the whole MATLAB to crash.
b. To know the datatype before reading the data I am using the results from
zarrinfo
(instead of introspection on already read data). To determine datatype fromzarrinfo
I needed to addfromZarrType
method to ZarrDatatype.mc. Added a test for this in tZarrRead -
tooBigArray
.d. Example of the proposed error when reading data that's too big:
Start too big
Before:
After:
Stride too big
Before:
After:
Combination of Count and Start is too big
Before:
After:
But logical values do not make sense for Start/Stride/Count, so I added a
mustBeNumeric
validator. Now the behavior is clearer: