-
Notifications
You must be signed in to change notification settings - Fork 226
Sequence db size != result db size #996
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
That is a very surprising error. Could you upload the input set so I can try to reproduce this locally? |
Does this only happen with |
So far yes, we only hit this on beegfs. We also have another scenario hitting this error from foldseek but I don't know what version of MMseqs2 is embedded into that binary. |
Could you please upload the full terminal output of one of the runs that crashes with this issue? |
This is stdout only, but stderr only has the title error. |
Hi,
My users reported hitting this error on MMseqs2-17 on our HPC beegfs parallel storage. I worked with them to isolate a small and quick way to reproduce it consistently.
We run
mmseqs easy-linclust input.faa ccl_clus tmp --min-seq-id 1.0 -c 1.0 --cov-mode 0 --threads 1 -v 3 --remove-tmp-files 1
and this works as expected. As soon as we increase --threads to 2 we hit db size inconsistency.I did some digging in the code and suspect that issue is timing related and comes from OpenMP scheduling in Util::ompCountLines. I see similar case was already worked around in issue #210. Can you comment on this and suggest a workaround or patch?
The text was updated successfully, but these errors were encountered: