FINUFFT version 2.4.1
Small fixes:
- Fix cufinufft importing bug found by @fzimmermann89 (Barbone, issue #707, PR #708)
- Cached the optimal thread number (# physical cores) to reduce system call
overhead in repeated small transforms (@YuWei-CH), #697, fixing #693).
New feature:
- Python simple CUDA API for type3 (@MaximEremenko)
Minor changes: