Skip to content

Add the capability to do adjoint transforms #633

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 48 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
1180e27
first version; adjoint still completely untested
mreineck Feb 18, 2025
e7bd659
add Python interface; fix a few bugs
mreineck Feb 18, 2025
fe0dcdc
add tests; add overlooked file
mreineck Feb 18, 2025
e148935
use assertions that are more helpful in cse of errors
mreineck Feb 18, 2025
b7d7ce1
test commit to see if tests pass with incresed tolerance for adjoint …
mreineck Feb 19, 2025
f54a1dd
increase overall requested plan accuracy for adjoint type 3 transform…
mreineck Feb 19, 2025
d50a912
be more clever about memory consumption
mreineck Feb 19, 2025
a4c313f
document new parameters for execute_internal()
mreineck Feb 24, 2025
2692658
Merge remote-tracking branch 'origin/master' into add_adjoint
mreineck Feb 24, 2025
eafcf3e
update CHANGELOG
mreineck Feb 24, 2025
7f19185
merge master
mreineck Feb 26, 2025
0fa2a4e
more comments
mreineck Feb 26, 2025
da3df92
small tweak
mreineck Feb 26, 2025
ec969b5
merge master
mreineck Mar 3, 2025
5a61c86
fix typos
mreineck Mar 22, 2025
9bc0e02
merge master
mreineck Apr 2, 2025
3dbd742
maerge master
mreineck Apr 3, 2025
ee21762
merge master
mreineck Apr 14, 2025
d17dc7a
add explanations for obscure tricks
mreineck Apr 14, 2025
886cdf9
merge master
mreineck Apr 30, 2025
8846fea
merge master
mreineck Jun 15, 2025
e2e0e5c
merge master
mreineck Jun 24, 2025
a04172f
added an example
DiamonDinoia Jun 24, 2025
1c7824c
create FFTW plans for adjoint transforms
mreineck Jun 25, 2025
ab23ae1
better variable names and debug prints
mreineck Jun 25, 2025
2f199c2
add docstring for execute_adjoint
mreineck Jun 25, 2025
4a87d06
comments
mreineck Jun 25, 2025
01e88ff
corrected doc comments in example/guru2d1_adjoint.cpp
ahbarnett Jun 25, 2025
478b898
add execute_adjoint to C/C++ guru doc strings
ahbarnett Jun 25, 2025
a4e980b
execute_adjoint fully described in docs/c.rst
ahbarnett Jun 25, 2025
0909357
execute_adjoint added to docs/cex.rst
ahbarnett Jun 25, 2025
bc8afbe
matlab mwrap interface add execute_adjoint
ahbarnett Jun 25, 2025
ff54c2a
actually add matlab execute_adjoint
ahbarnett Jun 25, 2025
08029a1
exec_adj into matlab docs and overview.src
ahbarnett Jun 25, 2025
3d6a993
exec_adj matlab docs and example/guru1d1_adjoint.m
ahbarnett Jun 25, 2025
dbbd266
exec_adj to matlab.rst and Contents.m
ahbarnett Jun 25, 2025
ed19475
mention additional benefits of the change
mreineck Jun 26, 2025
02eda19
merge master
mreineck Jun 26, 2025
733f025
add Fortran adjoint interface and example
mreineck Jun 26, 2025
20a5bc0
add adjoint to Fortran docs
mreineck Jun 26, 2025
e3d9da3
execute_adjoint in Fortran, with example. No docs yet
ahbarnett Jun 26, 2025
7d7056a
Merge branch 'add_adjoint' of https://github.com/flatironinstitute/fi…
ahbarnett Jun 26, 2025
225f6ed
add Martin's guru1d2_adjoint.f to makefile
ahbarnett Jun 26, 2025
0e206f2
negate iflags in Martin's guru1d2_adjoint{f}.f
ahbarnett Jun 26, 2025
6312e20
both fort adj examples in doc page
ahbarnett Jun 26, 2025
f74c302
simplified guru1d2_adjoint{f}.f
ahbarnett Jun 26, 2025
2526214
make octave runs adjoint example
ahbarnett Jun 26, 2025
7504635
merge master
mreineck Jul 12, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions CHANGELOG
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,12 @@ If not stated, FINUFFT is assumed (cuFINUFFT <=1.3 is listed separately).

Master (working towards v2.5.0), 7/8/25

* Added functionality for adjoint execution of FINUFFT plans (Reinecke #633,
addresses #566 and #571).
Work arrays are now only allocated during plan execution, reducing overall
memory consumption.
A single plan can now safely be executed by several threads concurrently.

V 2.4.1 7/8/25

* Update Python cufinufft unit tests to use complex dtypes (Andén, #705).
Expand Down
18 changes: 12 additions & 6 deletions docs/c.rst
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ with the word "many" in the function name) perform ``ntr`` transforms with the s

.. note::

The motivations for the vectorized interface (and guru interface, see below) are as follows. 1) It is more efficient to bin-sort the nonuniform points only once if there are not to change between transforms. 2) For small problems, certain start-up costs cause repeated calls to the simple interface to be slower than necessary. In particular, we note that FFTW takes around 0.1 ms per thread to look up stored wisdom, which for small problems (of order 10000 or less input and output data) can, sadly, dominate the runtime.
The motivations for the vectorized interface (and guru interface, see below) include the following. 1) It is more efficient to bin-sort the nonuniform points only once if there are not to change between transforms. 2) For small problems, certain start-up costs cause repeated calls to the simple interface to be slower than necessary. In particular, we note that FFTW takes around 0.1 ms per thread to look up stored wisdom, which for small problems (of order 10000 or less input and output data) can, sadly, dominate the runtime.


1D transforms
Expand All @@ -77,13 +77,19 @@ with the word "many" in the function name) perform ``ntr`` transforms with the s
Guru plan interface
-------------------

This provides more flexibility than the simple or vectorized interfaces.
This provides more flexibility than either simple or vectorized interfaces.
Any transform requires (at least)
calling the following four functions in order. However, within this
sequence one may insert repeated ``execute`` calls, or another ``setpts``
followed by more ``execute`` calls, as long as the transform sizes (and number of transforms ``ntr``) are
calling four of the following five functions in order. However, within this
sequence one may insert repeated ``execute`` and/or ``execute_adjoint`` calls,
or another ``setpts``
followed by more ``execute`` and/or ``execute_adjoint`` calls, as long as the transform sizes (and number of transforms ``ntr``) are
consistent with those that have been set in the ``plan`` and in ``setpts``.
Keep in mind that ``setpts`` retains *pointers* to the user's list of nonuniform points, rather than copying these points; thus the user must not change their nonuniform point arrays until after any ``execute`` calls that use them.
Keep in mind that ``setpts`` retains *pointers* to the user's list of nonuniform points, rather than copying these points; thus the user must not change their nonuniform point arrays until after any ``execute`` or ``execute_adjoint`` calls that use them.

The goal of the ``execute_adjoint`` feature (fully supported in v2.5.0)
is to allow the
common use-case of transform and adjoint transform pairs to be accessible
via a single plan stage and a single setpts call.

.. note::

Expand Down
10 changes: 7 additions & 3 deletions docs/cex.rst
Original file line number Diff line number Diff line change
Expand Up @@ -236,6 +236,8 @@ previous wisdom which would be significant when doing many small transforms.
You may also send in a new
set of stacked strength data (for type 1 and 3, or coefficients for type 2),
reusing the existing FFTW plan and sorted points.
Finally, you may execute *adjoints* of the planned transforms without
re-planning, making forward-adjoint transform pairs very convenient.
Now we redo the above 2D type 1 C++ example with the guru interface.

One first makes a plan giving transform parameters, but no data:
Expand All @@ -254,6 +256,7 @@ One first makes a plan giving transform parameters, but no data:
// step 3: do the planned transform to the c strength data, output to F...
finufft_execute(plan, &c[0], &F[0]);
// ... you could now send in new points, and/or do transforms with new c data
// ... or even adjoint transforms with the same points but now mapping F to c.
// ...
// step 4: when done, free the memory used by the plan...
finufft_destroy(plan);
Expand All @@ -264,14 +267,15 @@ is that the ``int64_t`` type (aka ``long long int``)
is needed since the Fourier coefficient dimensions are passed as an array.

.. warning::
You must not change the nonuniform point arrays (here ``x``, ``y``) between passing them to ``finufft_setpts`` and performing ``finufft_execute``. The latter call expects these arrays to be unchanged. We chose this style of interface since it saves RAM and time (by avoiding unnecessary duplication), allowing the largest possible problems to be solved.
You must not change the nonuniform point arrays (here ``x``, ``y``) between passing them to ``finufft_setpts`` and performing ``finufft_execute`` or ``finufft_execute_adjoint``. The last two calls expect these arrays to be unchanged. We chose this style of interface since it saves RAM and time (by avoiding unnecessary duplication), allowing the largest possible problems to be solved.

.. warning::
You must destroy a plan before making a new plan using the same
plan object, otherwise a memory leak results.

The complete code with a math test is in ``examples/guru2d1.cpp``, and for
more examples see ``examples/guru1d1*.c*``
The complete code with a math test is in ``examples/guru2d1.cpp``,
the demo of an adjoint execution is in ``examples/guru2d1_adjoint.cpp``,
and for more examples see ``examples/guru1d1*.c*``

Using the guru interface to perform a vectorized transform (multiple 1D type 1
transforms each with the same nonuniform points) is demonstrated in
Expand Down
50 changes: 47 additions & 3 deletions docs/cguru.doc
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,10 @@

Make a plan to perform one or more general transforms.

Under the hood, for type 1 and 2, this does FFTW planning and kernel Fourier
transform precomputation. For type 3, this does very little, since the FFT
sizes are not yet known.
Under the hood, for type 1 and 2, this chooses spread/interp kernel
parameters, precomputes the kernel Fourier transform, and (for FFTW), plans
a pair of FFTs. For type 3, only the kernel parameters are chosen, since
the FFT sizes are not yet known.

Inputs:
type type of transform (1,2, or 3)
Expand Down Expand Up @@ -128,6 +129,49 @@
if ntr>1, being the "slowest" (outer) dimension.


::

int finufft_execute_adjoint(finufft_plan plan, complex<double>* c, complex<double>* f)
int finufftf_execute_adjoint(finufftf_plan plan, complex<float>* c, complex<float>* f)

Perform one or more NUFFT transforms using previously entered nonuniform
points and the *adjoint* of the existing planned transform. The point is to
enable transforms and their adjoints to be accessible via a single plan.
Recall that the adjoint of a type 1 is a type 2 of opposite isign, and
vice versa. The adjoint of a type 3 is a type 3 of opposite isign and
flipped input and output. To summarize, this operation maps
adjoint of type 1: f -> c
adjoint of type 2: c -> f
adjoint of type 3: f -> c

Inputs:
plan plan object

Input/Outputs:
c If adjoints of types 1 and 3, the output values at the
nonuniform point sources (size M*ntr complex array).
If adjoint of type 2, the input strengths at the nonuniform
point targets (size M*ntr complex array).
f If adjoint of type 1, the input Fourier mode coefficients (size
N1*ntr or N1*N2*ntr or N1*N2*N3*ntr complex array, when
dim = 1, 2, or 3 respectively).
If adjoint of type 2, the output Fourier mode coefficients (size
N1*ntr or N1*N2*ntr or N1*N2*N3*ntr complex array, when
dim = 1, 2, or 3 respectively).
If adjoint of type 3, the input values at the nonuniform
frequency sources (size N*ntr complex array).

Outputs:
return value 0: success, 1: success but warning, >1: error (see error.rst)

Notes:
* The contents of the arrays x, y, z, s, t, u must not have changed since
the finufft_setpts call that read them. The adjoint execution rereads them
(this way of doing business saves RAM).
* f and c are contiguous Fortran-style arrays with the transform number,
if ntr>1, being the "slowest" (outer) dimension.


::

int finufft_destroy(finufft_plan plan)
Expand Down
47 changes: 44 additions & 3 deletions docs/cguru.docsrc
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,10 @@ int @G_makeplan(int type, int dim, int64_t* nmodes, int iflag, int ntr, double e

Make a plan to perform one or more general transforms.

Under the hood, for type 1 and 2, this does FFTW planning and kernel Fourier
transform precomputation. For type 3, this does very little, since the FFT
sizes are not yet known.
Under the hood, for type 1 and 2, this chooses spread/interp kernel
parameters, precomputes the kernel Fourier transform, and (for FFTW), plans
a pair of FFTs. For type 3, only the kernel parameters are chosen, since
the FFT sizes are not yet known.

Inputs:
type type of transform (1,2, or 3)
Expand Down Expand Up @@ -114,6 +115,46 @@ int @G_execute(finufft_plan plan, complex<double>* c, complex<double>* f)
if ntr>1, being the "slowest" (outer) dimension.


int @G_execute_adjoint(finufft_plan plan, complex<double>* c, complex<double>* f)

Perform one or more NUFFT transforms using previously entered nonuniform
points and the *adjoint* of the existing planned transform. The point is to
enable transforms and their adjoints to be accessible via a single plan.
Recall that the adjoint of a type 1 is a type 2 of opposite isign, and
vice versa. The adjoint of a type 3 is a type 3 of opposite isign and
flipped input and output. To summarize, this operation maps
adjoint of type 1: f -> c
adjoint of type 2: c -> f
adjoint of type 3: f -> c

Inputs:
plan plan object

Input/Outputs:
c If adjoints of types 1 and 3, the output values at the
nonuniform point sources (size M*ntr complex array).
If adjoint of type 2, the input strengths at the nonuniform
point targets (size M*ntr complex array).
f If adjoint of type 1, the input Fourier mode coefficients (size
N1*ntr or N1*N2*ntr or N1*N2*N3*ntr complex array, when
dim = 1, 2, or 3 respectively).
If adjoint of type 2, the output Fourier mode coefficients (size
N1*ntr or N1*N2*ntr or N1*N2*N3*ntr complex array, when
dim = 1, 2, or 3 respectively).
If adjoint of type 3, the input values at the nonuniform
frequency sources (size N*ntr complex array).

Outputs:
@r

Notes:
* The contents of the arrays x, y, z, s, t, u must not have changed since
the finufft_setpts call that read them. The adjoint execution rereads them
(this way of doing business saves RAM).
* f and c are contiguous Fortran-style arrays with the transform number,
if ntr>1, being the "slowest" (outer) dimension.


int @G_destroy(finufft_plan plan)

Deallocate a plan object. This must be used upon clean-up, or before reusing
Expand Down
3 changes: 3 additions & 0 deletions docs/fortran.rst
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,7 @@ These routines and arguments are, in double-precision:
call finufft_makeplan(type,dim,n_modes,iflag,ntrans,tol,plan,opts,ier)
call finufft_setpts(plan,M,xj,yj,zj,Nk,sk,yk,uk,ier)
call finufft_execute(plan,cj,fk,ier)
call finufft_execute_adjoint(plan,cj,fk,ier)
call finufft_destroy(plan,ier)

The single-precision (ie, ``real*4`` and ``complex*8``)
Expand All @@ -178,6 +179,8 @@ Each has a math test to check the correctness of some or all outputs::

simple1d1.f - 1D type 1, simple interface, default and various opts
guru1d1.f - 1D type 1, guru interface, default and various opts
guru1d1_adjoint.f - adjoint of 1D type 1, guru interface, default opts
guru1d2_adjoint.f - adjoint of 1D type 2, guru interface, default and various opts
nufft1d_demo.f - 1D types 1,2,3, minimally changed from CMCL demo codes
nufft2d_demo.f - 2D "
nufft3d_demo.f - 3D "
Expand Down
4 changes: 2 additions & 2 deletions docs/makefile.doc
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
make[1]: Entering directory '/home/marco/repos/finufft'
make[1]: Entering directory '/home/alex/numerics/finufft'
Makefile for FINUFFT CPU library. Please specify your task:
make lib - build the main library (in lib/ and lib-static/)
make examples - compile and run all codes in examples/
Expand All @@ -23,4 +23,4 @@ Make options:
You must at least 'make objclean' before changing such options!

Also see docs/install.rst and docs/README
make[1]: Leaving directory '/home/marco/repos/finufft'
make[1]: Leaving directory '/home/alex/numerics/finufft'
8 changes: 5 additions & 3 deletions docs/matlab.rst
Original file line number Diff line number Diff line change
Expand Up @@ -54,12 +54,14 @@ interface. For smaller transform sizes the acceleration factor of this vectorize

If you want yet more control, consider using the "guru" interface.
This can be faster than fresh calls to the simple or vectorized interfaces
for the same number of transforms, for reasons such as this:
for the same number of transforms, since
the nonuniform points can be changed between transforms, without forcing
FFTW to look up a previously stored plan.
Usually, such an acceleration is only important when doing
repeated small transforms, where "small" means each transform takes of
order 0.01 sec or less.
The guru interface is also very convenient for applying forward-adjoint
transform pairs, common in imaging or optimization applications.
Here we use the guru interface to repeat the first demo above:

.. code-block:: matlab
Expand All @@ -74,12 +76,12 @@ Here we use the guru interface to repeat the first demo above:
c = randn(M,1)+1i*randn(M,1); % iid random complex data (row or col vec)
f = plan.execute(c); % do the transform (0.008 sec, ie, faster)
% ...one could now change the points with setpts, and/or do new transforms
% with new c data...
% ...with new c data, and/or do adjoint transforms with new data...
delete(plan); % don't forget to clean up

.. warning::

If an existing array is passed to ``setpts``, then this array must not be altered before ``execute`` is called! This is because, in order to save RAM (allowing larger problems to be solved), internally FINUFFT stores only *pointers* to ``x`` (etc), rather than unnecessarily duplicating this data. This is not true if an *expression* such as ``-x`` or ``2*pi*rand(M,1)`` is passed to ``setpts``, since in those cases the ``plan`` object does make internal copies, as per MATLAB's usual shallow-copy argument passing.
If an existing array is passed to ``setpts``, then this array must not be altered before ``execute`` or ``execute_adjoint`` is called! This is because, in order to save RAM (allowing larger problems to be solved), internally FINUFFT stores only *pointers* to ``x`` (etc), rather than unnecessarily duplicating this data. This is not true if an *expression* such as ``-x`` or ``2*pi*rand(M,1)`` is passed to ``setpts``, since in those cases the ``plan`` object does make internal copies, as per MATLAB's usual shallow-copy argument passing.

Finally, we demo a 2D type 1 transform using the simple interface. Let's
request a rectangular Fourier mode array of 1000 modes in the x direction but 500 in the
Expand Down
43 changes: 38 additions & 5 deletions docs/matlabhelp.doc
Original file line number Diff line number Diff line change
Expand Up @@ -461,8 +461,7 @@

FINUFFT_PLAN is a class which wraps the guru interface to FINUFFT.

Full documentation is given in ../finufft-manual.pdf and online at
http://finufft.readthedocs.io
Full documentation is given online at http://finufft.readthedocs.io
Also see examples in the matlab/examples and matlab/test directories.

PROPERTIES
Expand All @@ -478,6 +477,7 @@
finufft_plan - create guru plan object for one/many general nonuniform FFTs.
setpts - process nonuniform points for general FINUFFT transform(s).
execute - execute single or many-vector FINUFFT transforms in a plan.
execute_adjoint - execute adjoint of planned transform(s).

General notes:
* use delete(plan) to remove a plan after use.
Expand Down Expand Up @@ -605,10 +605,43 @@
plan stage using opts.floatprec, otherwise an error is raised.


4) To deallocate (delete) a nonuniform FFT plan, use delete(plan)
4) EXECUTE_ADJOINT execute adjoint of planned transform(s).

This deallocates all stored FFTW plans, nonuniform point sorting arrays,
kernel Fourier transforms arrays, etc.
result = plan.execute_adjoint(data_in);

Perform the adjoint of the planned transform(s) that plan.execute would
perform (see above documentation for EXECUTE). This is convenient in the
common case of needing forward-adjoint transform pairs for the same set of
nonuniform points.
The adjoint of a type 1 is a type 2 of opposite isign, and vice versa.
The adjoint of a type 3 is a type 3 of opposite isign and flipped input
and output.

Inputs:
plan finufft_plan object
data_in strengths (adjoint type 2 and 3) or Fourier coefficients
(adjoint type 1) vector, matrix, or array of appropriate size.
For adjoint type 1, in 1D this is length-ms, in 2D size (ms,mt),
or in 3D size (ms,mt,mu), or each of these with an extra last
dimension ntrans if ntrans>1. For adjoint types 2 and 3, it is
a column vector of length M (for type 2, the length of xj),
or nk (for type 3, the length of s). If ntrans>1 its is a stack
of such objects, ie, it has an extra last dimension ntrans.
Outputs:
result strengths (adjoint of type 1 or 3) or Fourier coefficients
(adjoint of type 2) vector, matrix, or array of appropriate size.
For adjoint of type 1 and 3, this is either a length-M vector
(where M is the length of xj), or an (M,ntrans) matrix when
ntrans>1. For adjoint of type 2, in 1D this is
length-ms, in 2D size (ms,mt), or in 3D size (ms,mt,mu), or
each of these with an extra last dimension ntrans if ntrans>1.

Notes:
* The precision (double/single) of all inputs must match that chosen at the
plan stage using opts.floatprec, otherwise an error is raised.


5) To deallocate (delete) a nonuniform FFT plan, use delete(plan)

This deallocates all stored FFTW plans, nonuniform point sorting arrays,
kernel Fourier transforms arrays, etc.
Loading
Loading