Skip to content

Commit

Permalink
Merge pull request #232 from MPoL-dev/nufft_bifurcate
Browse files Browse the repository at this point in the history
splitting NuFFT into base and uu,vv cached class.
  • Loading branch information
iancze authored Dec 13, 2023
2 parents ef11906 + bb4ea6d commit cb28c37
Show file tree
Hide file tree
Showing 8 changed files with 921 additions and 572 deletions.
4 changes: 4 additions & 0 deletions docs/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@

# Changelog

## v0.2.1

- Changed API of {class}`~mpol.fourier.NuFFT`. Previous signature took `uu` and `vv` points at initialization (`__init__`), and the `.forward` method took only an image cube. This behaviour is preserved in a new class {class}`~mpol.fourier.NuFFTCached`. The updated signature of {class}`~mpol.fourier.NuFFT` *does not* take `uu` and `vv` at initialization. Rather, its `forward` method is modified to take an image cube and the `uu` and `vv` points. This allows an instance of this class to be used with new `uu` and `vv` points in each forward call. This follows the standard expectation of a layer (e.g., a linear regression function predicting at new `x`) and the pattern of the TorchKbNuFFT package itself. It is expected that the new `NuFFT` will be the default routine and `NuFFTCached` will only be used in specialized circumstances (and possibly deprecated/removed in future updates). Changes implemented by #232.

## v0.2.0

- Moved docs build out of combined and into standalone test workflow
Expand Down
6 changes: 3 additions & 3 deletions docs/ci-tutorials/loose-visibilities.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,10 +97,10 @@ uu_chan = uu[chan]
vv_chan = vv[chan]
```

and then use these values to initialize a {class}`mpol.fourier.NuFFT` object
Now, we'll initialize a {class}`mpol.fourier.NuFFT` object

```{code-cell}
nufft = fourier.NuFFT(coords=coords, nchan=nchan, uu=uu_chan, vv=vv_chan)
nufft = fourier.NuFFT(coords=coords, nchan=nchan)
```

Now let's put the NuFFT aside for a moment while we initialize the {class}`mpol.gridding.DataAverager` object and create an image for use in the forward model.
Expand Down Expand Up @@ -163,7 +163,7 @@ icube = images.ImageCube(coords=coords, nchan=nchan, cube=img_packed_tensor)
The interesting part of the NuFFT is that it will carry an image plane model all the way to the Fourier plane in loose visibilities, resulting in a model visibility array the same shape as the original visibility data.

```{code-cell}
vis_model_loose = nufft(icube())
vis_model_loose = nufft(icube(), uu_chan, vv_chan)
print("Loose model visibilities from the NuFFT have shape {:}".format(vis_model_loose.shape))
print("The original loose data visibilities have shape {:}".format(data.shape))
```
Expand Down
4 changes: 2 additions & 2 deletions docs/ci-tutorials/optimization.md
Original file line number Diff line number Diff line change
Expand Up @@ -312,14 +312,14 @@ $$
For speed reasons, the {class}`mpol.precomposed.SimpleNet` does not work with the original data visibilities directly, but instead uses an averaged version of them in {class}`~mpol.datasets.GriddedDataset`. To calculate model visibilities corresponding to the original $u,v$ points of the dataset, we will need to use the {class}`mpol.fourier.NuFFT` layer. More detail on this object is in the [Loose Visibilities tutorial](loose-visibilities.md), but basically we instantiate the NuFFT layer relative to some image dimensions and $u,v$ locations

```{code-cell} ipython3
nufft = fourier.NuFFT(coords=coords, nchan=dset.nchan, uu=uu, vv=vv)
nufft = fourier.NuFFT(coords=coords, nchan=dset.nchan)
```

and then we can calculate model visibilities corresponding to some model image (in this case, our optimal image). Since {meth}`mpol.fourier.NuFFT.forward` returns a Pytorch tensor, we'll need to detach it and convert it to numpy. We'll also remove the channel dimension.

```{code-cell} ipython3
# note the NuFFT expects a "packed image cube," as output from ImageCube.forward()
vis_model = nufft(rml.icube())
vis_model = nufft(rml.icube(), uu, vv)
# convert from Pytorch to numpy, remove channel dimension
vis_model = np.squeeze(vis_model.detach().numpy())
```
Expand Down
877 changes: 470 additions & 407 deletions docs/large-tutorials/pyro.ipynb

Large diffs are not rendered by default.

41 changes: 26 additions & 15 deletions docs/large-tutorials/pyro.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ jupytext:
extension: .md
format_name: myst
format_version: 0.13
jupytext_version: 1.14.1
jupytext_version: 1.15.2
kernelspec:
display_name: Python 3 (ipykernel)
language: python
Expand All @@ -27,7 +27,23 @@ In this tutorial, we will explore how we can use MPoL with a probabilistic progr

In this tutorial, we will use [Stochastic Variational Inference](http://pyro.ai/examples/svi_part_i.html) algorithms to obtain the posterior distribution of the model parameters. These algorithms are quick to implement in Pyro and--important for this tutorial--quick to run. Pyro also has full support for MCMC algorithms like Hamiltonian Monte Carlo and the No U-Turn Sampler (NUTS) ([for example](http://pyro.ai/examples/bayesian_regression_ii.html#HMC)) that are relatively straightforward to use in an extension from this model. However, because their run times are significantly longer than SVI algorithms, more computational resources are needed beyond the scope of this tutorial.

+++
If the following output says `Using cuda`, then this tutorial was executed on a GPU. We found that it took about 5 minutes to converge the SVI, which is pretty exciting. You may be able to run this on CPU-only machine, but expect the runtime to take significantly longer. You may want to shorten the number of iterations and reduce the number of predictive samples to get a sense that the routine will in fact execute, but be aware that your solution may not fully converge.

```{code-cell} ipython3
import torch
if torch.cuda.is_available():
device = torch.device('cuda')
else:
device = torch.device('cpu')
print(f"Using {device}.")
```

```{code-cell} ipython3
# import arviz now, to check UTF-8 loading issue.
import arviz as az
```

## MPoL and models

Expand Down Expand Up @@ -244,7 +260,6 @@ We also recommend reading Gelman et al. 2020's paper on [Bayesian Workflow](http
There are many ways to build a Pyro model. In this tutorial we will take a class-based approach and use the [PyroModule](http://pyro.ai/examples/modules.html) construct, but models can just as easily be built using function definitions (for [example](http://pyro.ai/examples/intro_long.html#Models-in-Pyro)).

```{code-cell} ipython3
import torch
from torch import nn
from mpol import geometry, gridding, images, fourier, utils
from mpol.constants import deg
Expand Down Expand Up @@ -573,9 +588,10 @@ class VisibilityModel(PyroModule):
self.flayer = fourier.FourierCube(coords=coords)
# create a NuFFT, but only use it for predicting samples
self.nufft = fourier.NuFFT(
coords=self.coords, nchan=self.nchan, uu=uu, vv=vv, sparse_matrices=False
)
# store the uu and vv points we might use
self.uu = torch.as_tensor(uu, device=device)
self.vv = torch.as_tensor(vv, device=device)
self.nufft = fourier.NuFFT(coords=self.coords, nchan=self.nchan)
def forward(self, predictive=True):
Expand All @@ -592,7 +608,7 @@ class VisibilityModel(PyroModule):
if predictive:
# use the NuFFT to produce and store samples
vis_nufft = self.nufft(img)[0]
vis_nufft = self.nufft(img, self.uu, self.vv)[0]
pyro.deterministic("vis_real", torch.real(vis_nufft))
pyro.deterministic("vis_imag", torch.imag(vis_nufft))
Expand Down Expand Up @@ -623,14 +639,9 @@ As we described in the [NuFFT](../ci-tutorials/loose-visibilities.md) tutorial,

When visualizing model or residual visibility values, it is often far more useful to work with the loose visibility values produced from the NuFFT. This is because the loose visibilities can be gridded using a weighting scheme like Briggs robust weighting, which can dramatically increase the sensitivity of the resulting image. So that is why our `VisibilityModel` uses a {class}`~mpol.fourier.NuFFT` layer to produce model visibilities when working in a predictive mode but otherwise uses a more efficient {class}`~mpol.fourier.FourierCube` layer to produce model visibilities when working in a likelihood evaluation loop.

Now we'll do a predictive check with the `VisibilityModel` using the same disk values found by Guzmán et al. 2018. We will also place it on the GPU, if the device is available.
Now we'll do a predictive check with the `VisibilityModel` using the same disk values found by Guzmán et al. 2018. We will also place it on the GPU with the `.to` call, if the device is available.

```{code-cell} ipython3
if torch.cuda.is_available():
device = torch.device('cuda')
else:
device = torch.device('cpu')
# we will use this object throghout the rest of the tutorial, so we'll just call it 'model'
model = VisibilityModel(coords=coords, distance=distance, uu=uu, vv=vv, weight=weight, data=data, device=device)
model.to(device);
Expand Down Expand Up @@ -1039,14 +1050,14 @@ ax.set_xlabel("radius [au]")
ax.set_ylabel(r"$I_\nu$ [Jy $\mathrm{arcsec}^{-2}$]");
```

We see that there is a slightly larger scatter in the draws compared to the `AutoNormal` guide, especially around 40 au. This is because the `AutoMultivariateNormal` guide captured more of the covariance between parameters, resulting in a greater dispersion of draws.
We see that there is a *slightly* larger scatter in the draws compared to the `AutoNormal` guide, most noticeable around 40 au. This is because the `AutoMultivariateNormal` guide captured more of the covariance between parameters, resulting in a greater dispersion of draws.

Encouragingly, both our image and 1D profile results compare favorably with those found by [Guzmán et al. 2018](https://ui.adsabs.harvard.edu/abs/2018ApJ...869L..48G/abstract) (compare their Figures 2 & 4).

The true uncertainty in the radial profile may still be underestimated. As we discussed, one source could be the parameterization of the model. In reality, the disk rings are not perfect Gaussian shapes, and so, as currently implemented, our model could never capture the true intensity profile.


In our opinion, SVI is a very useful inference technique because of its speed and scalability. There is the risk, though, that your guide distribution does not fully capture complex covariances of your posterior distributions. Perhaps some parameter posteriors are significantly non-Gaussian or banana-shaped, and therefore not able to be captured by the multivariate Normal guide. This risk can be hard to assess from SVI fits alone, though there are steps you can take by trying out more [complex guides](https://docs.pyro.ai/en/stable/infer.autoguide.html#) or [writing your own](https://pyro.ai/examples/svi_part_i.html#Guide), parameterized around anticipated covariances.
In our opinion, SVI is a very useful inference technique because of its speed and scalability. There is the risk, though, that your guide distribution does not fully capture complex covariances of your posterior distributions. Perhaps some parameter posteriors are significantly non-Gaussian or banana-shaped, and therefore not able to be captured by the multivariate Normal guide. This risk can be hard to assess from SVI fits alone, though there are steps you can take by trying out more [complex guides](https://docs.pyro.ai/en/stable/infer.autoguide.html#) or [writing your own](https://pyro.ai/examples/svi_part_i.html#Guide), parameterized around anticipated covariances.

+++

Expand Down
Loading

0 comments on commit cb28c37

Please sign in to comment.