You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem or opportunity? Please describe.
Currently, mpol.Gridding.Gridder can only accept rectangular arrays of visibilities. I.e., either a single-channel dataset where uu, vv, weight, data_re, and data_im are of size (1, nvis) or a multi-channel dataset where they are of size (nchan, nvis).
There may arise instances where a multi-channel dataset will have different numbers of unflagged visibilities in each channel. Right now I can think of this arising in two situations
a multi-channel continuum dataset that the user wants to model with nterms > 1.
a multi-channel spectral line dataset that had transient RFI or contamination in only some EB's. (not super likely, but possible)
How then to input these products to mpol.Gridding.Gridder?
An additional concern is how to represent the data for Spectral Covariance (#18) applications. ALMA does not "Doppler track" but instead "Doppler sets." I think this means we'll want to select a uniform number of channels in the Topocentric frame, which will be mapped into an equal number of Barycentric or Topocentric frame. This does mean that we're going to need to have a (wasted) buffer region of channels on either end for the most blueshifted or most redshifted visibility samples at the start/end of the observations, but the simplicity of keeping a uniform number of channels is most likely worth the memory overhead.
Describe the solution you'd like
Expand mpol.Gridding.Gridder to accept ragged array collections. Either a list of np.arrays, or something like an Awkward array.
This will also require internal modification to how mpol.Gridding.Gridder performs the gridding operation, as well as changes to the ingest routine _check_data_inputs_2d.
Once the arrays are "gridded," they are no longer ragged, since the core data representation is the (full) coordinate grid.
Considerations
We also considered the situation where mpol.Gridding.Gridder retains its current input scheme but the user also supplies an extra flag argument. Then, the indexing is done internally. I think this will gets more confusing internally (since ragged arrays will still need to be managed, and masked arrays don't play super well with most operations). MPoL in general has no need for the flagged visibilities, so it seems extraneous to have us keep track of them internally.
Additional context
Ragged tensors might also need to be considered in the "loose" visibility case (i.e., individual interpolation with torchkbnufft rather than pregridding), though that is a separate issue than the one described here, since that will need to work in a Pytorch context. The issue discussed here doesn't need to work in Pytorch, since the GriddedDataset output to Pytorch will be dense and rectangular.
The text was updated successfully, but these errors were encountered:
iancze
changed the title
mpol.gridding.Gridder to accept ragged arrays of visibilities
Gridder and/or output routines to accept ragged arrays of visibilities or utilize flags
Dec 25, 2022
Is your feature request related to a problem or opportunity? Please describe.
Currently, mpol.Gridding.Gridder can only accept rectangular arrays of visibilities. I.e., either a single-channel dataset where uu, vv, weight, data_re, and data_im are of size
(1, nvis)
or a multi-channel dataset where they are of size(nchan, nvis)
.There may arise instances where a multi-channel dataset will have different numbers of unflagged visibilities in each channel. Right now I can think of this arising in two situations
How then to input these products to mpol.Gridding.Gridder?
An additional concern is how to represent the data for Spectral Covariance (#18) applications. ALMA does not "Doppler track" but instead "Doppler sets." I think this means we'll want to select a uniform number of channels in the Topocentric frame, which will be mapped into an equal number of Barycentric or Topocentric frame. This does mean that we're going to need to have a (wasted) buffer region of channels on either end for the most blueshifted or most redshifted visibility samples at the start/end of the observations, but the simplicity of keeping a uniform number of channels is most likely worth the memory overhead.
Describe the solution you'd like
Expand mpol.Gridding.Gridder to accept ragged array collections. Either a list of np.arrays, or something like an Awkward array.
This will also require internal modification to how mpol.Gridding.Gridder performs the gridding operation, as well as changes to the ingest routine
_check_data_inputs_2d
.Once the arrays are "gridded," they are no longer ragged, since the core data representation is the (full) coordinate grid.
Considerations
We also considered the situation where mpol.Gridding.Gridder retains its current input scheme but the user also supplies an extra
flag
argument. Then, the indexing is done internally. I think this will gets more confusing internally (since ragged arrays will still need to be managed, and masked arrays don't play super well with most operations). MPoL in general has no need for the flagged visibilities, so it seems extraneous to have us keep track of them internally.Additional context
Ragged tensors might also need to be considered in the "loose" visibility case (i.e., individual interpolation with torchkbnufft rather than pregridding), though that is a separate issue than the one described here, since that will need to work in a Pytorch context. The issue discussed here doesn't need to work in Pytorch, since the
GriddedDataset
output to Pytorch will be dense and rectangular.The text was updated successfully, but these errors were encountered: