Skip to content

Latest commit

 

History

History
190 lines (142 loc) · 19 KB

35.viz_and_volrendering.md

File metadata and controls

190 lines (142 loc) · 19 KB

The primary method by which researchers interact with their data in yt is via visualization; from the standpoint of the library, however, this is a side-effect of the various analysis, regularization and data-processing algorithms that are implemented within yt. Nearly all of the visualization that is done using yt utilizes the matplotlib library for actual deposition of pixels into an image format, although all of the input to that deposition is conducted by yt. Making this distinction is important, because it underscores the relationship between the different libraries and how they exist in the ecosystem of scientific software; yt does not replace matplotlib, but rather, augments it by providing a grammar of analysis of volumetric data and defining how that grammar is translated into visual representations as presented by matplotlib.

CC: discuss ray traversal for patch-based datasets + oct-based datasets.

Pixelizing Variable-Mesh Objects {#sec:pixelization}

The results of either projecting or slicing through a logically-cartesian finite volume dataset is represented in yt as a collection of pixel positions and widths. These objects, hereafter referred to as exposing the "variable mesh" interface (as originated in HippoDraw), are not typically suitable for direct visualization. Many visualization libraries, including matplotlib, would necessarily regard these as collections of patches of fixed size, supplying them to the underlying engine. To optimize for repeated rendering, yt provides its own "pixelization" routines that take advantage of the input data structures. These "pixelizers" (or "rasterizers") can account for periodic data, variable resolution, overlapping and disjoint datasets, and non-Cartesian coordinate systems.

The pixelizers in yt are implemented in Cython, and they accept an input "image plane" buffer (with extent) as well as the variable mesh to be deposited. Pixelizers exist for cartesian coordinates, cylindrical and spherical coordinates, off-axis cartesian planes, and for the Mollweide orthographic projection. Each of these pixelizers follows a roughly identical process for depositing source pixels into the image plane. The outer loop is over the input pixels, $p_i$, composed of $x_i, y_i, dx_i, dy_i, v_i$, where $x$ and $y$ refer to the coordinate system; in practice this means they may actually represent the $r$, $\theta$, $\phi$ or other coordinates.

  1. Compute left and right edges of the bounding box for this pixel in the resolution of the image plane
  2. Iterate over the first image plane coordinate from the left edge to the right edge of the bounding box
  3. Iterate over the second image plane coordinate from the left edge to the right edge of the bounding box
  4. Map from the coordinate system to the image plane and deposit $v$

In practice, this is a fast operation, as long as the inner loops are sufficiently well determined; for instance, when depositing an input pixel with a width of $w$ into an image plane where the pixel width corresponds to a width of $w/16$, only $16^2$ pixels (with a high-degree of sequential ordering) have to be iterated over. The spherical and cylindrical pixelization routines operate similarly, but are somewhat degraded by a lower degree of locality in the final mapping from coordinate system to image plane.

Recent work has been done to port the pixelization routines to Rust and compiling these to WebAssembly, resulting in the development of the Widgyts project. Widgyts provides a browser-side Jupyterlab interface to the pixelization routines, enabling extremely low-latency exploration of datasets.

Higher-Order Unstructured Mesh Elements

Software Volume Rendering {#sec:software-volume-rendering}

The volume rendering is based on classical concepts for rendering 3D objects, and relies on the notion of a scene, a camera and an object to render. The object to render can be any data container of supported AMR datasets (either patch-based and octree-based datasets). The implementation of the volume rendering is based on integrating a transfer function along individual "rays." By defining a traversal function, cells can be ordered and a sampling function and accumulation can be applied to them. Below, we describe the traversal operations, and then the different "sampling" operations that can be applied.

Conceptually, it is easies to think about this as constructing a global "ordering" of the cells, and then an in-order application of the sampler function with affiliated ray-specific data storage. In reality, however, these operations are interleaved and decomposed to enable task- and data-based parallelism. Data is distributed to different processors wherever possible based on different decomposition strategies, and within the traversal algorithms OpenMP parallelism is utilized to allow for individual rays to be integrated simultaneously. Typically, multiple "sampling points" are identified within each element to be sampled; the default number is 5. We have found that the image quality improves somewhat as this number is increased, as the interpolation allows for finer-grained variation, but 5 is typically sufficient for all but the highest resolution data and highest-gradient data.

Lenses for Volume Rendering

In addition to specifying traversal algorithms, the software volume rendering in yt relies on the concept of "lenses." A lens, in yt terminology, is a descriptive object from which emanate the rays that are integrated. The traditional lens is that of the perspective projection, wherein a plane of rays are sent outward. The simplest lens, however, is that of a plane-parallel set of rays, each of which is emitted by an individual pixel with constant direction. Additional lenses yt provides include a stereo projection lens (for stereoscopic 3D volume rendering), a fisheye lens (for hemispheric rendering suitable for planetarium domes), a spherical lens (for spherical rendering suitable for full-sphere projections (such as for "$360^{\circ}$ YouTube videos), and the stereoscopic spherical lens.

Each of these lens objects provides a functional form of ray direction and origin. In some cases, an inverse function can be provided, which in some cases enables the traversal function to more intelligently check intersection and reduce computation time.

Patch-based ray traversal

Unlike taking standard "projections" of datasets, the process of volume rendering requires that elements be traversed in order, accumulating values that may be reliant on previous values. While this is straightforward to implement for grids [@doi:10.2312/egtp.19871000], in situations where there are overlapping sets of values the process must be handled more carefully.

<script type="text/paperscript" canvas="vrFigure" src="images/volume_rendering/vr-figure.js"></script>

To ensure that cells are traversed in-order, and to ensure that only the highest-resolution cells are used in cases where grids overlap, we decompose the grid hierarchy into a kD-tree. To construct the kD-tree, each grid at the root level is added to the tree. Wherever these grids include "child" grids (i.e., grids that include higher-resolution data that overlaps with the grids) a cutting plane is inserted in the kD-tree such that the child grid is isolated. This process is then applied recursively until all of the grids in the dataset (or its data subselection) have been added. By inserting cutting planes at child-grid boundaries, the tree can be specified to include only "leaf" node data, thus ensuring that any traversal will only include the highest-resolution data available, and thus can be traversed in order.

kD-trees also provide so-called "viewpoint traversals," which allows the grids to be ordered in either front-to-back or back-to-front order. These traversals can also be ordered to allow for radially-emanating traversals, for such cases as hemispheric renderings.

Octree ray traversal

Casting rays through an octree can be achieved efficiently by relying on the octree structure. In order to abstract away the underlying layout of the data, we first construct an octree that contains all leaf cells in the data container. We store all cells as octs with no children, and mark them with their position within the data container, going from $0$ to $N_\mathrm{cell}-1$. Octs that are inserted in the process of building the tree are not marked nor indexed. We also compute the vertex-centered data for all cells in the container. Note that, contrary to the octree utilized to index the data from octree datasets, this octree may span multiple domains and contains all levels from the root level (that contains a single oct the size of the simulation domain) all the way down to the leaf cells. We then cast rays off the camera, and for each ray, we compute the ordered list of the $N$ cells it intersects with together with the intersection points along the ray [@revelles_efficient_parametric_algorithm_2000]. In the following, we will write the coordinate along the ray as $t$, with the camera located at $t=0$. In general the tree may contain holes (this may happen if the data container is a region selector), so that the exit coordinate out of a cell may not coincide with the entry coordinate through the next cell. In practice, we solve this by storing for each cell both the entry and exit coordinates of the ray.

The algorithm relies on the fact that if a ray passes through an oct and intersects with its six faces at coordinate along the ray $t_{xi}, t_{yi}, t_{zi}$ (on entry) and $t_{xo}, t_{yo}, t_{zo}$ (on exit), then its intersection with the inner cells' faces can be computed explicitly from these six values and their half point $(t_{xi}+t_{xo})/2, (t_{yi}+t_{yo})/2, (t_{zi}+t_{zo})/2$. This implies that each call to the algorithm only need computing one division and a few simple arithmetic comparisons. It also uses the fact that for a given oct, we can compute which cell the ray will intersect with first, and from any given cell, which cell the the ray will intersect next.

The algorithm then works as follows. If the ray does not intersect with the root oct, then the algorithm returns an empty list of cell crossed and $t$ values. Otherwise, initialize an empty list of cells traversed and $t$-values. a) Find the intersection of the ray with all six faces of the oct. b) If the current oct is marked, store the entry and exit $t$-values and the index of the oct in their respective list and return. c) If the current oct is not marked, find the first cell the ray intersects with and call the algorithm recursively (starting at step a) with the oct contained in the cell, if any). d) Find the next cell within the oct. If there is no next cell, return. Otherwise, call the algorithm recursively (starting at step a) with the oct contained in the cell, if any) then go back to d). On exit of the algorithm, we then have a list of cells and $t$-values. For each cell in the list, we then call the sampler with the vertex-centered values and the entry and exit coordinates.

An example of the volume rendering of a galaxy in a zoom-in cosmological simulation made with RAMSES is shown on Figure {@fig:ramses-volume-rendering}.

 Volume rendering of gas density isocontours around a galaxy in a cosmological zoom-in simulation performed with RAMSES. Adapted from [@cadiou_angular_momentum_2021]. {#fig:ramses-volume-rendering}

Sampling functions

As a ray passes through each sampling point, we apply a sampling function. In the yt software volume renderer, these cython subclasses of the ImageSampler base class. Each defines a setup process and a sample process. Once a ray arrives at a sampling point (for instance, as it walks across a grid and encounters values) the sampler is called with the data it is accessing, information about the ray, the "enter" and "exit" parameters of the sample region, and the accumulated data so far.

Each ImageSampler object implements a different method of integration through the volume. Below, we discuss each in turn. To do so, we apply the convention that our volume rendering ray-casting starts at a position defined by the parameter $t$, starting at the position $\mathbf{\vec{x}}(t_0) \equiv \mathbf{\vec{x}}_0$ and terminates at $\mathbf{\vec{x}}(t_1) \equiv \mathbf{\vec{x}}_1$. Field values are taken as $f(\mathbf{\vec{x}})$, and the distance through with a ray has passed is defined as the path length.

Projection Sampler

This sampler object conducts a simple integration through the domain. $$ v(\mathbf{\vec{x}}(t_1)) = \int_{t_0}^{t_1} f(x, y, z) dl $$

Because the sampling function includes no dependence on previous values, it can be sampled completely independently and out of order. This allows for much simpler data-based parallelism. There are two formulations of this sampler in yt; the first is a piecewise-linear interpolation scheme, wherein the field is assumed to be constant within each sampling volume. This method requires neither vertex-centered data nor multiple sampling points and is considerably faster to conduct. The second method available, which is the default, is that of an interpolated projection. This sampler requires vertex-centered data and utilizes multiple samples. Because the arithmetic is reasonably simple and the data can be supplied in any order, it is still reasonably fast to conduct.

This sampler function operates independently of color channels, exclusively in floating-point space. This allows colormapping to be conducted at the end of the process and allows for high dynamic range beyond that typically enabled by RGBA channels.

Transfer Function Sampler

The transfer function sampler is the most complex of the sampling functions available in yt, as it affords the most flexibility. Unfortunately, this also results in a cumbersome and difficult-to-describe setup process, wherein the simple operations are straightforward to accomplish but the more complex operations are subject to trial-and-error.

The transfer function sampler is designed for constructing images; as such, it provides integration for up to six channels of data. Typically, this is used in the four-channel mode, corresponding to red, green, blue and alpha (RGBA) data. (It is important to note, however, that this is not by any means the only mechanism that these could be assigned; the values are accumulated in floating point and could correspond to any output data, not just these three colors and one alpha.) In six-channel mode, typically the channels are for red, green, blue, and a corresponding alpha channel for each of those colors, accumulated independently.

The transfer function is set up around the concept of "field interpolation tables." These are 1D arrays, typically 256 elements, that include floating point values as well as concrete bounds in data-space. These tables provide a mapping $f(v) \rightarrow u$, where $v$ is a local field value, as well as an optional "weight" value drawn from a second table. Careful construction of these tables can produce volume renderings via isocontours, where specific values in field space are highlighted with distinct colors. Indeed, this is the most common method of applying this sampler, as it allows a straightforward way of seeing "through" the outer layers and viewing nesting structures simultaneously.

We use notation $C_i^0$ for the input values for a given channel (where $i$ is one of $R, G, B, A$) and $C_i^1$ for the output. With the sampled field value ($f(t)$) and optional weight field value ($w(t)$) we can then define a sample, $S_i$ for each channel including alpha. we compute the updated values for each sampling point via:

$$ C_i^0 a_t \leftarrow \mathrm{max}(1 - dt * S_A, 0) \\ C_i^1 \leftarrow S_i \delta t + a_t C_i^0 $$

This is the case for "grey" opacity. In the use case of differing opacity for each channel, we modify this such that we utilize additional alpha channels for each primary channel. $a_t$ is then defined individually for each output channel.

This particular set of transfer functions, with what amounts to multi-channel, multi-field sampling in both output and opacity, can be specified to perform complex functions such as variable scattering, different weights for different channels, etc.

Hardware-accelerated Volume Rendering

Software volume rendering, as described above, provides a number of affordances for careful visualization. Specifically, as the code and kernels are written in languages that are similar to more traditional languages such as C and C++, the barrier to entry for describing a new sampling system can be lower. That being said, when examining responsiveness, software volume rendering is rarely -- if ever -- competitive with hardware-based volume rendering, such as that accelerated through graphics processing units (GPUs) and using OpenGL, Metal, Vulkan, or one of the higher-level platforms for graphics. Fluid interactivity is essentially inaccessible for software volume rendering except on the smallest of datasets. And yet, fluid interactivity enables much deeper exploration of the parameter space of visualization, as well as the ability to immerse oneself in data.

To support more interactive visualization (in addition to that described in @sec:jupyter_integration) a basic system for conducting hardware-accelerate, OpenGL-based visualization of yt-supported data has been developed in an external package, yt_idv. yt_idv is built on PyOpenGL and provides support for grid-based, particle-based and finite element datasets, with a heavy emphasis on the grid-based datasets. While the process of volume rendering is interesting, with many different fascinating areas of inquiry and opportunities for optimization, yt_idv is notable for its architecture more than its algorithms and optimization.

yt_idv is built using the Traitlets library, and utilizes the immediate-mode graphical user interface system "Dear ImGUI" for presenting a user interface. This allows for the system to be largely data-driven; frame redraws are only executed when a parameter changes (such as the camera path, transfer function, etc) and new parameters for the visualization can be easily exposed to the user interface. Furthermore, this allows shaders to be dynamically recompiled only as necessary.

In addition to this, the shaders themselves are specified in a configuration file that links GLSL source files into multi-component shaders, allowing declarative construction of shader pipelines and improving interoperability. For ray-casting routines, only a small kernel needs to be modified to change the functionality. This allows the relatively complex process of casting rays through multiple volumes to be hidden, much as described in @sec:software-volume-rendering. Annotations can be added simply, and an event loop has been enabled so that users can interact with the running visualization interface through IPython.

While this project includes a number of additional features designed for accessibility of data and in-depth coupling of visualization with quantitative analysis, they extend beyond the scope of this paper.