Skip to content

Commit

Permalink
Merge pull request #190 from SWIFTSIM/load_fof_catalogues
Browse files Browse the repository at this point in the history
Load FOF catalogues & Cosmology Metadata Changes
  • Loading branch information
JBorrow authored Sep 19, 2024
2 parents 5492c3f + 45e2b8f commit ac05b8d
Show file tree
Hide file tree
Showing 23 changed files with 1,699 additions and 1,206 deletions.
3 changes: 1 addition & 2 deletions .github/workflows/pytest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.8, 3.9, "3.10", "3.11", "3.12"]
python-version: ["3.10", "3.11", "3.12"]

steps:
- uses: actions/checkout@v2
Expand All @@ -26,7 +26,6 @@ jobs:
${{ runner.os }}-pip-${{ runner.os }}-
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install flake8 pytest
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
if [ -f optional_requirements.txt ]; then pip install -r optional_requirements.txt; fi
Expand Down
37 changes: 33 additions & 4 deletions docs/source/loading_data/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@ Loading Data
The main purpose of :mod:`swiftsimio` is to load data. This section will tell
you all about four main objects:

+ :obj:`swiftsimio.reader.SWIFTUnits`, responsible for creating a correspondence between
+ :obj:`swiftsimio.metadata.objects.SWIFTUnits`, responsible for creating a correspondence between
the SWIFT units and :mod:`unyt` objects.
+ :obj:`swiftsimio.reader.SWIFTMetadata`, responsible for loading any required information
+ :obj:`swiftsimio.metadata.objects.SWIFTMetadata`, responsible for loading any required information
from the SWIFT headers into python-readable data.
+ :obj:`swiftsimio.reader.SWIFTDataset`, responsible for holding all particle data, and
keeping track of the above two objects.
Expand Down Expand Up @@ -47,8 +47,8 @@ notebook, and you will see that it contains several sub-objects:
simulation.
+ ``data.dark_matter``, likewise containing information about the dark matter
particles in the simulation.
+ ``data.metadata``, an instance of :obj:`swiftsimio.reader.SWIFTMetadata`
+ ``data.units``, an instance of :obj:`swiftsimio.reader.SWIFTUnits`
+ ``data.metadata``, an instance of :obj:`swiftsimio.metadata.objects.SWIFTSnapshotMetadata`
+ ``data.units``, an instance of :obj:`swiftsimio.metadata.objects.SWIFTUnits`

Using metadata
--------------
Expand Down Expand Up @@ -268,3 +268,32 @@ in SWIFT will be automatically read.
data = sw.load(
"extra_test.hdf5",
)
Halo Catalogues
---------------

SWIFT-compatible halo catalogues, such as those written with SOAP, can be
loaded entirely transparently with ``swiftsimio``. It is generally possible
to use all of the functionality (masking, visualisation, etc.) that is used
with snapshots with these files, assuming the files conform to the
correct metadata standard.

An example SOAP file is available at
``http://virgodb.cosma.dur.ac.uk/swift-webstorage/IOExamples/soap_example
.hdf5``

You can load SOAP files as follows:

.. code-block:: python
from swiftsimio import load
catalogue = load("soap_example.hdf5")
print(catalogue.spherical_overdensity_200_mean.total_mass)
# >>> [ 591. 328.5 361. 553. 530. 507. 795.
# 574. 489.5 233.75 0. 1406. 367.5 2308.
# ...
# 0. 534. 0. 191.75 1450. 600. 290. ] 10000000000.0*Msun (Physical)
12 changes: 6 additions & 6 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -12,27 +12,27 @@ packages = [
"swiftsimio.metadata.particle",
"swiftsimio.metadata.unit",
"swiftsimio.metadata.writer",
"swiftsimio.metadata.soap",
"swiftsimio.visualisation",
"swiftsimio.visualisation.projection_backends",
"swiftsimio.visualisation.slice_backends",
"swiftsimio.visualisation.tools",
"swiftsimio.visualisation.smoothing_length"
"swiftsimio.visualisation.smoothing_length",
]

[project]
name = "swiftsimio"
version="8.0.1"
version="9.0.0"
authors = [
{ name="Josh Borrow", email="[email protected]" },
]
description="SWIFTsim (swift.dur.ac.uk) i/o routines for python."
description="SWIFTsim (swiftsim.com) i/o routines for python."
readme = "README.md"
requires-python = ">3.8.0"
requires-python = ">3.10.0"
classifiers = [
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"License :: OSI Approved :: GNU Lesser General Public License v3 or later (LGPLv3+)",
"Operating System :: OS Independent",
]
Expand Down
7 changes: 3 additions & 4 deletions swiftsimio/__init__.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
from .reader import *
from .writer import SWIFTWriterDataset
from .snapshot_writer import SWIFTSnapshotWriter
from .masks import SWIFTMask
from .statistics import SWIFTStatisticsFile
from .__version__ import __version__
Expand Down Expand Up @@ -75,7 +75,7 @@ def mask(filename, spatial_only=True) -> SWIFTMask:
"""

units = SWIFTUnits(filename)
metadata = SWIFTMetadata(filename, units)
metadata = metadata_discriminator(filename, units)

return SWIFTMask(metadata=metadata, spatial_only=spatial_only)

Expand Down Expand Up @@ -109,5 +109,4 @@ def load_statistics(filename) -> SWIFTStatisticsFile:
return SWIFTStatisticsFile(filename=filename)


# Rename this object to something simpler.
Writer = SWIFTWriterDataset
Writer = SWIFTSnapshotWriter
105 changes: 65 additions & 40 deletions swiftsimio/masks.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
snapshots.
"""

import warnings

import unyt
import h5py

Expand Down Expand Up @@ -48,6 +50,11 @@ def __init__(self, metadata: SWIFTMetadata, spatial_only=True):
self.units = metadata.units
self.spatial_only = spatial_only

if not self.metadata.masking_valid:
raise NotImplementedError(
f"Masking not supported for {self.metadata.output_type} filetype"
)

if self.metadata.partial_snapshot:
raise InvalidSnapshot(
"You cannot use masks on partial snapshots. Please use the virtual "
Expand All @@ -65,9 +72,11 @@ def _generate_empty_masks(self):
types.
"""

for ptype in self.metadata.present_particle_names:
for group_name in self.metadata.present_group_names:
setattr(
self, ptype, np.ones(getattr(self.metadata, f"n_{ptype}"), dtype=bool)
self,
group_name,
np.ones(getattr(self.metadata, f"n_{group_name}"), dtype=bool),
)

return
Expand Down Expand Up @@ -100,21 +109,24 @@ def _unpack_cell_metadata(self):
# contain at least one of each type of particle).
sort = None

for ptype, pname in zip(
self.metadata.present_particle_types, self.metadata.present_particle_names
for group, group_name in zip(
self.metadata.present_groups, self.metadata.present_group_names
):
part_type = f"PartType{ptype}"
counts = count_handle[part_type][:]
offsets = offset_handle[part_type][:]
if self.metadata.shared_cell_counts is None:
counts = count_handle[group][:]
offsets = offset_handle[group][:]
else:
counts = count_handle[self.metadata.shared_cell_counts][:]
offsets = offset_handle[self.metadata.shared_cell_counts][:]

# When using MPI, we cannot assume that these are sorted.
if sort is None:
# Only compute once; not stable between particle
# types if some datasets do not have particles in a cell!
sort = np.argsort(offsets)

self.offsets[pname] = offsets[sort]
self.counts[pname] = counts[sort]
self.offsets[group_name] = offsets[sort]
self.counts[group_name] = counts[sort]

# Also need to sort centers in the same way
self.centers = unyt.unyt_array(centers_handle[:][sort], units=self.units.length)
Expand All @@ -128,7 +140,7 @@ def _unpack_cell_metadata(self):

def constrain_mask(
self,
ptype: str,
group_name: str,
quantity: str,
lower: unyt.array.unyt_quantity,
upper: unyt.array.unyt_quantity,
Expand All @@ -139,13 +151,13 @@ def constrain_mask(
We update the mask such that
lower < ptype.quantity <= upper
lower < group_name.quantity <= upper
The quantities must have units attached.
Parameters
----------
ptype : str
group_name : str
particle type
quantity : str
Expand All @@ -169,23 +181,17 @@ def constrain_mask(
print("Please re-initialise the SWIFTMask object with spatial_only=False")
return

current_mask = getattr(self, ptype)
current_mask = getattr(self, group_name)

particle_metadata = getattr(self.metadata, f"{ptype}_properties")
group_metadata = getattr(self.metadata, f"{group_name}_properties")
unit_dict = {
k: v
for k, v in zip(
particle_metadata.field_names, particle_metadata.field_units
)
k: v for k, v in zip(group_metadata.field_names, group_metadata.field_units)
}

unit = unit_dict[quantity]

handle_dict = {
k: v
for k, v in zip(
particle_metadata.field_names, particle_metadata.field_paths
)
k: v for k, v in zip(group_metadata.field_names, group_metadata.field_paths)
}

handle = handle_dict[quantity]
Expand All @@ -203,7 +209,7 @@ def constrain_mask(

current_mask[current_mask] = new_mask

setattr(self, ptype, current_mask)
setattr(self, group_name, current_mask)

return

Expand Down Expand Up @@ -282,7 +288,7 @@ def _generate_cell_mask(self, restrict):

return cell_mask

def _update_spatial_mask(self, restrict, ptype: str, cell_mask: np.array):
def _update_spatial_mask(self, restrict, group_name: str, cell_mask: np.array):
"""
Updates the particle mask using the cell mask.
Expand All @@ -296,28 +302,28 @@ def _update_spatial_mask(self, restrict, ptype: str, cell_mask: np.array):
restrict : list
currently unused
ptype : str
group_name : str
particle type to update
cell_mask : np.array
cell mask used to update the particle mask
"""

if self.spatial_only:
counts = self.counts[ptype][cell_mask]
offsets = self.offsets[ptype][cell_mask]
counts = self.counts[group_name][cell_mask]
offsets = self.offsets[group_name][cell_mask]

this_mask = [[o, c + o] for c, o in zip(counts, offsets)]

setattr(self, ptype, np.array(this_mask))
setattr(self, f"{ptype}_size", np.sum(counts))
setattr(self, group_name, np.array(this_mask))
setattr(self, f"{group_name}_size", np.sum(counts))

else:
counts = self.counts[ptype][~cell_mask]
offsets = self.offsets[ptype][~cell_mask]
counts = self.counts[group_name][~cell_mask]
offsets = self.offsets[group_name][~cell_mask]

# We must do the whole boolean mask business.
this_mask = getattr(self, ptype)
this_mask = getattr(self, group_name)

for count, offset in zip(counts, offsets):
this_mask[offset : count + offset] = False
Expand Down Expand Up @@ -367,8 +373,8 @@ def constrain_spatial(self, restrict, intersect: bool = False):
# we just make a new mask
self.cell_mask = self._generate_cell_mask(restrict)

for ptype in self.metadata.present_particle_names:
self._update_spatial_mask(restrict, ptype, self.cell_mask)
for group_name in self.metadata.present_group_names:
self._update_spatial_mask(restrict, group_name, self.cell_mask)

return

Expand All @@ -391,19 +397,38 @@ def convert_masks_to_ranges(self):
# Use the accelerate.ranges_from_array function to convert
# This into a set of ranges.

for ptype in self.metadata.present_particle_names:
for group_name in self.metadata.present_group_names:
setattr(
self,
ptype,
group_name,
# Because it nests things in a list for some reason.
np.where(getattr(self, ptype))[0],
np.where(getattr(self, group_name))[0],
)

setattr(self, f"{ptype}_size", getattr(self, ptype).size)
setattr(self, f"{group_name}_size", getattr(self, group_name).size)

for ptype in self.metadata.present_particle_names:
setattr(self, ptype, ranges_from_array(getattr(self, ptype)))
for group_name in self.metadata.present_group_names:
setattr(self, group_name, ranges_from_array(getattr(self, group_name)))

return

def constrain_index(self, index: int):
"""
Constrain the mask to a single row.
Intended for use with SOAP catalogues, mask to read only a single row.
Parameters
----------
index : int
The index of the row to select.
"""
if not self.metadata.filetype == "SOAP":
warnings.warn("Not masking a SOAP catalogue, nothing constrained.")
return
for group_name in self.metadata.present_group_names:
setattr(self, group_name, np.array([[index, index + 1]]))
setattr(self, f"{group_name}_size", 1)
return

def get_masked_counts_offsets(self) -> (Dict[str, np.array], Dict[str, np.array]):
Expand Down
2 changes: 2 additions & 0 deletions swiftsimio/metadata/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@
from .particle import particle_types
from .particle import particle_fields

from .soap import soap_types

from .unit import unit_types
from .unit import unit_fields

Expand Down
4 changes: 4 additions & 0 deletions swiftsimio/metadata/metadata/metadata_fields.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,12 @@
header_unpack_arrays = {
"BoxSize": "boxsize",
"NumPart_ThisFile": "num_part",
"NumGroup_ThisFile": "num_group",
"NumSubhalos_ThisFile": "num_subhalo",
"CanHaveTypes": "has_type",
"NumFilesPerSnapshot": "num_files_per_snapshot",
"OutputType": "output_type",
"SubhaloTypes": "subhalo_types",
}

# Some of these 'arrays' are really types of mass table, so unpack
Expand Down
Loading

0 comments on commit ac05b8d

Please sign in to comment.