Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update documentation about validity metadata files #11

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 37 additions & 22 deletions docs/src/metadata.md
Original file line number Diff line number Diff line change
@@ -1,55 +1,70 @@
# Metadata

LEGEND metadata is stored in [JSON](https://www.json.org). Formatting guidelines:
LEGEND metadata is stored in [YAML](https://www.yaml.org) or
[JSON](https://www.json.org). Formatting guidelines:

* In general, field names should be interpretable as valid variable names in
common programming languages (e.g. use underscores (`_`) instead of dashes
(`-`))
* Snake casing is preferred, i.e. separating non-capitalized words with
underscores (`_`).

## Physical units

Physical units should be specified as part of a field name by adding
`_in_<units>` at the end. For example:

```json
{
"radius_in_mm": 11,
"temperature_in_K": 7
}
```yaml
radius_in_mm: 11,
temperature_in_K: 7
```

## Specifying metadata validity in time (and system)

LEGEND adopts a custom file format to specify the validity of metadata (for
LEGEND adopts the YAML format to specify the validity of metadata (for
example a data production configuration that varies in time or according to the
data taking mode), called JSONL (JSON + Legend).
data taking mode).

A JSONL file is essentially a collection of JSON-formatted records. Each record
is formatted as follows:
A `validity.yaml` file is essentially a collection of records. Each record is
formatted as follows:

```json
{"valid_from": "TIMESTAMP", "category": "DATATYPE", "apply": ["FILE1", "FILE2", ...]}
```yaml
- valid_from: TIMESTAMP
category: DATATYPE
mode: MODE
apply:
- FILE1.yaml
- FILE2.yaml
- ...
```

where:

* `TIMESTAMP` is a LEGEND-style timestamp `yyymmddThhmmssZ` (in UTC time),
also used to label data cycles, specifying the start of validity
* `DATATYPE` is the data type (`all`, `phy`, `cal`, `lar`, etc.) to which the
metadata applies
* `apply` takes an array of metadata files, to be combined "in cascade"
(precedence order right to left) into the final metadata object
metadata applies. If omitted, it should default to `all`.
* `MODE` can be `reset`, `append`, `remove`, `replace`. If omitted and this is
the first record in the file, defaults to `reset`, otherwise defaults to
`append`.
* `apply` takes an array of metadata files, to be comined into the main
metadata object depending on `mode` (see below). In general, the files are
combined "in cascade" (precedence order first to last) into the final metadata
object.

The record above translates to:
The above example record, if appearing at the top of the validity file,
translates to:

> Combine `FILE1`, `FILE2` etc. into a single metadata object. Fields in
> `FILE2` override fields in `FILE1`. This metadata applies only to `DATATYPE`
> data and is valid from `TIMESTAMP` on.

Records are stored in JSONL files one per line, without special delimiters:
Modes:

```json
{"valid_from": "TIMESTAMP1", "category": "DATATYPE1", "apply": ["FILE1", "FILE2", ...]}
{"valid_from": "TIMESTAMP2", "category": "DATATYPE2", "apply": ["FILE3", "FILE4", ...]}
...
```
* `reset`: remove all entries from the existing metadata file list before
applying (in cascade) the ones listed in `apply`.
* `append`: append (in cascade) files listed in `apply` to the current file
list.
* `remove`: remove the file(s) listed in `apply` from the current file list.
* `replace`: replace, in the current fule list, the first file listed in
`apply` with the second one.
Loading