Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing data when a public alert is first encountered in the previous history of a private alert #5

Open
wombaugh opened this issue Mar 4, 2019 · 2 comments
Assignees

Comments

@wombaugh
Copy link
Contributor

wombaugh commented Mar 4, 2019

As photopoints are not reingested, a photopoint first encountered as a previous candidate is not saved again if it appears as a candidate at some later time. This matters since not all data are included in the previous candidate information. The DB will thus not be stable towards the order in which alerts are processed. (In principle this is a problem with the stream, rather than ampel.)

What can be done:

  • A T3 could make sure information is updated consistently, but this would make the base DB content dependent on whether and when a T3 was run.
  • We could force all candidates to be ingested, even if already in the DB. This could imply a large performance penalty. This could still be an option for some channels that are critically using this information.
  • Change the ingestor such that some missing information are inherited to the previous candidates.

Right now it seems like the third, least invasive, option is sufficient. These are the affected keys:

Key that can be inherited: 'nmtchps', 'srmag2', 'objectidps1', 'distpsnr1', 'maggaiabright', 'sgmag2', 'simag2', 'maggaia', 'nmatches', 'objectidps3', 'distpsnr2', 'sgscore3', 'distpsnr3', 'sgmag3', 'objectidps2', 'simag3', 'jdstartref', 'sgmag1', 'neargaiabright', 'szmag3', 'szmag1', 'srmag1', 'srmag3', 'neargaia', 'jdendref', 'simag1', 'sgscore1', 'sgscore2', 'rfid', 'nframesref', 'szmag2'
(note that we are making the assumption that reference build did not change between history and new obs)

There are a number of keys that for principal reason should not be saved ['jdstarthist', 'ndethist', 'ncovhist', 'jdendhist']. This are different in that they do not only depend on the candidate, but rather on the state at processing (actually more similar to a science record)

A final list of properties which also cannot be inherited, as they are exposure dependent, but which should really be part of the alert. Should contact caltech about these: 'zpclrcov', 'ssnrms', 'dsdiff', 'dsnrms', 'clrrms', 'zpmed', 'clrmed', ,'exptime', 'tooflag'

@vbrinnel
Copy link
Member

vbrinnel commented Mar 7, 2019

When processing an alert, the ingester first requests all photopoints associated with the associated transient (A projection is used so that only a subset of the existing fields are returned, currently: _id, alTags, jd, fid, rcid, alExcluded, magpsf). This query serves several purposes, it allows:

  • to optimize the ingestion by not re-inserting existing data into the DB
  • to create compounds containing all previous photopoints (not only the one from the last 30 days)
  • to detect superseded data
  • to take channel custom photopoint rejections into account

The way I thought of handling this issue was not to inherit properties from one photopoint to a another , but to make sure that the info of a givent photopoint contained in candidate are really present in the DB and if not to add them. More precisely:

  1. I would add nframesref to the projection described previously (no performance penalty). Doing this allows to detect DB photopoints lacking this information, i.e photopoints that were inserted into the DB based on information originating from prv_candidates.
  2. Then I would check if the current photopoint (defined in the dict candidate from the alert) has a DB conterpart and if it is lacking nframesref
  3. In which case, I would add an UpdateOne operation to the list of bulk operations, which would - based on the information from candidate - $set the fields nmtchps, srmag2, objectidps1, distpsnr1, maggaiabright, sgmag2, simag2, maggaia, nmatches, objectidps3, distpsnr2, sgscore3, distpsnr3, sgmag3, objectidps2, simag3, jdstartref, sgmag1, neargaiabright, szmag3, szmag1, srmag1, srmag3, neargaia, jdendref, simag1, sgscore1, sgscore2, rfid, nframesref, szmag2

Advantage: this is fairly simple to implement, should catch most of the cases and does not imply any performance penalty.

Drawback 1: it only fixes photopoints that triggered an alert at some point. Say the first alert of a transient contains photopoints in prv_candidates: those will never be updated. For these, you would need to 'inherit' these properties.

Drawback 2: the fix does not work if the alert is rejected for some reason (color too blue...) and auto-complete is not activated.

@vbrinnel
Copy link
Member

Regarding Drawback 1:
The photodata query performed before each ingestion allows to determine if the transient is new or not. In case it is new, all photodata included in prv_candidates are to be inserted into the DB (photopoints and upper limits). Here, we could make sure the previously described properties included in candidate are transferred to the photopoint contained in prv_candidates. Should an alert based on a photopoint originally present in prv_candidates be later triggered, then the $set operation described in #5 (comment) will make sure the exact properties of this photopoint (from candidate) will be written to the DB (photopoint properties will thus be the "true" ones and no longer the inherited ones).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants