-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
validator would fail on "legit" "n/a" value #1026
Comments
Could you amplify on which validator you are using? I'm pretty sure that the python validator treats missing values (n/a) values in the onset column as a fatal error. What does it mean if an events.tsv field has an n/a onset value? |
It is Python one as from this PR https://github.com/hed-standard/hed-python/pull/1025/files Means that we don't know when it was exactly presented, likely between events with timings present. @jungheejung could elaborate more |
hmmm -- since it has been established that event files in BIDS do not have to be ordered by onset, I don't think it would be meaningful to allow onset times to be "n/a". It used to be explicitly so in the spec. It is implied in the description under Task Events in that it explicitly says that duration allows n/a but does not say that for onset. Both are required columns. |
well, "nothing forbids" it to be n/a ATM, and as for meaningful -- might still be useful to know that there was some event of that kind but timing information was lost, in comparison to completely leaving having such an event out. A somewhat odd example: earthquake ;) you would know it did happen during the scan but would not know when exactly... Or a little more realistic -- "someone passed with a metal roller bad by the MEG scanner room" during session ;) So indeed -- could be argued, but allowed for in BIDS ATM, thus could be present/used, and tool should ideally be robust to encounter such cases |
This doesn't address the issue.... what does that event mean? It seems that the implication is that this event has a mystery time that is between the times of the two events in which it appears between in the tsv. This is problematic because events don't have to be sorted by event time. If you sort -- those will appear at the beginning or the end and you will have lost the association. Other numeric columns can have n/a, no problem, but it doesn't seem reasonable that events --- which are supposed to be aligned on the time line should be allowed to have no onset times. A potential way to handle this is to introduce a "foggy" time notion. We could introduce a tag that says the value of something is known to within +/- some amount (e.g., Error-bar/#). If an n/a in the onset column is to mean that it was just something that happened on some unknown time during the experiment, then this specific meaning must be spelled out really explicitly in BIDS and I can't help thinking it should go somewhere else. @sappelhoff @effigies @Remi-Gau --- the question here is whether the |
IMO it would mean that the dataset author accepts undefined treatment of those rows. A tool that needs values to operate can justifiably drop missing values. |
@yarikoptic -- we'll fix the issue as follows --- clearly the validator should not throw an exception. For now we will ignore the rows with |
Sounds great! Thank you in advance! |
@VisLab Hi Kay, not to be pushy, but I wonder when do you think you would have time to address this issue? we would like to make our BIDS dataset sparkling clean for HED ;) |
I'm testing now --- maybe tomorrow.....
…On Fri, Oct 11, 2024 at 1:08 PM Yaroslav Halchenko ***@***.***> wrote:
@VisLab <https://github.com/VisLab> Hi Kay, not to be pushy, but I wonder
when do you think you would have time to address this issue? we would like
to make our BIDS dataset sparkling clean for HED ;)
—
Reply to this email directly, view it on GitHub
<#1026 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJCJOQUEBSFWFYCHYA6K7DZ3AH2RAVCNFSM6AAAAABPKBU7A2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMBXHEYDEOBWGE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Fixed by PR #1028. Let me know if there are issues. The validator now treats n/a onset rows as though there were no onset column. It validates the HED but does not allow event processes that have onsets and offsets. Let me know if you encounter any issues. Thx |
on "spacetop" dataset (which is yet to be made public) there is .tsv files with
n/a
in some .tsv's, which is the way BIDS mandates to code "missing values". See https://bids-specification.readthedocs.io/en/stable/common-principles.html#tabular-files which statesI filed
though.
ATM for my sample dataset (not yet public on openneuro, attn @jungheejung - remind me if we do have it somewhere public meanwhile), a version of
hed-validator
(#1025) blows withnote that it is also unclear on which file it blows -- so might be worth providing some feedback (logging ERROR level or just catch/enhance exception while working on a specific file) on which file it happens.
I think it is likely on
n/a
in onset field in some_events.tsv
, e.g.FWIW -- we did not see similar crash while running deno bids-validator (I guess uses JS version?).
The text was updated successfully, but these errors were encountered: