-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Craft TSV records for an SFB demo #1
Comments
Specification of dataset metadata in a table format, where each dataset property is described in a single row. Below is one section on each recognized property. Each section contains a table snippet with details on the syntax and semantics for that property. Dataset identifier [required]Definition: https://schema.org/identifier
Dataset name [required]Definition: https://schema.org/name
Dataset description [optional]Definition: https://schema.org/description
Dataset version [optional]Definition: https://schema.org/version
Author [optional]Definition: https://schema.org/author One or more rows are supported, each row for one author, order in table (top-to-bottom) defines author order (first is first, last is last).
Publication [optional]Definition: https://schema.org/publication One or more rows are supported, each row for one publication related to the dataset, or possibly a publication of the dataset itself. The citation is free-format. Citation style should be consistent across publications.
Dataset keywords [optional]Definition: https://schema.org/keywords One keyword per cell, no limit on number of keywords.
Custom dataset property [optional]Definition: https://schema.org/Property One special property category label (col1) is recognized: Any row with a col1 cell value that does not match a recognized field is interpreted as a custom property. Col4 and col5 allow for amending the "display name" and "display value" of the property with identifiers in the form of URLs to the definitions of respective concepts and items.
|
Given the specification above, for the SFB1451 we can additionally required (already included in above spec, just spelled out here). Origin SFB project [required]Columns 3 and 4 are implicit, 3 is constant, and 4 is the link to the project landing page on the SFB website (for example).
To be extended with relevant metadata, such as species, limbs, recording method, ... |
Looks good to me so far. Regarding "publication": if current catalog schema & rendering is the target, than the catalog would show (in different fields): Title, Authors, Publication year, DOI & Journal. I would be cautious allowing free-form citation, it can be a nightmare to parse. Requiring DOI is a great idea, maybe we add other fields instead of free-form citation? |
I'm confused by the definitions of Custom dataset property [optional] and Origin SFB project [required]. This is my interpretation in the form of data:
For the first two rows, col2 and col3 represent the |
Where/what is the confusion? |
If the sample data I provided in the table looks correct, then I guess my interpretation is correct. But for clarity: Custom properties:
Does that mean it should not be indicated with Origin SFB project: I'm not confused about this anymore. |
Wording could probably be improved, but you did it according to my intent for the spec, hence I cannot explain why it should be different. It should be exactly like you did. But ideally also with the other two columns that bring the semantic clarity to name and value. |
Some context in: datalad/datalad-catalog#311
Example dataset (here in csv):
Example file (here in csv):
Some notes from discussion:
dataset_id
ourselves deterministically, from e.g.project
andidentifier
andname
?The text was updated successfully, but these errors were encountered: