-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data access description example #170
Conversation
This is using a fictitious installation of the RWTH RDM system Coscine as an example. Most needed aspects can be mapped onto properties that are provided by DCAT. Missing pieces are: 1. Specification of full details of how to perform an actual download request 2. Required authentication Specifically `DCAT:endpointURL` is just "root location or primary endpoint of the service", but most APIs offer more than a download API, hence need some further specification. Coscine uses a particular URL path for its "get blob" API, which needs to be parameterized with three parameters. These parameters needs to be put into a qualified-relation (Distribution-DataService). The approach via `endpoint_path` and `access_parameter` is arbitrary. I could not find a common pattern or vocabulary to do this. An equallly good (from my POV) alternative would be: - `DataService.download_url_template` -- with some convention to put placeholders into a download URL - `QualifiedAccess.download_url_parameters` with a list of name/value pairs Yet another alternative would be to encode all three shown parameters into a single "identifier", and call that `access_id` (matching watch the annex access example is doing). The disadvantage for that would be that the knowledge of the individual parameters that make up the compound identifier is lost. If preserved, however, a single-point-fix of `endpoint_path` could fix up data access in case of an API version update that does not change the meaning of the identifiers. Theoretically, we would also need to add information on authentication, but this is also a vast space of possibilities. A system implementing a download based on this schema is given a `contact_point`, and also a dataservice description to be able to present a user with a meaningful request for entering a credential, at least.
My initial instinct is that I agree with your description of pros/cons of having a single identifier ( Whether it's named |
Have you seen https://www.hydra-cg.com/spec/latest/core/ ? I came across it several times in my search for schema-drive UI generation. It has several terms related to some aspects of this PR, such as templated links and discovering an API endpoint. I haven't inspected it in detail, though, so not sure if it is actually practically useful. Perhaps worth a quick scan though. |
The (more or less arbitrary) definition is - a property is something that is observed or measured, something that is a fact, given a concrete thing - a characteristic is the underlying, more generic concept This is done to be able to distinguish a `Parameter` from a `Property` later on (or in another schema). A `Parameter` being a more causal/functional concept, something that is variable and impacts a system or function. The reason to base both concepts on the same foundation is the similarity with respect to the schema design principles. We need to be able to express parameters without prescribing a particular vocabulary. Employing the same approach twice in identical fashion (only changing the semantics) avoids needless complexity.
So far it was only in `Distribution`, but we also need it for services and processes, etc.
da0483a
to
455d2f3
Compare
Major changes are: - more useful and accurate `distribution` schema description - more complete `DataService` class - new `Parameter` concept - examples that document the capabilities While most changes more-or-less reflect the continued adoption of `DCAT` concepts and properties, the introduction of `Parameter` is noteworthy. `Parameter` is a variant of `Property` and serves a similar purpose (declare arbitrary additional aspects without prescribing a vocabulary to do so) with only a change in semantics of the class itself. In contrast to `Property` (observed or measured, fixed), `Parameter` is a variable with impact on a system or function. Closes #171 Via the property `has_parameter` particular parameters can be declared as supported/needed (e.g., `DataService`), or provided (`QualifiedAccess`). Examples are included. `QualifiedAccess` is no longer derived from `EntityInfluence` -- it has been too much of a stretch. It is now focused on access, and no longer requires a role specification. Closes #156
Also prune some prefixes that are not strictly needed.
Major changes are:
distribution
schema descriptionDataService
classParameter
conceptWhile most changes more-or-less reflect the continued adoption of
DCAT
concepts and properties, the introduction of
Parameter
is noteworthy.Parameter
is a variant ofProperty
and serves a similar purpose(declare arbitrary additional aspects without prescribing a vocabulary
to do so) with only a change in semantics of the class itself. In
contrast to
Property
(observed or measured, fixed),Parameter
is avariable with impact on a system or function.
Closes #171
Via the property
has_parameter
particular parameters can be declaredas supported/needed (e.g.,
DataService
), or provided(
QualifiedAccess
). Examples are included.QualifiedAccess
is no longer derived fromEntityInfluence
-- it hasbeen too much of a stretch. It is now focused on access, and no longer
requires a role specification.
Closes #156