Rework abstraction layers #132

DominikPinsel · 2021-10-08T11:45:09Z

DominikPinsel
Oct 8, 2021
Collaborator

Hello everyone,

I have a feeling that we need to rethink some of our design decisions. Please note, that this text is based on my
assumption how a self-sovereign data ecosystem might look like.

GAIA-X Data Ecosystem

So which data ecosystems are there?

To have some ecosystems let’s just look at IDS and Ocean Protocol, that both should be GAIA-X compliant somewhere in the
future, and GAIA-X itself. That makes:

IDS
GAIA-X
Ocean Protocol

I’d also assume that the GAIA-X ecosystem will take more ideas from SSI. Where, for example, contracts can also be
described using verifiable credentials. But let’s leave this aspect our for now.

Ecosystems Similarity

Is there something these data ecosystems have in common? We can use this information to define our "core".

I assume in all these ecosystems are similar in a way, that all can

create data offers
create contracts for data exchange
exchange data (based on a contract)

There are other things the core might consist of, but we are only looking at the ecosystem parts.

Similarity 1: Create data/service offers

For IDS this basically is describing Contract Offers and making them public via Self-Description and Broker. A
contract offer describes different things, like obligations, permissions, resources, artifacts, and so on. These things
must all be described using the IDS Information model.

I don’t know enough about GAIA-X to be able to describe a contract in detail. But a GAIA-X contract contains rights
& obligations, which might be a simple text document. The contract offer is also called service offering. The Service
Offering must be described using a GAIA-X self-description. Other things, that are part of the service offering, and
also must have a GAIA-X compliant self-description, are Resources and Assets.

The Ocean Protocol creates contract offers, by providing data offers in an ocean market. There are three different
kind of data offers. One for a fixed price, one for auctions and one for automated price discovery. Also some kinds of
offers are only valid for a certain amount of time (“buy now and be able to download the data for one year”). Another
interesting thing is, that all offers have their own DID. So creating and updating offers costs real money.

Similarity 2: Create agreement/contracts for data/service exchange

Again let’s start with IDS. Two IDS connectors are able to initiate a contract negotiation between two participants.
How this negotiation happens and what is exchanged is defined by IDS.

Again for GAIA-X I’m not that sure how this works. As far as I can see GAIA-X mainly cares that an agreement
happens. Some kind of negotiation or exchange/processing of verifiable credentials is not defined. But I would assume it
might be necessary to do something before the agreement happens, as all GAIA-X Self-Descriptions may contain
verifiable-credentials.

The Ocean Protocol contract should be pretty straight forward. Someone buys something at the market. Now both
parties have a contract.

Similarity 3: Exchange data (based on contract)

IDS has a defined process, that needs to be followed to exchange data. In the current IDS the data exchange is
always synchronous.

Again I’m not sure how GAIA-X wants the data transfer. But I assume they don’t care as long as the data is
transferred.

For the Ocean Protocol I’m not sure whether there is some kind of API, too. But I assume the data may be
synchronously downloaded on their marketplace website, using a download button.

So how is this important for our software architecture? I assume the most important part is, that these three systems might
have very little in common and our initial assumptions about certain aspects are wrong and introduce unnecessary
complexity to our codebase. Basically I think we put an abstraction layer over stuff, we are not able to reuse.

Let’s take some concrete examples.

Policy-Engine & -Model

I assume the ORDL policy-engine and model is purely IDS. Therefore, it is not necessary to create an abstraction layer.
We should describe IDS contracts using the IDS terms.

Asset & Asset Index

I assume assets from the different systems have very little in common. So little, that we are not properly able to
describe them using the same value objects. Therefore, we should only create an Asset interface in the SPI. And each
ecosystem must make the implementation of this interface itself.
In conclusion, an Asset-Index implementation, that wants to create assets for IDS, must reference the IDS SPI (?) and create the IDS Asset itself.

This would solve multiple problems we have with our current Asset objects. Like

For each EDC ContractOffer one IDS Resource is created. This mapping is strange because you would expect artifacts of
the same family in the same resource.
An asset is always mapped to one representation and artifact.
It’s not possible to use all IDS properties to describe the asset/artifact. Or our Asset object would become pretty
noisy.

Contract & Contract Offer (Framework)

I assume contracts from the different systems have very little in common. So little, that we are not able to describe
the contracts, using a common value object. Therefore, we should only create a Contract interface in the SPI. And each
system must make the implementation of this interface itself. The contract offer framework, that wants to create IDS
Contract Offers, must reference the IDS SPI (?) and create the object itself.

This would solve the big problem, that our current policy model is not map-able to the IDS model.

Data Transfer

I assume the data transfer (if it becomes possible to transfer data synchronously) is common enough. So it won’t be
necessary to create a custom data transfer for each ecosystem. But maybe it’s necessary to introduce new extension
points for the ecosystems. E.g. for Access or Usage Control or IDS Apps.

We could do this by adding all the necessary extension points to the transfer process, or refactoring the transfer code,
so that each ecosystem is able to setup a custom transfer extension.
But as long as we don't know any requirements, and we already have a working transfer core, I would leave it as it is.

Conclusion

At the moment we have the extensions for data ecosystems in the data-protocol package. But I assume
each ecosystem brings a lot more code, that should not be summarized as "data-protocol". For example the IDS extensions might need a lot of code to support IDS rules and apps.

I also think, it is no possible to define some kind of abstract asset or contract, that can be used in all data ecosystems without various disadvantages.

Maybe each of data ecosystems needs to be treated as first class EDC citizen and be moved to its own core extension? If we move them to the core extension, we could also move their corresponding SPI to the Core SPI. This again makes it easier for the AssetIndex (and other extensions) to create ecosystem specific implementations of Assets/Contracts/etc.

So, sorry for the wall of text. I hope I was able to explain my thoughts. What do you think?

jimmarino · 2021-10-08T16:39:04Z

jimmarino
Oct 8, 2021
Collaborator

Hi Dominik,

What you seem to be suggesting is to create parallel modules and runtimes for each data-sharing technology because there appears to be little in common. I think the issue will become more tractable (and generalizable) once we look at concrete use cases and protocols.

So, I would vote for worrying about this when it happens. At this point, I think these issues are too theoretical to address. GAIA-X hasn't defined anything yet. Ocean Protocol "data sharing/compute-to-data" is a thin layer over K8S and not IMO a policy-based data sharing protocol.

Today we discussed how to map IDS Assets to EDC Assets and vice-versa. Let's try that and see where things go, remembering that we do not need to support the full expressivity of IDS.

Jim

2 replies

DominikPinsel Oct 9, 2021
Collaborator Author

Yea, I also think that it's now a bit theoretical. What worries me is, that it's currently really difficult to map between the EDC and the IDS model. Especially the policy part. Therefore I wanted to drop my thoughts, that a model, that is easily map-able to IDS, is a model that we probably won't be able to reuse for other GAIA-X implementations.

For me it's totally fine if we follow the current approach. I mean Denis and I implemented most of it :)
But I think we should keep in mind, that it may be necessary to use another approach. Where we use a specific data model for each ecosystem, that is better map-able.
Maybe in the future, when maybe the number of workarounds and tradeoffs is growing (like I assume), we can continue this discussion :)

jimmarino Oct 9, 2021
Collaborator

I don't see how it is difficult to map between IDS and EDC. We have already discussed how to do this for Asset types by flattening the Asset hierarchy in IDS. Policy is also mappable since the EDC model is very close to ODRL (and IDS).

Nothing precludes Asset from being an extensible type, particularly if it is made to use Jackson's support for polymorphic types like other types in the EDC currently do. This would allow the runtime to support very expressive data models for specific catalogs easily. I believe this change would involve adding a few annotations to Asset and generifying its builder (.cf ProvisionedResource for an example).

GAIA-X has not defined any policy-related mechanism, so I would not worry about it.

At a higher level, let's keep in mind one of the primary architectural principles we have with the project and how to achieve it: supporting multiple data sharing protocols with a common platform. Having what would essentially be multiple code bases for each data sharing protocol doesn't map well to that vision.

mspiekermann · 2021-11-02T09:01:43Z

mspiekermann
Nov 2, 2021
Maintainer

@jimmarino @DominikPinsel Do we have some documentation that shows the mapping of EDC and IDS Asset hierarchy as mentionded above? I think it would be great to have this as a starting point for further discussions and anchor for assessing other models from other initiatives (when they make more detailed statements or provide specifications).

I closed #146 to merge streams, but we should remind the assessments and suggestions of @DominikPinsel in our further discussion.

2 replies

jimmarino Nov 2, 2021
Collaborator

Here's what I have. Perhaps we can start a wiki page and attach this for the time being?

IDS Notes.pdf

DominikPinsel Nov 2, 2021
Collaborator Author

Sure. I haven't done a deep analysis between the two models, but I think I found enough problems to show that the two models are incompatible as they are now.

IDS has a pre-duty and a post-duty. Our EDC model only supports one duty field. If we add a pre- and post duty field to the EDC model, it is no longer ORDL compliant. If we don't add it, we cannot differentiate between pre- and post-duties in EDC.
IDS only supports AtomicConstraints, consisting of an Operator, a LeftExpression and a RightExpression. ODRL and EDC support other constraint types as well (e.g. OrConstraint). Therefore, the EDC model is only map-able to IDS if only constraints of type AtomicConstraint are used. => IDS supports only a subset of the EDC capabilities.
IDS supports 46 different types of Operators for the AtomicConstraint. EDC only supports 7 different Operators. => EDC only supports a subset of the IDS capabilities.
IDS supports 17 hard coded types of LeftExpression. EDC is extensible and allows basically everything. => IDS only supports a subset of EDC.
IDS supports only RdfResource for the RightExpression. EDC is extensible and allows basically everything. => IDS only supports a subset of EDC.

_{Dominik Pinsel [email protected], Daimler TSS GmbH, legal info/Impressum}

jimmarino · 2021-11-02T14:34:40Z

jimmarino
Nov 2, 2021
Collaborator

We need to be careful with our wording. The two models are not incompatible; they can be mapped.

This is a bug in IDS because they claim to support ODRL. IDS needs to fix this. We can also discuss ways to work around once we have a concrete use case (i.e. in a separate thread).
Again, this is the same problem with IDS. Also, see my comments on expressiveness between models below.
Again, this is not an incompatibility. Rather, EDC coding is not complete. Additional operators can be added.
Ditto
Again, see my comments on expressiveness below

When mapping between different policy systems, it may be the case that one policy model is more expressive than another. That doesn't mean they are "incompatible". When model A is more expressive than B, and the transformation is A->B, we need to limit the type of expressions in A so that they can be expressed in B. Practically in the EDC, when running IDS, this can be enforced by having policy inputted using an "IDS policy editor" or performing policy validation.

When B is more expressive than A, we can choose to (a) map the parts of B to coarser-grained concepts in A; or (2) not support that type of expression. Given how wide open ODRL is, I suspect option 2 may happen, which is not a problem. Users can author policy using expressions EDC supports or submit a PR.

Note that the above situations frequently happen when mapping type systems (e.g. JSON, Java, XML, etc.). There are well-established patterns we can lean on to deal with these cases.

I'm happy to take this forward and work with IDSA to resolve ambiguities with respect to policy since there are a number of them. The first issue that needs to be addressed is the simultaneous support in IDS for ODRL and its policy grammar additions. That is the root of many of the ambiguities we are running into.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rework abstraction layers #132

{{title}}

Replies: 3 comments 4 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Rework abstraction layers #132

DominikPinsel Oct 8, 2021 Collaborator

GAIA-X Data Ecosystem

Ecosystems Similarity

Similarity 1: Create data/service offers

Similarity 2: Create agreement/contracts for data/service exchange

Similarity 3: Exchange data (based on contract)

Policy-Engine & -Model

Asset & Asset Index

Contract & Contract Offer (Framework)

Data Transfer

Conclusion

Replies: 3 comments · 4 replies

jimmarino Oct 8, 2021 Collaborator

DominikPinsel Oct 9, 2021 Collaborator Author

jimmarino Oct 9, 2021 Collaborator

mspiekermann Nov 2, 2021 Maintainer

jimmarino Nov 2, 2021 Collaborator

DominikPinsel Nov 2, 2021 Collaborator Author

jimmarino Nov 2, 2021 Collaborator

DominikPinsel
Oct 8, 2021
Collaborator

Replies: 3 comments 4 replies

jimmarino
Oct 8, 2021
Collaborator

DominikPinsel Oct 9, 2021
Collaborator Author

jimmarino Oct 9, 2021
Collaborator

mspiekermann
Nov 2, 2021
Maintainer

jimmarino Nov 2, 2021
Collaborator

DominikPinsel Nov 2, 2021
Collaborator Author

jimmarino
Nov 2, 2021
Collaborator