-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discovery of storage #310
Comments
There are basically two different mechanisms that I have in mind. One is that we have a resource The other is that we define that the response payload to Personally, Orthogonal to that is the content of that response. I see two options, one is for example: <> pim:storage </pods/users/foo/> ,
</pods/users/bar/> . However, there's no value to this being self-describing. Also, it appears to stretch the definition of Therefore, I rather suggest we do for example: </pods/users/foo/> a pim:Storage .
</pods/users/bar/> a pim:Storage . That tells you exactly what you need to know in concise terms, and without stretching the definition of anything, it describes the storages exactly the way they describe themselves. |
I don't quite understand this. Any given resource is part of only one storage and there are no overlapping storages.
No. /pods/users/foo/ and /bar/baz are under different storages.
That's old. There was an update and the current definition is https://github.com/solid/vocab/blob/bd9ce1c7806254bf1307c73f2b5e0dd3eeaab101/space.n3#L91-L92 . I'll follow up with Tim to push latest to w3.org.
That works only for 200 response. Clients may need to navigate up the hierarchy eg. to find the storage owner, irrespective to their permission on each resource on the way. This was the consensus.
I don't see a problem :) but I'll respond to the following:
The Protocol describes one way that is guaranteed for clients to discover the storage. Clients are not required to discover the storage.
What's an "anonymous pod" - defined anywhere? Close issue? |
What is that requires Does that mean that I am not allowed to have a something like a |
How was
Of course |
A legacy CMS system or something. Whatever, but not the Solid protocol.
OK, I didn't quite parse that, but And BTW, |
FYI proposal for notification also defines |
If |
Yes, and therefore, if |
What breaks exactly? There is nothing prior to Storage. The path may be in the URI but there is nothing to see in |
Yes, but which? If you do not know anything about the structure of the pod, how do you get started? |
I'll bite. How did a client come across the pod? How do you find out my inbox? What's the input? Client select or arrives at a target somehow. Do you think it would help to introduce language along the lines of https://www.w3.org/TR/ldn/#discovery :
|
No, I don't think that would help. Say that Solid lives in a Web ecosystem, where servers manage pods along with legacy systems, using the same origin. You may then have arrived at the server by very conventional means, like a link to a non-Solid resource, news articles. Whatever stuff you find on the Web today. And then you, as the client, want to discover if there are any pods there. The content admin may not have made that data available on the resource that you already have, not very accommodating that is, but you want to know. We can't control all the content out there, but in order to lead a healthy existence in an ecosystem, with a ramp-up part, I think it is vitally important to have some hooks that you can be sure exists. |
Hmm, I think that's a broader question or orthogonal to discovering Storage, and there may be a simple answer to it by checking the type of resource. For example, LDP mentions rel=type ldp:Resource ( https://www.w3.org/TR/ldp/#ldpr-gen-linktypehdr ) to indicate LDP support.. and for Solid, there could be something like solid:Resource ( see also #194 (comment) ) |
So, testing out the algorithm: GET /pods/users/
200 OK
Content-Type: text/html
<h1>These are our users</h1>
<ul><li>foo</li><li>bar</li></ul>
GET /pods/
200 OK
Content-Type: text/html
<h1>Yeah, we really love Solid!</h1>
GET /
200 OK
Content-Type: text/html
<h1>Welcome to our site!</h1> Result: Storage not discovered. GET /pods/users/foo/assets/images/bar.jpg
200 OK
Content-Type: image/jpeg
sdoglghsdgfjh
GET /pods/users/foo/assets/images/
200 OK
Content-Type: text/turtle
<> ldp:contains <bar.jpg>
GET /pod/users/foo/assets/
Accept: text/html
200 OK
Content-Type: text/html
<h1>Seriously</h1>
<p>I've done six requests now, and I still haven't discovered a storage.
How long do you actually expect me to keep doing this?
This is no way to make a discovery protocol.
Can't I just have a list of storages?</p>
Result: The client received no guidance, it wasn't required to, and so it gave up. Very reasonably, I might add. |
There are too many heuristics in there. We need a simple mechanism just to get started. |
First of all, what kind of an evil application are you that uses GET instead of HEAD just to discover Storage in HTTP header. The spec allows both for obvious reasons... And again, the response could've been 403. Need to discover root container / Storage somehow. There has to be an input. The Protocol already works with 3986 on hierarchical paths and containment. So what's in place is a quick way to break the segments and check. The Protocol also states:
When you say:
If it is a Solid server, it should've had Link rel=type Storage in header. If it is not discovered by that point, the application would know it is not working with a Solid server.
We already have something but you want something different. What I'm saying is that the use cases to discover the Solid server's Storage is different than the use case to determine whether a server supports the Solid Protocol. If you don't like the discovery algorithm looking for rel=type Storage or resource with rel=type solid:Resource (for Solid server). We could introduce Clients can discover the storage which contains the resource of an HTTP HEAD or GET request target by checking for the Link header with rel="http://www.w3.org/ns/pim/space#storage". The target of the relation is the storage (pim:Storage). And the requirement for the server obviously. |
Completely different use case. The 1) storage of a resource, is different than 2) list of storages in a pod, is different than 3) whether a server supports the Solid protocol. |
I do not understand the use case motivating the feature that lists all storage locations on a Solid server. Surely there are cases where storage location would not be discoverable by public agents. The use case that I do understand is the following: As a user, what are the locations of my storage roots on a Solid server. For example, I may have a WebID at It would be useful to have an endpoint on the storage server that, given an access token asserting a user's identity, list the data pods that this user owns. Clearly, this endpoint would require authentication. I also see this only being relevant in cases where there are multiple storage roots on a Solid server. |
Primarily, it is not so much a use case, it is a quality attribute, that conformance should be easy to verify. The current specification makes it difficult. |
Nod.
Nod.
I do not understand the use case motivating the feature that lists all storage locations of an agent on a Solid server. The use case that I do understand is the following: As a user, what are the locations of my storages on the Web. Discovery of storages starts from the agent because the agent can have storages on different origins. One way that apps can start discovery of an agent's storages is from the WebID Profile document (accessible by any agent with Read). Another would be from a different (access controlled) resource listing the storages such as the Edit: In the Protocol, the subject would be the agent as per the use case above:
|
I have started to think about this differently: It is also about the authority of the URI space of the storage. There may/will be other APIs on a Solid server, and they will occupy parts of the URI space. It is further very likely that parts of the storage's space will be occupied by URIs naming things that aren't controlled by the storage. In fact, if the storage is at The question is then, who has the authority to name things in the URI space of a storage? If we leave that to some "server admin" entity, which can have a server-wide discovery mechanism, then that "server admin" entity can override the wishes and priorities of the storage's owner or user. I think we should be very careful to use server-wide discovery mechanisms in other protocols that define APIs. Instead, the server should make storage discovery really easy, and thereby leave the control to the storage owner or user of their own URI space. The root container is a better place to manage further discovery. |
I think that with the resolution of solid-contrib/conformance-test-harness#119 it now makes sense to remove this from the milestone and return to it for 1.0. While I still think the current behavior is not sufficiently testable (as also confirmed by the above PR), we don't need to prioritize it right now. Any opposition to remove it from the milestone? |
Having had time to think about this further, I retract my idea to have a listing of server's storages, that is not sufficiently aligned with privacy expectations in the case where a server hosts many (my assumption has been that that's a rare case, you'd want a domain). I still think the algorithm needs to have work, it doesn't scale to have to move up the tree, possibly in a large tree. Whether that should be addressed in this issue or more generally in #355 is an open question. |
So, I have a slightly different use case than one that has been discussed already. I do not care about listing all the storages of a server (and agree with others that this is probably a bad idea), but as an application developer I need to know where to store data given a WebId. Therefore, I would like to propose extending section 4.1 on storage to include something along the lines of the following:
It would also require an addition to the WebId section along the lines of:
As stated earlier, exposing the storage directly in the WebID document itself won't work, because it is a public document and not all storages will be public. However, there MUST be a way for application developers, given a WebID, to get a list of storages owned / controlled by that WebID. The natural place to put it then, would be in a private This aligns with the expectation set by the WebId profile group that a well-formed document MUST include a That document, however, suggests that if a preferencesFile does not exist, it should be created by application developers. As an application developer, this puts me in an impossible catch-22. If I have a WebId that has no This is something that MUST be exposed by the server in some way, and this seems like the most reasonable way to do it. |
While writing tests for Section 4.1, I found no reliable way to discover a storage, and therefore no reliable way to test this section's requirements.
Now, since we generally have a storage on the server's
/
, for the most part, you could just start with that, and the tests would pass, but that's not the requirement. A server may support several storages, but parts of the URI space doesn't need to be a storage.In the current protocol, you can discover the storage by traversing towards it's root container. So, in principle, you could
GET /foo/bar
to discover that the root container is/foo/
. However, this can't be used for discovery, because it would require the entire URI space to be covered by storages, and that may not be the case. If I have understood it correctly, you could for example have that/pods/users/foo/
is the root container for the user foo, and similarly for other users. Then, justGET /bar/baz
will not yield a pointer to a storage.To remedy this problem, I suggest that we have a way that clients can discover the pods as a SHOULD level requirement (so that a server may support anonymous pods, but otherwise should make them discoverable).
The text was updated successfully, but these errors were encountered: