Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset ordering #114

Open
aozalevsky opened this issue May 2, 2023 · 2 comments
Open

Dataset ordering #114

aozalevsky opened this issue May 2, 2023 · 2 comments

Comments

@aozalevsky
Copy link

aozalevsky commented May 2, 2023

A branched topology of datasets (see below) breaks a logic dataset order in the mmcif file.

With a topology like this

A00 (primary)
|
A10

     B00 (primary)
    /    \ 
B10       B11
(parent) 
|
B20

the Dataset table looks like this:

B00
B11
A00
A10
B10
B20

But if i delete a parallel node B11, everything is ordered in a more reasonable manner:

A00
A10
B00
B10
B20

I'm adding B11 to the protocol as po.system.orphan_datasets.append(B11). Am I missing something, or is this a bug?

@benmwebb
Copy link
Contributor

benmwebb commented May 2, 2023

I'm not sure what you mean by "logic dataset order" but the IHM dictionary doesn't mandate ordering for any table IIRC. python-ihm generally will output objects in a consistent but unsorted order. In the case of datasets they will be output in the same order they're encountered in the Python object hierarchy. This is not a bug unless the output dataset IDs are actually wrong.

BTW, generally it should not be necessary to place objects in the various orphan_ lists. python-ihm stores objects in a hierarchy, so generally something like a Dataset should be referenced by another object in that hierarchy (such as a Restraint or another Dataset). But on reading an mmCIF file it's possible that an object such as a dataset is listed in a table but nothing refers to it. The orphan_ lists are provided to keep references to such objects.

@aozalevsky
Copy link
Author

My first impression was that it should, just as you said, traverse the hierarchy (something like a depth-first search, starting from primary datasets). But it looks like object hierarchy is more complicated. For instance, in the example above, objects were created in the following order: B20, B00, B10, B11. and yet B00-B11 end up on top of the list, while B10-B20 are at the bottom.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants