Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

canGoTo of BitmapTriplesIterator #212

Open
Chat-Wane opened this issue Sep 23, 2024 · 3 comments
Open

canGoTo of BitmapTriplesIterator #212

Chat-Wane opened this issue Sep 23, 2024 · 3 comments

Comments

@Chat-Wane
Copy link

I noticed a discrepancy between the hdt-version and hdt-cpp on BitmapTriplesIterator's function canGoTo which forbids goTo in the java version, except for ?s ?p ?o:

The rest of the code of goTo seems similar, so, should the java version of canGoTo simply return true as well?

@ate47
Copy link
Contributor

ate47 commented Sep 23, 2024

By looking at the cpp code, I'll say they are wrong. First the check for the boundaries is inside de goto method and second because it is only for checking the max.

A simple test would be to use a S?? pattern with S>1 and then try to goto 1 to see if it creates issues

@Chat-Wane
Copy link
Author

Chat-Wane commented Sep 23, 2024

Thanks for the reply!

You are right, in hdt-cpp, canGoTo is not used in the corresponding goTo, but it's used in the search function. I feel like it's a test about whether or not this implementation of the iterator is suited for skips. Then only checking for the max makes sense.

I used the hdt-cpp implementation and never had issues with the skip. That's why I trust it, although I cannot be sure it used BitmapTriplesIterator under the hood.

On the other hand, in the hdt-java, I cannot perform any skip whatever the kind of variable order (except for the fully unbounded ?s ?p ?o). (I tested with an HDT both in memory, and as a file, both indexed and not).

@Chat-Wane
Copy link
Author

I apologize for the inaccurate statement, ?s ?p ?o, ?s P O and ?s ?p O succeed.

On WatDiv10M, with the following arbitrary values:

Long sId = hdt.getDictionary().stringToId("http://db.uwaterloo.ca/~galuc/wsdbm/City1", TripleComponentRole.SUBJECT);
Long pId = hdt.getDictionary().stringToId("http://www.geonames.org/ontology#parentCountry", TripleComponentRole.PREDICATE);
Long oId = hdt.getDictionary().stringToId("http://db.uwaterloo.ca/~galuc/wsdbm/Country23", TripleComponentRole.OBJECT);

Only a few actually manage to skip.

assertTrue(indeedSkipped(hdt, sId, 0L, 0L, 1)); // BitmapTriplesIterator fails
assertTrue(indeedSkipped(hdt, sId, pId, 0L, 1)); // BitmapTriplesIterator fails
assertTrue(indeedSkipped(hdt, sId, 0L, oId, 1)); // SequentialSearchIteratorTripleID unsupported operation
assertTrue(indeedSkipped(hdt, 0L, pId, 0L, 1)); // BitmapTriplesIteratorYFOQ fails
assertTrue(indeedSkipped(hdt, 0L, pId, oId, 1)); // BitmapTriplesIteratorZFOQ succeed
assertTrue(indeedSkipped(hdt, 0L, 0L, oId, 1)); // BitmapTriplesIteratorZFOQ succeed
assertTrue(indeedSkipped(hdt, sId, pId, oId, 1)); // BitmapTriplesIterator fails

If I remember correctly, there should be more...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants