Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TTL files as input to rdf2hdt produces invalid blank node IDs #210

Open
GregHanson opened this issue May 6, 2024 · 1 comment
Open

TTL files as input to rdf2hdt produces invalid blank node IDs #210

GregHanson opened this issue May 6, 2024 · 1 comment

Comments

@GregHanson
Copy link

Using an input ttl file from W3C SPARQL 1.0 Test Suite (i18n,) I run it through rdf2hdt and dump the contents using hdtSearch:

./bin/rdf2hdt.sh sample.ttl sample.hdt
[INFO] Scanning for projects...
[INFO] Inspecting build with total of 1 modules...
[INFO] Installing Nexus Staging features:
[INFO]   ... total of 1 executions of maven-deploy-plugin replaced with nexus-staging-maven-plugin
[INFO]
[INFO] ----------------------< org.rdfhdt:hdt-java-cli >-----------------------
[INFO] Building HDT Java Command line Tools 3.0.10
[INFO] --------------------------------[ jar ]---------------------------------
[INFO]
[INFO] --- exec-maven-plugin:1.6.0:java (default-cli) @ hdt-java-cli ---
[WARN] base uri not specified, using 'file:///path/to/sample.ttl'
[INFO] Converting path/to/sample.ttl to path/to/sample.hdt as TURTLE
File converted in ..... 524 ms 808 us
Total Triples ......... 9
Different subjects .... 4
Different predicates .. 5
Different objects ..... 9
Common Subject/Object . 0
HDT saved to file in .. 7 ms 942 us
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  1.314 s
[INFO] Finished at: 2024-05-06T16:36:52-04:00
[INFO] ------------------------------------------------------------------------

./bin/hdtSearch.sh sample.hdt
[INFO] Scanning for projects...
[INFO] Inspecting build with total of 1 modules...
[INFO] Installing Nexus Staging features:
[INFO]   ... total of 1 executions of maven-deploy-plugin replaced with nexus-staging-maven-plugin
[INFO]
[INFO] ----------------------< org.rdfhdt:hdt-java-cli >-----------------------
[INFO] Building HDT Java Command line Tools 3.0.10
[INFO] --------------------------------[ jar ]---------------------------------
[INFO]
[INFO] --- exec-maven-plugin:1.6.0:java (default-cli) @ hdt-java-cli ---
>> ? ? ?
Query: |?| |?| |?|
_:@0 http://www.w3.org/2001/sw/DataAccess/tests/data/i18n/normalization.ttl#resumé "Alice's normalized resumé"
_:@0 http://xmlns.com/foaf/0.1/name "Alice"
_:@1 http://www.w3.org/2001/sw/DataAccess/tests/data/i18n/normalization.ttl#resumé "Bob's non-normalized resumé"
_:@1 http://xmlns.com/foaf/0.1/name "Bob"
_:@2 http://www.w3.org/2001/sw/DataAccess/tests/data/i18n/normalization.ttl#resumé "Eve's non-normalized resumé"
_:@2 http://www.w3.org/2001/sw/DataAccess/tests/data/i18n/normalization.ttl#resumé "Eve's normalized resumé"
_:@2 http://xmlns.com/foaf/0.1/name "Eve"
file:///path/to/sample.ttl http://www.w3.org/2000/01/rdf-schema#comment "Normalized and non-normalized IRIs"
file:///path/to/sample.ttl http://www.w3.org/2002/07/owl#versionInfo "$Id: normalization-01.ttl,v 1.1 2005/10/25 09:38:08 aseaborne Exp $"
Iterated 9 triples in 22 ms 504 us

While I cannot find @ called out in ttl or nt spec, when using @ for blank nodes in the examples from the docs above, riot CLI throws validation errors when a blank node begins with @

cat <<EOF > blanknode.ttl
@prefix foaf: <http://xmlns.com/foaf/0.1/> .

_:@123 foaf:knows _:@1234 .
_:@1234 foaf:knows _:@123 .
EOF

 cat <<EOF > blanknode.nt
_:@123 <http://xmlns.com/foaf/0.1/knows> _:bob .
_:bob <http://xmlns.com/foaf/0.1/knows> _:@123.
EOF

riot --validate --time blanknode.ttl
17:02:00 ERROR riot            :: [line: 3, col: 3 ] Blank node label does not start with alphabetic or _ : '@'
blanknode.ttl :  (No Output) : 1 errors : 0 warnings
riot --validate --time blanknode.nt
17:02:05 ERROR riot            :: [line: 1, col: 3 ] Blank node label does not start with alphabetic or _ : '@'
blanknode.nt :  (No Output) : 1 errors : 0 warnings
@GregHanson GregHanson changed the title TTL files as input to rdf2hdt produces invalid blank nodes TTL files as input to rdf2hdt produces invalid blank node IDs May 6, 2024
@GregHanson
Copy link
Author

actually the spec does list valid characters:

RDF blank nodes in Turtle are expressed as _: followed by a blank node label which is a series of name characters. The characters in the label are built upon PN_CHARS_BASE, liberalized as follows:

Where PN_CHARS_BASE is the following list:

[A-Z] 
[a-z] 
[#x00C0-#x00D6] 
[#x00D8-#x00F6] 
[#x00F8-#x02FF] 
[#x0370-#x037D] 
[#x037F-#x1FFF] 
[#x200C-#x200D] 
[#x2070-#x218F] 
[#x2C00-#x2FEF]
[#x3001-#xD7FF] 
[#xF900-#xFDCF] 
[#xFDF0-#xFFFD] 
[#x10000-#xEFFFF]

Which does not include #x0040 for @

GregHanson added a commit to DeciSym/oxigraph that referenced this issue May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant