-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Term memoization #272
base: main
Are you sure you want to change the base?
Term memoization #272
Conversation
7a6c734
to
9fbd794
Compare
So, I've created a turtle parsing benchmark, and the results show about a 40% decrease in memory usage (~55% after garbage collection), and an 8% reduction in processing time. Looking at the difference before and after the gc call, there seems to be some additional space for reducing memory usage. |
if (datatype) { | ||
this.datatype = NamedNode.fromValue(datatype) | ||
} | ||
this.datatype = datatype ? NamedNode.fromValue(datatype) : XSD.string | ||
} | ||
copy () { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method (copy) seems rather nonsensical now, since indexing is done with the value, so either reindexing logic has to be created, or the user should pass intended modifications to this function.
|
||
nn.sI = ++this.termIndex | ||
nn.term = ln | ||
this.termMap[nn.sI] = this.nsMap[nn.value] = nn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking whether this.nsMap[nn.value]
already contains a value can aid in debugging, but introduces a runtime overhead. If rdflib has a 'dev build' it'd certainly be a good addition
9fbd794
to
265c189
Compare
265c189
to
4631500
Compare
|
It was using `ns` assuming it's available globally
This deduplicates all objects entering the store, keeping at most one instance for each value in memory
4631500
to
1c54ffb
Compare
This implements some of the methods shared over all memoized Node descendants. Though RDF would only contain Named- blank-nodes and literals, RDFlib also allows Collection and Graph types to be introduced into a triple, so an additional layer of is introduced to overwrite some of the core methods with more performant versions which take advantage of the memoization (e.g. `equals()` using a single `===`). Another major change is the introduction of `#generateString`, which contains implementation details about how a class should be stringified. Since there should only ever be a single instance for every unique node, the `#toString` method is overridden to memoize the result of the first time `#generateString` is called, saving later string concatenation operations and garbage collection for a persistent memory allocation.
This has a few advantages: * Makes rdflib.js comply with the rdfjs task force definition (http://rdf.js.org/#literal-interface) * Makes behaviour consistent with the new `Term.literalByValue` implementation * Downstream can use switch without additional undefined and null checking We now only have one instance of xsd:string, so the overhead is only a reference rather than an entire object for each literal.
1c54ffb
to
cadecfb
Compare
Awesome work! Trying to write down what you, @Vinnl and I just discussed f2f: I do find it a bit scary that when you do: var a = new A();
var b = new A();
a.value = 'x';
b.value ='y';
console.log(a.value); You would get 'y' instead of 'x'. So I think that we might be better off going for nodes that are just objects. So instead of: var node = new NamedNode('http://example.com/friend') we could maybe just do: var node = {
termType: 'NamedNode',
value: 'http://example.com/friend'
} That would also be a lot cheaper for passing these to webworkers and back. |
Not sure how this PR affects conforming to RDF/JS Data Model. As a note, one of latest changes to that draft provides To go other way and 'downgrade' RDF/JS conformant object to JSON serializable object I recall issue opened by @RubenVerborgh rdfjs/data-model-spec#94 which might need revising.
Currently I experiment with prototype based on https://github.com/PolymerLabs/actor-helpers which actually aims at moving as much as possible into web workers. Easy round trip between RDF/JS conformant objects and JSON serializable objects will come there as requirement. |
As per request I've integrated some of the external changes to rdflib into the library.
This
is still a WIPstill contains some minor bugs, but the tests succeed and it works properly when integrated into our rdflib app with minimal changes,but some additional code isn't ported yet, which might come this weekend. But the overall mechanics and changes are already visible.The largest change is having to use factory methods rather than constructor calls, IIRC this could be overcome by using ES5 functional types rather than ES6 classes, but I can see value in using factories and leaving the constructor semantics unchanged to prevent unexpected behaviour (e.g.
new Literal('a') === new Literal('a') // => true
).TODO: