2024-02-08 kernel meeting notes #115
zachschuermann
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
summary
todo
action items
PRs in flight:
attendees
@tdas @roeap @zachschuermann @nicklan @ryan-johnson-databricks @hntd187
notes stream
PRs in flight (listed above).
Robert's pr: for json parsing - can we use from file hack? C++ arrow lets you read json from file but not from column (all the machinery is there). Ask @wjones127 ? apache/arrow#33662
nick: should we just parse all stats at once? we should have a clear picture of when we would use this? add microbenchmark
return result in C: (1) slot in parameters, return code or (2) null means error, not-null means success
module naming: rename defaultclient/simpleclient. good to have non-async client.
from last time: nick's scan result. doing selection vector now? yep! has a large boolean vector with it. Allocating with lots of false is fast. True takes a while.
from last time: How can we flatten roaring bitmap? Maybe build the dumb thing now (for loop) and take this on as an optimization later? Roaring bitmap stores indicies but we would want a sparse array.
Aside (zach): does allocating for a selection vector cause memory management issues?
Allocate big vector, then set indicies in the treemap.
Ryan: 3 things
FFI version of
EngineInterface
From nick/robert: unsafe in kernel used in one place to do
Box::into_raw
can we work around this? Fixed: we can implement a method that takes the box rather than the type itself. self if a box of the thing, then into_any, then downcast. yay!Box<trait> -> Box<any> -> Box<concrete>
then turn into record batch. No transmute.Beta Was this translation helpful? Give feedback.
All reactions