-
Notifications
You must be signed in to change notification settings - Fork 224
Batch inserts #68
base: master
Are you sure you want to change the base?
Batch inserts #68
Conversation
Basic strategy is to batch consecutive inserts together per namespace. Batch gets saved whenever: - An update or delete is done to the same namespace as the insert - After streaming (up to) 1000 updates from oplog, time from last batch update is larger than 5 seconds. - More than a threshold of updates have happened in this namespace. - Program is exiting/streaming stops.
@@ -170,14 +174,31 @@ def optail | |||
if tail_from.is_a? Time | |||
tail_from = tailer.most_recent_position(tail_from) | |||
end | |||
|
|||
last_batch_insert = Time.now | |||
tailer.tail(:from => tail_from) | |||
until @done | |||
tailer.stream(1000) do |op| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't done the digging to confirm this, but I'm pretty sure the contract on Tailer
by default is that once your block returns, the op is considered to have been handled, and the timestamp may be persisted to postgres. However, with batched inserts, we haven't actually processed the op until we've flushed the inserts, so this could result in data loss if we save a timestamp before flushing the inserts.
mongoriver does have a batch
mode, which allows you to explicitly mark batches and tell mongoriver when you're done with a batch. Unfortunately I've forgotten the details, so you'll probably have to source-dive :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point!
My original assumption was that if the process is told to stop, it would flush due to the signal handler. However in hindsight you're right - if something catastrophic happens the data would not get flushed resulting in data loss.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we can't assume we'll get to shutdown gracefully -- we need to handle the case where the machine dies, the program gets killed via SIGKILL
, whatever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modulo the concerns around making sure we don't update timestamps too early, I think this lgtm. |
Did you figure out how to address the concerns around timestamps? We really need this optimization in our environment. |
This pull request adds support for batching sequential INSERTs when doing tailing, speeding up tailing under certain conditions while not being slower than the current state. See also issue #47.
r? @nelhage
cc @snoble
Basic strategy is to batch consecutive inserts together per namespace. Batch gets saved whenever:
Some handwavy measurements for tailing 20000 oplog entries:
Notes on potential future work (That I may or may not be working on soonish):
The next "low hanging" performance fruit to work on after this would be to optimize updates, though this wouldn't have this large of an effect.
Some ideas on how can be done:
$set
entries in oplog can directly be translated into postgres queries only updating those columns mentioned. Updates without$set
can replace the current row in postgres with the data in oplog entry. Tricky part here is figuring out if/how this applies to tokumx even after mongoriver does oplog entry translation (if they support any other $ operations) and unset.Another performance improvement would be to have multiple tailers in either separate threads or processes, separated by namespace. This would however keeping multiple tailing states in database (one per namespace) and I'm not quite sure what the performance implications are for mongo for querying the same oplog (with filters?) from multiple processes.