Skip to content
This repository has been archived by the owner on Nov 22, 2018. It is now read-only.

Questions about scalability, and doing backups #59

Open
twitchyliquid64 opened this issue Jun 19, 2014 · 31 comments
Open

Questions about scalability, and doing backups #59

twitchyliquid64 opened this issue Jun 19, 2014 · 31 comments

Comments

@twitchyliquid64
Copy link
Collaborator

Hi There,

I had a quick look in the documentation, and I could not find any reference about how the DB scales.

I would like to know how I should expect it to behave performance wise as the number of records increases.

Consider the query:
SELECT * FROM invoices WHERE customerid = $1;
With an index:
CREATE INDEX invoicesLookup ON invoices(customerid);

What kind of performance can I 'expect' average case and worst case? (no guarantees ofc). O(n)? O(log(n))? O(1)?

Likewise for a range query with dates:
SELECT * FROM invoices WHERE creationdate < now();
With an index:
CREATE INDEX invoicesDateLookup ON invoices(creationdate );

What kind of performance?

Moreover, what kind of performance can I expect for ORDER BY? O(n^2)?

My tinkering and blooping around suggests that ql is relatively stable, as I haven't been able to cause corruption no matter what I do. However, I haven't been able to work out a good way of 'dumping'/backing-up the database. Other database engines such as postgres have mechanisms designed to dump a database to a file such that it can be reconstructed. But I havent been able to work out how to dump ql, without closing the db and re-opening it.

So my second question is: how should one do backups of a live database in ql?

@twitchyliquid64
Copy link
Collaborator Author

db.flush() added, so I can use that.

@cznic
Copy link
Owner

cznic commented Sep 8, 2014

Reopening b/c the intent is to answer once the new DB backend is finished. No promises on timeline, but it is an active WIP.

@cznic cznic reopened this Sep 8, 2014
@twitchyliquid64
Copy link
Collaborator Author

Any update on this? I've found myself wanting to use ql again :)

@cznic
Copy link
Owner

cznic commented May 13, 2017

I cannot recall what the situation was in 2014, but ATM I think

  • SELECT * FROM invoices WHERE customerid = $1;: O(n).

  • With CREATE INDEX invoicesLookup ON invoices(customerid);: O(log(n)+m) where m is the cardinality of the result set.

  • SELECT * FROM invoices WHERE creationdate < now();: O(n).

  • With CREATE INDEX invoicesDateLookup ON invoices(creationdate );: O(log(n)+m).

I suggest to verify the above by writing benchmarks* - it may also be the case that the logarithmic behavior works only for some types.

Wrt dumps/backups: No progress. I've never finished the new backend, but PRs are always welcome 😉

Maybe @gernest has some ideas?

(*) They would also make for a nice contribution probably.

@gernest
Copy link
Collaborator

gernest commented May 13, 2017

@cznic is the new backend you are referring to related to the HTTPFS?

@twitchyliquid64 regarding backup, since ql uses a single file don't you think you can just copy&gzip it? Or do you want something like pg_dump dumping SQL to recreate the database state?

@cznic
Copy link
Owner

cznic commented May 13, 2017

is the new backend you are referring to related to the HTTPFS?

No, the new backend should have replaced cznic/lldb.

regarding backup, since ql uses a single file don't you think you can just copy&gzip it?

I think he wants to make a dump/backup while mutating the DB. Then it's not that easy anymore.

@gernest
Copy link
Collaborator

gernest commented May 13, 2017

I see, probablySQL dump should be possible. Since all ops relate to the single database. It is okay to say you can restore by recreating the database file content.

This is hard.

@cznic
Copy link
Owner

cznic commented May 13, 2017

Thinking about it, maybe the solution is to

  • Wait for any pending R/W transaction to finish to have a consistent file image.
  • Enter backup mode which keeps the WAL growing during the dump/backup which then becomes a simple file copy.
  • Flush the WAL while coordinating with any R/W transaction pending, if any.
  • Exit dump mode.

@gernest
Copy link
Collaborator

gernest commented May 13, 2017

Wait for any pending R/W transaction to finish to have a consistent file image.
Tricky

Do we keep count of pending transactions?

@cznic
Copy link
Owner

cznic commented May 13, 2017

Do we keep count of pending transactions?

No, but IIRC DB.rw reflects if there is any R/W transaction executing. If so, it executes in isolation.

@gernest
Copy link
Collaborator

gernest commented May 13, 2017

That still wont be able to address to when all transactions are done.

@cznic
Copy link
Owner

cznic commented May 13, 2017

The DB state machine knows when it turns the rw field to false: here and here. In those places we can check for a pending dump/backup request and enter the according state.

@gernest
Copy link
Collaborator

gernest commented May 13, 2017

I see, that makes sense. I haven't groked th FSM yet!

offtopic: I have been running go-fuzz against db.Run for 1h20m no crashes yet!

@gernest
Copy link
Collaborator

gernest commented May 13, 2017

// Flush ends the transaction collecting window, if applicable. IOW, if the DB
// is dirty, it schedules a 2PC (WAL + DB file) commit on the next outer most
// DB.Commit or performs it synchronously if there's currently no open
// transaction.
//
// The collecting window is an implementation detail and future versions of
// Flush may become a no operation while keeping the operation semantics.

By any change was there any point in time when this was implemented? The explanation looks like it is what we will need to do when dumping.

@cznic
Copy link
Owner

cznic commented May 13, 2017

I haven't groked th FSM yet!

Here's its State Transition Table.

I have been running go-fuzz against db.Run for 1h20m no crashes yet!

Interesting experiment and glad to hear that!

@cznic
Copy link
Owner

cznic commented May 13, 2017

By any change was there any point in time when this was implemented? The explanation looks like it is what we will need to do when dumping.

It's a nop because currently the WAL flush is performed unconditionally on COMMIT of the outermost R/W transaction. WAL cannot be flushed before that actually because we van get ROLLBACK instead of COMMIT in the end.

@cznic
Copy link
Owner

cznic commented May 13, 2017

// The collecting window is an implementation detail and future versions of
// Flush may become a no operation while keeping the operation semantics.

There has been some point in time when we were collecting transactions for 1 sec before flushing the WAL for performance reasons. You're right, this is similar to what dump/backup needs to do.

@gernest
Copy link
Collaborator

gernest commented May 13, 2017

While we are on this subject. I faced a challenge when trying to enforce canceling queries. From my understanding, when execution is happening in the FSM. This is the expected behavior when we cancel the context.

  • If we are entering rw state, we just halt and rollback.
  • If we are inside rw state , what happens?

Or can we just limit the scope to when we are evaluating the result set? I have feeling for it to make sense we need db.run1ctx which will also accept context.Context and handle canceling signal well.

@cznic
Copy link
Owner

cznic commented May 13, 2017

If we are inside rw state , what happens?

In such case we probably/maybe need to effectively "inject" a ROLLBACK into the executing statement list.

Or can we just limit the scope to when we are evaluating the result set? I have feeling for it to make sense we need db.run1ctx which will also accept context.Context and handle canceling signal well.

I think you're right. As we don't yet have SELECT INTO, cancelling a R/W transaction is probably not really useful.

@gernest
Copy link
Collaborator

gernest commented May 13, 2017

I see. One strategy is, properly cancel result sets( which is trivial), then we make sure we don't step into rw mode when the query is cancelled. When the cancel signal is sent and we are already in rw mode we leave it to the gods 😃

select {
  case <-ctx.Done():
    return //  no op
  default:
    db.run1()
}

@cznic
Copy link
Owner

cznic commented May 13, 2017

Sounds like a good plan to me ;-)

@gernest
Copy link
Collaborator

gernest commented May 13, 2017

While the idea seems good. I tried it and it doesn't work well with the database/sql interfaces.

For canceling queries, the databas/sql QueryContext expects cancellation at all levels, ie execution and evaluation of result set. But with ql we delay the evaluation until when we call Do method of the returned multi row result set.

rs,err:=db.QueryContext(ctx, "select sleep(5)");

err is expected not to be nil, somehow when ctx is canneled while we are still sleeping, but sleep function is not evaluated/called until you start scannning the rows.

What I'm saying is, for properly supporting cancellation. This must happen at execution level , there is no way for the gods to say stop otherwise. Since the execution FSM is way over my head, there is nothing I can do about the #172 at the moment.

@twitchyliquid64
Copy link
Collaborator Author

WRT backups: Can i just gzip the database file? Will there be a time where this is in a 'corrupt' state? I would expect the semantics of db.Flush() to mean the any successful commits made prior to Flush() would be captured in the db file. I understand Flush() is currently a no-op, is my expectation still correct?

I have considered contribution, however I get very lost in the code. Is there a unofficial HACKING file/guide or similar?

@cznic
Copy link
Owner

cznic commented May 14, 2017

Can i just gzip the database file?

Yes.

Will there be a time where this is in a 'corrupt' state?

The DB file is in inconsistent state during committing the WAL file to the DB file - that being a result of outermost COMMIT of a R/W transaction.

I would expect the semantics of db.Flush() to mean the any successful commits made prior to Flush() would be captured in the db file.

Correct.

I understand Flush() is currently a no-op, is my expectation still correct?

Yes and it's valid until a new R/W transaction WAL starts to be commited back to the DB file.

Is there a unofficial HACKING file/guide or similar?

Unfortunately not, sorry.

@twitchyliquid64
Copy link
Collaborator Author

It sounds like If I start gzipping the database while other mutating db ops are in flight, then the file could become corrupted (failed backup). Is there a way to 'take a lock' on mutating the database for the sake of a backup? If this is as simple as holding an internal lock, this might be a nice first contribution.

@cznic
Copy link
Owner

cznic commented May 14, 2017

Untested (error handling omitted):

Direct API

ctx := ql.NewRWCtx()
db.Run(ctx, "begin transaction")
backup()
db.Run(ctx, "rollback")

sql API

tx := db.Begin()
backup()
tx.Rollback()

@cznic
Copy link
Owner

cznic commented May 22, 2017

@twitchyliquid64 Please let us know if you have tried something and what were the results, if any, thank you.

@twitchyliquid64
Copy link
Collaborator Author

Sorry for the delay, ive been blocked on other features so haven't done backup yet, but will update this issue once I circle back.

Cheers & <3
Tom

@cznic
Copy link
Owner

cznic commented May 22, 2017

No problem, thanks.

@twitchyliquid64
Copy link
Collaborator Author

FYI, current plan is to build a package that uses __Table and _Column systables, so a user just has to call backup(). Slated for implementation after I revamp the planner.

@cznic
Copy link
Owner

cznic commented Jul 15, 2017

Sounds reasonable.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants