-
Notifications
You must be signed in to change notification settings - Fork 290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Odd database corruption #72
Comments
Hi Daniel. I agree that it does not seem to be the same sort of error as #71. All data is sequentially appended to the tail of buntdb db file it should not be possible to write in the middle of the file. Also, loading a corrupted db file or a file with partially written entries (as in #71) is not allowed and buntdb will returns an error, so it shouldn't be caused by a previous process failing to finish writing data or a bad fsync. The only thing that I can think that might cause an issue like this is if two or more processes opened the buntdb database at the same time and each tried to write entries at the same time. |
I've been thinking about this issue some more. I'm wondering if I should not introduce an flock feature to BuntDB. If the problem is related to multiple processes clobbering the file then perhaps this could help. |
I've been banging around in the code and I saw that there is a write operation that could partially write data and still fail. It's likely not common, but in the case of low system resources, it could happen. Which would leave the database in a corrupted state. I wrote a test utility to reproduce the issue. https://gist.github.com/tidwall/e823bcb75558c2a590cc17227d630ee1 I fixed the problem b6a7490 and pushed a new version v1.2.3. |
Thanks so much! I agree that this is more likely than concurrent file access. We were thinking that if concurrent file access were causing this, the second half of the write would have to be interleaved somewhere else in the file, but we couldn't find it. |
You’re welcome. I’m glad it’s squared away. Let me know if you happen upon any more problems. Thanks for the detailed reporting. |
Hi tidwall! We've been using buntdb in our irc server for a number of years now and just ran into a really odd instance of a corrupt db from a user. We've been looking into it over here: ergochat/ergo#1603
Basically, two commands in the db are truncated, in two separate places in the middle of the database file. The actual file itself isn't truncated, just those two entries, so it doesn't seem to be the same sort of error as #71?
The user shutdown our software successfully, and after a reboot buntdb just returns
invalid database
.Here's the first instance of the failure w/ anonymised data (~60k lines into a ~69k line db file) where ~20 characters are truncated from the command. The third long json blob is what's cut off:
The above instance returns an error at line 842 because it gets dumped in the middle of a line.
The second instance looks almost identical, just that ~103 characters are cut off and it happens to fall on a
\r\n
so buntdb thinks it's finished a command, tries to read a new one, and gets ana
instead of a*
(since*
indicates 'new command') and returns an error on line 781. This second instance is about 63k lines into the 69k line db file.There's no extra data inserted into the db at all, just the end of those two long lines that are missing.
If it can help, we open up our db here and we save that key in the
Update
call here.Thanks again for the great library, it's done really well for our uses so far! I wanted to track this down further, but figured it'd be wise to ask first since it looked strange. If you're after any more info just let me know.
The text was updated successfully, but these errors were encountered: