Skip to content

Internals for Developers

Mario Juric edited this page Jan 1, 2017 · 1 revision

Transactions in LSD

Since version 0.4 LSD implements transactions. Transactions are used to:

  • prevent database corruption in case of failed updates
  • provide a consistent view of the database to readers while updates are ongoing

Any operation that modifies table data or metadata can only be done from within a transaction. This is achieved using DB.transaction() context manager, that automatically commits on exit. For example:

import lsd, lsd.smf

db = lsd.DB("test") 
with db.transaction():
        db.create_table('ps1_det', lsd.smf.det_table_def)

As a part of the committing process, the table's neighbor cache will be automatically updated to keep it in a consistent state, as well as it's catalog (a list of which datafiles make up the table data).

Implementation

LSD implements transactions using a variant of the [http://en.wikipedia.org/wiki/Snapshot_isolation snapshot isolation] technique. Each LSD table has a 'snapshots' directory, with subdirectories storing snapshot data. Snapshots can either be opened or committed; a committed snapshot contains special file '.committed', as a marker of its state.

The data logically contained in the table consists of a union of contents of all committed snapshot directories, made from oldest to newest committed snapshot, where contents (files) of newer snapshots overwrite eponymous files from older ones. For example, imagine a table 'table1', with two snapshots, '0001' and '0002', containing the following:

table1/snapshots/0001/tablets/+0.5+0.5/T55555/main.h5  
table1/snapshots/0001/tablets/+0.5+0.5/T55556/main.h5
 
table1/snapshots/0002/tablets/+0.5+0.5/T55556/main.h5
table1/snapshots/0002/tablets/+0.5+0.5/T55557/main.h5

Logically, this table is equivalent to the one having:

table1/tablets/+0.5+0.5/T55555/main.h5  # file from 0001
table1/tablets/+0.5+0.5/T55556/main.h5  # file from 0002
table1/tablets/+0.5+0.5/T55557/main.h5  # file from 0002

LSD does this "directory merging" automatically, and caches the results for fast lookup in {{{catalog.pkl}}} files stored in each snapshot's directory. Also, actual "snapshot IDs" (the 0001 and 0002 in the example above) are times when the transaction was created, formatted as "YYYYMMDDHHmmss.ssssss".

As a consequence of this implementation:

  • Rolling back to an older snapshot can be achieved by removing directories containing newer snapshots. Actually, in principle the directories don't even have to be removed -- LSD just needs to be told to look for a specific snapshot -- but this is not implemented yet.
  • To read a given snapshots, all older snapshots must be present. You can view each snapshot as a "diff" between the current and previous state of the database, going back to the beginning; all diffs have to be present to construct the current state.
  • If anything goes wrong in a transaction, the snapshot directory created by the transaction will be left in the snapshots/ subdirectory, but won't have a '.committed' file, and therefore be ignored by LSD. They can be safely removed, either manually ({{{'rm -rf'}}}), or using {{{lsd-vacuum}}}.
  • Queries to the database don't see the data added by the current transaction; they see the database state as it was when the transaction was started. For example, if you have a table with 10 rows, begin a transaction, add or modify some rows and, without committing, query that table again, you will get the original 10 rows as a result. Only after you've called db.commit() will your queries begin returning the new data.
  • Upon commit, LSD will do the necessary housekeeping, including the updating of table catalogs ({{{catalog.pkl}}}), as well as intelligently updating the neighbor caches for the cells that were modified by the transaction.