-
Notifications
You must be signed in to change notification settings - Fork 340
Erasure coding
gilbertchen edited this page Sep 27, 2020
·
1 revision
This feature is available since CLI version 2.7.0.
To initialize a storage with erasure coding enabled, run this command (assuming 5 data shards and 2 parity shards):
duplicacy init -erasure-coding 5:2 repository_id storage_url
Then you can run backup, check, prune, etc as usual.
When a bad chunk is detected, you'll see log messages like this:
Restoring /private/tmp/duplicacy_test/repository to revision 1
Recovering a 1824550 byte chunk from 364910 byte shards: ***--**
Downloaded chunk 1 size 1817347, 1.73MB/s 00:00:11 9.0%
Recovering a 6617382 byte chunk from 1323477 byte shards: **--***
Downloaded chunk 2 size 6591322, 8.02MB/s 00:00:02 42.0%
Recovering a 5136934 byte chunk from 1027387 byte shards: --*****
Downloaded chunk 3 size 5116593, 12.90MB/s 00:00:01 67.6%
Recovering a 2515494 byte chunk from 503099 byte shards: -*****-
Downloaded chunk 4 size 2505558, 15.29MB/s 00:00:01 80.1%
Recovering a 3984934 byte chunk from 796987 byte shards: --*****
Downloaded chunk 5 size 3969180, 19.07MB/s 00:00:01 100.0%
Downloaded file1 (20000000)
To check if a storage is configured with erasure coding, run duplicacy -d list
and it should report the numbers of data and parity shards:
Data shards: 5, parity shards: 2
The encoded chunk file starts with a 10 byte unique banner, then a 14 byte header containing the chunk size and parity parameters, followed by hashes of each shard, then the contents of shards, and finally the 14 byte header again for redundancy:
----------------------------
| duplicacy\0003 (10 bytes) |
-------------------------------------------------------------------------------------------------
| chunk size (8 bytes) | #data shards (2 bytes) | #parity shards (2 bytes) | checksum (2 bytes) |
-------------------------------------------------------------------------------------------------
| hash of data shard #1 (32 bytes) |
------------------------------------
...
| hash of parity shard #1 (32 bytes) |
------------------------------------
...
| data shard #1 |
-----------------
...
| parity shard #1 |
-----------------
...
-------------------------------------------------------------------------------------------------
| chunk size (8 bytes) | #data shards (2 bytes) | #parity shards (2 bytes) | checksum (2 bytes) |