iAdd DESIGN doc - dedup - deduplicating backup program Err bitreich.org 70 hgit clone git://bitreich.org/dedup/ git://enlrupgkhuxnvlhsf6lc3fziv5h2hhfrinws65d7roiv6bfj7d652fid.onion/dedup/ URL:git://bitreich.org/dedup/ git://enlrupgkhuxnvlhsf6lc3fziv5h2hhfrinws65d7roiv6bfj7d652fid.onion/dedup/ bitreich.org 70 1Log /scm/dedup/log.gph bitreich.org 70 1Files /scm/dedup/files.gph bitreich.org 70 1Refs /scm/dedup/refs.gph bitreich.org 70 1Tags /scm/dedup/tag bitreich.org 70 1README /scm/dedup/file/README.gph bitreich.org 70 1LICENSE /scm/dedup/file/LICENSE.gph bitreich.org 70 i--- Err bitreich.org 70 1commit a7753b65b2b40ba265e30e8f2f0bda25da7baa53 /scm/dedup/commit/a7753b65b2b40ba265e30e8f2f0bda25da7baa53.gph bitreich.org 70 1parent 42797a877f6efb89a46968af77fe8ab9c0e335fa /scm/dedup/commit/42797a877f6efb89a46968af77fe8ab9c0e335fa.gph bitreich.org 70 hAuthor: sin URL:mailto:sin@2f30.org bitreich.org 70 iDate: Mon, 6 May 2019 01:00:49 +0100 Err bitreich.org 70 i Err bitreich.org 70 iAdd DESIGN doc Err bitreich.org 70 i Err bitreich.org 70 iDiffstat: Err bitreich.org 70 i A DESIGN | 51 +++++++++++++++++++++++++++++++ Err bitreich.org 70 i Err bitreich.org 70 i1 file changed, 51 insertions(+), 0 deletions(-) Err bitreich.org 70 i--- Err bitreich.org 70 1diff --git a/DESIGN b/DESIGN /scm/dedup/file/DESIGN.gph bitreich.org 70 i@@ -0,0 +1,51 @@ Err bitreich.org 70 i+Design notes Err bitreich.org 70 i+============ Err bitreich.org 70 i+ Err bitreich.org 70 i+There are three main abstractions in the design of dedup: Err bitreich.org 70 i+ Err bitreich.org 70 i+ - The chunker interface Err bitreich.org 70 i+ - The snapshot layer Err bitreich.org 70 i+ - The block layer Err bitreich.org 70 i+ Err bitreich.org 70 i+The block layer Err bitreich.org 70 i+--------------- Err bitreich.org 70 i+ Err bitreich.org 70 i+From the outside world, the block layer is just an abstraction for Err bitreich.org 70 i+dealing with variable length blocks. All blocks are referenced with Err bitreich.org 70 i+their hash. Err bitreich.org 70 i+ Err bitreich.org 70 i+The block layer is arranged into a stack of layers. From top to Err bitreich.org 70 i+bottom these are as follows: Err bitreich.org 70 i+ Err bitreich.org 70 i+ - Generic layer Err bitreich.org 70 i+ - The compression layer Err bitreich.org 70 i+ - The encryption layer Err bitreich.org 70 i+ - The storage layer Err bitreich.org 70 i+ Err bitreich.org 70 i+The generic layer is the one that client code interfaces with. It is Err bitreich.org 70 i+the top level entrypoint to the block layer. Err bitreich.org 70 i+ Err bitreich.org 70 i+The compression layer will prepend a compression descriptor to the Err bitreich.org 70 i+block and then compress the block using snappy or lz4. It is possible Err bitreich.org 70 i+to disable compression in which case a special descriptor is prepended Err bitreich.org 70 i+and the data is passed uncompressed to the layer below. Err bitreich.org 70 i+ Err bitreich.org 70 i+The encryption layer will prepend an encryption descriptor to the Err bitreich.org 70 i+block and then encrypt/authenticate the block using XChaCha20 and Err bitreich.org 70 i+Poly1305. It is possible to disable encryption in which case it acts Err bitreich.org 70 i+as a bypass with a special type of encryption descriptor. Err bitreich.org 70 i+ Err bitreich.org 70 i+The storage layer will prepend a storage descriptor and append the Err bitreich.org 70 i+descriptor and the data to a single backing file. Err bitreich.org 70 i+ Err bitreich.org 70 i+The snapshot layer Err bitreich.org 70 i+------------------ Err bitreich.org 70 i+ Err bitreich.org 70 i+The snapshot abstraction is currently very simplistic. A snapshot is Err bitreich.org 70 i+a file under $repo/archive/. The contents of the file are the Err bitreich.org 70 i+block hashes of the data stored in the snapshot. Err bitreich.org 70 i+ Err bitreich.org 70 i+The chunker interface Err bitreich.org 70 i+--------------------- Err bitreich.org 70 i+ Err bitreich.org 70 i+TBD Err bitreich.org 70 .