content addressed replication

Supercedes #45 

In this replication variant, pages are replicated based on their content hash. Thus the current state of the file can be computed from the content store (provides access to pages by their content hash) and the current snapshot (list of content hashes).

# Hash algorithms
- [see experiments repo](https://github.com/orbitinghail/experiments)
- [blake3](https://github.com/BLAKE3-team/BLAKE3) is the safest choice which outputs cryptographic 32 byte hashes
- [polymur](https://github.com/jedisct1/rust-polymur-hash) is a faster choice which outputs 8 byte hashes at a collision rate of `n * 2^-60.2` where n is the input size in bytes

# snapshot size
assuming an object size of 4k and a hash output size of 32 bytes, a 1GB database snapshot will consume 8MB

# notes on efficient replication
- clients always want to update to the latest snapshot
- ideally we are able to only store changes to the snapshot rather than re-storing the full snapshot
- a tree based structure can allow for efficient structural sharing (conceptually similar to a merkel tree)
- blake3 is a tree based hash function that internally computes a tree of hashes over 1kb chunks, and [bao](https://github.com/oconnor663/bao) is an application of blake3 for verifiable streaming/seeking without needing full access to the data
- one of the authors of Blake3 is currently [factoring out the needed pieces of the blake3 crate](https://github.com/BLAKE3-team/BLAKE3/tree/master/rust/guts) for me to sit my replication algorithm directly on top of it (I think)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

content addressed replication #49

Hash algorithms

snapshot size

notes on efficient replication

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

content addressed replication #49

Description

Hash algorithms

snapshot size

notes on efficient replication

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions