hackernews client

bibiver

4 months ago

We've just released rs-merkle-tree, a Merkle tree crate designed with performance and modularity in mind. It comes with the following key features:

* Fixed depth: All proofs have a constant size equal to the depth of the tree. The depth can be configured via a const generic.

* Append-only: Leaves are added sequentially starting from index 0. Once added, a leaf cannot be modified.

* Optimized for Merkle proof retrieval: Intermediate nodes are stored so that proofs can be fetched directly from storage without recomputation, resulting in very fast retrieval times.

* Configurable storage and hash functions: Currently supports Keccak and Poseidon hashers, and in-memory, Sled, RocksDB, and SQLite stores.

The Rust ecosystem already offers several Merkle tree implementations, but rs-merkle-tree is built for a specific use case: append-only data structures such as blockchains, distributed ledgers, audit logs, or certificate transparency logs. It’s particularly optimized for proof retrieval, storing intermediate nodes in a configurable and extensible storage backend so they don’t need to be recomputed when requested.

xpe

4 months ago

For those in the know... Do you have any recommended reading for Merkle-curious people that may have some scars from blockchain-related hype? It would be nice to find a nice introductory resource that covers some interesting use-cases that may not be well-known. Perhaps with nice graphics, interactivity, etc.?

For example, here is a slick web site that shows how certificate transparency uses Merkle trees: https://certificate.transparency.dev/howctworks/

sunshowers

4 months ago

The joke is there are two kinds of distributed systems: Spanner and Bitcoin.

xpe

4 months ago

I'm currently exploring how different APIs compute ETags [1]. I'm inclined to think the overhead of Merkle-trees make them relatively less useful for ETags for paginated APIs where responses are small and flat hashing is fast. One rule of thumb would be: the overhead isn't justified unless you need either (a) efficient updates on very large collections, or (b) collection-level state tracking across pagination boundaries. Sound about right?

[1]: https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/...

Also, ETags can use weak validators:

> W/ (case-sensitive) indicates that a weak validator is used. Weak ETags are easy to generate, but are far less useful for comparisons. Strong validators are ideal for comparisons but can be very difficult to generate efficiently. Weak ETag values of two representations of the same resources might be semantically equivalent, but not byte-for-byte identical. This means weak ETags prevent caching when byte range requests are used, but strong ETags mean range requests can still be cached.

raphinou

4 months ago

I'm working on a project where I need to prove that a file in a git repo is append only, ie all changes to the file only added lines. The only way I can think of is looking at the git history of the file, but would there be a faster way using Merkel trees somehow?

jmount

4 months ago

Is this trying to deal with that git history can be replaced?

raphinou

4 months ago

We handle the case to prevent git history rewrites, but an alternative solution could be adopted if less cumbersome.

6r17

4 months ago

There are use-cases outside of "crypto" for blockchain - notably for security related use-cases such as historization ; tough I concede that use-case can be handled with more standard technologies it is very fun to get into it and study such use as it may have properties that could be interesting. One has to have some time for this but to be frank it's not really something super hard to pull-off (Bc itself conceptually)

tialaramex

4 months ago

I think the question, certainly the question I have - is, what are some use cases outside of "crypto" and the CT log ?

kevinastone

4 months ago

Git and other vcs are generally built on merkle-trees for consistent hashing.

There's a list of other common hashing cases on the Wikipedia entry: https://en.wikipedia.org/wiki/Merkle_tree#Uses

trn217

4 months ago

You can make use of merkle-trees when ever you want to proof the data integrity of large amounts of individually independent data (distributed FS etc). After playing arround with SMTs for a bit, possible use-cases came to mind quite frequently.

vlovich123

4 months ago

The challenge to me is that the ability to prove the data integrity is quite hard once you store the data elsewhere, particularly on a single disk - I don’t know that people are regularly reading and reevalidating the stored data and likely not on every I/o operation.

geopert

4 months ago

[dead]

NoahZuniga

4 months ago

How are you supposed to show that two tree heads are consistent?

Retr0id

4 months ago

Isn't the whole point of a merkle tree that you just... compare them?

NoahZuniga

4 months ago

Blockchains are an alternative way to make an append only data structure, so it's not clear why you would want to use a merkle tree to create a block chain.

infogulch

4 months ago

Yes, the term block chain does has a specific technical meaning: a sequence of hashed values where each contains the hash of the previous, similar to the way a git commit includes the hash of its parent commit. But the term blockchain has also taken on a broader colloquial meaning of "log with certain cryptographic properties", which both block chain and merkle tree implementations can satisfy with various advantages and disadvantages. I think it's fair to allow usage of the broader definition.

tromp

4 months ago

A blockchain can use Merkle trees in several ways. First, the block header usually contains the Merkle root of all transactions. This makes the header, and in turn the Proof-of-Work, commit to all txs, which helps make the chain immutable. Second, the header can also directly commit to the UTXO set, simplifying the design of light clients (sadly, Bitcoin doesn't do this).

dafelst

4 months ago

Doesn't bitcoin use merkle-trees internally to create its blockchain?

geopert

4 months ago

[dead]

vlovich123

4 months ago

Nice. Since I/O is typically asynchronous, any ideas on how an async storage backend could be integrated elegantly?

Also do you have any benchmarks in the repo that would stress the I/O layer for latency too?

Build something useful in Rust and not mention that it's in fact built in Rust.

Difficulty:

Impossible

Our modular, high-performance Merkle Tree library for Rust

32 Comments

bibiver

xpe

sunshowers

xpe

raphinou

jmount

raphinou

6r17

tialaramex

kevinastone

trn217

vlovich123

geopert

NoahZuniga

Retr0id

NoahZuniga

infogulch

tromp

dafelst

geopert

vlovich123

bibiver

geopert

Retr0id

bibiver

gritzko

geopert

anonymousDan

wngr

bibiver

N_Lens

torginus