Concurrent Hash Table Designs

87 pointsposted a month ago
by signa11

15 Comments

mrkeen

a month ago

  public V get(Object key) { synchronized (mutex) { ... } }
  public V put(K key, V value) { synchronized (mutex) { ... } }
  public V remove(Object key) { synchronized (mutex) { ... } }
> From a correctness standpoint, this strategy is almost trivial to reason about.

It is indeed trivial to reason about. This design invites users to stumble into the race condition between get and set.

PaulHoule

a month ago

What a good transactional API is like is an interesting question.

   public void modify(K key, Function<Optional<V>,Optional<V>> modifier)
or something along those lines covers the race of a single value being changed out from under you, but if you want to update one or more keys you might want

   public void modifyMany(Set<K> key, Consumer<Map<K,V>> modifier)
where the modifier gets passed a modifiable view that has just the keys in the set.

There is also the Clojure-style immutable API for maps though I can say I never really enjoyed working with that the way I enjoyed immutable lists (which are Lispier than Lisp: I went through all the stages of grief reading On Lisp and couldn't help think "If he was using Clojure he wouldn't be struggling with nconc)

mrkeen

a month ago

STM containers does it right.

  lookup :: k -> Map K v -> STM (Maybe v)

  insert :: v -> k -> Map K v -> STM ()
It won't execute until you wrap it with atomically, so you're forced to be explicit about whether these things should happen in one logical step or separately.

https://hackage-content.haskell.org/package/stm-containers/d...

user

a month ago

[deleted]

eterm

a month ago

There's a design in the .NET ConcurrentDictionary that is even more inviting:

  GetOrAdd(TKey, Func<Tkey,TValue> )
This lets you specify a key, and a method to run to generate a value to store if the key does not exist.

This is thread-safe, however it is not atomic much to the surprise of many that use it.

If you wrongly assume that the factory method can only execute once, you fall into a trap, since it actually can execute many times before the result is stored.

This can cause a problem if you use it for caching, because you might for example put an expensive calculation you want to cache there, only to find out cache expiration causes a stampede as the factory suddenly executes a lot at once.

the-smug-one

a month ago

I don't see the problem at all, tbh. Both `get(K); put(K);` and `put(K); get(K);` are valid execution traces on a uniprocessor.

mrkeen

a month ago

Oh yeah, I'm not sure what an execution trace is, but ConcurrentHashMap is indeed fit-for-purpose for single-threaded use.

nickmonad

a month ago

Would love to read this, although I'm seeing some pretty horrific code formatting issues in both Firefox and Chrome.

ramon156

a month ago

weirdest part is, the initial load looks fine (try refreshing, scroll persists).

firefox's reader view helps too

manwe150

a month ago

I saw a blog about this yesterday: disable extensions (notably 1Password) to fix formatting inside code tags.

pama

a month ago

Looks decent on iPhone safari FWIW.