From 19k to 4.2M events/sec: story of a SQLite query optimisation

5 pointsposted 5 months ago
by Hywan

3 Comments

idoubtit

5 months ago

I don't understand why they didn't try to alter the ON condition. Instead of moving the JOIN into a per-field-subquery or a preliminary query, why not replace

    LEFT JOIN event_chunks as ec
      ON ec.chunk_id = lc.id
with

    LEFT JOIN event_chunks as ec
      ON case lc.type
        WHEN 'E' THEN ec.chunked_id = lc.id
        ELSE 0
        END
I don't know if it more performant with Sqlite's query planner (and it may depend on many conditions, e.g. strict tables), but it looked like the obvious fix to the initial query. And it's 100% SQL, no need for a Rust hashmap.

Hywan

4 months ago

Thanks for mentioning this other possibility! To be honest, I've discovered this possibility from you. Sadly, it doesn't improve the performance that much. We get the following performance:

- time: 98ms - throughput: 101Kelem/s

which is way slower than the 2.3ms and 4.18Melem/s we have with our final solution. It's even slower than the `CASE` without the `LEFT JOIN` (which was 39ms and 251Kelem/s).

Again, thanks for having pointed that out!

BinaryIgor

5 months ago

These are exactly the problems that often make software maintenance actually interesting :)