Logbookd – SQLite Backed Syslogd

77 pointsposted 5 days ago
by tosh

32 Comments

pzmarzly

5 days ago

  -g kB       Remove old log lines when the in-memory database crosses x kB
Seems like garbage collection is only implemented for in-memory database (by reading SQLITE_DBSTATUS_CACHE_USED). Maybe logrotate could be set up to do it instead, but nothing in documentation indicates so.

Otherwise looks like a great project.

thebeardisred

5 days ago

What's the maximum write speed? At what point do you start losing log messages?

cryptonector

5 days ago

You can amortize the write speed significantly by not committing often, either in the sense of a SQL `COMMIT` or in the sense of doing a _synchronous_ `COMMIT`. You could commit every N seconds, say, for some sufficiently large N, or you could commit after N seconds of idle time and no more than M seconds since the last commit. You can also disable `fsync()`, commit often, and re-enable `fsync()` once every N seconds. There are many tactics you can use for data where some loss due to power failure is tolerable.

I.e., you can probably get pretty close to the storage device's max sustained write throughput, though with some losses for write magnification due, e.g., to B-tree write magnification and also indices you might want to maintain.

Write magnification due to B-tree write magnification can be amortized by committing infrequently (which is why I listed that _first_ above). Though there should be no need because SQLite3 already amortizes B-tree write magnification by using a write-ahead log (WAL), so be sure to enable the WAL for this sort of application.

Write magnification due to indices can be amortized by partitioning your tables by time ranges, and then use a VIEW to unify your tables, and then you can create an index on any partition only when it closes to new log entries. This approach causes reads to be slow when searching newer log entries, but those probably will all fit in memory, so it's not a problem if you have a large enough page cache.

Now I've not built anything like this so I can't say for sure, but I suspect that one could get very aggressive with these techniques and reach a sustained write rate of around 75% of the storage device(s)' sustained write rate.

singron

5 days ago

Turning off fsync is pretty dangerous since a crash could corrupt the database. You might think you would just lose a couple seconds of data, but that's only true if writes are applied in order.

E.g. if some data is moved from page A to page B, you normally write B with the new data, fsync, and then write A without the data. Without fsync, you might only write page A and you would lose that data. This might happen on a internal data structure and corrupt the whole database.

cryptonector

5 days ago

You could also disable sync writes for the WAL but enable them for the main DB.

bubblesnort

5 days ago

I don't think this is going to be an issue as Linux has a built-in rate limiter.

thebeardisred

3 days ago

This is a core design challenge for all logging systems. This is why there are mechanisms for intentionally dropping messages to relieve queue pressure, optimizations around the use of IO_URING. Inversely because logging systems can drop messages it is one of the primary reasons for "MARK" type mechanisms (https://lists.debian.org/debian-user/1998/09/msg00915.html).

cryptonector

5 days ago

That's not GP's question. GP wants to know how high the write rate can be regardless of the systemd log rate limiter, likely so as to be able to increase that rate limit!

Spivak

5 days ago

I'm actually kinda surprised they went with SQLite here, log messages are the trivialest data format and there's no way you can't beat SQLite's speed by just not having database logic in the middle at all. Just being able to BYOAllocator for the logs themselves with such predictable linear memory usage would make this thing scream.

WhyNotHugo

5 days ago

The advantage of SQLite is being able to perform queries like “logs for service X and Y yesterday between 15hs and 16hs”.

nolist_policy

5 days ago

  journalctl -u ssh.service -u exim4.service --since='2024-09-14 15:00' --until='2024-09-14 16:00'
(Systemd accepts e.g. 'yesterday' as a timestamp but not together with a time)

cryptonector

5 days ago

You're sort of right in that a B-tree is not a good data structure for logs given that append-only files are perfect for logs. But the point of using an RDBMS for logs is to be able to a) index the logs, b) provide a great search facility for logs. Perhaps a better design would be a virtual table plugin for SQLite3 that allows one to use log files as tables, then index and search them with SQLite3, but if one lacks the time to investigate that approach then oe can't be faulted for using SQLite3 directly.

ivzhh

5 days ago

Agree, /var/log/messages is there for a long time, writing to log is never a problem. Digesting the log is the niche market and it is profitable enough that we have a lot of tools in this market (rotation, transmission, parsing, etc)

simscitizen

5 days ago

How does this work exactly? Is every log line a separate transaction in autocommit mode? Because I don't see any begin/commit statements in this codebase so far...

nine_k

5 days ago

Maybe autocommit mode is set? I'd expect that.

simscitizen

4 days ago

Inserting a new row for each log line in autocommit mode would be absurdly inefficient compared to just appending a log line to a file.

marcrosoft

5 days ago

I did something similar but not open source: centrallogging.com it is surprising how SQLite can scale for smallish amounts of logs (1tb)

juvenn

4 days ago

Why not use duckdb? It is a column database, and more situated for log entries (seems to me).

righthand

5 days ago

This looks right up my alley. I am experimenting to see how much I can strip systemd from my every day laptop as an exercise in futility and to understand how embedded a distribution like Debian has become.

UI_at_80x24

5 days ago

I stopped trying to swim against the current and switched to OpenBSD/FreeBSD. You might be surprised how viable it really is.

koeng

5 days ago

It’s really a shame that OpenBSD doesn’t have a good file system. Otherwise I’d use for my production systems (I could put it inside of proxmox, with a ZFS container outside for stability inside, but I like to run bare metal, no VMs)

ivzhh

5 days ago

I tried to install OpenBSD on three VM hosts, one Macbook Intel, one FreeBSD and one KVM. OpenBSD failed to install in all three environments. The latter two crashed just during the file system creation stage.

nine_k

5 days ago

I run Void Linux on my laptop, which seems to exhibit many BSD-esque approaches (and lacks systemd), while also being a rolling release distro with fresh stuff usually appearing in a few days after an upstream release.

Pretty fine so far (about 6 years).

righthand

5 days ago

That’s my concern, the Debian experience is usually pretty lovely and I’d hate to leave it behind, but maybe there’s no point in fiddling with something stuck in trends.

yjftsjthsd-h

5 days ago

I think Alpine would be a smaller jump?

ivzhh

5 days ago

Yes, Alpine is a good choice for small servers. I run both FreeBSD and Alpine for my homelab. Alpine feels very close to FreeBSD style. I still prefer pf over ufw/awall.

user

4 days ago

[deleted]