Logbookd – SQLite Backed Syslogd

77 pointsposted 10 months ago
by tosh

32 Comments

pzmarzly

10 months ago

  -g kB       Remove old log lines when the in-memory database crosses x kB
Seems like garbage collection is only implemented for in-memory database (by reading SQLITE_DBSTATUS_CACHE_USED). Maybe logrotate could be set up to do it instead, but nothing in documentation indicates so.

Otherwise looks like a great project.

thebeardisred

10 months ago

What's the maximum write speed? At what point do you start losing log messages?

cryptonector

10 months ago

You can amortize the write speed significantly by not committing often, either in the sense of a SQL `COMMIT` or in the sense of doing a _synchronous_ `COMMIT`. You could commit every N seconds, say, for some sufficiently large N, or you could commit after N seconds of idle time and no more than M seconds since the last commit. You can also disable `fsync()`, commit often, and re-enable `fsync()` once every N seconds. There are many tactics you can use for data where some loss due to power failure is tolerable.

I.e., you can probably get pretty close to the storage device's max sustained write throughput, though with some losses for write magnification due, e.g., to B-tree write magnification and also indices you might want to maintain.

Write magnification due to B-tree write magnification can be amortized by committing infrequently (which is why I listed that _first_ above). Though there should be no need because SQLite3 already amortizes B-tree write magnification by using a write-ahead log (WAL), so be sure to enable the WAL for this sort of application.

Write magnification due to indices can be amortized by partitioning your tables by time ranges, and then use a VIEW to unify your tables, and then you can create an index on any partition only when it closes to new log entries. This approach causes reads to be slow when searching newer log entries, but those probably will all fit in memory, so it's not a problem if you have a large enough page cache.

Now I've not built anything like this so I can't say for sure, but I suspect that one could get very aggressive with these techniques and reach a sustained write rate of around 75% of the storage device(s)' sustained write rate.

singron

10 months ago

Turning off fsync is pretty dangerous since a crash could corrupt the database. You might think you would just lose a couple seconds of data, but that's only true if writes are applied in order.

E.g. if some data is moved from page A to page B, you normally write B with the new data, fsync, and then write A without the data. Without fsync, you might only write page A and you would lose that data. This might happen on a internal data structure and corrupt the whole database.

cryptonector

10 months ago

You could also disable sync writes for the WAL but enable them for the main DB.

bubblesnort

10 months ago

I don't think this is going to be an issue as Linux has a built-in rate limiter.

thebeardisred

10 months ago

This is a core design challenge for all logging systems. This is why there are mechanisms for intentionally dropping messages to relieve queue pressure, optimizations around the use of IO_URING. Inversely because logging systems can drop messages it is one of the primary reasons for "MARK" type mechanisms (https://lists.debian.org/debian-user/1998/09/msg00915.html).

cryptonector

10 months ago

That's not GP's question. GP wants to know how high the write rate can be regardless of the systemd log rate limiter, likely so as to be able to increase that rate limit!

Spivak

10 months ago

I'm actually kinda surprised they went with SQLite here, log messages are the trivialest data format and there's no way you can't beat SQLite's speed by just not having database logic in the middle at all. Just being able to BYOAllocator for the logs themselves with such predictable linear memory usage would make this thing scream.

WhyNotHugo

10 months ago

The advantage of SQLite is being able to perform queries like “logs for service X and Y yesterday between 15hs and 16hs”.

nolist_policy

10 months ago

  journalctl -u ssh.service -u exim4.service --since='2024-09-14 15:00' --until='2024-09-14 16:00'
(Systemd accepts e.g. 'yesterday' as a timestamp but not together with a time)

cryptonector

10 months ago

You're sort of right in that a B-tree is not a good data structure for logs given that append-only files are perfect for logs. But the point of using an RDBMS for logs is to be able to a) index the logs, b) provide a great search facility for logs. Perhaps a better design would be a virtual table plugin for SQLite3 that allows one to use log files as tables, then index and search them with SQLite3, but if one lacks the time to investigate that approach then oe can't be faulted for using SQLite3 directly.

ivzhh

10 months ago

Agree, /var/log/messages is there for a long time, writing to log is never a problem. Digesting the log is the niche market and it is profitable enough that we have a lot of tools in this market (rotation, transmission, parsing, etc)

simscitizen

10 months ago

How does this work exactly? Is every log line a separate transaction in autocommit mode? Because I don't see any begin/commit statements in this codebase so far...

nine_k

10 months ago

Maybe autocommit mode is set? I'd expect that.

simscitizen

10 months ago

Inserting a new row for each log line in autocommit mode would be absurdly inefficient compared to just appending a log line to a file.

marcrosoft

10 months ago

I did something similar but not open source: centrallogging.com it is surprising how SQLite can scale for smallish amounts of logs (1tb)

juvenn

10 months ago

Why not use duckdb? It is a column database, and more situated for log entries (seems to me).

righthand

10 months ago

This looks right up my alley. I am experimenting to see how much I can strip systemd from my every day laptop as an exercise in futility and to understand how embedded a distribution like Debian has become.

UI_at_80x24

10 months ago

I stopped trying to swim against the current and switched to OpenBSD/FreeBSD. You might be surprised how viable it really is.

koeng

10 months ago

It’s really a shame that OpenBSD doesn’t have a good file system. Otherwise I’d use for my production systems (I could put it inside of proxmox, with a ZFS container outside for stability inside, but I like to run bare metal, no VMs)

ivzhh

10 months ago

I tried to install OpenBSD on three VM hosts, one Macbook Intel, one FreeBSD and one KVM. OpenBSD failed to install in all three environments. The latter two crashed just during the file system creation stage.

nine_k

10 months ago

I run Void Linux on my laptop, which seems to exhibit many BSD-esque approaches (and lacks systemd), while also being a rolling release distro with fresh stuff usually appearing in a few days after an upstream release.

Pretty fine so far (about 6 years).

righthand

10 months ago

That’s my concern, the Debian experience is usually pretty lovely and I’d hate to leave it behind, but maybe there’s no point in fiddling with something stuck in trends.

yjftsjthsd-h

10 months ago

I think Alpine would be a smaller jump?

ivzhh

10 months ago

Yes, Alpine is a good choice for small servers. I run both FreeBSD and Alpine for my homelab. Alpine feels very close to FreeBSD style. I still prefer pf over ufw/awall.

user

10 months ago

[deleted]