krunck
4 days ago
My way of doing things:
1. Scripts should always return an error (>0) when things did not go as planned and 0 when they did. Always.
2. Scripts should always notify you when they return >0. Either in their own way or via emails sent by Cron.
3. Use chronic ( from Debian moreutils package) to ensure that cron jobs only email output when they ended in error. That way you don't need to worry about things sent to STDOUT spamming you.
4. Create wrapper scripts for jobs that need extra functionality: notification, logging, or sanity checks.BlackPearl02
4 days ago
Those are all solid practices! I use chronic too, and proper exit codes are essential.
The gap I found is that even with all of that, you can still have "successful" jobs (exit code 0, no errors) that produce wrong results. Like a backup script that runs successfully but creates empty files because the source directory was empty, or a sync that only processes 10% of records because of a logic bug.
But there's another issue: chronic doesn't detect when cron jobs don't run at all. If your crontab gets corrupted, the server time changes, or cron daemon stops, chronic won't alert you because there's no output to email.
That's why I built result validation monitoring - it expects a ping after your job completes. If the ping doesn't arrive (job didn't run, crashed before completion, etc.), it alerts. Plus it validates the actual results (file sizes, record counts, content validation) and alerts if they don't match expectations.
Works alongside chronic/exit codes, but adds detection for jobs that never executed and validation of the actual outputs.
PenguinCoder
4 days ago
+1 for chronic. Very useful for knowing when a cron fails without needing to manually review every log run.