> For something like tax software, you should have people on call, or even 24/7 staffing, for that specific week.
In my country, the tax system (EDS, Electronic Declaration System) is down pretty much every single year on the day when tax declaration submissions start.
2020: "SRS: Significantly increasing EDS capacity is expensive and not cost-effective" https://www-lsm-lv.translate.goog/raksts/zinas/ekonomika/vid...
2022: "SRS urges not to rush to submit annual income tax returns so as not to overload the EDS" https://www-lsm-lv.translate.goog/raksts/zinas/ekonomika/vid...
2023: "The SRS urges not to rush to submit income tax returns in the first days of March" https://www-lsm-lv.translate.goog/raksts/zinas/ekonomika/vid...
2025: "A virtual queue will be open this year for submitting annual income tax returns to the SRS" https://www-lsm-lv.translate.goog/raksts/zinas/ekonomika/28....
So basically their "solution" for the longest time was to just tell people that it's too expensive to make it have high availability and that they shouldn't use the system on the first days of the period when you can submit the data and eventually just adding a queue in front of the system to manage the concurrent users.
It seems that taxes still get handled correctly and that nobody really cares that much. Found this to be an interesting example of going against the established culture of trying to go above and beyond for availability, even if I scoffed at it a few years ago.
It definitely wouldn't be horrible to live in a world where a prod outage doesn't mean "Sorry wife, I'm not coming home today, will be stuck in some random war room for hours and then fudge up the groceries massively due to sleep deprivation" but rather "Sorry boss, the system is down, what a bummer. I'll look into it tomorrow at 9 AM." for pretty much anything aside from truly critical and time sensitive systems (e.g. air traffic control, as opposed to your music streaming app).
If it's down on the day that submissions open, then don't rush it. But when the window is closing there are thousands of dollars at stake for millions of people and I consider that pretty critical. It's not a generic outage. And it's also not unexpected. There's a lot less "Sorry wife, I'm not coming home today" when you scheduled it three months in advance.
Health and Life >> thousands of dollars at stake
Millions of thousands of dollars. When "health and life" is talking about whether ten people have overtime for a week, it's far less important than billions of dollars. And you can easily easily pay them enough to compensate for the stress.
And as I already said, when it comes to missing the tax deadline, leaving things broken would have a huge impact on the customers' heal and life. The total stress levels they'd feel would be enough to kill your server engineers outright.
I disagree.
The IRS can wait. If a million people can't file their taxes the IRS will wait and I'm okay with that.
I'm not risking another cardiac arrest so that a bunch of people can file their taxes on time.
> If a million people can't file their taxes the IRS will wait and I'm okay with that.
Even if you're right, it would cause a fuckton of stress in the people filing taxes, many of them now undergoing a significantly higher risk of cardiac arrest than the guy who's on call 1 week per year.
And if there's even a few percent chance you're wrong, the fallout would be enormous. In both money lost and even more stress.
And how many people do you think it'll take to make the IRS wait? What if you're a bit under that threshold, still with a whole lot of very stressed customers?
As long as the amount of on-call time is very small, I don't think it needs to be restricted to a super critical subset of jobs.
I feel you're missing the point. Your angle is self perpetuating. People have a higher risk of cardiac arrest at the fear of the consequences of missing the IRS date. The argument being made is that there won't really be any major consequences - if millions of people miss it because of a TurboTax issue, an extension will be granted.
Why should the engineers be stressed and overworked because other people are scared of something that doesn't have to happen?
The world is less stressful and - I think - better without manufactured urgency like what you're defending.
Don't get me wrong, some things are life and death, like life support machines. Taxes are not.
Pretend it's a smaller company. 100k people late. That's small enough to make a special exception quite unlikely, but big enough to be a lot of very stressed people. It's not self-perpetuating logic, it's how deadlines work. Letting those engineers off the hook won't solve the deadline, those people will just be told they should have done it sooner and they will suffer the consequences.
Despite not being anywhere near life or death, the stress is real. And for most people it's not crippling stress, but neither is being on call for a single week out of the year. If we're going to blow that level of on-call into a "risk of cardiac arrest" then to be reasonable we have to do the same thing for tax filing failures.
There's no way for deadlines to not be moderately stressful. You can't decide to avoid urgency and stress.