mnahkies
9 hours ago
That was difficult to read, smelt very AI assisted though the message was worthwhile, it could've been shorter and more to the point.
A few things I've been thinking about recently:
- we have authentication everywhere in our stack, so I've started including the user id on every log line. This makes getting a holistic view of what a user experienced much easier.
- logging an error as a separate log line to the request log is a pain. You can filter for the trace, but it makes it hard to surface "show me all the logs for 5xx requests and the error associated" - it's doable, but it's more difficult than filtering on the status code of the request log
- it's not enough to just start including that context, you have to educate your coworkers that it's now present. I've seen people making life hard for themselves because they didn't realize we'd added this context
xmprt
8 hours ago
On the other hand, investing in better tracing tools unlocks a whole nother level of logging and debugging capabilities that aren't feasible with just request logs. It's kind of like you mentioned with using the user id as a "trace" in your first message but on steroids.
dexwiz
8 hours ago
These tools tend to be very expensive in my experience unless you are running your own monitoring cloud. Either you end up sampling traces at low rates to save on costs, or your observability bill is more than your infrastructure bill.
jonasdegendt
6 hours ago
We self host Grafana Tempo and whilst the cost isn’t negligible (at 50k spans per second), the money saved in developer time when debugging an error, compared to having to sift through and connect logs, is easily an order of magnitude higher.
dietr1ch
8 hours ago
Doing stuff like turning on tracing for clients that saw errors in the last 2 minutes, or for requests that were retried should only gather a small portion of your data. Maybe you can include other sessions/requests at random if you want to have a baseline to compare against.
oulipo2
6 hours ago
I've tried HyperDX and SigNoz, they seem easy to self-host and decent enough
spike021
8 hours ago
If your codebase has the concept of a request ID, you could also feasibly use that to trace what a user has been doing with more specificity.
mnahkies
8 hours ago
We do have both a span id and trace id - but I personally find this more cumbersome over filtering on a user id. YMMV if you're interested in a single trace then you'd filter for that, but I find you often also care what happened "around" a trace
ivan_gammel
8 hours ago
…and the same ID can be displayed to user on HTTP 500 with the support contact, making life of everyone much easier.
dexwiz
8 hours ago
I have seen pushback on this kind of behavior because "users don't like error codes" or other such nonsense. UX and Product like to pretend nothing will ever break, and when it does they want some funny little image, not useful output.
A good compromise is to log whenever a user would see the error code, and treat those events with very high priority.
spockz
7 hours ago
We put the error code behind a kind of message/dialog that invites the user to contact us if the problem persists and then report that code.
It’s my long standing wish to be able to link traces/errors automatically to callers when they call the helpdesk. We have all the required information. It’s just that the helpdesk has actually very little use for this level of detail. So they can only attach it to the ticket so that actual application teams don’t have to search for it.
ivan_gammel
7 hours ago
Nah, that’s easy problem to solve with UX copy. „Something went wrong. Try again or contact support. Your support request number is XXXX XXXX“ (base 58 version of UUID).
inkyoto
3 hours ago
> I have seen pushback on this kind of behavior because "users don't like error codes" or other such nonsense […]
There are two dimensions to it: UX and security.
Displaying excessive technical information on an end-user interface will complicate support and likely reveal too much about the internal system design, making it vulnerable to external attacks.
The latter is particularly concerning for any design facing the public internet. A frequently recommended approach is exception shielding. It involves logging two messages upon encountering a problem: a nondescript user-facing message (potentially including a reference ID pinpointing the problem in space and time) and a detailed internal message with the problem’s details and context for L3 support / engineering.
kulahan
6 hours ago
If you care about this more than anything else (e.g. if you care about audits a LOT and need them perfect), you can simply code the app via action paths, rather than for modularity. It makes changes harder down the road, but for codebases that don’t change much, this can be a viable tradeoff to significantly improve tracing and logging.
nine_k
6 hours ago
...if it does not, you should add it. A request ID, trace ID, correlation key, whatever you call it, you should thread it through every remote call, if you value your sanity.
giancarlostoro
6 hours ago
TIDs are good here too. If you generate it and enforce it across all your services spanning various teams and APIs anyone of any team can grab a TID you provide and you can get the full end to end of one transaction.
khazhoux
7 hours ago
> That was difficult to read, smelt very AI assisted though the message was worthwhile...
It won’t be long before ad computem comments like this are frowned upon.
bccdee
6 hours ago
Why? "This was written badly" is a perfectly normal thing to say; "this was written badly because you didn't put in the effort of writing it yourself" doubly so.
0xbadcafebee
6 hours ago
Say they used AI to write it, it came out bad, and they published it anyway. They had the opportunity to "make it better" before publishing, but didn't. The only conclusion for this is, they just aren't good at writing. So whether AI is used or not, it'll suck either way. So there's no need to complain about the AI.
It's like complaining that somebody typed a crappy letter rather than hand-wrote it. Either way the letter's gonna suck, so why complain that it was typed?
minitech
4 hours ago
Compared to human bad writing, AI writing tends to suck more verbosely and in exciting new ways (e.g. by introducing factual errors).
Dylan16807
5 hours ago
> The only conclusion for this is, they just aren't good at writing.
Not true. It's likely an effort issue in that situation.
And that kind of effort issue is good to call out, because it compounds the low quality.
alwa
6 hours ago
I read it as a more-or-less kind comment: “even though you’ll notice that they let an AI make the writing terrible, the underlying point is good enough to be worth struggling through that and discussing”
mnahkies
6 hours ago
I felt unsure whether to include that particular comment, but landed on including because I think it's a real danger. I've got no problem with people using AI and do use it for some things myself.
However I don't think you should outsource understanding to LLMs, and also think that shifting the effort from the writer to the reader is a poor strategy (and disrespectful to the reader)
edit: in case it's unclear I'm not accusing the author of having outsourced their understanding to AI, but I think it's a real risk that people can fall into, the value is in the thinking people put into things not the mechanics of typing it out