mnahkies
2 months ago
That was difficult to read, smelt very AI assisted though the message was worthwhile, it could've been shorter and more to the point.
A few things I've been thinking about recently:
- we have authentication everywhere in our stack, so I've started including the user id on every log line. This makes getting a holistic view of what a user experienced much easier.
- logging an error as a separate log line to the request log is a pain. You can filter for the trace, but it makes it hard to surface "show me all the logs for 5xx requests and the error associated" - it's doable, but it's more difficult than filtering on the status code of the request log
- it's not enough to just start including that context, you have to educate your coworkers that it's now present. I've seen people making life hard for themselves because they didn't realize we'd added this context
xmprt
2 months ago
On the other hand, investing in better tracing tools unlocks a whole nother level of logging and debugging capabilities that aren't feasible with just request logs. It's kind of like you mentioned with using the user id as a "trace" in your first message but on steroids.
dexwiz
2 months ago
These tools tend to be very expensive in my experience unless you are running your own monitoring cloud. Either you end up sampling traces at low rates to save on costs, or your observability bill is more than your infrastructure bill.
jonasdegendt
2 months ago
We self host Grafana Tempo and whilst the cost isn’t negligible (at 50k spans per second), the money saved in developer time when debugging an error, compared to having to sift through and connect logs, is easily an order of magnitude higher.
dietr1ch
2 months ago
Doing stuff like turning on tracing for clients that saw errors in the last 2 minutes, or for requests that were retried should only gather a small portion of your data. Maybe you can include other sessions/requests at random if you want to have a baseline to compare against.
valyala
2 months ago
Try open-source databases specially designed for traces, such as Grafana Tempo or VictoriaTraces. They can handle the data ingestion rate of hundreds of thousands trace spans per second on a regular laptop.
sahilagarwal
a month ago
I like to write them on my own in every company Im in using bash. So I have a local set of bash commands to help me figure out logs and colorize the items I want to.
Takes some time and its a pain in the ass initially, but once I've matured them - work becomes so much more easy. Reduces dependability on other people / teams / access as well.
Edit: Thinking about this, they wont work in other use cases. Im a data engineer so my jobs are mostly sequential.
oulipo2
2 months ago
I've tried HyperDX and SigNoz, they seem easy to self-host and decent enough
spike021
2 months ago
If your codebase has the concept of a request ID, you could also feasibly use that to trace what a user has been doing with more specificity.
ivan_gammel
2 months ago
…and the same ID can be displayed to user on HTTP 500 with the support contact, making life of everyone much easier.
dexwiz
2 months ago
I have seen pushback on this kind of behavior because "users don't like error codes" or other such nonsense. UX and Product like to pretend nothing will ever break, and when it does they want some funny little image, not useful output.
A good compromise is to log whenever a user would see the error code, and treat those events with very high priority.
spockz
2 months ago
We put the error code behind a kind of message/dialog that invites the user to contact us if the problem persists and then report that code.
It’s my long standing wish to be able to link traces/errors automatically to callers when they call the helpdesk. We have all the required information. It’s just that the helpdesk has actually very little use for this level of detail. So they can only attach it to the ticket so that actual application teams don’t have to search for it.
inkyoto
2 months ago
> I have seen pushback on this kind of behavior because "users don't like error codes" or other such nonsense […]
There are two dimensions to it: UX and security.
Displaying excessive technical information on an end-user interface will complicate support and likely reveal too much about the internal system design, making it vulnerable to external attacks.
The latter is particularly concerning for any design facing the public internet. A frequently recommended approach is exception shielding. It involves logging two messages upon encountering a problem: a nondescript user-facing message (potentially including a reference ID pinpointing the problem in space and time) and a detailed internal message with the problem’s details and context for L3 support / engineering.
dist1ll
2 months ago
Sorry for the OT response, I was curious about this comment[0] you made a while back. How did you measure memory transfer speed?
inkyoto
2 months ago
I used «powermetrics» bundled with macOS with «bandwidth» as one of the samplers (--samplers / -s set to «cpu_power,gpu_power,thermal,bandwidth»).
Unfortunately, Apple has taken out the «bandwidth» sampler from «powermetrics», and it is no longer possible to measure the memory bandwidth as easily.
KronisLV
2 months ago
> UX and Product like to pretend nothing will ever break, and when it does they want some funny little image, not useful output.
Just ignore them or provide appeasement insofar that it doesn’t mess with your ability to maintain the system.
(cat picture or something)
Oh no, something went wrong.
Please don’t hesitate to reach out to our support: (details)
This code will better help us understand what happened: (request or trace ID)ivan_gammel
2 months ago
Nah, that’s easy problem to solve with UX copy. „Something went wrong. Try again or contact support. Your support request number is XXXX XXXX“ (base 58 version of UUID).
mnahkies
2 months ago
We do have both a span id and trace id - but I personally find this more cumbersome over filtering on a user id. YMMV if you're interested in a single trace then you'd filter for that, but I find you often also care what happened "around" a trace
kulahan
2 months ago
If you care about this more than anything else (e.g. if you care about audits a LOT and need them perfect), you can simply code the app via action paths, rather than for modularity. It makes changes harder down the road, but for codebases that don’t change much, this can be a viable tradeoff to significantly improve tracing and logging.
nine_k
2 months ago
...if it does not, you should add it. A request ID, trace ID, correlation key, whatever you call it, you should thread it through every remote call, if you value your sanity.
giancarlostoro
2 months ago
TIDs are good here too. If you generate it and enforce it across all your services spanning various teams and APIs anyone of any team can grab a TID you provide and you can get the full end to end of one transaction.
valtism
2 months ago
Wow, I didn't think this was badly written at all! I certainly don't think it smells like AI. Are you conflating lists with AI written prose?
weebull
a month ago
> - we have authentication everywhere in our stack, so I've started including the user id on every log line. This makes getting a holistic view of what a user experienced much easier.
Depends on the service, but tracking everything a user does may not be an option in terms of data retention laws
khazhoux
2 months ago
> That was difficult to read, smelt very AI assisted though the message was worthwhile...
It won’t be long before ad computem comments like this are frowned upon.
bccdee
2 months ago
Why? "This was written badly" is a perfectly normal thing to say; "this was written badly because you didn't put in the effort of writing it yourself" doubly so.
0xbadcafebee
2 months ago
Say they used AI to write it, it came out bad, and they published it anyway. They had the opportunity to "make it better" before publishing, but didn't. The only conclusion for this is, they just aren't good at writing. So whether AI is used or not, it'll suck either way. So there's no need to complain about the AI.
It's like complaining that somebody typed a crappy letter rather than hand-wrote it. Either way the letter's gonna suck, so why complain that it was typed?
minitech
2 months ago
Compared to human bad writing, AI writing tends to suck more verbosely and in exciting new ways (e.g. by introducing factual errors).
0xbadcafebee
2 months ago
> AI writing tends to suck more verbosely
So, it's the style you oppose, the way a grammar nazi complains about "improper" English
> and in exciting new ways (e.g. by introducing factual errors).
Because factually incorrect comments didn't exist before AI?
Your concern is that you read something you don't like, so you pick the lowest-effort criteria to complain about. Speaks more about you than the original commenter.
damentz
2 months ago
I'm pretty sure by verbose it's the realization you've wasted precious time reading AI bloat that you'll never get back. On top of that, now you need to reread the text for hallucinations or just take a loss and ignore any conclusions at risk that they came from bad data.
Dylan16807
2 months ago
> The only conclusion for this is, they just aren't good at writing.
Not true. It's likely an effort issue in that situation.
And that kind of effort issue is good to call out, because it compounds the low quality.
0xbadcafebee
2 months ago
I don't know if you're new to the internet, but low-effort comments have existed before AI, and will continue to exist regardless of AI.
alwa
2 months ago
I read it as a more-or-less kind comment: “even though you’ll notice that they let an AI make the writing terrible, the underlying point is good enough to be worth struggling through that and discussing”
mnahkies
2 months ago
I felt unsure whether to include that particular comment, but landed on including because I think it's a real danger. I've got no problem with people using AI and do use it for some things myself.
However I don't think you should outsource understanding to LLMs, and also think that shifting the effort from the writer to the reader is a poor strategy (and disrespectful to the reader)
edit: in case it's unclear I'm not accusing the author of having outsourced their understanding to AI, but I think it's a real risk that people can fall into, the value is in the thinking people put into things not the mechanics of typing it out