hackernews client

Patterns for Defensive Programming in Rust

333 pointsposted 2 months ago

98 Comments

stouset

2 months ago

Good article, but one (very minor) nit I have is with the PizzaOrder example.

    struct PizzaOrder {
        size: PizzaSize,
        toppings: Vec<Topping>,
        crust_type: CrustType,
        ordered_at: SystemTime,
    }

The problem they want to address is partial equality when you want to compare orders but ignoring the ordered_at timestamp. To me, the problem is throwing too many unrelated concerns into one struct. Ideally instead of using destructuring to compare only the specific fields you care about, you'd decompose this into two structs:

    #[derive(PartialEq, Eq)]
    struct PizzaDetails {
        size: PizzaSize,
        toppings: Vec<Topping>,
        crust_type: CrustType,
        … // additional fields
    }

    #[derive(Eq)]
    struct PizzaOrder {
        details: PizzaDetails,
        ordered_at: SystemTime,
    }

    impl PartialEq for PizzaOrder {
        fn eq(&self, rhs: &Self) -> bool { 
            self.details == rhs.details
        }
    }

I get that this is a toy example meant to illustrate the point; there are certainly more complex cases where there's no clean boundary to split your struct across. But this should be the first tool you reach for.

hurril

2 months ago

You have a good point there, that is better. But it is still, well honestly, wrong. Two orders ordered at different times are just not the same order, and using a typeclass approach to say that they most definitely are is going to bite you in the back seat.

PartialEq and Eq for PizzaDetails is good. If there is a business function that computes whether or not someone orders the same thing, then that should start by projecting the details.

tialaramex

2 months ago

Yeah, I immediately twitched when I saw the PartialEq implementation. Somebody is going to write code which finds the "correct" order and ends up allowing someone to order the same pizza but get yours, while you have to wait for it to be made and cooked again.

It's not difficult to write the predicate same_details_as() and then it's obvious to reviewers if that's what we meant and discourages weird ad-hoc code which might stop working when the PizzaDetails is redefined.

stouset

2 months ago

I do agree that implementing PartialEq on orders in this way is a bad fit. But it is a synthetic example to make a point, so I tried to keep it in the spirit of the original article (while ironically picking nits in the same vein myself).

wyldfire

2 months ago

> But it is still, well honestly, wrong. Two orders ordered at different times are just not the same order

I probably don't have enough context but whatever identity makes up "your order" goes in the PizzaOrder and not the PizzaDetails. The delivery address, for example, goes in the PizzaOrder.

zozbot234

2 months ago

You can solve this in the general case by implementing the typeclass for the coarser equality relation over an ad-hoc wrapper newtype.

hurril

2 months ago

Well it isn't a good call. This is the kind of code that OOP makes people write.

bbminner

2 months ago

While better, a person modifying PizzaDetails might or might not expect this change to affect the downstream pizza deduplication logic (wherever it got sprinkled throughout the code). They might not even know that it exists.

Ideally, imho, a struct is a dumb data holder - it is there to pass associated pieces of data together (or hold a complex unavoidable state change hidden from the user like Arc or Mutex).

All that is to say that adding a field to an existing struct and possibly populating it sparsely in some remote piece of code should not changed existing behavior.

I wonder whether there's a way to communicate to whoever makes changes to the pizza details struct that it might have unintended consequences down the line.

Should one wrap PizzaDetails with PizzaComparator? Or better even provide it as a field in PizzaOrder? Or we are running into Java-esq territory of PizzaComparatorBuilderDefaultsConstructorFactory?

Should we introduce a domain specific PizzaFlavor right under PizzaDetails that copies over relevant fields from PizzaDetails, and PizzaOrder compares two orders by constructing and comparing their flavours instead? A lot of boilerplate.. but what is being considered important to the pizza flavor is being explicitly marked.

In a prod codebase I'd annotate this code with "if change X chaange Y" pre submit hook - this constraint appears to be external to the language itself and live in the domain of "code changes over time". Protobufs successfully folded versioning into the language itself though. Protobufs also have field annotations, "{important_to_flavour=true}" field annotation would be useful here.

kazinator

2 months ago

Decomposing things just to have different equality notions doesn't generalize.

How would you decompose a character string so that you could have a case-insensitive versus sensitive comparison?

Nevermark

2 months ago

> How would you decompose a character string

With a capitalization bit mask of course!

And you can speed up full equality comparisons with a quick cap equality check first.

(That is the how. The when is probably "never". :)

Rygian

2 months ago

Don't forget to store the locale used for capitalization, too.

stouset

2 months ago

Right, I did note that this decomposition isn’t always applicable. But it often is, and you should default to that when possible.

quotemstr

2 months ago

The tech industry is full of brash but lightly-seasoned people resurrecting discredited ideas for contrarianism cred and making the rest of us put down monsters we thought we'd slain a long time ago.

"Defensive programming" has multiple meanings. To the extent it means "avoid using _ as a catch-all pattern so that the compiler nags you if someone adds an enum arm you need to care about", "defensive" programming is good.

That said, I wouldn't use the word "defensive" to describe it. The term lacks precision. The above good practice ends up getting mixed up with the bad "defensive" practices of converting contract violations to runtime errors or just ignoring them entirely --- the infamous pattern in Java codebases of scrawling the following like of graffiti all over the clean lines of your codebase:

  if (someArgument == null) { 
    throw new NullPointerException("someArgument cannot be null");
  }

That's just noise. If someArgument can't be null, let the program crash.

Needed file not found? Just return ""; instead.

Negative number where input must be contractually not negative? Clamp to zero.

Program crashing because a method doesn't exist? if not: hasattr(self, "blah") return None

People use the term "defensive" to refer to code like the above. They programs that "defend" against crashes by misbehaving. These programs end up being flakier and harder to debug than programs that are "defensive" in that they continually validate their assumptions and crash if they detect a situation that should be impossible.

The term "defensive programming" has been buzzing around social media the past few weeks and it's essential that we be precise that

1) constraint verification (preferably at compile time) is good; and

2) avoidance of crashes at runtime at all costs after an error has occurred is harmful.

kstrauser

2 months ago

For a second I thought you were advocating for something of those, and I had a rant primed up.

Yes. Defensively handle all the failure modes you know how to handle, but nothing else. If you're writing a service daemon and the user passes in a config filename that doesn't exist, crash and say why. Don't try to guess, or offer up a default config, or otherwise try to paper over the idea that the user asked you to do something impossible. Pretty much anything you try other than just crashing is guaranteed to be wrong.

And for the love of Knuth, don't freaking clamp to zero or otherwise convert inputs into semantically different value than specified. (Like, it's fine to load a string representation of a float into an IEEE754 datatype if you're not working with money or other exact values. But don't parse 256 as 255 and call it good enough. It isn't.)

tialaramex

2 months ago

There's probably a "Pay it forwards" lesson from Rust's diagnostics too.

So much end user software tries to be "friendly" by just saying "An error occurred" regardless of what's wrong or whether you can do anything about it. Rust does better and it's a reminder that you can too.

user

2 months ago

[deleted]

array_key_first

2 months ago

I think the "if something bad happens throw an exception" thing does have some value, namely that you can make it very explicit in the code that this is a use case that you can't handle, and not that you merely forgot something or wrote a bug.

In PHP a pattern I often employ is:

  match ($value) {
    static::VALUE_1 => ..., 
    static::VALUE_2 => ..., 
    default => static::unreachable() 
  }

Where unreachable is literally just:

  static function unreachable() { 
    throw new Exception('Unreachable'); 
  }

Now, we don't actually need the default match arm. If we just leave it off entirely, and someone passes in something we can't match, it'll throw a PHP error about unmatched cases.

But what I've found is that if I do that, then other programmers go in later and just add in the case to the match statement so it runs. Which, of course, breaks other stuff down stream, because it's not a valid value we can actually use. Or worse: they add a default match arm that doesn't work! Just so the PHP interpreter doesn't complain.

But with this, now the reader knows "the person who wrote this considered what happens when something bad is passed in, and decided we cant handle it. There's probably a good reason for that". So they don't touch it.

Now, PHP has unique challenges because it's so dynamic. If someone passes in the wrong thing we might end up coercing null to zero and messing up calculations, or we might end up truncating a float or something. Ideally we prevent this with enums, but enums are a pain in the ass to write because of autoloading semantics (I don't want to write a whole new file for just a few cases)

1718627440

2 months ago

What I do in C is:

    switch (value) {
        case VALUE1: ...
        case VALUE2: ...
        default:
            __builtin_unreachable ();
    }

Where __builtin_unreachable invokes UB.

ipsi

2 months ago

The Java one can actually be quite helpful, for a couple of reasons:

1. It tells you which variable is null. While I think modern Java will include that detail in the exception, that's fairly new. So if you had `a.foo(b.getBar(), c.getBaz())`, was a, b, or c null? Who knows!

2. Putting it in the constructor meant you'd get a stack trace telling you where the null value came from, while waiting until it was used made it a lot harder to track down the source.

Not applicable to all situations, but it could be genuinely helpful, and has been to me.

quotemstr

2 months ago

In actual Java-Java (as opposed to Kotlin or something), first line of defense should be a linter that tries to prove nullability properties. In situations where that doesn't work, well, I think I'm the world's only fan of Java's assert keyword. If you can't use assert the language feature, you can at least throw AssertionError, which is a non-Exception Throwable subclass that's more likely to make your program die instantly, as it should, instead of treating the contract violation as a recoverable condition.

emschwartz

2 months ago

Indexing into arrays and vectors is really wise to avoid.

The same day Cloudflare had its unwrap fiasco, I found a bug in my code because of a slice that in certain cases went past the end of a vector. Switched it to use iterators and will definitely be more careful with slices and array indexes in the future.

josephg

2 months ago

> Cloudflare had its unwrap fiasco,

Was it a fiasco? Really? The rust unwrap call is the equivalent to C code like this:

    int result = foo(…);
    assert(result >= 0);

If that assert tripped, would you blame the assert? Of course not. Or blame C? No. If that assert tripped, it’s doing its job by telling you there’s a problem in the call to foo().

You can write buggy code in rust just like you can in any other language.

Defletter

2 months ago

I think it's because unwrap() seems to unassuming at a glance. If it were or_panic() instead I think people would intuit it more as extremely dangerous. I understand that we're not dealing with newbies here, but everyone is still human and everything we do to reduce mistakes is a good thing.

josephg

2 months ago

I'm sure lots of bystanders are surprised to learn what .unwrap() does. But reading the post, I didn't get the impression that anyone at cloudflare was confused by unwrap's behaviour.

If you read the postmortem, they talk in depth about what the issue really was - which from memory is that their software statically allocated room for 20 rules or something. And their database query unexpected returned more than 20 items. Oops!

I can see the argument for renaming unwrap to unwrap_or_panic. But no alternate spelling of .unwrap() would have saved cloudflare from their buggy database code.

bonesss

2 months ago

Looking at that unwrap as a Result<T> handler, the arguable issue with the code was the lack of informative explanation in the unexpected case. Panicking from the ill-defined state was desired behaviour, but explicit is always better.

The argument to the contrary is that reading the error out-load showed “the config initializer failing to return a valid configuration”. A panic trace saying “config init failed” is a minor improvement.

If we’re gonna guess and point fingers, I think the configuration init should be doing its own panicking and logging when it blows up.

josephg

2 months ago

First, again, that’s not the issue. The bug was in their database code. Could this codebase be improved with error messages? Yes for sure. But that wouldn’t have prevented the outage.

Almost every codebase I’ve ever worked in, in every programming language, could use better human readable error messages. But they’re notoriously hard to figure out ahead of time. You can only write good error messages for error cases you’ve thought through. And most error cases only become apparent when you stub your toe on them for real. Then you wonder how you missed it in the first place.

In any case, this sort of thing has nothing to do with rust.

mrkeen

2 months ago

It's not unassuming. Rust is superior to many other languages for making this risky behaviour visually present in the code base.

You can go ahead and grep your codebase for this today, instead of waiting for an incident.

I'm a fairly new migrant from Java to C#, and when I do some kind of collection lookup, I still need to check whether the method will return a null, throw an exception, expect an out+variable, or worst of all, make up some kind of default. C#'s equivalent to unwrap seems to be '!' (or maybe .Val() or something?)

neonsunset

2 months ago

Whether the value is null (and under which conditions) is encoded into the nullability of return value. Unless you work with a project which went out of its way to disable NRTs (which I've sadly seen happen).

bigstrat2003

2 months ago

> I think it's because unwrap() seems to unassuming at a glance. If it were or_panic() instead I think people would intuit it more as extremely dangerous.

Anyone who has learned how to program Rust knows that unwrap() will panic if the thing you are unwrapping is Err/None. It's not unassuming at all. When the only person who could be tripped up by a method name is a complete newbie to the language, I don't think it's actually a problem.

Similarly, assert() isn't immediately obvious to a beginner that it will cause a panic. Heck, the term "panic" itself is non obvious to a beginner as something that will crash the program. Yet I don't hear anyone arguing that the panic! macro needs to be changed to crash_this_program. The fact of the matter is that a certain amount of jargon is inevitable in programming (and in my view this is a good thing, because it enables more concise communication amongst practitioners). Unwrap is no different than those other bits of jargon - perhaps non obvious when you are new, but completely obvious once you have learned it.

cardanome

2 months ago

I don't think you can know what unwrap does and assume it is safe. Plus warnings about unwrap are very common in the Rust community, I even remember articles saying to never use it.

I have always been critical of the Rust hype but unwrap is completely fine. Is an escape hatch has legitimate uses. Some code is fine to just fail.

It is easy to spot during code review. I have never programmed Rust professional and even I would have asked about the unwrap in the cloudfare code if I had reviewed that. You can even enforce to not use unwrap at all through automatic tooling.

formerly_proven

2 months ago

.or_panic() would be a genuinely better name for .unwrap(). .unwrap() sounds a lot like .unbox(), whatcoulddagowrong?

elbear

2 months ago

The point is Rust provides more safety guarantees than C. But unwrap is an escape hatch, one that can blow up in your face. If they had taken the Haskell route and not provide unwrap at all, this wouldn't have happened.

openuntil3am

2 months ago

https://hackage.haskell.org/package/base/docs/Data-Maybe.htm...

"The fromJust function extracts the element out of a Just and throws an error if its argument is Nothing."

elbear

2 months ago

I forgot about that one. Oops. So, ignore the part about Haskell and keep the rest.

antonvs

2 months ago

Haskell’s fromJust, and similar partial functions like head, are as dangerous as Rust’s unwrap. The difference is only in the failure mode. Rust panics, whereas Haskell throws a runtime exception.

You might think that the Haskell behavior is “safer” in some sense, but there’s a huge gotcha: exceptions in pure code are the mortal enemy of lazy evaluation. Lazy evaluation means that an exception can occur after the catch block that surrounded the code in question has exited, so the exception isn’t guaranteed to get caught.

Exceptions can be ok in a monad like IO, which is what they’re intended for - the monad enforces an evaluation order. But if you use a partial function like fromJust in pure code, you have to be very careful about forcing evaluation if you want to be able to catch the exception it might generate. That’s antithetical to the goal of using exceptions - now you have to write to code carefully to make sure exceptions are catchable.

The bottom line is that for reliable code, you need to avoid fromJust and friends in Haskell as much you do in Rust.

The solution in both languages is to use a linter to warn about the use of partial functions: HLint for Haskell, Clippy for Rust. If Cloudflare had done that - and paid attention to the warning! - they would have caught that unwrap error of theirs at linting time. This is basically a learning curve issue.

josephg

2 months ago

> The difference is only in the failure mode. Rust panics, whereas Haskell throws a runtime exception.

Fun facts: Rust’s default panic handler also throws a runtime exception just like C++ and other languages. Rust also has catch blocks (std::panic::catch_unwind). But its rarely used. By convention, panicking in rust is typically used for unrecoverable errors, when the program should probably crash. And Result is used when you expect something to be fallable - like parsing user input.

You see catch_unwind in the unit test runner. (That’s why a failing test case doesn’t stop other unit tests running). And in web servers to return 50x. You can opt out of this behaviour with panic=abort in Cargo.toml, which also makes rust binaries a bit smaller.

antonvs

2 months ago

The difference is not just convention. You mentioned some similarities between Rust panics and C++ exceptions, but there are some important differences. If you tried to write Rust code that used panics and catch_unwind as a general exception mechanism, you’d soon run into those differences, and find out why Rust code isn’t written that way.

The key difference is that in the general case, panics are designed to lead to program termination, not recovery. Examples like unit tests are a special case - the fact that handling panics work for that case doesn’t mean they would work well more broadly.

The point you mentioned, about being able to configure panics to abort, is another issue. If you did that in a program which used panics as an exception handling mechanism, the program would fail on its first exception. Of course you can say “just don’t do that”, but the point is it highlights the difference in the semantics of panics vs. exceptions.

Also, panics are not typed, the way exceptions are in C++ or Java. Using them as a general exception handling mechanism would either be very limiting, or require the design of a whole infrastructure for that.

The are other issues as well, including behavior related to threads, to FFI, and to where panics can even be caught.

elbear

2 months ago

I forgot about fromJust. On the other hand, fromJust is shunned by practically everybody writing Haskell. `unwrap` doesn't have the same status. I also understand why. Rust wanted to be more appealing, not too restrictive while Haskell doesn't care about attracting developers.

antonvs

2 months ago

It's not just fromJust, there many other partial functions, and they all have the same issue, such as head, tail, init, last, read, foldl1, maximum, minimum, etc.

It's an overstatement to say that these are "shunned by practically everybody". They're commonly used in scenarios where the author is confident that the failure condition can't happen due to e.g. a prior test or guard, or that failure can be reliably caught. For example, you can catch a `read` exception reliably in IO. They're also commonly used in GHCi or other interactive environments.

I disagree that the Rust perspective on unwrap is significantly different. Perhaps for beginning programmers in the language? But the same is often true of Haskell. Anyone with some experience should understand the risks of these functions, and if they don't, they'll eventually learn the hard way.

One pattern in Rust that may mislead beginners is that unwrap is often used on things like builders in example docs. The logic here is that if you're building some critical piece of infra that the rest of the program depends on, then if it fails to build the program is toast anyway, so letting it panic can make sense. These examples are also typically scenarios where builder failure is unusual. In that case, it's the author's choice whether to handle failure or just let it panic.

sapiogram

2 months ago

Haskell is far more dangerous. It allows you to simple destruct the `Just` variant without a path for the empty case, causing a runtime error if it ever occurs.

josephg

2 months ago

> The point is Rust provides more safety guarantees than C. But unwrap is an escape hatch

Nope. Rust never makes any guarantees that code is panic-free. Quite the opposite. Rust crashes in more circumstances than C code does. For example, indexing past the end of an array is undefined behaviour in C. But if you try that in rust, your program will detect it and crash immediately.

More broadly, safe rust exists to prevent undefined behaviour. Most of the work goes to stopping you from making common memory related bugs, like use-after-free, misaligned reads and data races. The full list of guarantees is pretty interesting[1]. In debug mode, rust programs also crash on integer overflow and underflow. (Thanks for the correction!). But panic is well defined behaviour, so that's allowed. Surprisingly, you're also allowed to leak memory in safe rust if you want to. Why not? Leaks don't cause UB.

You can tell at a glance that unwrap doesn't violate safe rust's rules because you can call it from safe rust without an unsafe block.

[1] https://doc.rust-lang.org/reference/behavior-considered-unde...

Measter

2 months ago

> In debug mode, rust programs also crash on unsigned integer overflow.

All integer overflow, not just unsigned. Similarly, in release mode (by default) all integer overflow is fully defined as two's complement wrap.

elbear

2 months ago

I never said Rust makes guarantees that code is panic-free. I said that Rust provides more safety guarantees than C. The Result type is one of them because you have to handle the error case explicitly. If you don't use unwrap.

Also, when I say safety guarantees, I'm not talking about safe rust. I'm talking about Rust features that prevent bugs, like the borrow checker, types like Result and many others.

josephg

2 months ago

Ah thanks for the clarification. That wasn’t clear to me reading your comment.

You’re right that rust forces you to explicitly decide what to do with Result::Err. But that’s exactly what we see here. .unwrap() is handling the error case explicitly. It says “if this is an error, crash the program. Otherwise give me the value”. It’s a very useful function that was used correctly here. And it functioned correctly by crashing the program.

I don’t see the problem in this code, beyond it not giving a good error message as it crashed. As the old joke goes, “Task failed successfully.”

willtemperley

2 months ago

This is the equivalent of force-unwrap in Swift, which is strongly discouraged. Swift format will reject this anti-pattern. The code running the internet probably should not force unwrap either.

1718627440

2 months ago

Funny, it's really the same thing, why Rust people say we should abandon C. Meanwhile in C, it is also common to hand out handle instead of indices precisely due to this problem.

kstrauser

2 months ago

It's pretty similar, but writing `for item in container { item.do_it() }` is a lot less error prone than the C equivalent. The ha-ha-but-serious take is that once you get that snippet to compile, there's almost nothing you could ever do to break it without also making the compiler scream at you.

josephg

2 months ago

In rust, handing out indexes isn’t that common. It’s generally bad practice because your program will end up with extra, unnecessary bounds checks. Usually we program rust just the same as in C - get a reference (pointer) to an item inside the array and pass that around. The rust compiler ensures the array isn’t modified or freed while the pointer is held. (Which is helpful, but very inconvenient at times!)

empath75

2 months ago

This is one of the best Rust articles I've ever read. It's obviously from experience and covers a lot of _business logic_ foot guns that Rust doesn't typically protect you against without a little bit of careful coding that allows the compiler to help you.

So many rust articles are focused on people doing dark sorcery with "unsafe", and this is just normal every day api design, which is far more practical for most people.

pornel

2 months ago

What's really nice is where you don't need defensive programming in Rust.

If your function gets ownership of, or an exclusive reference to an object, then you know for sure that this reference, for as long as it exists, is the only one in the entire program that can access this object (across all threads, 3rd party libraries, recursion, async, whatever).

References can't be null. Smart pointers can't be null. Not merely "can't" meaning not allowed and may throw or have a dummy value, but just can't. Wherever such type exists, it's already checked (often by construction) that it's valid and can't be null.

If your object's getter lends an immutable reference to its field, then you know the field won't be mutated by the caller (unless you've intentionally allowed mutable "holes" in specific places by explicitly wrapping them in a type that grants such access in a controlled way).

If your object's getter lends a reference, then you know the caller won't keep the reference for longer than the object's lifetime. If the type is not copyable/cloneable, then you know it won't even get copied.

If you make a method that takes ownership of `self`, then you know for sure that the caller won't be able to call any more methods on this object (e.g. `connection.close(); connection.send()` won't compile, `future.then(next)` only needs to support one listener, not an arbitrary number).

If you have a type marked as non-thread safe, then its instances won't be allowed in any thread-spawning functions, and won't be possible to send through channels that cross threads, etc. This is verified globally, across all code including 3rd party libraries and dynamic callbacks, at compile time.

Svoka

2 months ago

Article mostly focuses on code practices to avoid making logical mistakes when iterating on your program.

zelphirkalt

2 months ago

I fully agree with the actually great thing being what not to have to look out for and my first thought when seeing the headline was: "Doesn't the type system handle most of that stuff?"

In other languages I get most of the benefits by sticking to functional programming practices and not mutating stuff all over the place. Rust's type system sort of encodes that, and maybe a little more, by making safe mutation a known non-interfering thing.

MarkSweep

2 months ago

I don’t see how your comment is relevant, none of things you mention are covered in the article. This was an article about logic bugs that can exist in spite of the borrow checker.

J_Shelby_J

2 months ago

Wow that’s amazing. The partial equality implementation is really surprising.

One question about avoiding boolean parameters, I’ve just been using structs wrapping bools. But you can’t treat them like bools… you have to index into them like wrapper.0.

Is there a way to treat the enum style replacement for bools like normal bools, or is just done with matches! Or match statements?

It’s probably not too important but if we could treat them like normal bools it’d feel nicer.

jvanderbot

2 months ago

I almost always prefer enums and matches! vs bool parameters. Another way is to implement a Trait that you find useful that encapsulates the logic. And don't forget you can do impl <Enum> {} blocks to add useful functions that execute regardless of which member of the enum you got.

    enum MyType{
    
    ...
    
    }

    impl MyType{
        pub fn is_useable_in_this_way(&self) -> bool{
            // possibly ...
            match self {...}
        }
    }

and later:

    pub fn use_in_that_way(e: MyType) {
        if e.is_useable_in_this_way() {...}
    }

Or if you hate all that there's always:

    if let MyType::Member(x) = e {
        ...
    }

J_Shelby_J

2 months ago

If let is probably the closest to a regular bool.

For ints you can implement the deref trait on structs. So you can treat YourType(u64) as a u64 without destructing. I couldn’t figure out a way to do that with YouType(bool).

tayo42

2 months ago

I think you can do something like impl defref but not sure that's a good idea hah. Maybe it's a different trait I'm thinking of

sdeframond

2 months ago

Those patterns look good.

Question: how to encourage such patterns within a team? I often find it difficult to do it during code reviews and leading to unproductive arguments about "code style" and "preferences".

Funnily, these arguments do not happen when a linter pops a warning instead...

perching_aix

2 months ago

This made me wonder, why aren't there usually teams whose job is to keep an eye on the coding patterns used in the various codebases? Similarly like how you have an SOC team who keeps monitoring traffic patterns, or an Operations Support team who keeps monitoring health probes, KPIs, and logs, or a QA who keeps writing tests against new code, maybe there would be value to keeping track of what coding patterns develop into over the course of the lifetime of codebases?

Like whenever I read posts like this, they're always fairly anecdotal. Sometimes there will even be posts about how large refactor x unlocked new capability y. But the rationale always reads somewhat retconned (or again, anecdotal*). It seems to me that maybe such continuous meta-analysis of one's own codebases would have great potential utility?

I'd imagine automated code smell checking tools can only cover so much at least.

* I hammer on about anecdotes, but I do recognize that sentiment matters. For example, if you're planning work, if something just sounds like a lot of work, that's already going to be impactful, even if that judgement is incorrect (since that misjudgment may never come to light).

jadenPete

2 months ago

I work one of these teams! At my company (~300 engineers), we have tech debt teams for both frontend and backend. I’m on the backend team.

We do the work that’s too large in scope for other teams to handle, and clearly documenting and enforcing best practices is one component of that. Part of that is maintaining a comprehensive linting suite, and the other part is writing documentation and educating developers. We also maintain core libraries and APIs, so if we notice many teams are doing the same thing in different ways, we’ll sit down and figure out what we can build that’ll accommodate most use cases.

vlovich123

2 months ago

There are. All the big tech companies have them. It’s just difficult to accomplish when you have millions of lines of code.

perching_aix

2 months ago

Is there an industry standard name for these teams that I somehow missed then?

svat

2 months ago

Not exactly the question you asked, but you may want to read the chapter on “Large-Scale Changes” in the “Software Engineering at Google” book: https://abseil.io/resources/swe-book/html/ch22.html

perching_aix

2 months ago

Thanks, I shall take a look.

tczMUFlmoNk

2 months ago

You may wish to search for "readability at Google". Here is one article:

https://www.moderndescartes.com/essays/readability/

(I have not read this article closely, but it is about the right concept, so I provide it as a starting point since "readability" writ large can be an ambiguous term.)

mattarm

2 months ago

See https://abseil.io/tips/ for some idea of the kinds of guidance these kinds of teams work to provide, at least at Google. I worked on the “C++ library team” at Google for a number of years.

These roles don’t really have standard titles in the industry, as far as I’m aware. At Google we were part of the larger language/library/toolchain infrastructure org.

Much of what we did was quasi-political … basically coaxing and convincing people to adopt best practices, after first deciding what those practices are. Half of the tips above were probably written by interested people from the engineering org at large and we provided the platform and helped them get it published.

Speaking to the original question, no, there were no teams just manually reading code and looking for mistakes. If buggy code could be detected in an automated way, then we’d do that and attempt to fix it everywhere. Otherwise we’d attempt to educate and get everyone to level up their code review skills.

perching_aix

2 months ago

This is a really cool insight, thank you!

> Half of the tips above were probably written by interested people from the engineering org at large and we provided the platform and helped them get it published.

Are you aware how those engineers established their recommendations? Did they maybe perform case studies? Or was it more just a distillation of lived experience type of deal?

eep_social

2 months ago

I wasn’t in C++ style land but my recollection is that distilled experience would be backed up by extensive mailing list discussions. in case of contention the discussion might extend into case studies or other quantitative techniques atop google3. It’s difficult for me personally to describe the impact (outsized)of a super-resourced monorepo for this kind of thing. also as gp mentioned, it was sometimes possible to automate changes to comply with updated guidelines.

brohee

2 months ago

The very useful TryFrom trait landed only in 1.34, so hopefully the code using unwrap_or_else() in From impl predates that...

Actually the From trait documentation is now extremely clear about when to implement it (https://doc.rust-lang.org/std/convert/trait.From.html#when-t...)

rolandog

2 months ago

As someone unfamiliar with Rust (yet! it's on my ever growing list of things I'd like to absorb into my brain), unwrap_or_else() sounds like part of the "What You See Is What I Threatened the Computer To Do" paradigm.

Y_Y

2 months ago

> INTERCAL has many other features designed to make it even more aesthetically unpleasing to the programmer: it uses statements such as "READ OUT", "IGNORE", "FORGET", and modifiers such as "PLEASE". This last keyword provides two reasons for the program's rejection by the compiler: if "PLEASE" does not appear often enough, the program is considered insufficiently polite, and the error message says this; if it appears too often, the program could be rejected as excessively polite.

rolandog

2 months ago

Oh wow! That's amazing! "I came to learn Computer Science, but I left with good bedside manners".

strbean

2 months ago

Immediately thought of INTERCAL :)

wongarsu

2 months ago

There are also the equally threatening and useful `map_or_else` (on Result and Option) and `ok_or_else` (on Option and experimentally on bool)

thatoneengineer

2 months ago

Aside from just being immensely satisfying, these patterns of defensive programming may be a big part of how we get to quality GAI-written code at scale. Clippy (and the Rust compiler proper) can provide so much of the concrete iterative feedback an agent needs to stay on track and not gloss over mistakes.

Dowwie

2 months ago

I'm not reading a solid argument as to not use "..Defaults()" because doing so suggests that you may introduce a bug and therefore should be explicit about EVERYTHING instead? Ugh. Hard disagree.

ninkendo

2 months ago

Care to say why you disagree?

Using ..Default::default() means “whatever additional fields are added later, I don’t care”. Which is great until someone needs to add a field to the struct, and they rely on the compiler to tell them all the places that don’t have a value for the field (so they can pass the right value depending on the situation.) Then the callers with Default are missed, and bugs can result.

Any time you say “I don’t care what happens in the future here”, you better have a good reason for that to be the case, IMO.

Kinrany

2 months ago

It's really unfortunate that Default is often interpreted as "recommended" when the trait makes/requires zero guarantees beyond the value being a valid member of the type.

jgerrish

2 months ago

This has already been hashed over a hundred thousand times, but there are also developer habits that we all need to defend against. One is pulling in needless crates.

Rust encourages that behavior. Sometimes rightly, but it does build a habit.

I spoke previously about how the Rust book uses the external rand create as a key example and it sets the tone for developers. I'm changing that stance somewhat since it was a decent strategic choice to have crypto packages plug-and-play. But tit still builds a habit.

gs17

2 months ago

> I spoke previously about how the Rust book uses the external rand create as a key example and it sets the tone for developers. I'm changing that stance somewhat since it was a decent strategic choice to have crypto packages plug-and-play. But tit still builds a habit.

Yeah, that originally turned me off from the language entirely. I also changed my mind eventually.

swiftcoder

2 months ago

In the first example, the match feels extremely overkill. Vec.first() exposes the correct semantic (as does Vec.iter().nth(0) for the more general case), returning an Option.

fvncc

2 months ago

Note its possible to write the example more succinctly (while having the same behavior) with:

https://docs.rs/itertools/latest/itertools/trait.Itertools.h...

bigstrat2003

2 months ago

I also think the first example has a solution which is worse than the purported problem it attempts to solve. If you're worried that someone might take the if statement out from around the vec index (I don't think this is actually a concern, but let's say it is for sake of argument), what's to stop someone from taking the match statement out from around your slice access? I can't see any reason why the solution isn't equally as vulnerable to the exact same problem. So the match approach doesn't seem to be adding value, while being much more verbose, and less clear.

As you said, calling first() is a far better approach.

miniBill

2 months ago

That's fine but the matching exposes that you should also handle the "has more than one element" case. And in general it's pointing to the idea of "try to not separate checks from code that depends on those checks"

user

2 months ago

[deleted]

lilyball

2 months ago

In Pattern: Defensively Handle Constructors, it recommends using a nested inner module with a private Seal type. But the Seal type is unnecessary. The nested inner module is all you need, a private field `_private: ()` is only settable from within that inner module, so you don't need the extra private type.

ggirelli

2 months ago

Loved the article, such a nice read. I am still slowly ramping up my proficiency in Rust and this gave me a lot of things to think through. I particularly enjoyed the temporary mutability pattern, very cool and didn't think about it before!

aw1621107

2 months ago

> I particularly enjoyed the temporary mutability pattern, very cool and didn't think about it before!

It's not too uncommon in other languages (sometimes under the name "immediately invoked function expression"), though depending on the language you may see lambdas involved. For example, here's one of the examples from the article ported to C++:

    auto data = []() {
        auto data = get_vector();
        auto temp = compute_something();
        data.insert_range(data.end(), temp);
        std::ranges::sort(data);
        return data;
    }();

schneems

2 months ago

This was posted with a (mostly) healthy discussion on lobste.rs, here's the link https://lobste.rs/s/ouy4dq/patterns_for_defensive_programmin...

Sunscratch

2 months ago

Nice article. The problem of multiple booleans is just one instance of a more general problem: when a function takes multiple arguments of the same type (i32, String, etc.). The newtype pattern allows you to create distinct types in such cases and enforce correctness at compile time.

torginus

2 months ago

Defensive programming is a widely known antipattern : https://wiki.c2.com/?DefensiveProgramming

The 'defensive' nature refers to the mindset of the programmer (like when guilty people are defensive when being asked a simple question), that he isn't sure of anything in the code at any point, so he needs to constantly check every invariant.

Enterprise code is full of it, and it can quickly lead to the program becoming like 50% error handling by volume, many of the errors being impossible to trigger because the app logic is validating a condition already checked in the validation layer.

Its presence usually betrays a lack of understanding of the code structure, or even worse, a faulty or often bypassed validation layer, which makes error checking in multiple places actually necessary.

One example is validating every parameter in every call layer, as if the act of passing things around has the ability to degrade information.

wavemode

2 months ago

The kind of defensive programming you're talking about has almost nothing to do with the contents of this article. This article is mostly just about structuring code so that bugs appear at compile time rather than runtime.

cracki

2 months ago

It's not about fear of "degrading information".

A function must check its arguments. It cannot assume that the arguments are already checked (against its own requirements). This is regardless of what called it, or where the values came from.

user

2 months ago

[deleted]

tayo42

2 months ago

>Using _ as a placeholder for unused variables can lead to confusion

I would have guessed linters would have complained about what's being suggested there. Is the something special about var: _ thing that avoids it?

dwattttt

2 months ago

Underscore acts as a wildcard, denoting that you don't want to bind a variable. Quoting the Rust reference[0]:

> Underscore expressions, denoted with the symbol _, are used to signify a placeholder in a destructuring assignment.

[0]: https://doc.rust-lang.org/reference/expressions/underscore-e...

adastra22

2 months ago

This is a great article. Anyone know more content like this?

germandiago

2 months ago

Some of the advice is applicable tp C++ as well. Enums and such things with non-exhaustive checks all have warnings and setting warnings as errors helps.