hackernews client

Eliminating Memory Safety Vulnerabilities at the Source

330 pointsposted 9 months ago

(security.googleblog.com)

199 Comments

steveklabnik

9 months ago

This is a very interesting post! One takeaway is that you don't need to re-write the world. Transitioning new development to a memory safe language can bring meaningful improvements. This is much easier (and cheaper) than needing to port everything over in order to get an effect.

gary_0

9 months ago

In fact, these results imply that the benefits of re-writing the world are limited in terms of security. This raises the cost-benefit ratio of keeping mature legacy code and only using memory-safe languages for new code.

This also implies that languages and tooling with robust support for integrating with unsafe legacy code are even more desirable.

Xylakant

9 months ago

Essentially, this has been Rusts value proposition from the outset - build a language that you can integrate into other codebases seamlessly, hence the choice of no runtime, no garbage collector etc. Bindgen (https://github.com/rust-lang/rust-bindgen) and similar tooling were around essentially since day one to assist in that.

It’s the only approach that has any chance of transitioning away from unsafe languages for existing, mature codebases. Rewriting entirely in a different language is not a reasonable proposition for every practical real-world project.

dadrian

9 months ago

Rust had to be dragged kicking and screaming into integration with other languages, and its C++ compatibility is a joke compared to Swift.

It's absolutely true that you need integration and compatibility to enable iterative improvements, but Rust historically has been hostile to anything besides unsafe C ABI FFI, which is not suitable for the vast majority of incremental development that needs to happen.

Luckily, this is starting to change.

Philpax

9 months ago

"Hostile" is assigning intent that was not present. C++ ABI integration was and is extremely difficult; you need to be able to fully handle C++ semantics and to target each platform's take on the ABI. Most C++ competitors have struggled with this for much the same reason.

This means that "solving this" requires partial integration of a C++ compiler; it's not a coincidence that the languages with the most success have been backed by the organisations that already had their own C++ compiler.

A much easier solution is to generate glue on both sides, which is what `cxx` does.

bluGill

9 months ago

If the intent was to interoyerate with c++ then not supporting the api is hostile. I have a lot of code using vector, if I have to convert that to arrays then you are hostile and worse that is new code which per the article makes it suspect

d does support c++ abi. It seems almost dead now but it is possible.

realistically there are two c++ abis in the world. Itanimum and msvc. both are known well enough that you can imblement them if you want (it is tricky)

Philpax

9 months ago

> If the intent was to interoyerate with c++ then not supporting the api is hostile. I have a lot of code using vector, if I have to convert that to arrays then you are hostile and worse that is new code which per the article makes it suspect

It's not hostile to not commit resources to integrating with another language. It might be shortsighted, though.

> d does support c++ abi. It seems almost dead now but it is possible.

That was made possible by Digital Mars's existing C++ compiler and their ability to integrate with it / borrow from it. Rust can't take the same path. Additionally, D's object model is closer to C++ than Rust's is; it's not a 1:1 map, but the task is still somewhat easier. (My D days are approaching a decade ago, so I'm not sure what the current state of affairs is.)

> realistically there are two c++ abis in the world. Itanimum and msvc. both are known well enough that you can imblement them if you want (it is tricky)

The raw ABI is doable, yeah - but how do you account for the differences in how the languages work? As a simple example - C++ has copy constructors, Rust doesn't. Rust has guarantees around lifetimes, C++ doesn't. What does that look like from a binding perspective? How do you expose that in a way that's amenable to both languages?

`cxx`'s approach is to generate lowest-common denominator code that both languages can agree upon. That wouldn't work as a language feature because it requires buy-in from the other side, too.

bluGill

9 months ago

I'm not arguing rust made the wrong decisions. I'm arguing the concequences of those decisions is hostile to C++ interoperability.

pjmlp

9 months ago

As long as Rust depends on GCC and Clang, it can surely take the same path.

I don't see Cranelift ever replacing them as the main backend.

SkiFire13

9 months ago

> Rust had to be dragged kicking and screaming into integration with other languages

Rust has integration with the C ABI(s) from day 1, which makes sense because the C ABI(s) are effectively the universal ABI(s).

> and its C++ compatibility is a joke

I'm not sure why you would want Rust to support interoperability with C++ though, they are very different languages. Moreover why does this fall on Rust? C++ has yet to support integration with Rust!

> compared to Swift

Swift bundled a whole C++ compiler (Clang) to make it work, that's a complete deal breaker for many.

> Rust historically has been hostile to anything besides unsafe C ABI FFI

Rust historically has been hostile to making the Rust ABI stable, but not to introducing other optional ABIs. Introducing e.g. a C++ ABI has just not been done because there's no mapping for many features like inheritance, move constructors, etc etc. Ultimately the problem is that most ABIs are either so simple that they're the same as the C ABI or they carry so many features that it's not possible to map them onto a different language.

maxk42

9 months ago

I would like to point out also that C++ is not yet compatible with itself. Even GCC has dozens of C++ features it doesn't implement and it was the first compiler to implement 100% of C++11: https://gcc.gnu.org/projects/cxx-status.html

bluGill

9 months ago

Technically true but in the real world those are rare edge cases.

maxk42

9 months ago

Only because they're not used. If you're not going to implement the spec then is it really a spec?

SkiFire13

9 months ago

It could also be argued that they aren't used because not all major compilers implement them.

bluGill

9 months ago

If there was demand major compilers would implement them. That is why despite being difficult all major compilers are working on getting modules working.

bluGill

9 months ago

I have millions of lines of c++. I'd love t use something else but without interoperability it is much harder. If you only write C you are fine. If you write java you want java interoperability, similear for lisp, or whatever your existing code is in.

steveklabnik

9 months ago

That was created with the intention of being integrated into a very large C++ codebase.

I do agree that they could do more. I find it kind of wild that they haven’t copied Zig’s cross compilation story for example. But this stuff being just fine instead of world class is more of a lack of vision and direction than a hostility towards the idea. Indifference is not malice.

Also, you seem to be ignoring the crabi work, which is still not a thing, of course, but certainly isn’t “actively hostile towards non-c ABIs”.

samatman

9 months ago

I think they had to get over the knee-jerk reaction that Zig was a, quote, "massive step back for the industry", frowny face.

It seems they have, which is all to the good: the work to make custom allocators practical in Rust is another example. Rust still doesn't have anything like `std.testing.checkAllAllocationFailures`, but I expect that it will at some future point. Zig certainly learned a lot from Rust, and it's good that Rust is able to do the same.

Zig is not, and will not be, a memory-safe language. But the sum of the decisions which go into the language make it categorically different from C (the language is so different from C++ as to make comparisons basically irrelevant).

Memory safety is a spectrum, not a binary, or there wouldn't be "Rust" and "safe Rust". What Rust has done is innovative, even revolutionary, but I believe this also applies to Zig in its own way. I view them as representing complementary approaches to the true goal, which is correct software with optimal performance.

steveklabnik

9 months ago

While Patrick played a critical role in Rust's history, he said that years after he stopped working on Rust. I haven't seen people still involved share strong opinions about Zig in either direction, to be honest.

I've known Andrew for a long time, and am a big fan of languages learning from each other. I have always found that the community of people working on languages is overall much more collegial to each other than their respective communities can be to each other, as a general rule. Obviously there are exceptions.

dadrian

9 months ago

I'll admit hostile was perhaps too harsh, however looking back to the late 2010s, it seems clear that the attitude was:

- C ABI is good enough for any compatibility

- "Rewrite it Rust" / RESF

- Prioritize language proposals, not fundamentals

I think the attitude as changed somewhat in the last 2-3 years, with more interest in cxx and crubit and crabi, although many of those things are at least partially blocked on stalled language proposals, and largely ignored by the "core" Rust community. I would say indifference and posturing does approach hostility.

There's also consistent conflation of "compatibility with C++" and "rich compatibility with all of C++". Swift is a great example of how targeting PODs and std::vector and aligning concurrency expectations gets you extremely far, especially if you limit yourself to just LLVM.

btw, I really appreciate your Rust book.

steveklabnik

9 months ago

Thanks for the kind words about the book!

I think indifference is very different than hostility. That there's only limited time, resources, and interest and those must be balanced and that means some projects advance before others is a very different thing than a refusal to do something.

nindalf

9 months ago

> Rust had to be dragged kicking and screaming into integration

On what basis do you make this flagrant claim? There is certainly an impedance mismatch between Rust and C++, but "kicking and screaming" implies a level of malice that doesn't exist and never existed.

Such a low quality comment.

bobajeff

9 months ago

>has been hostile to anything besides unsafe C ABI FFI

As opposed to integrating with a whole C++ frontend? Gee, I wonder why more languages haven't done this obvious idea of integrating a whole C++ frontend within their build system.

dadrian

9 months ago

This is the exact attitude I am referring to. Also, Rust already uses LLVM.

kibwen

9 months ago

Unfortunately, sharing a codegen backend is neither necessary nor sufficient to enable the sort of seamless high-level cross-language integration that you appear to be requesting.

rurban

9 months ago

Why is Rust then insecure by default, with exicit unsafe blocks leading to type and memory unsafeties, plus blocking IO leading to concurrency unsafeties?

They are talking about actually safe languages, secury by design and secure by default. Which would be a lisp or scheme without an FFI, or a beam language (Erlang, Elixir). Or pony, without the FFI. Or Concurrent Pascal, not Go.

Or a safe scripting language

dieortin

9 months ago

The article explicitly uses Rust as an example of a safe language

pjc50

9 months ago

Rewrites are so expensive that they're going to be very rare. But incremental nibbling at the edges is very effective.

I wonder if there's a "bathtub curve" effect for very old code. I remember when a particularly serious openssl vulnerability (heartbleed?) caused a lot of people to look at the code and recoil in horror at it.

Ygg2

9 months ago

From security perspective I agree but what if you want to be rid of GC or just reduce your overall resource consumption?

pjmlp

9 months ago

Learning how to actually use the language features that don't rely on GC, like on Swift, D, C#, Linear Haskell, Go, Eiffel, OCaml effects... is already a great step forward.

Plenty of people keep putting GC languages on the same basket without understanding what they are talking about.

Then if it is a domain where any kind of automatic resource management is impossible due to execution deadlines, or memory availability, Rust is an option.

elcritch

9 months ago

Well Rust also has certain aspects of “automatic resource management”. You can run into execution deadline issues with allocation or reallocated (drop) in Rust. The patterns to avoid this in critical areas is largely the same in any of the languages you listed.

Though I like using effect systems like in Nim or Ocaml for preventing allocation in specific areas.

pjmlp

9 months ago

You can even run into execution deadlines with malloc()/free(), that is why there are companies that made business out of selling specialized versions of them, and nowadays that is only less of a business because there are similar FOSS implementations.

The point is not winning micro-benchmarks games, rather there are specific SLAs for resource consumption and execution deadlines, is the language toolchain able to meet them, when the language features are used as they should, or does it run out of juice to meet those targets?

The recent Guile performance discussion thread is a good example.

If language X does meet the targets, and one still goes for "Rewrite XYZ into ZYW" approach, then we are beyond pure technical considerations.

throwaway2037

9 months ago

    > The recent Guile performance discussion thread is a good example.

Sounds very interesting. Can you share a link?

itishappy

9 months ago

Optimizing Guile Scheme

https://dthompson.us/posts/optimizing-guile-scheme.html

https://news.ycombinator.com/item?id=41600903

redman25

9 months ago

Most GC languages give you limited control over "when" garbage collection or allocation occur. With non-GC'd languages you can at least control them manually based on data structure choice (i.e. arenas), lifetimes, or manual drops.

throwaway2037

9 months ago

Do you count Python, where the ref impl (CPython) uses reference counting, as a GC language? If yes, you have a better idea when GC will occur compared to non-deterministic GC like Java/JVM and C#/CLR.

neonsunset

9 months ago

.NET has a few runtime knobs to control GC behavior at runtime:

GC.TryStartNoGCRegion - https://learn.microsoft.com/en-us/dotnet/api/system.gc.tryst...

GCSettings.LatencyMode - https://learn.microsoft.com/en-us/dotnet/standard/garbage-co...

The latter is what Osu! uses when you load and start playing a map, to sustain 1000hz game loop. Well, it also avoids allocations within it like fire but it's going to apply to any non-soft realtime system.

neonsunset

9 months ago

These are not mutually exclusive. In some GC-based languages these techniques are immediately available and some other languages take more abstracted away approach, relying more on the underlying runtime.

pjmlp

9 months ago

All of those options are available in Swift, D, C#, F#, Linear Haskell, OCaml with effects.

Only pointing out the ones relevant in 2024, and not the whole CS history.

steveklabnik

9 months ago

Then you should use the language that’s memory safe without GC.

Ygg2

9 months ago

Yes, I'm saying in those cases it does make some sense to rewrite it in Rust.

UncleMeat

9 months ago

If you have a codebase that has no security implication and won't receive significant benefit from improved productivity/stability from a more modern language, almost certainly not.

Such projects exist. That's fine.

nine_k

9 months ago

Your engineers are usually your most expensive resource. Developing software in Typescript or even Ruby is a way to get to the revenue faster, having spent less money on development. Development cost and time (that is, the opportunity cost) are usually the most important limitations for a project where the defect rate does not need to be extremely low (like in aircraft control firmware). Rust saves you development time because less of it is spent fixing bugs, but often would pick it not because it saves you RAM and CPU cycles; Haskell or, well, Ada/Spark may be comparably efficient if you can wield them.

hyperman1

9 months ago

This is true, but there is a crossover point where engineers spend more time understanding existing code than writing new code. Crossing it is typically a point where more static languages with more built in checks become cheaper than more dynamic code. In my experience, it takes about a year to reach this point, less if you hire more people.

user

9 months ago

[deleted]

Ygg2

9 months ago

To a point. Let's say you're optimizing a backend written in TS on Amazon, sure it's cheaper to hire guys to optimize it, but at some point it won't be. Either you need some really high class talent to start optimizing shit out of TS or you can't scale as fast as before.

Didn't something similar happened at Discord. It was Go if I recall.

nine_k

9 months ago

Indeed, the larger the scale, the more impact (including the costs of operation) a piece of software has. In a dozen^W^W OK, a thousand places where the scale is large, engineers shave off a percent or two of resource usage, and this saves company money.

But most places are small, and engineers optimize the time to market while remaining at an acceptable levels of resource consumption and support expenses, by using stuff like next.js, RoR, etc. And that saves the company money.

There is also a spectrum in between, with associated hassles of transition as the company grows.

My favorite example is that eBay rewrote their backend three times, and they did it not because they kept building the wrong thing. They kept building the right thing for their scale at the moment. Baby clothes don't fit a grown-up, but wearing baby clothes while a baby was not a mistake.

Ideally, of course, you have a tool that you can keep using from the small prototype stage to the world-scale stage, and it lets you build highly correct, highly performant software quickly. Let's imagine that such a fantastical tool exists. To my mind, the problem is usually not in it but in the architecture: what's efficient at a large scale is uselessly complex at a small scale. The ability to express intricate constraints that precisely match the intricacies of your highly refined business logic may feel like an impediment while you're prototyping and haven't yet discovered what the logic should really be.

In short, more precision takes more thinking, and thinking is expensive. It should be applied where it matters most, and often it's not (yet) the technical details.

pjmlp

9 months ago

Indeed, however I would vouch those rewrites have been more a people issue than technical.

Apparently there is unwillingness to have "high class talent", while starting from scratch in a completly different programming language stack where everyone is a junior, is suddenly ok. Clearly CV driven development decision.

Secondly, in many cases, even if all optimization options on the specific language are exhausted, it is still easier to call into a native library in a lower level language, than a full rewrite from scratch.

geodel

9 months ago

My takeaway was if it is our server/cloud cost we will try to optimize the hell out of it. When it is user / client side cost, fuck the user, we will write the crappiest possible software in JS/TS/Electron whatever and ask user upgrade their machine if they whine too much.

And totally agree about CV driven development. All these rewrite articles looks like they are trying to convince themselves instead if intelligent readers about rewrites.

user

9 months ago

[deleted]

WuxiFingerHold

9 months ago

I don't think it is in itself that surprising, that writing all new code in MSLs leads to less MS vulnerabilities. The likelihood of new code introducing new MS issues is after all higher than to discover existing ones in old code (which still happens of course).

However, it is impressive that from 2019 to 2023 the issues went down by almost 60% without rewriting, just by adding new code in MSLs (Rust mostly, I guess, probably a bit Kotlin). 60% is a massive achievement. I wonder why there's a plateau from 2021 and 2023. The fact that they use extrapolated number for 2024 reveals their intention to show nice numbers, but it would have been more beneficial if they'd have spent the time to analyse and explain the rise from 2022 to 2023.

I'm a bit disappointed that they don't tell us how much of the new MS code is written in which language. Seems like they're using both Rust and Kotlin. Is it 95% Rust? Or "just" 50% Rust? If the Kotlin portion is significant, in what areas do they use it instead of Rust?

infogulch

9 months ago

I'd like to acknowledge that the charts in this article are remarkably clear and concise. A great demonstration of how careful data selection and labeling can communicate the intended ideas so effortlessly that they virtually disappear into the prose.

So the upshot of the fact that vulnerabilities decay exponentially is that the focus should be on net-new code. And spending effort on vast indiscriminate RiiR projects is a poor use of resources, even for advancing the goal of maximal memory safety. The fact that the easiest strategy, and the strategy recommended by all pragmatic rust experts, is actually also the best strategy to minimize memory vulnerabilities according to the data is notably convergent if not fortuitous.

> The Android team has observed that the rollback rate of Rust changes is less than half that of C++.

Wow!

Wowfunhappy

9 months ago

> The answer lies in an important observation: vulnerabilities decay exponentially. They have a half-life. [...] A large-scale study of vulnerability lifetimes2 published in 2022 in Usenix Security confirmed this phenomenon. Researchers found that the vast majority of vulnerabilities reside in new or recently modified code.

It stands to reason, then, that it would be even better for security to stop adding new features when they aren't absolutely necessary. Windows LTSC is presumably the most secure version of Windows.

Animats

9 months ago

Individual bugs triggered in normal operation ought to decay over time on software that is maintained. If bugs cause problems, someone may report them and some fraction of them will be fixed. That's a decay mechanism.

Not-yet exploited vulnerabilities, though, don't have that decay mechanism. They don't generate user unhappiness and bug reports. They just sit there, until an enemy with sufficient resources and motivation finds and exploits them.

There are more enemies in that league than there used to be.

elcritch

9 months ago

Your assertions contradict the Usenix research cited in TFA, which found that the lifetime of vulnerabilities _do_ follow an exponential decay. If it takes longer to find a vulnerability, then its lifetime is longer.

Animats

9 months ago

What the article calls a "vunerability" is something they found internally.

Looking at vulnerabilities that were found from attacks, it looks different. [1] Most vulnerabilities are fixed in the first weeks or months. But ones that aren't fixed within a year hang on for a long time. About 18% of reported vulnerabilities are never fixed.

[1] https://www.tenable.com/blog/what-is-the-lifespan-of-a-vulne...

fanf2

9 months ago

I think that’s about time to deploy fixed code, not about time to discover the vulnerability nor about time to fix the code.

UncleMeat

9 months ago

Vulns aren't just vulns. "Hey, in some weird circumstances we see a uaf here and the system crashes" is the sort of thing you might see in an extremely ordinary crash report while also having significant security implications.

You can also uncover latent vulns over time through fuzzing or by adding new code that suddenly exercises new paths that were previously ill-tested.

Yes, there are some vulns that truly will never get exercised by ordinary interaction and won't become naturally visible over time. But plenty do get uncovered in this manner.

pfdietz

9 months ago

> vulnerabilities decay exponentially

This should be true not just of vulnerabilities, but bugs of any kind. I certainly see this in testing of the free software project I'm involved with (SBCL). New bugs tend to be in parts that have been recently changed. I'm sure you all have seen the same sort of effect.

(This is not to say all bugs are in recent code. We've all seen bugs that persist undetected for years. The question for those should be how did testing miss them.)

So this suggests testing should be focused on recently changed code. In particular, mutation testing can be highly focused on such code, or on code closely coupled with the changed code. This would greatly reduce the overhead of applying this testing.

Google has had a system where mutation testing has been used with code reviews that does just this.

Ygg2

9 months ago

> that it would be even better for security to stop adding new features when they aren't absolutely necessary

Even if features aren't necessary to sell your software, new hardware and better security algorithms or full on deprecation of existing algos will still happen. Which will introduce new code.

wepple

9 months ago

Or an alternative approach: only compile the subset of features you explicitly need.

Obviously there’s a ton of variance in how practical this is any place, but it’s less common than it should be.

adgjlsfhk1

9 months ago

This can be a really bad idea since it drastically increases the risk of users running a compiled combination of features that was never tested.

pjc50

9 months ago

This is usually absolutely horrendous to do, and of course exponentially increases your test workload for every possible combination of feature flags. If you're doing it by #ifdef it has the added effect of making code unreadable.

Only really practical if "features" are "plugins".

pfdietz

9 months ago

Or it's a really great idea, since you now can produce a diversity of software products, all of which should be correct on a subset of features, and all of which can be tested independently. Perhaps bugs can be flushed out that would be latent and undetectable in your standard build.

Having lots of knobs you can tweak is great for randomized testing. The more the merrier.

sieabahlpark

9 months ago

Allow me to introduce a whole new suite of bugs that occur when feature A exists but feature B doesn't.

Congrats you're back to square 1!

wolrah

9 months ago

> Allow me to introduce a whole new suite of bugs that occur when feature A exists but feature B doesn't.

Yeah, but are those bugs security bugs? Memory safety bugs are a big focus because they're the most common kind of bugs that can be exploited in a meaningful way.

Disabling entire segments of code is unlikely to introduce new memory safety bugs. It's certainly likely to find race conditions, and those can sometimes lead to security bugs, but its not nearly as likely as with memory safety bugs.

ReleaseCandidat

9 months ago

> Yeah, but are those bugs security bugs?

If the software is unusable, it doesn't matter if it has security bugs too. Or, to rephrase, the safest software is software nobody uses.

theptip

9 months ago

This is why I advise most folks to not take the latest point release of your language or libraries.

The bleeding edge is where many of the new vulns are. In general the oldest supported release is usually the safest.

The trade-off is when newer versions have features which will add value, of course. But usually a bad idea to take any version that ends “.0” IMO.

gortok

9 months ago

There is a correlation between new code and memory vulnerabilities (a possible explanation is given in the blog post, that vulnerabilities have a half-life that decays rapidly), but why does the blog post indicate causation between the two factors?

There is more than one possible and reasonable explanation for this correlation:

1. New code often relates to new features, and folks focus on new features for vulnerabilities. 2. Older code has been through more real life usage, which can exercise those edge cases where memory vulnerabilities reside.

I’m just not comfortable saying new code causes memory vulnerabilities and that vulnerabilities have a half-life that decays rapidly. That may —- may be true in sheer number count, but doesn’t seem to be true in impact, thinking back to the high-impact vulnerabilities in OSS like the heartbleed bug, and the cache-invalidation bugs for CPUs.

nitwit005

9 months ago

I'd just assume the oldest code is written by the best people, and under less time pressure.

My current company is an example of this. Early code written by founders. Some new code written by contractors under a tight deadline.

anymouse123456

9 months ago

I'm here with you for the downvotes.

This essay takes some interesting data from very specific (and unusual) projects and languages from a very specific (and unusual) culture and stridently extrapolates hard numeric values to all code without qualification.

> For example, based on the average vulnerability lifetimes, 5-year-old code has a 3.4x (using lifetimes from the study) to 7.4x (using lifetimes observed in Android and Chromium) lower vulnerability density than new code.

Given this conclusion, I can write a defect-filled chunk of C code and just let it marinate for 5 years offline in order for it become safe?

I'm pretty sure there are important data in this research and there is truth underneath what is being shared, but the unsupported confidence and overreach of the writing is too distracting for me.

throwaway2037

9 months ago

Why is this downvoted? It raises some interesting and important issues. I saw a big bank's back office settlement system once. 30+ years old with almost no unit tests. It changes very little now and is virtually bug free because people have been fixing bugs in it for 30+ years! When they need to make changes these days, they first write unit tests for existing behaviour, then fix the bug. It is an example of how code can mature with a very low defect rate with limited unit tests.

steveklabnik

9 months ago

I didn't downvote it, but, while I agree that there's reason to be skeptical that this research generalizes, the framing is aggressive.

> stridently extrapolates hard numeric values to all code without qualification.

The sentence they quote as evidence of this directly qualifies that this is from Android and Chromium.

anymouse123456

9 months ago

Please read the quotation more carefully. I appreciate the author calls out the source of the data, but the claims remain overly strong, broad and unqualified.

I concede this may not be the strongest example, but in my opinion, the language throughout the article, starting with the title, makes stronger claims than the evidence provided supports.

I agree with the author, that these are useful projects to use for research. I'm struggling with the lack of qualification when it comes to the conclusions.

Perhaps I missed it, but I also didn't see information about trade-offs experienced in the transition to Rust on these projects.

Was there any change related to other kinds of vulnerabilities or defects?

How did the transition to Rust impact the number of features introduced over a given time period?

Were the engineers able to move as quickly in this (presumably) new-to-them language?

I'm under the impression that it can take many engineers multiple years to begin to feel productive in Rust, is there any measure of throughput (even qualitative) that could be compared before, during and after that period?

I'm hung up on what reads as a sales pitch that implies broad and deep benefits to any software project of any scope, scale or purpose and makes no mention of trade offs or disadvantages in exchange for this incredible benefit.

throwaway2037

9 months ago

Wow, this an excellent follow-up. You shared so many interesting questions. I too want to know the answers! How about you post these as a comment to their blog post? It might spur them to write a follow-up and dive into more details and nuance.

steveklabnik

9 months ago

Oh, a day later, I realized that there have been some of these questions of yours answered, but previously:

> I also didn't see information about trade-offs experienced in the transition to Rust on these projects.

Yeah, I mean that's just not the topic of this particular post. But they have talked about it. Specifically this, from 2023:

https://opensource.googleblog.com/2023/06/rust-fact-vs-ficti...

> I'm under the impression that it can take many engineers multiple years to begin to feel productive in Rust, is there any measure of throughput (even qualitative) that could be compared before, during and after that period?

That is something that people say, but Google found differently. As that post says, "more than 2/3 of respondents are confident in contributing to a Rust codebase within two months or less when learning Rust. Further, a third of respondents become as productive using Rust as other languages in two months or less."

On this one:

> Were the engineers able to move as quickly in this (presumably) new-to-them language?

Regarding "presumably," 13% had Rust experience, but the rest did not.

They say "a third of respondents become as productive using Rust as other languages in two months or less" and "we’ve seen no data to indicate that there is any productivity penalty for Rust relative to any other language these developers previously used at Google."

anymouse123456

9 months ago

Thanks for such a detailed response and especially the new link, it's really helpful.

steveklabnik

9 months ago

As I said, I think you're fine to be skeptical, and there's surely a lot more stuff to be researched in the future, including these questions. I was just trying to speculate on maybe why you got downvotes.

bluGill

9 months ago

The idea is you write bug filled code but someone notices and fixes some of those bugs and soethe code counts as newer but it also has less bugs and eventualay enough bugs are fixed that nobody has to touch it and then the now bug free code gets old.

fn-mote

9 months ago

I actually don't think this is right. Bug fixes in the old code aren't in the "new, memory safe" code. You're fixing the old code as it gets older, reducing the defects.

bluGill

9 months ago

but now the old code has less memory defects

hoten

9 months ago

I guess the article could have made it clearer, but software that is not used or where security issues are not actively fixed, they obviously don't apply to the findings. But that kind of code is uninteresting to talk about in the first place.

benwilber0

9 months ago

> Increasing productivity: Safe Coding improves code correctness and developer productivity by shifting bug finding further left, before the code is even checked in. We see this shift showing up in important metrics such as rollback rates (emergency code revert due to an unanticipated bug).

> The Android team has observed that the rollback rate of Rust changes is less than half that of C++.

I've been writing high-scale production code in one language or another for 20 years. But I when I found Rust in 2016 I knew that this was the one. I was going to double-down on this. I got Klabnik and Carol's book literally the same day. Still have my dead-tree copy.

It's honestly re-invigorated my love for programming.

pclmulqdq

9 months ago

That makes sense because the #1 reason I have had to roll back my own C++ commits is due to crashes from some dumb failure to check whether a null pointer is null. If Rust is going to prevent that issue and other similar issues of stupid coding, you would expect whole classes of rollbacks to go away.

acdha

9 months ago

I now compare C to an IKEA bed we have in our guest room which has storage drawers making the edge straight down to the floor without a gap. I’m a grownup, I know that I need to stop half a step early, but every few weeks I stub a toe while I’m thinking about something else.

ahoka

9 months ago

TBH most of these issues go away when your language has no implicit nullability.

steveklabnik

9 months ago

That’s very kind, thank you.

benwilber0

9 months ago

You're a legend. Thanks for writing The Book. It really affected my life in a very positive way.

j-krieger

9 months ago

I feel entirely the same. I actively miss Rust when I need to choose another language.

ramon156

9 months ago

This is so relatable. Without sounding like a fanboy, Rust makes other languages feel like toy languages.

ahoka

9 months ago

Toy languages, like Haskell, Ocaml, Kotlin and F#?

aloisdg

9 months ago

replace Kotlin with Elixir and I am with you

user

9 months ago

[deleted]

SkyMarshal

9 months ago

They talk about "memory safe languages (MSL)" plural, as if there is more than one, but only explicitly name Rust as the MSL they're transitioning to and improving interoperability with. They also mention Kotlin in the context of improving Rust<>Kotlin interop, which also has some memory-safe features but maybe not to same extent as Rust. Are those the only two Google uses, or are there others they could be referring to?

steveklabnik

9 months ago

A few thoughts:

People who care about this issue, especially in the last few years, have been leaning into a "memory safe language" vs "non memory safe language" framing. This is because it gets at the root of the issue, which is safe by default vs not safe by default. It tries to avoid pointing fingers at, or giving recommendations for, particular languages, by instead putting the focus on the root cause.

In the specific case of Android, the subject of this post, I'm not aware of attempts to move into other MSLs than those. But I also don't follow Android development generally, but I do follow these posts pretty closely, and I don't remember any of them talking about stuff other than Rust or Kotlin.

amluto

9 months ago

> I don't remember any of them talking about stuff other than Rust or Kotlin.

Don’t forget the old, boring one: Java.

I assume the reason that Go doesn’t show up so much is that most Android processes have their managed, GC’d Java-ish-virtual-machine world and their native C/C++ world. Kotlin fits in with the former and Rust fits in with the latter. Go is somewhat of its own thing.

vvanders

9 months ago

Android has a surprising amount of core OS functionality in boring managed Java code. ART/Dalvik are quite impressive combined with a few other clever tricks to make a system that ran in a pretty small footprint.

pjmlp

9 months ago

Including mixed mode JIT/AOT compilation with PGO feedback, which many still aren't aware of.

pdimitar

9 months ago

I would think one of the reasons that Golang is not utilized is its lack of tagged unions. Another might be that it has a runtime and a GC which is typically undesirable for systems (as in: very close to the metal) software.

user

9 months ago

[deleted]

pjmlp

9 months ago

Go doesn't have any role on Android, other than being used on a build system that uses Go as its DSL.

user

9 months ago

[deleted]

nightpool

9 months ago

It's not just Rust—rewriting a C network service into Java or Python or Go is also an example of transitioning to memory safe languages. The point is that you're not exposed to memory safety bugs in your own code. Arguably it's much better to choose a language without Rust-like manual memory management when you don't absolutely need it.

pdimitar

9 months ago

I have chosen Rust over Golang on a number of occasions for a much more boring reason: Golang lacks enums / tagged unions / sum types. When you have to manually eyeball your code to ensure exhaustiveness, it gets old and tiring really fast.

For that reason I'd use OCaml as well even though it has GC, because it has sum types. That is, if I ever learn OCaml properly.

tialaramex

9 months ago

If you need concurrency then depending on exactly what you're doing with it Rust still looks like the right choice. If you can just have it magically work no worries, but the moment you need synchronization primitives or phrases like "thread safe" come into the conversation you're much more likely to get it wrong in the other languages.

alpire

9 months ago

Android talked more about the memory-safe languages they're using in a previous blog post: https://security.googleblog.com/2022/12/memory-safe-language...

Google also published their perspective on memory safety in https://security.googleblog.com/2024/03/secure-by-design-goo..., which also goes over some of the memory-safe languages in use like Java, Go and Rust.

dgacmu

9 months ago

There are many and google uses several - rust, python, java, and go among them. But low-level code for Android has historically been in c++ and Rust is the primary memory-safe replacement for the stuff they're building.

jnwatson

9 months ago

Java and Kotlin are used for apps. Rust is used for new system software.

Across Google, Go is used for some system software, but I haven't seen it used in Android.

GeekyBear

9 months ago

Google attempted to write a network stack in Go for their Fuchsia OS, but it had to be ripped out and rewritten for performance reasons.

https://news.ycombinator.com/item?id=22409838

GeekyBear

9 months ago

There are many memory safe languages, but not many of those are compiled and able to offer performance that is in the same ballpark as C.

Rust and Swift are the two most widely used.

Interestingly, Swift had interoperating with C as an explicit design goal, while Rust had data race safety as a design goal.

Now we have data race safety added in the latest version of Swift, and Rust looking to improve interoperability with C.

Narishma

9 months ago

I was under the impression that Swift was more in the Go/C# bucket rather than Rust/C++ in terms of performance.

GeekyBear

9 months ago

Reference counting doesn't have the same cost as a sweeping garbage collector.

Apple has also optimized their custom ARM core to further reduce the cost.

> retaining and releasing an NSObject takes ~30 nanoseconds on current gen Intel, and ~6.5 nanoseconds on an M1

https://blog.metaobject.com/2020/11/m1-memory-and-performanc...

That said, Swift is working toward adding a future opt-in Rust inspired approache to memory management for those who need it.

https://forums.swift.org/t/manifesto-ownership/5212

https://forums.swift.org/t/a-roadmap-for-improving-swift-per...

pjmlp

9 months ago

Swift's long term roadmap is to be usable in all scenarios Apple currently uses C, Objective-C and C++ for, it is even on its official documentation and product description site.

How much it gets there, depends on squizzing juice out of LLVM backend for Swift code.

GeekyBear

9 months ago

Given that the person who created LLVM also created Swift, the ability to be used as a systems programming language was always there.

Chris Lattner, the creator of both LLVM and Swift, has referred to Swift as “syntactic sugar for LLVM.” They are deeply tied together.

The amount of Swift code in Apple's operating systems has increased every year.

https://blog.timac.org/2023/1019-state-of-swift-and-swiftui-...

Lattner gave an interview looking at the advantages of reference counting over the sort of garbage collection used in languages like Java, C#, and Go while still avoiding error-prone manual memory management.

> ARC has clear advantages in terms of allowing Swift to scale down to systems that can’t tolerate having a garbage collector, for example, if you want to write firmware in Swift. I think that it does provide a better programming model where programmers think just a little bit about memory.

https://atp.fm/205-chris-lattner-interview-transcript#gc

NoahKAndrews

9 months ago

Kotlin is memory-safe. It runs with a GC.

adgjlsfhk1

9 months ago

GC is memory-safe.

saagarjha

9 months ago

That's what they said.

adgjlsfhk1

9 months ago

I can't read. oops.

wmf

9 months ago

Google has always tried to use a small set of languages. For Android they try to use C/C++, Java/Kotlin, and now Rust. The same lessons still apply in rampantly polyglot environments though.

_ph_

9 months ago

That is the separation between abstract considerations and a given projects constraints. In a given project there might be few choices, but if you talk about fundamental phenomena, you have to reason about arbitrary projects. And of course, there are plenty of alternatives to Rust. Even limited to Android, there are several choices, even if they might be a smaller set.

pdimitar

9 months ago

It's impractical to increase the surface of complexity even further. I for one approve that they settled on just one memory-safe language for their new code.

_ph_

9 months ago

Oh sure, for the Linux kernel, Rust makes a lot of sense and one would try to keep the number of languages used minimal. But I was speaking about an abstract context. There are plenty of memory safe languages which could be used. But eventually, one has to decide for one.

throwaway2037

9 months ago

    > memory safe languages

I would say anything that runs on JVM and CLR, and scripting langs, like Python, Perl, Ruby, etc.

Edit: I forgot Golang!

jeffbee

9 months ago

Perhaps they are even considering carbon to be memory safe

ievans

9 months ago

So the argument is because the vulnerability lifetime is exponentially distributed, focusing on secure defaults like memory safety in new code is disproportionately valuable, both theoretically and now evidentially seen over six years on the Android codebase.

Amazing, I've never seen this argument used to support shift/left secure guardrails but it's great. Especially for those with larger, legacy codebases who might otherwise say "why bother, we're never going to benefit from memory-safety on our 100M lines of C++."

I think it also implies any lightweight vulnerability detection has disproportionate benefit -- even if it was to only look at new code & dependencies vs the backlog.

naming_the_user

9 months ago

I'm a little uneasy about the conclusions being drawn here as the obvious counterpoint isn't being raised - what if older code isn't being looked at as hard and therefore vulnerabilities aren't being discovered?

It's far more common to look at recent commit logs than it is to look at some library that hasn't changed for 20 years.

MBCook

9 months ago

> what if older code isn't being looked at as hard and therefore vulnerabilities aren't being discovered?

It wasn’t being look at as hard before either. I don’t think that’s changed.

They don’t give a theory for why older code has fewer bugs, but I’ve got one: they’ve been found.

If we assumed that any piece of code has a fixed amount of unknown bugs per 1000 lines, it stands to reason that overtime the sheer number of times the code is run with different inputs in prod makes it more and more likely they will be discovered. Between fixing them and the code reviews while fixing them the hope would be that on average things are being made better.

So overtime, there are fewer bugs per thousand lines in existing code. It’s been battle tested.

As the post says, if you continue introducing new bugs at the same rate you’re not going to make progress. But if using a memory safe language means you’re introducing fewer bugs in new features then overtime the total number of bugs should be going down.

pacaro

9 months ago

I've always thought of this as being equivalent to "work hardening"

My concern with it is more about legitimately old code (android is 20ish years old, so reasonably falls into this category) which was written using standards and tools of the time (necessarily)

It requires a constant engineering effort to keep such code up to date. And the older code is, typically, less well understood.

In addition older code (particularly in systems programming) is often associated with older requirements, some of which may have become niche over time.

That long tail of old, less frequently exercised, code feels like it may well have a sting in its tail.

The halflife/work-hardening model depends on the code being stressed to find bugs

kernal

9 months ago

Android was released on September 23, 2008, so it just had its sweet 16.

pacaro

9 months ago

I believe that they started writing it in 2003. It's hard to precisely age code unless you cut it down and count the number of rings

SoylentOrange

9 months ago

I don’t understand this point. The project under scrutiny is Android and people are detecting vulnerabilities both manually and automatically based on source code/binary, not over commit logs. Why would the commit logs be relevant at all to finding bugs?

The commits are just used for attribution. If there was some old lib that hasn’t been changed in 20 years that’s passed fuzzing and manual code inspection for 20 years without updates, chances are it’s solid.

saagarjha

9 months ago

Exploit authors look at commit logs because new features have bugs in them, and it's easier to follow that to find vulnerabilities than dive into the codebase to find what's already there.

e28eta

9 months ago

I wasn’t entirely satisfied with the assertion that older code has fewer vulnerabilities either. It feels like there could be explanations other than age for the discrepancy.

For example: maybe the engineers over the last several years have focused on rewriting the riskiest parts in a MSL, and were less likely to change the lower risk old code.

Or… maybe there was a process or personnel change that led to more defects.

With that said, it does seem plausible to me that any given bug has a probability of detection per unit of time, and as time passes fewer defects remain to be found. And as long as your maintainers fix more vulnerabilities than they introduce, sure, older code will have fewer and the ones that remain are probably hard to find.

mccr8

9 months ago

Their concern is not with theoretical vulnerabilities, but actual ones that are being exploited. If an attacker never tries to find a vulnerability in some code, then it might as well not have it.

daft_pink

9 months ago

I’m curious how this applies to Mac vs Windows, where most newer Mac code is written in memory safe swift, while Windows still uses primarily uses C or C++.

akyuu

9 months ago

Apple is still adding large amounts of new Objective-C code in each new macOS version [0].

I haven't found any language usage numbers for recent versions of Windows, but Microsoft is using Rust for both new development and rewriting old features [1] [2].

[0] Refer to section "Evolution of the programming languages" https://blog.timac.org/2023/1128-state-of-appkit-catalyst-sw...

[1] https://www.theregister.com/2023/04/27/microsoft_windows_rus...

[2] https://www.theregister.com/2024/01/31/microsoft_seeks_rust_...

TazeTSchnitzel

9 months ago

It should be noted that Objective-C code is presumably a lot less prone to memory safety issues than C code on average, especially since Apple introduced Automatic Reference Counting (ARC). For example:

• Use-after-frees are avoided by ARC

• Null pointer dereferences are usually safe (sending a message to nil returns nil)

• Objective-C has a great standard library (Foundation) with safe collections among many other things; most of C's dangerous parts are easily avoided in idiomatic Objective-C code that isn't performance-critical

But a good part of Apple's Objective-C code is probably there for implementing the underlying runtime, and that's difficult to get right.

saagarjha

9 months ago

Most of Apple's Objective-C code is in the application layer just like yours is

daft_pink

9 months ago

I found the evolution of programming languages article you sourced very interesting.

Just to summarize the article, it shows that writing completely new code in memory safe language, while maintaining non-memory safe code, results in a steep reduction in memory safe errors overtime even though it results in an overall increase in unsafe code. It says that most memory safe vulnerabilities come from completely new code not maintained code and thus argues you can get the most of the benefits of memory safe code without rewriting your entire code base, which I think is the main takeaway from the article.

I’m not sure that’s totally happening in MacOS from reading your article, but it kind of is, so I think my hypothesis is correct that MacOS will likely have less vulnerabilities as it transitions many newer projects to swift although its important to note that important vulnerable projects such as webkit are still written in C++.

munificent

9 months ago

> while Windows still uses primarily uses C or C++.

Do you have data for that? My impression is that a large fraction of Windows development is C# these days. Back when I was at EA, nearly fifteen years ago, we were already leaning very heavily towards C# for internal tools.

pjmlp

9 months ago

WinDev culture is largely C++ and they aren't in any rush to change that, other than some incremental use of Rust.

WinRT is basically COM and C++, nowadays they focus on C# as consumer language, mostly because after killing C++/CX, they never improved the C++/WinRT developer experience since 2016, and only Windows teams use it, while the few folks that still believe WinUI has any future rather reach out for C#.

If you want to see big chuncks of C# adoption you have to look into business units under Azure org chart.

tsujamin

9 months ago

Linking to my thread from yesterday, there’s definitely memory safe systems programming happening over there: https://news.ycombinator.com/item?id=41642788

0xDEAFBEAD

9 months ago

Trying to think through the endgame here -- As vulnerabilities become rarer, they get more valuable. The remaining vulnerabilities will be jealously hoarded by state actors, and used sparingly on high-value targets.

So if this blog post describes the 4th generation, perhaps the 5th generation looks something like Lockdown Mode for iOS. Let users who are concerned with security check a box that improves their security, in exchange for decreased performance. The ideal checkbox detects and captures any attack, perhaps through some sort of virtualization, then sends it to the security team for analysis. This creates deterrence for the attacker. They don't want to burn a scarce vulnerability if the user happens to have that security box checked. And many high-value targets will check the box.

Herd immunity, but for software vulnerabilities instead of biological pathogens.

Security-aware users will also tend to be privacy-aware. So instead of passively phoning home for all user activity, give the user an alert if an attack was detected. Show them a few KB of anomalous network activity or whatever, which should be sufficient for a security team to reconstruct the attack. Get the user to sign off before that data gets shared.

musicale

9 months ago

"The net result is that a PL/I programmer would have to work very hard to program a buffer overflow error, while a C programmer has to work very hard to avoid programming a buffer overflow error."

https://www.acsac.org/2002/papers/classic-multics.pdf

pjmlp

9 months ago

UNIX being free beer is the root cause of what we are still trying to fix.

musicale

9 months ago

Are you implying that proprietary UNIX wasn't written in C? Or that (like the internet perhaps) it was basically a research prototype with minimal support for security or reliability that was accidentally (and perhaps inadvisedly) turned into something that people use and depend on?

kernal

9 months ago

>Note that the data for 2024 is extrapolated to the full year (represented as 36, but currently at 27 after the September security bulletin).

The reduction of memory safety bugs to a projected 36 in 2024 for Android is extremely impressive.

cakoose

9 months ago

What happens if we gradually transition to memory-safe languages for new features, while leaving existing code mostly untouched except for bug fixes?

...

In the final year of our simulation, despite the growth in memory-unsafe code, the number of memory safety vulnerabilities drops significantly, a seemingly counterintuitive result [...]

Why would this be counterintuitive? If you're only touching the memory-unsafe code to fix bugs, it seems obviously that the number of memory-safety bugs will go down.

Am I missing something?

cogman10

9 months ago

The counter intuitive part is that there is now more code written in memory unsafe languages than there was before. Even if it's just bug fixing.

It's not as if bug fixes haven't resulted in new memory bugs, but apparently that rate is much lower in bug fixes than it is in brand new code.

MBCook

9 months ago

I think the standard assumption would be that you need to start replacing older code with memory safe code to see improvements.

Instead they’ve shown that only using memory safe languages for new code is enough for the total bug count to drop.

Stem0037

9 months ago

The idea of "Safe Coding" as a fundamental shift in security approach is intriguing. I'd be interested in hearing more about how this is implemented in practice.

alpire

9 months ago

For more information on our safe coding approach, as applied to the web domain, check out this paper (https://static.googleusercontent.com/media/research.google.c...) or this talk (https://www.youtube.com/watch?v=ccfEu-Jj0as).

Animats

9 months ago

Half a century since Pascal. Forty years since Ada. 28 years since Java. Fifteen years since Go. Ten years since Rust. And still unsafe code is in the majority.

wmf

9 months ago

You can't really write a library in Java or Go that a C program can use. The real magic is memory safety without GC and a huge runtime. But if you point that out people will call you a fanboy.

archargelod

9 months ago

Am I missing something or can I just write such library in a language with RAII or ARC (Nim, Swift) memory management? I don't understand why Rust people believe Rust is the only option.

wmf

9 months ago

Rust isn't literally the only option but it seems to be the only mainstream option.

MBCook

9 months ago

Is it?

If you add up all the JavaScript, C#, Java, Python, and PHP being written every year, that’s a lot of code.

Are we sure that all that combined isn’t more than C/C++? Or at least somewhat close?

tiffanyh

9 months ago

Doesn't libc interface requirement of C/C++ use, create massive downstream challenges for other languages to gain super mass adoption (at the OS level).

sanxiyn

9 months ago

Yes, but Linux (hence Android) doesn't have that problem, because its interface is system call not libc.

saagarjha

9 months ago

The practical interface for Android is Binder, which has an interface that can be made amendable to a richer language.

tiffanyh

9 months ago

But don't syscalls require C/C++ data structures & type definitions.

So while not technically "requiring" C/C++, if your language cannot map exactly to C/C++ data structs & type definitions - it won't work.

nineteen999

9 months ago

Yes.That's a problem for the contenders, Linux/UNIX kernel are written in C. Unless we want to add language-specific syscall interfaces for every compiled language out there.

Alternative is to, y'know, write a kernel in your language of choice and choose your own syscall specification suiting that language, and gain mass adoption. Easy!

cyberax

9 months ago

Neither Pascal nor Ada are memory-safe.

docandrew

9 months ago

It’s possible to write unsafe code in either but it’s also much easier to write safe code in both Pascal and Ada than C/C++. And it’s easier to write safe code in C++ than C. Memory safety exists in a spectrum, it’s not all or nothing.

pjmlp

9 months ago

More than C or C++ will ever be.

It is like complaining a bullet vest doesn't protect against heavy machine gun bullets.

cyberax

9 months ago

Not really. Standard Pascal's dynamic features were barely present, but even they allowed dangling references to exist.

And all practical Pascal clones (e.g. Object Pascal) had to grow constructor/destructor systems to cleanup the memory on de-allocation (or they went the garbage collection route). So they were on-par with C++ for safety.

Ada is similar. It provided safety only for static allocations with bounds known at the compile-time. Their version of safe dynamic allocations basically borrowed Rust's borrow checker: https://blog.adacore.com/using-pointers-in-spark

pjmlp

9 months ago

No one used Standard Pascal, just like everyone embraces their Compiler C extensions and many of them are blissfully unaware their code fails to compiler under strict ISO C conformance mode.

We are beyond Ada 83, Controlled Types and SPARK exist, and Rust also does runtime bounds checking, so what.

pjmlp

9 months ago

60 years since NEWP/ESPOL/JOVIAL.

ESPOL is one of the first recorded uses of unsafe code blocks, 1961.

user

9 months ago

[deleted]

mpweiher

9 months ago

Well, that tells us something...

xyst

9 months ago

[flagged]

user

9 months ago

[deleted]

wepple

9 months ago

We love it. Can move on to the other two hundred problems to work on.

Including where CNE will go next; logic and web bugs.

dataking

9 months ago

well put! making sure that the new, memory safe code interoperates with the old code and is equally well supported by tooling takes an incredible amount of work.

0xbadcafebee

9 months ago

So there's a C program. There's a bunch of sub-par programmers, who don't use the old, well documented, stable, memory-safe functions and techniques. And they write code with memory safety bugs.

They are eventually forced to transition to a new language, which makes the memory safety bugs moot. Without addressing the fact that they're still sub-par, or why they were to begin with, why they didn't use the memory safe functions, why we let them ship code to begin with.

They go on to make more sub-par code, with more avoidable security errors. They're just not memory safety related anymore. And the hackers shift their focus to attack a different way.

Meanwhile, nobody talks about the pink elephant in the room. That we were, and still are, completely fine with people writing code that is shitty. That we allow people to continuously use the wrong methods, which lead to completely avoidable security holes. Security holes like the injection attacks, which make up 40% of all CVEs now, when memory safety only makes up 25%.

Could we have focused on a default solution for the bigger class of security holes? Yes. Did we? No. Why? Because none of this is about security. Programmers just like new toys to play with. Security is a red herring being used to justify the continuation of allowing people to write shitty code, and play with new toys.

Security will continue to be bad, because we are not addressing the way we write software. Rather than this one big class of bugs, we will just have the million smaller ones to deal with. And it'll actually get harder to deal with it all, because we won't have the "memory safety" bogey man to point at anymore.

pornel

9 months ago

People have a finite amount of time and effort they can spend on making the code correct. When the language is full of traps, spec gotchas, antiquated misfeatures, gratuitous platform differences, fragmented build systems, then a lot of effort is wasted just on managing all of that nonsense that is actively working against writing robust code, and it takes away from the effort to make a quality product beyond just the language-wrangling.

You can't rely on people being perfect all the time. We've been trying that for 50 years, and only got an endless circle of CVEs and calls to find better programmers next time.

The difference is how the language reacts to the mistakes that will happen. It could react with "oops, you've made a mistake! Here, fix this", and let the programmer apply a fix and move on, shipping code without the bug. Or the language could silently amplify smallest mistakes in the least interesting code into corruption that causes catastrophic security failures.

When concatenating strings and adding numbers securely is a thing that exists, and a thing that requires top-skilled programmers, you're just wasting people's talent on dumb things.

tptacek

9 months ago

No, this just isn't how it works. You'll find business logic, domain-specific, and systems programming security errors on every platform, but you don't find them with the density and the automatic severity you do memory safety issues. This is not about "language hipsters".

0xbadcafebee

9 months ago

If the impetus behind moving to new languages was primarily focused on improving security, we could have first attempted to address the underlying cause of programming security issues. The underlying cause is not "it's hard to check buffer boundaries". If you don't believe me, do a 5 Whys on the average buffer overflow, and tell me the best solution is still a new language.

steveklabnik

9 months ago

> we could have first attempted to address the underlying cause of programming security issues.

A lot of the recent moves here are motivated by multiple large industry players noticing a correlation between security issues and memory-unsafe language usage. "70% of security vulnerabilities are due to memory unsafety" is a motivating reason to move towards memory safe languages.

What do you believe the underlying cause to be?

0xbadcafebee

9 months ago

Security bugs are just a quality defect. Quality defects have many causes, but they all roll up to three root causes: 1) lack of a process to ensure quality, 2) a process which does not address the defect, and 3) inability to follow the process.

If there is no process to eliminate the defects, then they will always appear. Based on my knowledge and experience of this kind of software development, I have noticed that a majority of the time, there is no process applied to the development which would eliminate these defects. We just expect, or hope, that the developers are smart enough and rigorous enough to do it by themselves without anyone asking. Well, in the real world, you can't just hope for quality. The results speak for themselves.

Even when there is a process applied to eliminate the defect, it is often not effective. Either the process doesn't go far enough (many still don't even use Valgrind), or the people applying the process either don't follow it, or not well enough.

Many in the industry have decided that the solution to these defect issues is to use a new system [programming language] which avoids the need for the process to begin with. That's not a bad idea in principle. But they've only done that for a single class of quality defect. The rest of the security bugs, and every other kind of defect imaginable, is still there. And new ones will appear over time.

So rather than play whack-a-mole very slowly designing brand new systems to eliminate one single class of defect at a time, my proposal is we change the way we do development fundamentally. Regardless of the system in use, we should be able to define processes, ensure they are applied correctly, and continuously improve them, to eliminate quality defects.

This would not only solve security issues, but all kinds of bugs. If this method was standard practice (and mandatory), the Crowdstrike issue never would have happened, because they would have been required to be looking for quality defects and eliminating them.

tptacek

9 months ago

Security bugs are a distinct kind of defect and can't be reasoned about the same was as ordinary bugs, because ordinary bugs don't have intentional adversaries driving them. It makes sense to accept the risk of some kinds of bugs in ways you generally can't with security bugs (without some other mitigation in place).

steveklabnik

9 months ago

I fully agree that if you want the highest quality outcomes, you have to consider the whole process.

However, I don't see that as being in conflict with using tools that can also help you achieve higher quality within a certain stage of that process.

0xbadcafebee

9 months ago

The problem isn't the new tool, it's wasting an opportunity to have a more significant impact. The same amount of time, money, and resources that's going to be invested in this, could have been invested in a larger solution that still would have solved the immediate goals, as well as many others. We're pissing away the chance to improve not only more security but reliability of software as a whole. This is going to distract people for years away from other issues, and the result will be real world harm that could have been prevented. All because one class of bug is meme-worthy.

tptacek

9 months ago

"It's hard to check buffer boundaries" is memory corruption circa 1998.

alpire

9 months ago

> You have a bunch of sub-par programmers, who don't use the old, well documented, stable, memory-safe functions and techniques. They write code with memory safety bugs.

We should really stop putting the blame on developers. The issue is not that developers are sub-par, but that they are provided with tools making it virtually impossible to write secure software. Everyone writes memory safety bugs when using memory-unsafe languages.

And one helpful insight here is that the security posture of a software application is substantially an emergent property of the developer ecosystem that produced it, and that includes having secure-by-design APIs and languages. https://queue.acm.org/detail.cfm?id=3648601 goes into more details on this.

bcoates

9 months ago

I mostly agree with you but take the opposite position: attacking memory safety bugs has been so successful that we should use the same pattern on other large classes of bugs.

It's absolutely possible to write a reasonably usable language that makes injection/escaping/pollution/datatype confusion errors nearly impossible, but it would involve language support and rewriting most libraries--just like memory safety did. Unfortunately we are moving in the opposite direction (I'm still angry about javascript backticks, a feature seemingly designed solely to allow the porting of php-style sql injection errors)

0xbadcafebee

9 months ago

Take that to its logical end: a language designed with high-level functions to automate all possible work that, if left to a human to craft, would expose security vulnerabilities.

You know what you have there? A new method for developing software.

If we just switch languages every time we have another class of vuln that we want to eliminate, it will take us 1,000 years to get them all.

Or we could just get them all today by fundamentally rethinking how we write software.

The problem isn't software. The problem is humans writing software. There are bugs because we are using a highly fallible process to create software. If we want to eliminate the bugs, we have to change how the software is created.

jnwatson

9 months ago

It is 100-1000 times harder to find application-level security bugs than it is to find memory safety bugs.

It is also far easier to apply formal analysis to code if you don't have to model arbitrary pointers.

Android has hard evidence that just eliminating memory safety makes a significant difference.

saagarjha

9 months ago

> There's a bunch of sub-par programmers, who don't use the old, well documented, stable, memory-safe functions and techniques.

…which are?

pjc50

9 months ago

> old, well documented, stable, memory-safe functions and techniques.

I'm sorry, are we still talking about C here? Where the old functions are the ones that are explicitly labelled as too dangerous to use, like strcmp()?

user

9 months ago

[deleted]

uecker

9 months ago

I think it is somewhat about security. The "Nobody can write secure C/C++ code" nonsense translates to "We are not at fault that are products are broken garbage, because it is simply impossible to write secure code." Now we can pretend to fix "a whole class of errors" (as long as there is no "unsafe" anywhere) by imposing Rust on programmers (and the open-source community that then will produce memory safe code for free for some to use) and while this may indeed help a bit, one can hope to avoid being really being held responsible for product security for another decade or so.

uid65534

9 months ago

Having the tool actively prevent classes of errors is a worthwhile endeavor, but I do agree it gets overly focused on alone when several other _massive_ classes of vulnerabilities continue to be introduced. At a high level though, it is a lot easier to just have a framework enforce 'x' on all devs to raise the minimum bar. It doesn't help that the average Rust-bro, from which a lot of these memory safety arguments come from, is utterly terrible at making that case in reality. Case example: https://github.com/pyca/cryptography/issues/5771.

I think a lot of the arguments around C++ for example being 'memory unsafe' is a bit ridiculous because its trivial to write memory safe C++. Just run -Wall and enforce the use of smart pointers, there are nearly zero instances in the modern day where you should be dealing with raw pointers or performing offsets that lead to these bugs directly. The few exceptions are hopefully with devs that are intelligent enough to do so safely with modern language features. Unfortunately, this rarely gets focused on by security teams it seems since they are instead chasing the newest shiny language like you mention.

alpire

9 months ago

> its trivial to write memory safe C++

It is not unfortunately. That's why we see memory safety being responsible for 70% of severe vulns across many C and C++ projects.

Some of the reasons include: - C++ does little to prevent out-of-bounds vulns - Preventing use-after-free with smart pointers requires heavy use of shared pointers, which often incurs a performance cost that is unacceptable in the environment C++ is used.

Dylan16807

9 months ago

> It is not unfortunately. That's why we see memory safety being responsible for 70% of severe vulns across many C and C++ projects.

I don't think that's really a rebuttal to what they're trying to say. If the vast majority of C++ devs don't follow those two rules, then that's not much evidence against those two rules providing memory safety.

adgjlsfhk1

9 months ago

Right but the reason they don't follow those 2 rules is that using them would require not using most C++ libraries that don't follow the rules, and would introduce performance regressions that negate the main reason they chose C++ in the first place.

Dylan16807

9 months ago

This entire post is about a gradual transition. You don't have to avoid libraries that break the rules, you just have to accept that they're outside the safe zone.

For performance, you'll have to be more specific about your shared pointer claims. But I bet that it's a very small fraction of C++ functions that need the absolute best performance and can't avoid those performance problems while following the two rules.

drivebycomment

9 months ago

> its trivial to write memory safe C++.

Very bold claim, and as such, it needs substantial evidence, as there is practically no meaningful evidence to support this. There are some real world non-trivial c++ code that are known to have very few defects, but almost all of them required extremely significant effort to get there.

roca

9 months ago

What does "enforce the use of smart pointers" actually mean? Idiomatic C++ uses "this" and references ubiquitously, both of which are vulnerable to use-after-free bugs just like raw pointers.