Eliminating Memory Safety Vulnerabilities at the Source

307 pointsposted a day ago
by coffeeaddict1

153 Comments

steveklabnik

a day ago

This is a very interesting post! One takeaway is that you don't need to re-write the world. Transitioning new development to a memory safe language can bring meaningful improvements. This is much easier (and cheaper) than needing to port everything over in order to get an effect.

gary_0

20 hours ago

In fact, these results imply that the benefits of re-writing the world are limited in terms of security. This raises the cost-benefit ratio of keeping mature legacy code and only using memory-safe languages for new code.

This also implies that languages and tooling with robust support for integrating with unsafe legacy code are even more desirable.

Xylakant

15 hours ago

Essentially, this has been Rusts value proposition from the outset - build a language that you can integrate into other codebases seamlessly, hence the choice of no runtime, no garbage collector etc. Bindgen (https://github.com/rust-lang/rust-bindgen) and similar tooling were around essentially since day one to assist in that.

It’s the only approach that has any chance of transitioning away from unsafe languages for existing, mature codebases. Rewriting entirely in a different language is not a reasonable proposition for every practical real-world project.

dadrian

15 hours ago

Rust had to be dragged kicking and screaming into integration with other languages, and its C++ compatibility is a joke compared to Swift.

It's absolutely true that you need integration and compatibility to enable iterative improvements, but Rust historically has been hostile to anything besides unsafe C ABI FFI, which is not suitable for the vast majority of incremental development that needs to happen.

Luckily, this is starting to change.

Philpax

13 hours ago

"Hostile" is assigning intent that was not present. C++ ABI integration was and is extremely difficult; you need to be able to fully handle C++ semantics and to target each platform's take on the ABI. Most C++ competitors have struggled with this for much the same reason.

This means that "solving this" requires partial integration of a C++ compiler; it's not a coincidence that the languages with the most success have been backed by the organisations that already had their own C++ compiler.

A much easier solution is to generate glue on both sides, which is what `cxx` does.

bluGill

3 hours ago

If the intent was to interoyerate with c++ then not supporting the api is hostile. I have a lot of code using vector, if I have to convert that to arrays then you are hostile and worse that is new code which per the article makes it suspect

d does support c++ abi. It seems almost dead now but it is possible.

realistically there are two c++ abis in the world. Itanimum and msvc. both are known well enough that you can imblement them if you want (it is tricky)

Philpax

an hour ago

> If the intent was to interoyerate with c++ then not supporting the api is hostile. I have a lot of code using vector, if I have to convert that to arrays then you are hostile and worse that is new code which per the article makes it suspect

It's not hostile to not commit resources to integrating with another language. It might be shortsighted, though.

> d does support c++ abi. It seems almost dead now but it is possible.

That was made possible by Digital Mars's existing C++ compiler and their ability to integrate with it / borrow from it. Rust can't take the same path. Additionally, D's object model is closer to C++ than Rust's is; it's not a 1:1 map, but the task is still somewhat easier. (My D days are approaching a decade ago, so I'm not sure what the current state of affairs is.)

> realistically there are two c++ abis in the world. Itanimum and msvc. both are known well enough that you can imblement them if you want (it is tricky)

The raw ABI is doable, yeah - but how do you account for the differences in how the languages work? As a simple example - C++ has copy constructors, Rust doesn't. Rust has guarantees around lifetimes, C++ doesn't. What does that look like from a binding perspective? How do you expose that in a way that's amenable to both languages?

`cxx`'s approach is to generate lowest-common denominator code that both languages can agree upon. That wouldn't work as a language feature because it requires buy-in from the other side, too.

SkiFire13

9 hours ago

> Rust had to be dragged kicking and screaming into integration with other languages

Rust has integration with the C ABI(s) from day 1, which makes sense because the C ABI(s) are effectively the universal ABI(s).

> and its C++ compatibility is a joke

I'm not sure why you would want Rust to support interoperability with C++ though, they are very different languages. Moreover why does this fall on Rust? C++ has yet to support integration with Rust!

> compared to Swift

Swift bundled a whole C++ compiler (Clang) to make it work, that's a complete deal breaker for many.

> Rust historically has been hostile to anything besides unsafe C ABI FFI

Rust historically has been hostile to making the Rust ABI stable, but not to introducing other optional ABIs. Introducing e.g. a C++ ABI has just not been done because there's no mapping for many features like inheritance, move constructors, etc etc. Ultimately the problem is that most ABIs are either so simple that they're the same as the C ABI or they carry so many features that it's not possible to map them onto a different language.

maxk42

5 hours ago

I would like to point out also that C++ is not yet compatible with itself. Even GCC has dozens of C++ features it doesn't implement and it was the first compiler to implement 100% of C++11: https://gcc.gnu.org/projects/cxx-status.html

bluGill

3 hours ago

Technically true but in the real world those are rare edge cases.

maxk42

2 hours ago

Only because they're not used. If you're not going to implement the spec then is it really a spec?

bluGill

3 hours ago

I have millions of lines of c++. I'd love t use something else but without interoperability it is much harder. If you only write C you are fine. If you write java you want java interoperability, similear for lisp, or whatever your existing code is in.

steveklabnik

8 hours ago

That was created with the intention of being integrated into a very large C++ codebase.

I do agree that they could do more. I find it kind of wild that they haven’t copied Zig’s cross compilation story for example. But this stuff being just fine instead of world class is more of a lack of vision and direction than a hostility towards the idea. Indifference is not malice.

Also, you seem to be ignoring the crabi work, which is still not a thing, of course, but certainly isn’t “actively hostile towards non-c ABIs”.

samatman

6 hours ago

I think they had to get over the knee-jerk reaction that Zig was a, quote, "massive step back for the industry", frowny face.

It seems they have, which is all to the good: the work to make custom allocators practical in Rust is another example. Rust still doesn't have anything like `std.testing.checkAllAllocationFailures`, but I expect that it will at some future point. Zig certainly learned a lot from Rust, and it's good that Rust is able to do the same.

Zig is not, and will not be, a memory-safe language. But the sum of the decisions which go into the language make it categorically different from C (the language is so different from C++ as to make comparisons basically irrelevant).

Memory safety is a spectrum, not a binary, or there wouldn't be "Rust" and "safe Rust". What Rust has done is innovative, even revolutionary, but I believe this also applies to Zig in its own way. I view them as representing complementary approaches to the true goal, which is correct software with optimal performance.

steveklabnik

5 hours ago

While Patrick played a critical role in Rust's history, he said that years after he stopped working on Rust. I haven't seen people still involved share strong opinions about Zig in either direction, to be honest.

I've known Andrew for a long time, and am a big fan of languages learning from each other. I have always found that the community of people working on languages is overall much more collegial to each other than their respective communities can be to each other, as a general rule. Obviously there are exceptions.

nindalf

12 hours ago

> Rust had to be dragged kicking and screaming into integration

On what basis do you make this flagrant claim? There is certainly an impedance mismatch between Rust and C++, but "kicking and screaming" implies a level of malice that doesn't exist and never existed.

Such a low quality comment.

bobajeff

10 hours ago

>has been hostile to anything besides unsafe C ABI FFI

As opposed to integrating with a whole C++ frontend? Gee, I wonder why more languages haven't done this obvious idea of integrating a whole C++ frontend within their build system.

pjc50

13 hours ago

Rewrites are so expensive that they're going to be very rare. But incremental nibbling at the edges is very effective.

I wonder if there's a "bathtub curve" effect for very old code. I remember when a particularly serious openssl vulnerability (heartbleed?) caused a lot of people to look at the code and recoil in horror at it.

Ygg2

19 hours ago

From security perspective I agree but what if you want to be rid of GC or just reduce your overall resource consumption?

pjmlp

17 hours ago

Learning how to actually use the language features that don't rely on GC, like on Swift, D, C#, Linear Haskell, Go, Eiffel, OCaml effects... is already a great step forward.

Plenty of people keep putting GC languages on the same basket without understanding what they are talking about.

Then if it is a domain where any kind of automatic resource management is impossible due to execution deadlines, or memory availability, Rust is an option.

elcritch

17 hours ago

Well Rust also has certain aspects of “automatic resource management”. You can run into execution deadline issues with allocation or reallocated (drop) in Rust. The patterns to avoid this in critical areas is largely the same in any of the languages you listed.

Though I like using effect systems like in Nim or Ocaml for preventing allocation in specific areas.

pjmlp

16 hours ago

You can even run into execution deadlines with malloc()/free(), that is why there are companies that made business out of selling specialized versions of them, and nowadays that is only less of a business because there are similar FOSS implementations.

The point is not winning micro-benchmarks games, rather there are specific SLAs for resource consumption and execution deadlines, is the language toolchain able to meet them, when the language features are used as they should, or does it run out of juice to meet those targets?

The recent Guile performance discussion thread is a good example.

If language X does meet the targets, and one still goes for "Rewrite XYZ into ZYW" approach, then we are beyond pure technical considerations.

redman25

9 hours ago

Most GC languages give you limited control over "when" garbage collection or allocation occur. With non-GC'd languages you can at least control them manually based on data structure choice (i.e. arenas), lifetimes, or manual drops.

throwaway2037

8 hours ago

Do you count Python, where the ref impl (CPython) uses reference counting, as a GC language? If yes, you have a better idea when GC will occur compared to non-deterministic GC like Java/JVM and C#/CLR.

neonsunset

8 hours ago

These are not mutually exclusive. In some GC-based languages these techniques are immediately available and some other languages take more abstracted away approach, relying more on the underlying runtime.

nine_k

18 hours ago

Your engineers are usually your most expensive resource. Developing software in Typescript or even Ruby is a way to get to the revenue faster, having spent less money on development. Development cost and time (that is, the opportunity cost) are usually the most important limitations for a project where the defect rate does not need to be extremely low (like in aircraft control firmware). Rust saves you development time because less of it is spent fixing bugs, but often would pick it not because it saves you RAM and CPU cycles; Haskell or, well, Ada/Spark may be comparably efficient if you can wield them.

hyperman1

17 hours ago

This is true, but there is a crossover point where engineers spend more time understanding existing code than writing new code. Crossing it is typically a point where more static languages with more built in checks become cheaper than more dynamic code. In my experience, it takes about a year to reach this point, less if you hire more people.

Ygg2

15 hours ago

To a point. Let's say you're optimizing a backend written in TS on Amazon, sure it's cheaper to hire guys to optimize it, but at some point it won't be. Either you need some really high class talent to start optimizing shit out of TS or you can't scale as fast as before.

Didn't something similar happened at Discord. It was Go if I recall.

nine_k

10 hours ago

Indeed, the larger the scale, the more impact (including the costs of operation) a piece of software has. In a dozen^W^W OK, a thousand places where the scale is large, engineers shave off a percent or two of resource usage, and this saves company money.

But most places are small, and engineers optimize the time to market while remaining at an acceptable levels of resource consumption and support expenses, by using stuff like next.js, RoR, etc. And that saves the company money.

There is also a spectrum in between, with associated hassles of transition as the company grows.

My favorite example is that eBay rewrote their backend three times, and they did it not because they kept building the wrong thing. They kept building the right thing for their scale at the moment. Baby clothes don't fit a grown-up, but wearing baby clothes while a baby was not a mistake.

Ideally, of course, you have a tool that you can keep using from the small prototype stage to the world-scale stage, and it lets you build highly correct, highly performant software quickly. Let's imagine that such a fantastical tool exists. To my mind, the problem is usually not in it but in the architecture: what's efficient at a large scale is uselessly complex at a small scale. The ability to express intricate constraints that precisely match the intricacies of your highly refined business logic may feel like an impediment while you're prototyping and haven't yet discovered what the logic should really be.

In short, more precision takes more thinking, and thinking is expensive. It should be applied where it matters most, and often it's not (yet) the technical details.

pjmlp

10 hours ago

Indeed, however I would vouch those rewrites have been more a people issue than technical.

Apparently there is unwillingness to have "high class talent", while starting from scratch in a completly different programming language stack where everyone is a junior, is suddenly ok. Clearly CV driven development decision.

Secondly, in many cases, even if all optimization options on the specific language are exhausted, it is still easier to call into a native library in a lower level language, than a full rewrite from scratch.

geodel

an hour ago

My takeaway was if it is our server/cloud cost we will try to optimize the hell out of it. When it is user / client side cost, fuck the user, we will write the crappiest possible software in JS/TS/Electron whatever and ask user upgrade their machine if they whine too much.

And totally agree about CV driven development. All these rewrite articles looks like they are trying to convince themselves instead if intelligent readers about rewrites.

steveklabnik

19 hours ago

Then you should use the language that’s memory safe without GC.

Ygg2

18 hours ago

Yes, I'm saying in those cases it does make some sense to rewrite it in Rust.

UncleMeat

11 hours ago

If you have a codebase that has no security implication and won't receive significant benefit from improved productivity/stability from a more modern language, almost certainly not.

Such projects exist. That's fine.

infogulch

20 hours ago

I'd like to acknowledge that the charts in this article are remarkably clear and concise. A great demonstration of how careful data selection and labeling can communicate the intended ideas so effortlessly that they virtually disappear into the prose.

So the upshot of the fact that vulnerabilities decay exponentially is that the focus should be on net-new code. And spending effort on vast indiscriminate RiiR projects is a poor use of resources, even for advancing the goal of maximal memory safety. The fact that the easiest strategy, and the strategy recommended by all pragmatic rust experts, is actually also the best strategy to minimize memory vulnerabilities according to the data is notably convergent if not fortuitous.

> The Android team has observed that the rollback rate of Rust changes is less than half that of C++.

Wow!

gortok

12 hours ago

There is a correlation between new code and memory vulnerabilities (a possible explanation is given in the blog post, that vulnerabilities have a half-life that decays rapidly), but why does the blog post indicate causation between the two factors?

There is more than one possible and reasonable explanation for this correlation:

1. New code often relates to new features, and folks focus on new features for vulnerabilities. 2. Older code has been through more real life usage, which can exercise those edge cases where memory vulnerabilities reside.

I’m just not comfortable saying new code causes memory vulnerabilities and that vulnerabilities have a half-life that decays rapidly. That may —- may be true in sheer number count, but doesn’t seem to be true in impact, thinking back to the high-impact vulnerabilities in OSS like the heartbleed bug, and the cache-invalidation bugs for CPUs.

anymouse123456

11 hours ago

I'm here with you for the downvotes.

This essay takes some interesting data from very specific (and unusual) projects and languages from a very specific (and unusual) culture and stridently extrapolates hard numeric values to all code without qualification.

> For example, based on the average vulnerability lifetimes, 5-year-old code has a 3.4x (using lifetimes from the study) to 7.4x (using lifetimes observed in Android and Chromium) lower vulnerability density than new code.

Given this conclusion, I can write a defect-filled chunk of C code and just let it marinate for 5 years offline in order for it become safe?

I'm pretty sure there are important data in this research and there is truth underneath what is being shared, but the unsupported confidence and overreach of the writing is too distracting for me.

bluGill

3 hours ago

The idea is you write bug filled code but someone notices and fixes some of those bugs and soethe code counts as newer but it also has less bugs and eventualay enough bugs are fixed that nobody has to touch it and then the now bug free code gets old.

hoten

3 hours ago

I guess the article could have made it clearer, but software that is not used or where security issues are not actively fixed, they obviously don't apply to the findings. But that kind of code is uninteresting to talk about in the first place.

throwaway2037

8 hours ago

Why is this downvoted? It raises some interesting and important issues. I saw a big bank's back office settlement system once. 30+ years old with almost no unit tests. It changes very little now and is virtually bug free because people have been fixing bugs in it for 30+ years! When they need to make changes these days, they first write unit tests for existing behaviour, then fix the bug. It is an example of how code can mature with a very low defect rate with limited unit tests.

steveklabnik

5 hours ago

I didn't downvote it, but, while I agree that there's reason to be skeptical that this research generalizes, the framing is aggressive.

> stridently extrapolates hard numeric values to all code without qualification.

The sentence they quote as evidence of this directly qualifies that this is from Android and Chromium.

anymouse123456

4 hours ago

Please read the quotation more carefully. I appreciate the author calls out the source of the data, but the claims remain overly strong, broad and unqualified.

I concede this may not be the strongest example, but in my opinion, the language throughout the article, starting with the title, makes stronger claims than the evidence provided supports.

I agree with the author, that these are useful projects to use for research. I'm struggling with the lack of qualification when it comes to the conclusions.

Perhaps I missed it, but I also didn't see information about trade-offs experienced in the transition to Rust on these projects.

Was there any change related to other kinds of vulnerabilities or defects?

How did the transition to Rust impact the number of features introduced over a given time period?

Were the engineers able to move as quickly in this (presumably) new-to-them language?

I'm under the impression that it can take many engineers multiple years to begin to feel productive in Rust, is there any measure of throughput (even qualitative) that could be compared before, during and after that period?

I'm hung up on what reads as a sales pitch that implies broad and deep benefits to any software project of any scope, scale or purpose and makes no mention of trade offs or disadvantages in exchange for this incredible benefit.

steveklabnik

4 hours ago

As I said, I think you're fine to be skeptical, and there's surely a lot more stuff to be researched in the future, including these questions. I was just trying to speculate on maybe why you got downvotes.

Wowfunhappy

a day ago

> The answer lies in an important observation: vulnerabilities decay exponentially. They have a half-life. [...] A large-scale study of vulnerability lifetimes2 published in 2022 in Usenix Security confirmed this phenomenon. Researchers found that the vast majority of vulnerabilities reside in new or recently modified code.

It stands to reason, then, that it would be even better for security to stop adding new features when they aren't absolutely necessary. Windows LTSC is presumably the most secure version of Windows.

Animats

18 hours ago

Individual bugs triggered in normal operation ought to decay over time on software that is maintained. If bugs cause problems, someone may report them and some fraction of them will be fixed. That's a decay mechanism.

Not-yet exploited vulnerabilities, though, don't have that decay mechanism. They don't generate user unhappiness and bug reports. They just sit there, until an enemy with sufficient resources and motivation finds and exploits them.

There are more enemies in that league than there used to be.

elcritch

17 hours ago

Your assertions contradict the Usenix research cited in TFA, which found that the lifetime of vulnerabilities _do_ follow an exponential decay. If it takes longer to find a vulnerability, then its lifetime is longer.

Animats

16 hours ago

What the article calls a "vunerability" is something they found internally.

Looking at vulnerabilities that were found from attacks, it looks different. [1] Most vulnerabilities are fixed in the first weeks or months. But ones that aren't fixed within a year hang on for a long time. About 18% of reported vulnerabilities are never fixed.

[1] https://www.tenable.com/blog/what-is-the-lifespan-of-a-vulne...

fanf2

12 hours ago

I think that’s about time to deploy fixed code, not about time to discover the vulnerability nor about time to fix the code.

UncleMeat

11 hours ago

Vulns aren't just vulns. "Hey, in some weird circumstances we see a uaf here and the system crashes" is the sort of thing you might see in an extremely ordinary crash report while also having significant security implications.

You can also uncover latent vulns over time through fuzzing or by adding new code that suddenly exercises new paths that were previously ill-tested.

Yes, there are some vulns that truly will never get exercised by ordinary interaction and won't become naturally visible over time. But plenty do get uncovered in this manner.

pfdietz

11 hours ago

> vulnerabilities decay exponentially

This should be true not just of vulnerabilities, but bugs of any kind. I certainly see this in testing of the free software project I'm involved with (SBCL). New bugs tend to be in parts that have been recently changed. I'm sure you all have seen the same sort of effect.

(This is not to say all bugs are in recent code. We've all seen bugs that persist undetected for years. The question for those should be how did testing miss them.)

So this suggests testing should be focused on recently changed code. In particular, mutation testing can be highly focused on such code, or on code closely coupled with the changed code. This would greatly reduce the overhead of applying this testing.

Google has had a system where mutation testing has been used with code reviews that does just this.

theptip

5 hours ago

This is why I advise most folks to not take the latest point release of your language or libraries.

The bleeding edge is where many of the new vulns are. In general the oldest supported release is usually the safest.

The trade-off is when newer versions have features which will add value, of course. But usually a bad idea to take any version that ends “.0” IMO.

wepple

a day ago

Or an alternative approach: only compile the subset of features you explicitly need.

Obviously there’s a ton of variance in how practical this is any place, but it’s less common than it should be.

adgjlsfhk1

20 hours ago

This can be a really bad idea since it drastically increases the risk of users running a compiled combination of features that was never tested.

pjc50

13 hours ago

This is usually absolutely horrendous to do, and of course exponentially increases your test workload for every possible combination of feature flags. If you're doing it by #ifdef it has the added effect of making code unreadable.

Only really practical if "features" are "plugins".

pfdietz

11 hours ago

Or it's a really great idea, since you now can produce a diversity of software products, all of which should be correct on a subset of features, and all of which can be tested independently. Perhaps bugs can be flushed out that would be latent and undetectable in your standard build.

Having lots of knobs you can tweak is great for randomized testing. The more the merrier.

sieabahlpark

21 hours ago

Allow me to introduce a whole new suite of bugs that occur when feature A exists but feature B doesn't.

Congrats you're back to square 1!

wolrah

19 hours ago

> Allow me to introduce a whole new suite of bugs that occur when feature A exists but feature B doesn't.

Yeah, but are those bugs security bugs? Memory safety bugs are a big focus because they're the most common kind of bugs that can be exploited in a meaningful way.

Disabling entire segments of code is unlikely to introduce new memory safety bugs. It's certainly likely to find race conditions, and those can sometimes lead to security bugs, but its not nearly as likely as with memory safety bugs.

ReleaseCandidat

17 hours ago

> Yeah, but are those bugs security bugs?

If the software is unusable, it doesn't matter if it has security bugs too. Or, to rephrase, the safest software is software nobody uses.

Ygg2

20 hours ago

> that it would be even better for security to stop adding new features when they aren't absolutely necessary

Even if features aren't necessary to sell your software, new hardware and better security algorithms or full on deprecation of existing algos will still happen. Which will introduce new code.

benwilber0

a day ago

> Increasing productivity: Safe Coding improves code correctness and developer productivity by shifting bug finding further left, before the code is even checked in. We see this shift showing up in important metrics such as rollback rates (emergency code revert due to an unanticipated bug).

> The Android team has observed that the rollback rate of Rust changes is less than half that of C++.

I've been writing high-scale production code in one language or another for 20 years. But I when I found Rust in 2016 I knew that this was the one. I was going to double-down on this. I got Klabnik and Carol's book literally the same day. Still have my dead-tree copy.

It's honestly re-invigorated my love for programming.

pclmulqdq

19 hours ago

That makes sense because the #1 reason I have had to roll back my own C++ commits is due to crashes from some dumb failure to check whether a null pointer is null. If Rust is going to prevent that issue and other similar issues of stupid coding, you would expect whole classes of rollbacks to go away.

acdha

8 hours ago

I now compare C to an IKEA bed we have in our guest room which has storage drawers making the edge straight down to the floor without a gap. I’m a grownup, I know that I need to stop half a step early, but every few weeks I stub a toe while I’m thinking about something else.

ahoka

6 hours ago

TBH most of these issues go away when your language has no implicit nullability.

steveklabnik

19 hours ago

That’s very kind, thank you.

benwilber0

18 hours ago

You're a legend. Thanks for writing The Book. It really affected my life in a very positive way.

j-krieger

18 hours ago

I feel entirely the same. I actively miss Rust when I need to choose another language.

ramon156

6 hours ago

This is so relatable. Without sounding like a fanboy, Rust makes other languages feel like toy languages.

ahoka

6 hours ago

Toy languages, like Haskell, Ocaml, Kotlin and F#?

aloisdg

6 hours ago

replace Kotlin with Elixir and I am with you

SkyMarshal

a day ago

They talk about "memory safe languages (MSL)" plural, as if there is more than one, but only explicitly name Rust as the MSL they're transitioning to and improving interoperability with. They also mention Kotlin in the context of improving Rust<>Kotlin interop, which also has some memory-safe features but maybe not to same extent as Rust. Are those the only two Google uses, or are there others they could be referring to?

steveklabnik

a day ago

A few thoughts:

People who care about this issue, especially in the last few years, have been leaning into a "memory safe language" vs "non memory safe language" framing. This is because it gets at the root of the issue, which is safe by default vs not safe by default. It tries to avoid pointing fingers at, or giving recommendations for, particular languages, by instead putting the focus on the root cause.

In the specific case of Android, the subject of this post, I'm not aware of attempts to move into other MSLs than those. But I also don't follow Android development generally, but I do follow these posts pretty closely, and I don't remember any of them talking about stuff other than Rust or Kotlin.

amluto

21 hours ago

> I don't remember any of them talking about stuff other than Rust or Kotlin.

Don’t forget the old, boring one: Java.

I assume the reason that Go doesn’t show up so much is that most Android processes have their managed, GC’d Java-ish-virtual-machine world and their native C/C++ world. Kotlin fits in with the former and Rust fits in with the latter. Go is somewhat of its own thing.

vvanders

21 hours ago

Android has a surprising amount of core OS functionality in boring managed Java code. ART/Dalvik are quite impressive combined with a few other clever tricks to make a system that ran in a pretty small footprint.

pdimitar

13 hours ago

I would think one of the reasons that Golang is not utilized is its lack of tagged unions. Another might be that it has a runtime and a GC which is typically undesirable for systems (as in: very close to the metal) software.

pjmlp

10 hours ago

Go doesn't have any role on Android, other than being used on a build system that uses Go as its DSL.

nightpool

a day ago

It's not just Rust—rewriting a C network service into Java or Python or Go is also an example of transitioning to memory safe languages. The point is that you're not exposed to memory safety bugs in your own code. Arguably it's much better to choose a language without Rust-like manual memory management when you don't absolutely need it.

pdimitar

13 hours ago

I have chosen Rust over Golang on a number of occasions for a much more boring reason: Golang lacks enums / tagged unions / sum types. When you have to manually eyeball your code to ensure exhaustiveness, it gets old and tiring really fast.

For that reason I'd use OCaml as well even though it has GC, because it has sum types. That is, if I ever learn OCaml properly.

tialaramex

5 hours ago

If you need concurrency then depending on exactly what you're doing with it Rust still looks like the right choice. If you can just have it magically work no worries, but the moment you need synchronization primitives or phrases like "thread safe" come into the conversation you're much more likely to get it wrong in the other languages.

dgacmu

a day ago

There are many and google uses several - rust, python, java, and go among them. But low-level code for Android has historically been in c++ and Rust is the primary memory-safe replacement for the stuff they're building.

GeekyBear

7 hours ago

There are many memory safe languages, but not many of those are compiled and able to offer performance that is in the same ballpark as C.

Rust and Swift are the two most widely used.

Interestingly, Swift had interoperating with C as an explicit design goal, while Rust had data race safety as a design goal.

Now we have data race safety added in the latest version of Swift, and Rust looking to improve interoperability with C.

jnwatson

a day ago

Java and Kotlin are used for apps. Rust is used for new system software.

Across Google, Go is used for some system software, but I haven't seen it used in Android.

throwaway2037

8 hours ago

    > memory safe languages
I would say anything that runs on JVM and CLR, and scripting langs, like Python, Perl, Ruby, etc.

Edit: I forgot Golang!

_ph_

16 hours ago

That is the separation between abstract considerations and a given projects constraints. In a given project there might be few choices, but if you talk about fundamental phenomena, you have to reason about arbitrary projects. And of course, there are plenty of alternatives to Rust. Even limited to Android, there are several choices, even if they might be a smaller set.

pdimitar

13 hours ago

It's impractical to increase the surface of complexity even further. I for one approve that they settled on just one memory-safe language for their new code.

wmf

21 hours ago

Google has always tried to use a small set of languages. For Android they try to use C/C++, Java/Kotlin, and now Rust. The same lessons still apply in rampantly polyglot environments though.

jeffbee

21 hours ago

Perhaps they are even considering carbon to be memory safe

ievans

a day ago

So the argument is because the vulnerability lifetime is exponentially distributed, focusing on secure defaults like memory safety in new code is disproportionately valuable, both theoretically and now evidentially seen over six years on the Android codebase.

Amazing, I've never seen this argument used to support shift/left secure guardrails but it's great. Especially for those with larger, legacy codebases who might otherwise say "why bother, we're never going to benefit from memory-safety on our 100M lines of C++."

I think it also implies any lightweight vulnerability detection has disproportionate benefit -- even if it was to only look at new code & dependencies vs the backlog.

I'm a little uneasy about the conclusions being drawn here as the obvious counterpoint isn't being raised - what if older code isn't being looked at as hard and therefore vulnerabilities aren't being discovered?

It's far more common to look at recent commit logs than it is to look at some library that hasn't changed for 20 years.

MBCook

a day ago

> what if older code isn't being looked at as hard and therefore vulnerabilities aren't being discovered?

It wasn’t being look at as hard before either. I don’t think that’s changed.

They don’t give a theory for why older code has fewer bugs, but I’ve got one: they’ve been found.

If we assumed that any piece of code has a fixed amount of unknown bugs per 1000 lines, it stands to reason that overtime the sheer number of times the code is run with different inputs in prod makes it more and more likely they will be discovered. Between fixing them and the code reviews while fixing them the hope would be that on average things are being made better.

So overtime, there are fewer bugs per thousand lines in existing code. It’s been battle tested.

As the post says, if you continue introducing new bugs at the same rate you’re not going to make progress. But if using a memory safe language means you’re introducing fewer bugs in new features then overtime the total number of bugs should be going down.

pacaro

20 hours ago

I've always thought of this as being equivalent to "work hardening"

My concern with it is more about legitimately old code (android is 20ish years old, so reasonably falls into this category) which was written using standards and tools of the time (necessarily)

It requires a constant engineering effort to keep such code up to date. And the older code is, typically, less well understood.

In addition older code (particularly in systems programming) is often associated with older requirements, some of which may have become niche over time.

That long tail of old, less frequently exercised, code feels like it may well have a sting in its tail.

The halflife/work-hardening model depends on the code being stressed to find bugs

kernal

5 hours ago

Android was released on September 23, 2008, so it just had its sweet 16.

pacaro

3 hours ago

I believe that they started writing it in 2003. It's hard to precisely age code unless you cut it down and count the number of rings

SoylentOrange

a day ago

I don’t understand this point. The project under scrutiny is Android and people are detecting vulnerabilities both manually and automatically based on source code/binary, not over commit logs. Why would the commit logs be relevant at all to finding bugs?

The commits are just used for attribution. If there was some old lib that hasn’t been changed in 20 years that’s passed fuzzing and manual code inspection for 20 years without updates, chances are it’s solid.

saagarjha

19 hours ago

Exploit authors look at commit logs because new features have bugs in them, and it's easier to follow that to find vulnerabilities than dive into the codebase to find what's already there.

e28eta

a day ago

I wasn’t entirely satisfied with the assertion that older code has fewer vulnerabilities either. It feels like there could be explanations other than age for the discrepancy.

For example: maybe the engineers over the last several years have focused on rewriting the riskiest parts in a MSL, and were less likely to change the lower risk old code.

Or… maybe there was a process or personnel change that led to more defects.

With that said, it does seem plausible to me that any given bug has a probability of detection per unit of time, and as time passes fewer defects remain to be found. And as long as your maintainers fix more vulnerabilities than they introduce, sure, older code will have fewer and the ones that remain are probably hard to find.

mccr8

10 hours ago

Their concern is not with theoretical vulnerabilities, but actual ones that are being exploited. If an attacker never tries to find a vulnerability in some code, then it might as well not have it.

0xDEAFBEAD

12 hours ago

Trying to think through the endgame here -- As vulnerabilities become rarer, they get more valuable. The remaining vulnerabilities will be jealously hoarded by state actors, and used sparingly on high-value targets.

So if this blog post describes the 4th generation, perhaps the 5th generation looks something like Lockdown Mode for iOS. Let users who are concerned with security check a box that improves their security, in exchange for decreased performance. The ideal checkbox detects and captures any attack, perhaps through some sort of virtualization, then sends it to the security team for analysis. This creates deterrence for the attacker. They don't want to burn a scarce vulnerability if the user happens to have that security box checked. And many high-value targets will check the box.

Herd immunity, but for software vulnerabilities instead of biological pathogens.

Security-aware users will also tend to be privacy-aware. So instead of passively phoning home for all user activity, give the user an alert if an attack was detected. Show them a few KB of anomalous network activity or whatever, which should be sufficient for a security team to reconstruct the attack. Get the user to sign off before that data gets shared.

daft_pink

a day ago

I’m curious how this applies to Mac vs Windows, where most newer Mac code is written in memory safe swift, while Windows still uses primarily uses C or C++.

akyuu

a day ago

Apple is still adding large amounts of new Objective-C code in each new macOS version [0].

I haven't found any language usage numbers for recent versions of Windows, but Microsoft is using Rust for both new development and rewriting old features [1] [2].

[0] Refer to section "Evolution of the programming languages" https://blog.timac.org/2023/1128-state-of-appkit-catalyst-sw...

[1] https://www.theregister.com/2023/04/27/microsoft_windows_rus...

[2] https://www.theregister.com/2024/01/31/microsoft_seeks_rust_...

TazeTSchnitzel

21 hours ago

It should be noted that Objective-C code is presumably a lot less prone to memory safety issues than C code on average, especially since Apple introduced Automatic Reference Counting (ARC). For example:

• Use-after-frees are avoided by ARC

• Null pointer dereferences are usually safe (sending a message to nil returns nil)

• Objective-C has a great standard library (Foundation) with safe collections among many other things; most of C's dangerous parts are easily avoided in idiomatic Objective-C code that isn't performance-critical

But a good part of Apple's Objective-C code is probably there for implementing the underlying runtime, and that's difficult to get right.

saagarjha

19 hours ago

Most of Apple's Objective-C code is in the application layer just like yours is

munificent

a day ago

> while Windows still uses primarily uses C or C++.

Do you have data for that? My impression is that a large fraction of Windows development is C# these days. Back when I was at EA, nearly fifteen years ago, we were already leaning very heavily towards C# for internal tools.

pjmlp

10 hours ago

WinDev culture is largely C++ and they aren't in any rush to change that, other than some incremental use of Rust.

WinRT is basically COM and C++, nowadays they focus on C# as consumer language, mostly because after killing C++/CX, they never improved the C++/WinRT developer experience since 2016, and only Windows teams use it, while the few folks that still believe WinUI has any future rather reach out for C#.

If you want to see big chuncks of C# adoption you have to look into business units under Azure org chart.

cakoose

a day ago

What happens if we gradually transition to memory-safe languages for new features, while leaving existing code mostly untouched except for bug fixes?

...

In the final year of our simulation, despite the growth in memory-unsafe code, the number of memory safety vulnerabilities drops significantly, a seemingly counterintuitive result [...]

Why would this be counterintuitive? If you're only touching the memory-unsafe code to fix bugs, it seems obviously that the number of memory-safety bugs will go down.

Am I missing something?

cogman10

a day ago

The counter intuitive part is that there is now more code written in memory unsafe languages than there was before. Even if it's just bug fixing.

It's not as if bug fixes haven't resulted in new memory bugs, but apparently that rate is much lower in bug fixes than it is in brand new code.

MBCook

a day ago

I think the standard assumption would be that you need to start replacing older code with memory safe code to see improvements.

Instead they’ve shown that only using memory safe languages for new code is enough for the total bug count to drop.

kernal

7 hours ago

>Note that the data for 2024 is extrapolated to the full year (represented as 36, but currently at 27 after the September security bulletin).

The reduction of memory safety bugs to a projected 36 in 2024 for Android is extremely impressive.

Animats

a day ago

Half a century since Pascal. Forty years since Ada. 28 years since Java. Fifteen years since Go. Ten years since Rust. And still unsafe code is in the majority.

wmf

21 hours ago

You can't really write a library in Java or Go that a C program can use. The real magic is memory safety without GC and a huge runtime. But if you point that out people will call you a fanboy.

MBCook

a day ago

Is it?

If you add up all the JavaScript, C#, Java, Python, and PHP being written every year, that’s a lot of code.

Are we sure that all that combined isn’t more than C/C++? Or at least somewhat close?

tiffanyh

a day ago

Doesn't libc interface requirement of C/C++ use, create massive downstream challenges for other languages to gain super mass adoption (at the OS level).

sanxiyn

a day ago

Yes, but Linux (hence Android) doesn't have that problem, because its interface is system call not libc.

saagarjha

15 hours ago

The practical interface for Android is Binder, which has an interface that can be made amendable to a richer language.

tiffanyh

21 hours ago

But don't syscalls require C/C++ data structures & type definitions.

So while not technically "requiring" C/C++, if your language cannot map exactly to C/C++ data structs & type definitions - it won't work.

nineteen999

19 hours ago

Yes.That's a problem for the contenders, Linux/UNIX kernel are written in C. Unless we want to add language-specific syscall interfaces for every compiled language out there.

Alternative is to, y'know, write a kernel in your language of choice and choose your own syscall specification suiting that language, and gain mass adoption. Easy!

cyberax

a day ago

Neither Pascal nor Ada are memory-safe.

docandrew

20 hours ago

It’s possible to write unsafe code in either but it’s also much easier to write safe code in both Pascal and Ada than C/C++. And it’s easier to write safe code in C++ than C. Memory safety exists in a spectrum, it’s not all or nothing.

pjmlp

10 hours ago

More than C or C++ will ever be.

It is like complaining a bullet vest doesn't protect against heavy machine gun bullets.

cyberax

3 hours ago

Not really. Standard Pascal's dynamic features were barely present, but even they allowed dangling references to exist.

And all practical Pascal clones (e.g. Object Pascal) had to grow constructor/destructor systems to cleanup the memory on de-allocation (or they went the garbage collection route). So they were on-par with C++ for safety.

Ada is similar. It provided safety only for static allocations with bounds known at the compile-time. Their version of safe dynamic allocations basically borrowed Rust's borrow checker: https://blog.adacore.com/using-pointers-in-spark

pjmlp

10 hours ago

60 years since NEWP/ESPOL/JOVIAL.

ESPOL is one of the first recorded uses of unsafe code blocks, 1961.

mpweiher

a day ago

Well, that tells us something...

0xbadcafebee

21 hours ago

So there's a C program. There's a bunch of sub-par programmers, who don't use the old, well documented, stable, memory-safe functions and techniques. And they write code with memory safety bugs.

They are eventually forced to transition to a new language, which makes the memory safety bugs moot. Without addressing the fact that they're still sub-par, or why they were to begin with, why they didn't use the memory safe functions, why we let them ship code to begin with.

They go on to make more sub-par code, with more avoidable security errors. They're just not memory safety related anymore. And the hackers shift their focus to attack a different way.

Meanwhile, nobody talks about the pink elephant in the room. That we were, and still are, completely fine with people writing code that is shitty. That we allow people to continuously use the wrong methods, which lead to completely avoidable security holes. Security holes like the injection attacks, which make up 40% of all CVEs now, when memory safety only makes up 25%.

Could we have focused on a default solution for the bigger class of security holes? Yes. Did we? No. Why? Because none of this is about security. Programmers just like new toys to play with. Security is a red herring being used to justify the continuation of allowing people to write shitty code, and play with new toys.

Security will continue to be bad, because we are not addressing the way we write software. Rather than this one big class of bugs, we will just have the million smaller ones to deal with. And it'll actually get harder to deal with it all, because we won't have the "memory safety" bogey man to point at anymore.

pornel

20 hours ago

People have a finite amount of time and effort they can spend on making the code correct. When the language is full of traps, spec gotchas, antiquated misfeatures, gratuitous platform differences, fragmented build systems, then a lot of effort is wasted just on managing all of that nonsense that is actively working against writing robust code, and it takes away from the effort to make a quality product beyond just the language-wrangling.

You can't rely on people being perfect all the time. We've been trying that for 50 years, and only got an endless circle of CVEs and calls to find better programmers next time.

The difference is how the language reacts to the mistakes that will happen. It could react with "oops, you've made a mistake! Here, fix this", and let the programmer apply a fix and move on, shipping code without the bug. Or the language could silently amplify smallest mistakes in the least interesting code into corruption that causes catastrophic security failures.

When concatenating strings and adding numbers securely is a thing that exists, and a thing that requires top-skilled programmers, you're just wasting people's talent on dumb things.

tptacek

21 hours ago

No, this just isn't how it works. You'll find business logic, domain-specific, and systems programming security errors on every platform, but you don't find them with the density and the automatic severity you do memory safety issues. This is not about "language hipsters".

0xbadcafebee

6 hours ago

If the impetus behind moving to new languages was primarily focused on improving security, we could have first attempted to address the underlying cause of programming security issues. The underlying cause is not "it's hard to check buffer boundaries". If you don't believe me, do a 5 Whys on the average buffer overflow, and tell me the best solution is still a new language.

steveklabnik

5 hours ago

> we could have first attempted to address the underlying cause of programming security issues.

A lot of the recent moves here are motivated by multiple large industry players noticing a correlation between security issues and memory-unsafe language usage. "70% of security vulnerabilities are due to memory unsafety" is a motivating reason to move towards memory safe languages.

What do you believe the underlying cause to be?

tptacek

6 hours ago

"It's hard to check buffer boundaries" is memory corruption circa 1998.

alpire

21 hours ago

> You have a bunch of sub-par programmers, who don't use the old, well documented, stable, memory-safe functions and techniques. They write code with memory safety bugs.

We should really stop putting the blame on developers. The issue is not that developers are sub-par, but that they are provided with tools making it virtually impossible to write secure software. Everyone writes memory safety bugs when using memory-unsafe languages.

And one helpful insight here is that the security posture of a software application is substantially an emergent property of the developer ecosystem that produced it, and that includes having secure-by-design APIs and languages. https://queue.acm.org/detail.cfm?id=3648601 goes into more details on this.

bcoates

21 hours ago

I mostly agree with you but take the opposite position: attacking memory safety bugs has been so successful that we should use the same pattern on other large classes of bugs.

It's absolutely possible to write a reasonably usable language that makes injection/escaping/pollution/datatype confusion errors nearly impossible, but it would involve language support and rewriting most libraries--just like memory safety did. Unfortunately we are moving in the opposite direction (I'm still angry about javascript backticks, a feature seemingly designed solely to allow the porting of php-style sql injection errors)

0xbadcafebee

6 hours ago

Take that to its logical end: a language designed with high-level functions to automate all possible work that, if left to a human to craft, would expose security vulnerabilities.

You know what you have there? A new method for developing software.

If we just switch languages every time we have another class of vuln that we want to eliminate, it will take us 1,000 years to get them all.

Or we could just get them all today by fundamentally rethinking how we write software.

The problem isn't software. The problem is humans writing software. There are bugs because we are using a highly fallible process to create software. If we want to eliminate the bugs, we have to change how the software is created.

jnwatson

21 hours ago

It is 100-1000 times harder to find application-level security bugs than it is to find memory safety bugs.

It is also far easier to apply formal analysis to code if you don't have to model arbitrary pointers.

Android has hard evidence that just eliminating memory safety makes a significant difference.

saagarjha

15 hours ago

> There's a bunch of sub-par programmers, who don't use the old, well documented, stable, memory-safe functions and techniques.

…which are?

pjc50

13 hours ago

> old, well documented, stable, memory-safe functions and techniques.

I'm sorry, are we still talking about C here? Where the old functions are the ones that are explicitly labelled as too dangerous to use, like strcmp()?

uecker

17 hours ago

I think it is somewhat about security. The "Nobody can write secure C/C++ code" nonsense translates to "We are not at fault that are products are broken garbage, because it is simply impossible to write secure code." Now we can pretend to fix "a whole class of errors" (as long as there is no "unsafe" anywhere) by imposing Rust on programmers (and the open-source community that then will produce memory safe code for free for some to use) and while this may indeed help a bit, one can hope to avoid being really being held responsible for product security for another decade or so.

uid65534

21 hours ago

Having the tool actively prevent classes of errors is a worthwhile endeavor, but I do agree it gets overly focused on alone when several other _massive_ classes of vulnerabilities continue to be introduced. At a high level though, it is a lot easier to just have a framework enforce 'x' on all devs to raise the minimum bar. It doesn't help that the average Rust-bro, from which a lot of these memory safety arguments come from, is utterly terrible at making that case in reality. Case example: https://github.com/pyca/cryptography/issues/5771.

I think a lot of the arguments around C++ for example being 'memory unsafe' is a bit ridiculous because its trivial to write memory safe C++. Just run -Wall and enforce the use of smart pointers, there are nearly zero instances in the modern day where you should be dealing with raw pointers or performing offsets that lead to these bugs directly. The few exceptions are hopefully with devs that are intelligent enough to do so safely with modern language features. Unfortunately, this rarely gets focused on by security teams it seems since they are instead chasing the newest shiny language like you mention.

alpire

20 hours ago

> its trivial to write memory safe C++

It is not unfortunately. That's why we see memory safety being responsible for 70% of severe vulns across many C and C++ projects.

Some of the reasons include: - C++ does little to prevent out-of-bounds vulns - Preventing use-after-free with smart pointers requires heavy use of shared pointers, which often incurs a performance cost that is unacceptable in the environment C++ is used.

Dylan16807

20 hours ago

> It is not unfortunately. That's why we see memory safety being responsible for 70% of severe vulns across many C and C++ projects.

I don't think that's really a rebuttal to what they're trying to say. If the vast majority of C++ devs don't follow those two rules, then that's not much evidence against those two rules providing memory safety.

adgjlsfhk1

19 hours ago

Right but the reason they don't follow those 2 rules is that using them would require not using most C++ libraries that don't follow the rules, and would introduce performance regressions that negate the main reason they chose C++ in the first place.

Dylan16807

19 hours ago

This entire post is about a gradual transition. You don't have to avoid libraries that break the rules, you just have to accept that they're outside the safe zone.

For performance, you'll have to be more specific about your shared pointer claims. But I bet that it's a very small fraction of C++ functions that need the absolute best performance and can't avoid those performance problems while following the two rules.

roca

14 hours ago

What does "enforce the use of smart pointers" actually mean? Idiomatic C++ uses "this" and references ubiquitously, both of which are vulnerable to use-after-free bugs just like raw pointers.

drivebycomment

18 hours ago

> its trivial to write memory safe C++.

Very bold claim, and as such, it needs substantial evidence, as there is practically no meaningful evidence to support this. There are some real world non-trivial c++ code that are known to have very few defects, but almost all of them required extremely significant effort to get there.