hackernews client

Automatically Translating C to Rust

115 pointsposted 3 months ago

(cacm.acm.org)

109 Comments

veltas

3 months ago

Automatically translating C to unsafe Rust is pointless, the resultant code is harder to read and there's no improvement in understanding how to get the code maintainable and safe, that requires tons of manual work by someone with a deep understanding of the codebase.

Generally the Rust community as well don't seem to have an answer on how to do this incrementally. In business terms we have no idea how to do work slices with demonstrable value, so no way to keep this on track and cut losses if it becomes too much work. This also strongly indicates you're 'stuck' with Rust when you're done, maybe a better and less unidiomatic C++ killer comes later and sounds like you're either going to have to rewrite the whole thing or give up.

I'm definitely open to wisdom on this if anyone disagrees because it is valuable to me and probably most of the readers of this comment section.

pizza234

3 months ago

> Automatically translating C to unsafe Rust is pointless, the resultant code is harder to read and there's no improvement in understanding how to get the code maintainable and safe, that requires tons of manual work by someone with a deep understanding of the codebase.

I have experience on a (nontrivial) translation of a "very unsafe" C codebase to Rust, and it's not true that there is no value in this type of work.

The first step, automatic translation from C to Rust via tools, immediately revealed bugs in the original codebase. This step alone is worth spending some time on the operation.

Ports from C to Rust aren't a binary distribution of "all safe" or no port at all. Some projects, for example ClamAV, are adopting a mixed approach - (part/most of) new code in Rust, and some translation of existing functionalities to Rust.

In general, I think that automatic porting of C to Rust is, in real world, an academic exercise. This is because C codebases designed without safety in mind, simply need to be redesigned, so the domain in not really "how to port C to Rust" - it's "how to redesign and unsafe C codebase to a safe one" first of all. Additionally, I believe that in such cases, maintaining the implementation details is impossible - unsafety is a design, after all.

I personally advocate for very precisely scoped ports, where it can be beneficial (safety an stability); where that's not possible, I agree, better abandon early.

Diggsey

3 months ago

IMO, safety and "idiomatic-ness" of Rust code are two separate concerns, with the former being easier to automate.

In most C code I've read, the lifetimes of pointers are not that complicated. They can't be that complicated, because complex lifetimes are too error prone without automated checking. That means those lifetimes can be easily expressed.

In that sense, a fairly direct C to Rust translation that doesn't try to generate idomatic Rust, but does accurately encode the lifetimes into the type system (ie. replacing pointers with references and Box) is already a huge safety win, since you gain automatic checking of the rules you were already implicitly following.

Here's an example of the kind of unidiomatic-but-safe Rust code I mean: https://play.rust-lang.org/?version=stable&mode=debug&editio...

If that can be automated (which seems increasingly plausible) then the need to do such a translation incrementally also goes away.

Making it idiomatic would be a case of recognising higher level patterns that couldn't be abstracted away in C, but can be turned into abstractions in Rust, and creating those abstractions. That is a more creative process that would require something like an LLM to drive, but that can be done incrementally, and provides a different kind of value from the basic safety checks.

zozbot234

3 months ago

> In that sense, a fairly direct C to Rust translation that doesn't try to generate idomatic Rust, but does accurately encode the lifetimes into the type system (ie. replacing pointers with references and Box) is already a huge safety win, since you gain automatic checking of the rules you were already implicitly following.

Unfortunately, there's a lot of non-trivial C code that really does not come close to following the rules of existing Safe Rust, even at their least idiomatic. Giving up on idiomaticness can be very helpful at times, but it's far from a silver bullet. For example, much C code that uses "shared mutable" data makes no effort to either follow the constraints of Rust Cell<T> (which, loosely speaking, require get or set operations to be tightly self-contained, where the whole object is accessed in one go) or check for the soundness of ongoing borrows at runtime ala RefCell<T> - the invariants involved are simply implied in the flow of the C code. Such code must be expressed using unsafe in Rust. Even something as simple (to C coders) as a doubly-linked list involves a kind of fancy "static Rc" where two pointers jointly "own" a single list node. Borrowing patterns can be decoupled and/or "branded" in a way that needs "qcell" or the like in Rust, which we still don't really know how to express idiomatically, etc.

This is not to say that you can't translate such patterns to some variety of Rust, but it will be non-trivial and involve some kind of unsafe code.

zozbot234

3 months ago

> Generally the Rust community as well don't seem to have an answer on how to do this incrementally.

You can very much translate C to Rust on a function-by-function basis, the only issue is at the boundary where you're either left with unsafe interfaces or a "safe" but slow interop. But this is inherent since soundness is a global property, even a tiny bit of wrong unsafe code can spoil it all unless you do things like placing your untrusted code in a separate sandbox. So you can do the work incrementally, but much of the advantage accrues at the end.

pizza234

3 months ago

> You can very much translate C to Rust on a function-by-function basis, the only issue is at the boundary

Absolutely not. There are many restrictions of Rust that will prevent that. Lifetimes, global state come to mind first. Think about returning pointer to some owned by the caller - this can require massive cascading changes all over the codebase to be fixed.

zozbot234

3 months ago

These are restrictions of idiomatic Safe Rust. You can use either unsafe Rust or, in many cases, less idiomatic but still Safe Rust to sidestep them. (For instance, "aliasable mutable" but otherwise valid references which can often be expressed as &Cell<T>, etc.)

You might still need a "massive cascading change" later on to make the code properly idiomatic once you have Rust on both sides of the boundary, but that's just a one-time thing and quite manageable.

pizza234

3 months ago

> You can use either unsafe Rust or, in many cases, less idiomatic but still Safe Rust to sidestep them. (For instance, "aliasable mutable" but otherwise valid references which can often be expressed as &Cell<T>, etc.)

There's no doubt that one can convert C into unsafe Rust - C2Rust can automatically convert an entire C codebase into unsafe Rust

The problem is that after such step (which is certainly valuable), converting the code to safe Rust is typically a lot of work, which is the point of the academic research in question. Half baked code, using safety workarounds, doesn't provide any value to a project.

adgjlsfhk1

3 months ago

unsafe rust still has to follow invariants, you're just promising the compiler that it does

zozbot234

3 months ago

Yes, clearly it's a matter of using different facilities that may only be accessible to Unsafe Rust, and changing the interface accordingly. But to state that Rust as a whole has such restrictions is not correct.

sevensor

3 months ago

Surely if you do this, you just end up expressing your C design in different syntax?

Doing the right thing means writing different functions with different signatures. Incrementalism here is very hard, and the smallest feasible bottom up replacement for existing functionality may be uncomfortably large. Top down is easier but it tends to lock in the incumbent design.

zozbot234

3 months ago

> Surely if you do this, you just end up expressing your C design in different syntax?

Using different syntax is not pointless: the syntax allows you to express limited invariants that are expected to be comprehensively upheld by the surrounding C code. These invariants will initially be extremely broad (e.g. "this function must always get a $VALID pointer as input", for whatever values of $VALID), since they cannot be automatically checked; but they can gradually become stricter as more and more of the codebase is rewritten to be memory safe. Does this sometimes involve " cascading changes"? Yes, but much smaller than a from-scratch 100% rewrite into Safe Rust.

the__alchemist

3 months ago

My 2C: What we need isn't a translater, but painless FFI. The FFI tools avail like cc and bindgen make working results most of the time, but they need [manual] wrapping.

It's kind of a similar situation (Although a bit more complicated) exposing Rust libs in python; PyO3/maturin do the job, but you have to manually wrap.

So... I would like tools that call C code from rust, but with slices etc instead of pointers.

zozbot234

3 months ago

> I would like tools that call C code from rust, but with slices etc instead of pointers.

A slice is just a bundle of pointer + size. C raw interfaces vary on how they express the "size" part, so the point of wrapping is translating that information into whatever bespoke way is expected by the code you're working with.

the__alchemist

3 months ago

Good insight! I guess I don't really understand why we can't use native types then. I don't want to keep having to write these:

  pub fn fir_q31(
    s: &mut sys::arm_fir_instance_q31,
    input: &[i32],
    output: &mut [i32],
    block_size: usize,

) { // void arm_fir_q31 ( // const arm_fir_instance_q31 * S, // const float32_t * pSrc, // float32_t * pDst, // uint32_t blockSize // ) // Parameters // [in] S points to an instance of the floating-point FIR filter structure // [in] pSrc points to the block of input data // [out] pDst points to the block of output data // [in] blockSize number of samples to process // Returns none

    compiler_fence(Ordering::SeqCst);
    unsafe {
        sys::arm_fir_q31(s, input.as_ptr(), output.as_mut_ptr(), block_size as u32);
    }
}

user

3 months ago

[deleted]

IshKebab

3 months ago

It's not pointless. For a start it frees you from the C toolchain so things like cross-compilation and WASM become much easier.

Secondly, it's a sensible first step in the tedious manual work of idiomatic porting. I'm guessing you didn't read the article but it's about automating some of this step too.

krater23

3 months ago

The big bloaty part it the rust toolchain, not the C toolchain. But this beside, you are now free from a C toolchain and have unmaintainable automatically generated unsafe Rust code. Don't see a win there.

IshKebab

3 months ago

The Rust toolchain is waaaay better than the C toolchain. Pretty much everyone agrees on that.

And yes the generated Rust code is going to be harder to maintain, but you aren't supposed to just run c2rust and then stop. It's a starting point for converting to idiomatic Rust.

pjmlp

3 months ago

Cross-compilation is not hard, it appears what was used to be common knowledge about compilers and linkers nowadays is considered TL;DR; content for all pratical learning purposes.

Animats

3 months ago

The article doesn't address the hard problem of figuring out array sizes. There's some work going on as part of the DARPA TRACTOR program to work on that. This area, of course, is the usual cause of buffer overflows.

The goal is to convert C pointers to Rust arrays, pointer arithmetic to Rust slices, and array allocations to Vec initialization. The hard problem is figuring out the sizes of arrays, which is going to require global analysis down the call chain.

If you're going to publish papers on this, please address that problem.

uecker

3 months ago

Of course, one you have identifies the bounds to each pointer you could just do bounds checking in C.

AlotOfReading

3 months ago

That's not actually sufficient in the general case where the pointer may not be the type of the underlying object. You also have to respect strict aliasing even if the bounds are correct. This isn't true in the same way in Rust because memory is untyped. You only need to ensure basic memory validity (range, initialization, alignment, etc).

uecker

3 months ago

Yes, you also do not want to do random casts, but this is even easier. I do not get your point out memory validity in Rust. What if you write where a pointer is stored, or even a boolean?

AlotOfReading

3 months ago

I'm talking about type punning specifically here. There's a lot of old C code out there that stores everything in int * buffers and casts pointers back to the correct type. I'm even aware of one toolchain for a widely used MCU that typedef'd char to int (i16).

I believe this would be legal in Rust today if you respected the other rules, with the caveat that it wouldn't be remotely idiomatic or possible without unsafe.

kant2002

3 months ago

I did take a look at the projects which attempt to address conversion of C to Rust and even if article talk about about uplifting C to idiomatic Rust, or to utilize decompilation techniques, I do not see anything of that in any existing project.

- C2Rust: https://github.com/immunant/c2rust From what I see, very limited testing of what C can be uplifted. - Citrus: https://gitlab.com/citrus-rs/citrus Overall almost no tests. I would not even mention this project, since it seems to be working on hope. https://gitlab.com/citrus-rs/citrus/-/tree/master/tests?ref_... - Corrode: https://github.com/jameysharp/corrode Written in Haskel, not in Rust as others, have potential, since they utilize csmith for testing.But still lack of testing. https://github.com/jameysharp/corrode/tree/master/scripts

I really don't see any project which attempt to reverse engineer Rust idioms from C, even if in limited contexts. Maybe the goal of the article to inspire all of us. Or Maybe I miss some other existings projects?

steveklabnik

3 months ago

C2Rust is the largest project here, it was born out of corrode. The general idea of it is to do a direct C -> unsafe Rust port, and then also provide tooling to convert unsafe Rust -> safe Rust, as a secondary tool to be used as a second step. I don't think they've shipped much of that second tool yet, but I haven't checked in in a while.

kant2002

3 months ago

Yeah. After reading article I become supper interested in looking how they uplift things and was honestly a bit disappointed. Did not try to look for Huawai work for obvious reasons

steveklabnik

3 months ago

Rust does not have strict aliasing, that’s correct.

uecker

3 months ago

But Rust still has trap representations, or? In practice, this implies similar constraints as strict aliasing.

steveklabnik

3 months ago

That’s a related concept but not the same thing.

The “validity” rules in Rust (like C’s trap representations) are about which bit patterns represent valid values, not about which pointer types may alias the same memory.

Strict aliasing is a type-based access rule; validity is a value-based representation rule.

Rust enforces the latter (invalid values are undefined behavior), but explicitly does not enforce the former (raw pointers may alias arbitrarily). That’s the difference.

To put it into code:

  float foo(const uint32_t *xs) {
      const float *fp = (const float*)xs; // UB: incompatible effective type

But in Rust, the equivalent:

  fn foo(xs: *const u32) {
      let ptr = xs as *const f32; // not UB to access through ptr later

(I am on my phone and so may have made small errors and revised a few times, my apologies.)

It’s also true you must pay attention to validity: this works in Rust because all bit patterns of u32 and f32 are valid for f32 (IEEE-754). If you tried u32 -> bool, you’d hit UB not because of aliasing, but because many u8 patterns are invalid for bool (only 0 and 1 are valid). Creating the pointer is fine; dereferencing would only be okay if you first ensure at runtime that the stored bits are valid for the target type, and alignment is satisfied.

uecker

3 months ago

I understand the difference and this is why I said "similar constraints". Once you have non-value (trap) representations you need to be careful when dereferencing pointers cast to different types, even if you do not have strict aliasing. The point under the discussion upthread was that in unsafe Rust this would not be a problem because it does not have strict aliasing and my argument that it still is would be because of non-value representations.

steveklabnik

3 months ago

It's true that care still needs to be taken. Many people in C are critical of strict aliasing (Linus as a major example) while not being too worried about the dangers when punning. Strict aliasing adds additional things you need to worry about on top of the representation issues. It's a worry you have in C (unless you use a flag and write non-standards conforming code) that you don't have in Rust.

zozbot234

3 months ago

Rust has pointer provenance which implies very similar constraints to the "typed memory" wording of C/C++.

AlotOfReading

3 months ago

Does it? It's very unclear to me whether something like type punning is prohibited by provenance today. The docs don't provide much clarity, and the comments I can find by ralf suggest the details are undecided. I can't imagine it won't be eventually prohibited since we already have hardware designs prohibiting it and it's a terrible code pattern to begin with, but I don't know if the language currently does so.

steveklabnik

3 months ago

This isn’t correct. Just because Rust has aliasing rules doesn’t mean they’re the same sorts of rules.

C and C++ are also looking to adopt more formal provenance rules.

pizlonator

3 months ago

The code I've seen that was autotranslated from C to Rust has an absolutely hopeless number of unsafe statements.

You're better off using Fil-C.

jrpelkonen

3 months ago

Fil-C is an innovative approach and a great technical achievement. However, I wouldn't suggest that it is an universal solution without caveats. For instance, the performance penalty of up to 4x is not acceptable in a lot of cases.

Also, the c2rust output is rough but not hopeless: There are real world success stories of rust projects that were bootstrapped via c2rust, e.g. https://tweedegolf.nl/en/blog/151/translating-bzip2-with-c2r...

pizlonator

3 months ago

bzip2 is tiny, has relatively low overhead in Fil-C (forget exactly what it is but not 4x), and last I checked this Rust version still has >100 uses of unsafe.

galangalalgol

3 months ago

Fil-C doesn't stop the data race problems the borrow checker would catch does it?

Has anyone tried pointing an agentic ai at recreating a c utility by looking only at the man page and using differential fuzzing? It isn't a port, so no licensing issues, and the code would use unsafe, and presumably be more idiomatic. I have no idea if it would ever complete, or just get stuck in an endless loop. Or even if it did succeed, how many joules it would use.

krater23

3 months ago

I'm sure when you try to get a AI to recreate such a tool, ths code would be unmaintainable, bloath, slow and shitty, but in the end it would work in some way. Interesting topic, but nothing to go productive with.

pizlonator

3 months ago

> data race problems

No, Fil-C just makes races memory safe.

Also this is sort of changing the topic a bit since bzip is single threaded

galangalalgol

3 months ago

Even if it wasn't single threaded, it would probably have been fine grained OMP style multithreaded which runs into far fewer issues. I was just making sure I understood what Fil-C was doing. I hadn't heard of it. It seems like a great thing.

aitchnyu

3 months ago

Are they competing with WASM on performance?

kevincox

3 months ago

I would assume that these two use cases are basically completely separate.

Auto-translate from C to Rust would serve as a great step to starting a porting project. Now you can incrementally re-write the "basically C" auto-ported code to "proper Rust" without dealing with FFI and other pains that come from function-by-function ports.

Fil-C is great for running software that you don't want to port. (Or don't yet have the resources to port.)

Interestingly there is probably a gap between the two. When your project is pure C you can use Fil-C. However I don't think Fil-C supports Rust. So assuming that the initial C to Rust translation doesn't produce 100% safe code (I'm not aware of any current tools that do this) you have this middle state where you can no longer compile with Fil-C but have lots of unsafe Rust code. So maybe there is a use case for Fil-Rust where you compile your Rust program so that even unsafe blocks are in fact safe. This could be used until you complete the port.

ralegh

3 months ago

Wonder if it would be better to auto translate to broken rust, ie forcing the user to fix memory issues. I imagine that would lead to pretty big refactors in some cases though.

Animats

3 months ago

No. What comes out of C2Rust is awful. The Rust that comes out reads like compiler output. Basically, they have a library of unsafe Rust functions that emulate C semantics. Put in C that crashes, get Rust that crashes in the same way. Tried that on a JPEG 2000 decoder.

levodelellis

3 months ago

I find it funny AF that Fil-C is safer than languages with the unsafe keyword. Who knew C could be so safe with a proper compiler

Ar-Curunir

3 months ago

It is well known that GC allows you to solve memory safety problems

timeon

3 months ago

> proper compiler

Not just compiler but GC as sell. So it does note solve same problem as Rust.

levodelellis

3 months ago

Would you rather have a gc or unsafe?

In just about every language I seen people use .clone rather than deal with problems so I suspect a lot of cases a GC can be just fine or faster. Although I'm comfortable with memory management and rather use C or C++ if I'm writing fast code

timeon

3 months ago

> Would you rather have a gc or unsafe?

Like in case where you can't use Rust? (ie.: existing codebase). Sure that is what Fil-C is good for. Point is that Fil-C does not solve the problem Rust does. It is more like band-aid. (Maybe my comment was misunderstood because of typo: sell/well)

Also I think there is huge difference between GC and fact that some people use .clone() somewhere.

levodelellis

3 months ago

[flagged]

Mond_

3 months ago

? Putting a program into a safe container or isolation boundary (this is roughly what GC is in this context) causes it to be memory safe. This is not an interesting observation. It also causes it to be significantly slower, to the point of not being competitive anymore.

uecker

3 months ago

The memory model of C is intentionally designed to allow safe implementations (still from the time of hardware-segmented methods).

CoastalCoder

3 months ago

Could you expand on that?

aw1621107

3 months ago

I believe the claim is that there's nothing in the C standard that requires implementations to be unsafe. If they wanted to, they could bounds check pointers, check allocations are still alive when pointers are dereferenced, etc. and still be conformant to the standard.

pornel

3 months ago

Nothing in the C standard requires bytes to have 8 bits either.

There's a massive gap between what C allows, and what real C codebases can tolerate.

In practice, you don't have room to store lengths along pointers without disturbing sizeof and pointer<>integer casts. Fil-C and ASAN need to smuggle that information out of band.

uecker

3 months ago

Even more, certain rules are specifically designed to make such checks possible while being conformant to the standard.

user

3 months ago

[deleted]

stared

3 months ago

Tried Claude Code with explicit instructions to create idiomatic code and avoid unsafe statements?

procaryote

3 months ago

The other direction might be more interesting, in case rust drops in popularity in a couple of years, leaving behind a bunch of "let's rewrite in rust" efforts

speedgoose

3 months ago

I am not convinced that anyone would take a working rust project and rewrite it in C. I don’t see any good reason to do so.

When rust will lose popularity, it is going to happen eventually, I would bet it’s in favour of a newer and more promising programming language. Not C.

VBprogrammer

3 months ago

I think Rust has hit critical mass. It's now basically the default choice for something you want to perform well but want to be reasonably secure. For example, uv in the python ecosystem.

foldr

3 months ago

If you read HN you might get that impression, but that vast majority of software that needs security and good performance is being written in Java.

dana321

3 months ago

If you were building a programming language, would you write it in Java or Rust?

foldr

3 months ago

I'm not personally a fan of Java, but if I was implementing a compiler, I'd pick a language with GC. There's pretty much no downside to a GC in that context, and it gives you more flexibility when working with graph data structures.

If 'building a programming language' means writing an interpreter or VM, then I can see the attraction of Rust for that case. But writing interpreters and VMs is like 0.0001% of the programming that gets done in the world.

pjmlp

3 months ago

In those two alone, Java.

There is no reason I would care about borrow checking implementing a compiler, and besides all the tooling, Java also has stuff like ANTRLR and MPS, and naturally Graal is a good playground for compiler backend tooling.

However in general, I would rather look into OCaml, Haskell, F#, Scala.

childintime

3 months ago

Graal and Truffle make the JVM look attractive, especially for this case!

VBprogrammer

3 months ago

I wouldn't be surprised if that was closer to the truth. A heck of a lot of boring software runs on the JVM. That said, it's a slightly different niche from command line tools.

pjmlp

3 months ago

Alongside C# in more Microsoft influenced culture shops. :)

andrewmcwatters

3 months ago

[dead]

m00dy

3 months ago

Rust is the clear winner of the LLM era. With code generation being so effortless, why would you write in any other language?

throawayonthe

3 months ago

i don't use LLMs, but i've heard people complain current LLMs are not good at writing Rust

wizzwizz4

3 months ago

Current LLMs are not good at writing any language you actually understand, unless you do so much of the work that you might as well have written the whole program yourself.

They're excellent at doing things I'm not an expert at, though! https://en.wikipedia.org/wiki/Gell-Mann_amnesia_effect

galangalalgol

3 months ago

We should make calculators like this for kids to learn on. Every so often it makes mistakes that you will spot if you could have done the arithmetic yourself and are just saving time. That is where ai code is at right now.

bigstrat2003

3 months ago

This is exactly why I don't trust LLMs (and therefore why I don't use them). When dealing with something I know about I can see the many mistakes they make - I would have to be a complete fool to trust them to do better on subjects I don't know about.

m00dy

3 months ago

yeah that narrative was popular last year. You can't go wrong with LLMs on Rust.

pessimizer

3 months ago

That narrative is still popular with LLMs themselves. If you ask an LLM whether it can code Rust, it will tell you that it can but not very well.

They're good at web languages, python, and C/C++. As far as I can tell Rust works if you're already good at Rust and you can catch its screwups and strange architecture choices quickly.

morcus

3 months ago

Maybe I'm doing it wrong (using a variety of models on GitHub Copilot) but in complex tasks I often find that they give me code that doesn't quite compile (often due to lifetime errors, sometimes other issues)

_alternator_

3 months ago

Try agents like Claude code. My experience was that the initial code was conceptually correct with some type errors on the first pass. It then iterated on compile errors about 6 times, tweaking the code to resolve the issues. Then it compiled and ran correctly.

This was about 500 lines of working rust in about 10 minutes, approximately 25x my pace at writing rust. (I’m a bit of a beginner.)

pjmlp

3 months ago

The ultimate goal is for LLM replace languages, and directly perform tasks, why bother with Rust when we will be using agentic runtimes?

m00dy

3 months ago

I feel so safe when my Rust code compiles; it feels like the program will run forever. I'm not sure what you mean by "agentic runtimes," but if they offer the same safety standards as Rust, I wouldn't mind using them.

pjmlp

3 months ago

Beware of too much expectations,

https://edera.dev/stories/tarmageddon

m00dy

3 months ago

what you sharing is not a rust specific, It's the same for npm and pypy packages.

Rust is native binary + fearless concurrency + memory safe and AI can help you to achieve these targets very fast. That's why Rust is the winner of all the languages, every software needs to be fast, secure and able to run forever.

pjmlp

3 months ago

How is Rust winning on CUDA and Khronos standards?

m00dy

3 months ago

wait until China releases their open source CUDA on pure rust code.

nacozarina

3 months ago

new chips will always have a c compiler available long before anything else

Avamander

3 months ago

I would assume that an LLVM backend is created for new chips and then C is not the only thing getting support. There's very little point in just supporting C in that sense.

nicoburns

3 months ago

That doesn't seem to have been an issue for recent new CPU architectures. RISC-V has excellent Rust support for example.

camel-cdr

3 months ago

Not really. Rust still doesn't support Arm SVE or RVV intrinsics.

nicoburns

3 months ago

I suppose so. I'd see that as more of a missing Rust language feature (SIMD support is still immature) rather than a platform support issue though.

pjmlp

3 months ago

Neither does C, the regular ISO C as defined by WG14.

pjmlp

3 months ago

Alongside a much safer C++ compiler.

In 2025 there are hardly single language compiler toolchains being released.

Also if the chip toolchain is based on a GCC or clang fork, there are several frontends to chose from.

ghthor

3 months ago

Compile speed maybe the only one. But hopefully that keeps becoming less of a difference

whatpeoplewant

3 months ago

Language popularity is cyclical; the hedge is to treat Rust as an implementation detail behind stable, language-agnostic boundaries (protocols, C ABI, WASM) and invest in strong tests/specs. If Rust wanes, migrate piecemeal: keep interfaces, reimplement modules elsewhere, and verify parity with property tests and benchmarks. Multi-agent, agentic LLM workflows can prototype alternatives in parallel, generate FFI/interop shims, and cross-check behavior to de-risk the swap without another “big rewrite.”

rererereferred

3 months ago

That would also help use Rust in platforms that only have a C compiler.

galangalalgol

3 months ago

People have used mrustc like that to put rust on a c64. The number of targets that make sense from a word length perspective that aren't already supported by llvm are pretty small I think? You aren't going to compile rust to some fixed point dsp where a long is 48bits. The c anything is likely to generate won't compile in whatever odd not-quite-ansi c compiler the chip maker provides.

indigoabstract

3 months ago

That could be interesting. If some new language or tool appears that automatically figures out the correct lifetime and ownership of the resources in your program, people (might be the same people) will call for rewrites from Rust into the new language, as you would no longer have to assign memory ownership manually.

jurschreuder

3 months ago

In a way this is strange because there us a huuuge new area of vulnerabilities caused by LLMs writing code that DWARFS the read/write out of array bounds issues C has.

cue_the_strings

3 months ago

I agree.

But on the other hand, let's not kid ourselves, array out of bounds, use after free, resource leaks and bad type system, all of this isn't even close to an exhaustive list of C downsides. Beyond its direct limitations, C inspires an approach that is vastly inferior even if you follow all the best practices. Even compared to (modern) C++ it's much worse. I say this and I kind of like C.

If the approaches described in the article save us 30% of the effort of translating C codebases to Rust, it's still worth trying; we're unfortunately not very close to complete automation, but that's something worthy of pursuit.

tmountain

3 months ago

I understand the issues related to LLM leaking and re-distributing "private" information, but I'm curious which category of concerns you're referring to. Would you mind giving some context (genuinely curious) ?

jurschreuder

3 months ago

You can look at the vulnerabilities found graph of github. It stays about the same for years and then skyrockets up at around the time LLMs were invented.

And they are all pretty simple vulnerabilities, exploitable even to people knowing nothing about how to get root access from a binary that has an out of bounds condition somewhere in randomly shuffled memory layout in a specific version if a C program.

pornel

3 months ago

The code needs to pass integrity checks of the safe Rust subset, which is a different challenge than writing dangerous code without feedback.

alkonaut

3 months ago

The key invention here would be to translate from idiomatic C to idiomatic - safe - Rust.

That also sounds exactly like the kind of invention that would make me fear for my job and claim AGI has all but arrived.

Just syntactically translating C code to mostly unsafe or non-idiomatic Rust seems like a pretty pointless excercise?

pantalaimon

3 months ago

I just upgraded my Ubuntu to the new version with Rust written Coreutils - this is insane

    % size /usr/bin/ls
       text    data     bss     dec     hex filename
    10086795  731540    2104 10820439  a51b57 /usr/bin/ls

    % ls -sh /usr/lib/cargo/bin/coreutils/ls
    11M /usr/lib/cargo/bin/coreutils/ls

    % du -sh /usr/bin
    1.5G /usr/bin

gpm

3 months ago

The entire rust coreutils package, as installed, is 12 MB https://packages.ubuntu.com/questing/rust-coreutils Which is nearly double the gnu coreutils package but still a complete nothing burger: https://packages.ubuntu.com/questing/gnu-coreutils

I think what's happening here is that they've all been compiled into one binary, and then that one binary hardlinked to a variety of names like /usr/bin/ls. Since they all show as having the same inode and the same size.

The other 1.5G of your 1.5G /usr/bin is unrelated to rust coreutils.

pantalaimon

3 months ago

You are absolutely right!

    % du -sh /usr/lib/cargo/bin/
    13M /usr/lib/cargo/bin/

Just a bit odd they went for hard links instead of soft links, makes it harder to tell that it's all the same file.

user

3 months ago

[deleted]