Revisiting Loop Recognition in C++ in Rust

27 pointsposted 8 months ago
by todsacerdoti

13 Comments

pjmlp

8 months ago

I was expecting that the C++ code would have been updated to C++23 best practices, and what are all those std::list doing in modern CPU cache lines?

npalli

8 months ago

Not only is it not C++23, the C++ code is from the 2011 paper!, so pre-C++11. Rust maybe better than code written before C++11 is not a strong take.

pjmlp

8 months ago

Yes it has a pretty much C++98/C++ARM feeling to the code style.

quietbritishjim

8 months ago

In fairness, updating to a range-based for loop would give the same effective code as using iterators manually in the loop as they have done. I'm not convinced any new features in C++23 (after C++11) would have given a performance improvement - was there something you had in mind?

The choice of std::list is a bit more dubious. Looking at the code (I found a mirror [1]), it seems like it's mostly for appending and iterating but there is pop_front in one place. Maybe std::deque would be better? Or std::vector with an iterator noting the effective start? Or maybe they were deliberately choosing a poor data structure to stress the compiler (and if so the OP should've used a linked list in Rust for fairness). The article doesn't comment so we can only guess their motivation.

[1] https://github.com/hundt98847/multi-language-bench/blob/mast...

pjmlp

8 months ago

> was there something you had in mind?

More like on how to write the code in more ergonomic way than the performance by itself, maybe some of that stuff could even be constepxr/eval.

npalli

8 months ago

std::list to std::vector should be the big one.

std::map to std::unordered_map could be next.

then, really ranges/constexpr/std::move could make a difference, hard to say definitely.

Beyond these, Modern C++ would have most definitely led to much shorter code as that was a metric for comparison.

quietbritishjim

8 months ago

> std::list to std::vector should be the big one

That is not a "C++23 best practice", which is what I was replying to. It doesn't even need C++11! And one use of this type uses pop_front() so std::vector is not obviously a good choice here.

> std::map to std::unordered_map could be next.

Again, I called out C++11 as possibly making a difference - sure this could help but it doesn't need C++23.

> then, really ranges/constexpr/std::move could make a difference, hard to say definitely.

How? Ranges are a nice syntax but what would they speed up here? There doesn't seem to be anything evaluated at compile time so what's the benefit of constexpr? (Even std::move doesn't have an obvious use in the benchmark code but that's C++11 anyway.)

> Beyond these, Modern C++ would have most definitely led to much shorter code as that was a metric for comparison.

I agree that would be interesting, but I'd be surprised if the code was much shorter. I'd guess something like 10%.

kvemkon

8 months ago

> Raspberry Pi OS Bookworm

> g++ (Debian 12.2.0-14) 12.2.0

> cargo 1.87.0 (99624be96 2025-05-06)

Looks like cargo is not from OS repo. So either use both outdated gcc and rust from RaspiOS or install the recent gcc-15 from experimental.

SkiFire13

8 months ago

Moreover they also differ in the optimizer being used. A more fair comparison would use Clang for C++ or rustc_codegen_gcc for Rust, possibly with the same versions of LLVM and/or GCC/libgccjit.

Joker_vD

8 months ago

The whole blogpost reads surprisingly angry, for some reason. Is it just my impression?

> And Google can go to hell.

Oh, apparently the tone is intended. Why though?..

camblomquist

8 months ago

I somehow missed the comments on this post. I think I need to respond not just to this but to the other (completely valid) criticisms in the comments here. And because you bring up what I think is a more interesting problem, you're the comment I'm replying to.

The anger I think stewed for a long time but was in it very nearly since the beginning of this project two years ago (I stopped working on it for quite a while).

First, my anger was directed at C++. std::map forced to be a Red-Black Tree for one. And I had originally written a lot about that anger in the first draft. I had written about how if this paper had baked in the oven a bit longer, they would've had C++11 to work with. I wanted to try to write a Modern C++ version of the code but that goal was what kept me from touching the project for over a year. I had gotten sick of C++ as a whole both at work and in my personal projects. And by the time I got back to it, the discussion on C++ no longer felt like it had a place in the post if it were to have the structure it ended up having.

On the rewrite, I found myself getting more and more frustrated by the original paper itself. The methodology felt flawed yet I felt I had to keep it. I said in the aside how the read of the paper gave this impression that they wanted to shill Go with this paper, like this was going to be its big debut as a language until it failed pretty much every test thrown at it. I don't know how this turned into Google Hate. Or at least, I don't remember. Maybe something about the current state of the world or an impression of Google company culture based on the tone of the paper. Maybe I just didn't want to be angry at Hundt specifically.

Much like the original paper, this post should've spent more time in the oven. But after spending so much time on it, it also felt like I just needed to get it out there so I could be done with it.

krona

8 months ago

By my reading, the C++ FindSet() implementation of the union/find algorithm builds a temporary list! Incredible scenes.