pizlonator
7 months ago
It’s cool to see this kind of analysis, even if it’s analyzing a totally bogus benchmark.
If you want to compare language runtimes, compilers, or CPUs then you have to pick a larger workload than just one loop. So, if a microbenchmark is an experiment, then it is a truly bad experiment indeed.
Reason: loops like this are easy for compilers to analyze, in a way that makes them not representative of real code. The hard part of writing a good compiler is handling the hard to analyze cases, not loops like this. So, if a runtime does well on a bullshit loop like this then it doesn’t mean that it’ll do well on real stuff.
(Source: I wrote a bunch of the JSC optimizations including the loop reshaping and the modulo ones mentioned by this post.)
ergeysay
7 months ago
> loops like this are easy for compilers to analyze, in a way that makes them not representative of real code
Which makes it a perfectly fine benchmark to measure whether a particular compiler implements these optimisations. The benchmark also highlights fine implementation details. I did not know about Dart's interrupt checks, for instance.
I see these microbenchmarks as genuinely useful, as I can analyse them, the logic behind them, and apply results in interpreter design. Consider [0] for example. Any sane compiler would do this kind of optimisation, but I've seen only one production interpreter (daScript) doing it.
[0] https://ergeysay.github.io/optimising-interpreters-fusion.ht...
pizlonator
7 months ago
> Which makes it a perfectly fine benchmark to measure whether a particular compiler implements these optimisations.
No, because whether the optimizations are “implemented” doesn’t matter.
What matters is whether the optimizations are robust enough to trigger for real code, not just bogus dead loops.
pkolaczk
7 months ago
And what if the runtime does poorly on even such a simple loop? Go is surprisingly slower here than Java and Kotlin.
I agree with the author of the blog here - microbenchmarks are just experiments and they can be very useful if you do proper analysis of the results. You can definitely learn something about the runtimes even from such a simple for loop benchmark.
pizlonator
7 months ago
As a VM implementer, I conclude nothing from the observation that Go is slower on this test. Because it’s a bullshit loop that I wouldn’t ever write and neither would anyone else unless they were trying to troll people with nonsense “experiments”.
The fact that you’re drawing information from the fact that Go is slower on this loop is just you being mislead by a bad experiment.
If you want to understand Go’s performance, run some real benchmarks. My understanding is that Go is plenty fast.
pkolaczk
7 months ago
A simple loop like that running slower tells that the compiler optimization strength is not really very good. If it misses optimizations of trivial code like looping and basic arithmetic, it will likely miss even more in complex code. And instead of getting defensive about your language of choice, the right reaction is what Dart developers did - they improved their compiler. The benchmark proved actually useful to them.
This benchmark is not any less real than your “real world” benchmark. By being a simple micro benchmarks it may be actually even more useful than running very complex “real world” code, because it simplifies analysis.
solarkraft
7 months ago
> The hard part of writing a good compiler is handling the hard to analyze cases, not loops like this
But it means that it can handle the easy case and if one is already bad at that it indicates that it won’t do well on harder ones either.
My takeaway from 1 billion loops wasn’t exactly “js is always fast” but “js can be fast while python will always be slow” (talking about their main interpreters of course).
brabel
7 months ago
> It’s cool to see this kind of analysis, even if it’s analyzing a totally bogus benchmark.
Yeah but that bogus benchmark has become viral on social networks, and people are even writing their own shitty "experiments" based on that one.
This post is absolutely wonderful in that it doesn't shit on the original benchmark too much, while explaining patiently how to do it right. Hopefully, the person who started with the bogus benchmark, as well as people following in his footsteps, will learn something important (I myself have posted benchmarks before - though I believe much higher quality - but failed to properly analyse the results as shown here).
Notice that the bogus benchmark ended up catching the attention of a Dart maintainer and looks like they found their compiler was missing some easy optimisations... so it may end up having been helpful for every Dart user in the world!