Historically, -O3 has been a bit less stable (producing incorrect code) and more experimental (doesn't always make things faster).
Flags from -O3 often flow down into -O2 as they are proven generally beneficial.
That said, I don't think -O3 has the problems it once did.
-O3 gained a reputation of being more likely to "break" code, but in reality it was almost always "breaking" code that was invalid to start with (invoked undefined behavior). The problem is C and C++ have so many UB edge cases that a large volume of existing code may invoke UB in certain situations. So -O2 thus had a reputation of being more reliable. If you're sure your code doesn't invoke undefined behavior, though, then -O3 should be fine on a modern compiler.
Oh, there are also plenty of bugs. And Clang still does not implement the aliasing model of C. For C, I would definitely recommend -O2 -fno-strict-aliasing
Exactly. A lot of people didn’t understand the contract between the programmer and the compiler that is required to use -O3.
That's a little vague, I'd put that more pointedly: they don't understand how the C and C++ languages are defined, have a poor grasp of undefined behaviour in particular, and mistakenly believe their defective code to be correct.
Of course, even with a solid grasp of the language(s), it's still by no means easy to write correct C or C++ code, but if your plan it to go with this seems to work, you're setting yourself up for trouble.
Indeed, e.g. Rust by default (release builds) use -O3.
Compiler speed matters. I will confess to not as much practical knowledge of -O3, but -O2 is usually reasonable fast to compile.
For cases where -O2 is too slow to compile, dropping a single nasty TU down to -O1 is often beneficial. -O0 is usually not useful - while faster for tiny TUs, -O1 is still pretty fast for them, and for anything larger, the increased binary size bloat of -O0 is likely to kill your link time compared to -O1's slimness.
Also debuggability matters. GCC's `-O2` is quite debuggable once you learn how to work past the possibility of hitting an <optimized out> (going up a frame or dereferencing a casted register is often all you need); this is unlike Clang, which every time I check still gives up entirely.
The real argument is -O1 vs -O2 (since -O1 is a major improvement over -O0 and -O3 is a negligible improvement over -O2) ... I suppose originally I defaulted to -O2 because that's what's generally used by distributions, which compile rarely but run the code often. This differs from development ... but does mean you're staying on the best-tested path (hitting an ICE is pretty common as it is); also, defaulting to -O2 means you know when one of your TUs hits the nasty slowness.
While mostly obsolete now, I have also heard of cases where 32-bit x86 inline asm has difficulty fulfilling constraints under register pressure at low optimization levels.
You have to profile for your specific use case. Some programs run slower under O3 because it inlines/unrolls more aggressively, increasing code size (which can be cache-unfriendly).
Yeah, -O3 generally performs well in small benchmarks because of aggressive loop unrolling and inlining. But in large programs that face icache pressure, it can end up being slower. Sometimes -Os is even better for the same reason, but -O2 is usually a better default.
Most people use -O2 and so if you use -O3 you risk some bug in the optimizer that nobody else noticed yet. -O2 is less likely to have problems.
In my experience a team of 200 developers will see 1 compiler bug affect them every 10 years. This isn't scientific, but it is a good rule of thumb and may put the above in perspective.
Would you say that bug estimate is when using -O2 or -O3?
The estimate includes visual studio, and other compilers that are not open source for whatever optimization options we were using at the time. As such your question doesn't make sense (not that it is bad, but it doesn't make sense).
In the case of open source compilers the bug was generally fixed upstream and we just needed to get on a newer release.
People keep saying "O3 has bugs," but that's not true. At least no more bugs than O2. It did and does more aggressively expose UB code, but that isn't why people avoid O3.
You generally avoid O3 because it's slower. Slower to compile, and slower to run. Aggressively unrolling loops and larger inlining windows bloat code size to the degree it impacts icache.
The optimization levels aren't "how fast do you want to code to go", they're "how aggressive do you want the optimizer to be." The most aggressive optimizations are largely unproven and left in O3 until they are generally useful, at which point they move to O2.
I would say there is a fair share of cases where programmers were told it is UB when it actually was a compiler bug - or non-conformance.
That share is a vanishingly small fraction of cases.
I am not sure. I saw quite a few of these bugs where programmers were told it is UB but it isn't.
For example, people showed me
extern void g(int x);
int f(int a, int b)
{
g(b ? 42 : 43);
return a / b;
}
as an example on how compilers exploit "time-travelling" UB to optimize code, but it is just a compiler bug that got fixed once I reported it:
https://developercommunity.visualstudio.com/t/Invalid-optimi...
Other compilers have similar issues.
You're an expert, you're overestimating the competence of the median programmer.
That's a great bug you found, and of course it is a compiler bug, not UB.
99.9% of the bugs I've dealt with of this sort were just pointer aliasing. Or just use-after-free. Or just buffer overruns.
The median programmer, especially in the good ol' days, wrote UB code about once every 6-10 hours.
Sure. All I am saying is that there are still plenty of compiler bugs related to optimization, which is reason enough for me to recommend being careful with optimization in contexts where correctness is important.
I agree that compilers have issues and that you have clearly run into some of them. I disagree with whether they are are more common than writing UB.
Oh, I didn't meant to imply that there are more common, just that they are common enough to be careful with optimizations.
Sure, I guess? In my experience I turn on the optimizer mostly without fear because I know that if, in the rare case I need to track down an optimizer bug, it would look the same as my process for identifying any other sort of crazy bug and in this case it will at least have a straightforward resolution.
More aggressive optimization is necessarily going to be more error prone. In particular, the fact that -O3 is "the path less traveled" means that a higher number of latent bugs exist. That said, if code breaks under -O3, then either it needs to be fixed or a bug report needs to be filed.