pcwalton
19 hours ago
I notice that the paper doesn't claim to eliminate all reasoning about undefined behavior for optimizations. For example:
int f() {
int arr[3], i = 0;
arr[3] = 5;
return i;
}
Optimizing this to "return 0" is relying on UB, because it's assuming that i wasn't laid out directly after arr in the stack frame. I believe this is what the paper calls "non-guardable UB".I don't agree with the claim in the paper that their semantics offers a "flat memory model". A flat memory model would rule out the optimization above. Rather, the memory model still has the notion of object bounds; it's just simplified in some ways.
dooglius
18 hours ago
>it's assuming that i wasn't laid out directly after arr in the stack frame
The compiler isn't "assuming" that so much as choosing not to put i in the stack frame at all. And I don't think it's correct to view the lack of a register spill as an "optimization" per se. It does remain true that code writing past the end of an array will be UB in typical scenarios (i.e. when not using asan/valgrind).
(Now, if the compiler also removed the store, that could legitimately be called an optimization based on assuming no-UB)
pcwalton
17 hours ago
"Exploiting undefined behavior" occurs when a simple semantics (however one defines "simple") results in behavior A, but the compiler chooses behavior B instead based on the actual, more complex, language semantics. The code snippet in question passes that test. If I flip the declaration order of i and arr, then I get this [1] at -O0 (the "simple" semantics):
push rbp
mov rbp, rsp
mov dword ptr [rbp - 4], 0
mov dword ptr [rbp - 4], 5
mov eax, dword ptr [rbp - 4]
pop rbp
ret
Which indeed returns 5. But at -O2 clang optimizes it to this: xor eax, eax
ret
Which returns 0. So the simple semantics produces one result, and the complex semantics produces another. Hence, it's exploiting undefined behavior.dooglius
16 hours ago
Maybe this is just arguing semantics, but I don't agree with the definition you've given, and I don't think that your definition is what TFA means. "Exploiting undefined behavior" I think refers narrowly to the implementation assuming that undefined behavior does not occur, and acting on that assumption. Undefined behavior naturally resulting in unpredictable behavior is not exploitation in the same sense. For example,
printf("A");
bool x;
if ( x ) {printf("B");} else {printf("C");}
printf("\n");
If at -O0 "AB" is printed and at -O2 "AC" is printed (due to the vagaries of whatever was left on the stack), then that would meet your definition, but I would not regard that as "exploiting undefined behavior", merely as the manifestation of the inherent unpredictability of UB. If the compiler didn't print anything (i.e. the whole block was removed due to UB detection) then that _would_ be a case of exploiting undefined behavior.pcwalton
16 hours ago
That example is an instance of unspecified vs. undefined behavior, but the correctness of the pointer provenance-based optimization example I gave doesn't depend on whether writing to an out-of-bounds pointer is unspecified or undefined.
artikae
16 hours ago
What about this: https://godbolt.org/z/xP9xG3Ee3
Here the compiler "register allocates" i for some reads but not for others.
i gets stack allocated, but some uses of it act as though they were register allocated.
dooglius
14 hours ago
I'm not sure quite what you're asking for exactly, given the link is for clang trunk and doesn't have the modifications discussed in TFA, and I don't dispute that clang does UB-based reasoning at -O3. But, I will argue that the assembly shown can be accomplished without resorting to what I call "reasoning about UB", and within a flat memory model, supporting the claim that these sacrifices are often not necessary. I'm going to draw a distinction between stack memory being "private" in the sense that only the compiler is allowed to alter it, and "public" where the address can be written to by something else and the compiler needs to handle that. Local variables at first are tracked privately. After the address of a variable is taken with &x, or at the point in time when an array variable is indexed, the associated memory is public. Conceptually, the use of private memory can be indirect; the compiler could encode/decode a stack variable x as (x XOR 0xDEADBEEF) on the stack and it would be fine (but the compiler does the simple thing in practice, naturally). Note that this notion of "private"/"public" stack memory is a property of addresses, not the provenance of the accessing pointers, and so is fully compatible with a flat memory model. The compiler's ability to use private memory isn't a case of "reasoning around UB" in a meaningful sense -- otherwise you could just as well argue that returning from a function call is "reasoning about UB", because the the return address can't be clobbered.
In your provided snippet, the correctness argument for the assembly in a non-UB-reasoning universe goes like this: at first, i is stored privately on the stack with value zero, and so as an optimization we can assume that value is still zero without rereading. Only later, when &i is taken, is that memory made public and the compiler has to worry about something altering it. In actual execution, the problem is that the write function alters compiler-private memory (and note again, that being private is a property of the underlying address, not the fact that it's accessed via an out-of-bounds array indexing), and this is UB and so the program breaks. But, the compiler didn't need to make _assumptions_ around UB.
jcranmer
17 hours ago
I've only briefly skimmed the paper, but on that glance, it looks like what they did was (effectively) drop all the attributes in LLVM that can indicate UB, e.g., the inbounds flags on getelementptr instructions, or the nsw flags on arithmetic operations.
As you note, it doesn't remove the more core UB behaviors in LLVM, in particular LLVM's reliance on pointer provenance.
nikic
15 hours ago
They do a bit more than that. One of the options (-disable-object-based-analysis under AA2) disables the assumption that distinct identified objects do not alias, which is disabling pointer provenance in at least one key place.
So I think this option very roughly approximates a kind of "no provenance, but with address non-determinism" model, which still permits optimizations like SROA on non-escaping objects.
jcranmer
15 hours ago
That's what I get for relying on a skimming of the paper.
Also, hi, didn't know you commented on this site.
throwawayqqq11
18 hours ago
Sorry, i dont get why the memory layout should have any effect, when its clear in the AST that i=0 should be returned.
maartenscholl
18 hours ago
I think in the example the parent gave `arr[3]` is past the end of the 3 element array, where `i` might reside, potentially changing its value.
bregma
18 hours ago
It's clear in the AST that there is undefined behaviour and it is malformed code. It is not valid C code, so what the compiler chooses to do with it is not defined by the language.
pcwalton
18 hours ago
Note that if you change the code to this you have the same issue:
int g(int n) {
int arr[3], i = 0;
arr[n] = 5;
return i;
}
Without "exploiting UB" it's incorrect to optimize this to "return 0", because of the possibility that i was allocated right after arr and n == 3.screcth
17 hours ago
If we consider writing out of bounds to be legal, we make it impossible to reason about the behavior of programs.
pcwalton
17 hours ago
Hence why I (and many other compiler developers) are inherently skeptical whenever anyone says "just stop exploiting undefined behavior".
tialaramex
13 hours ago
I'm not a compiler developer but I'm at least as skeptical as you because there is no sign that the "just stop exploiting UB" people actually want any specific semantics, IMO they want Do What I Mean, which isn't a realizable language feature.
If you could somehow "stop exploiting UB" they'd just be angry either that you're still exploiting an actual language requirement they don't like and so have decided ought to be excluded or that you followed the rules too literally and obviously the thing they meant ought to happen even though that's not what they actually wrote. It's lose-lose for compiler vendors.
gavinhoward
12 hours ago
I am one of the "stop exploiting UB" camp. [1]
I agree that some of us are unreasonable, but I do recognize that DWIM is not feasible.
I just want compilers to treat UB the same as unspecified behavior, which cannot be assumed away.
muldvarp
6 hours ago
> I just want compilers to treat UB the same as unspecified behavior, which cannot be assumed away.
Unspecified behavior is defined as the "use of an unspecified value, or other behavior where this International Standard provides two or more possibilities and imposes no further requirements on which is chosen in any instance".
Which (two or more) possibilities should the standard provide for out-of-bounds writes? Note that "do what the hardware does" wouldn't be a good specification because it would either (a) disable all optimizations or (b) be indistinguishable from undefined behavior.
tialaramex
11 hours ago
You mention that "Note that those surprised programmers are actually Rust compiler authors" but I can't figure out which of the many links is to some "surprised programmers" who are actually rustc authors, and so I don't even know if you're right.
Rust's safe subset doesn't have any UB, but the unsafe Rust can of course cause UB very easily, because the rules in Rust are extremely strict and only the safe Rust gets to have the compiler ensure it doesn't break the rules. So it seems weird for people who work on the compiler guts to be "surprised".
pcwalton
8 hours ago
I'm a Rust compiler author, and I'm fully in favor of "UB exploitation". In fact, LLVM should be doing more of it. LLVM shouldn't be holding back optimizations in memory-safe languages for edge cases that don't really matter in practice.
gavinhoward
7 hours ago
muldvarp
6 hours ago
I don't see any surprised compiler authors in that thread. The reporter immediately suggests the correct underlying reason for the bug and another compiler author even says that they wondered how long it would take for someone to notice this.
Even if you read any surprise into their messages they wouldn't be surprised that C does something completely unreasonable, they would be surprised that LLVM does something unreasonable (by default).
Dwedit
10 hours ago
There is also a completely different scenario where out-of-bounds writes aren't undefined behavior anymore. And that's when you've manually defined the arrays in an assembly source file, and exported their symbols. In that situation, you know what's before the array or after the array, so doing pointer math into an adjacent area has a well known effect.
im3w1l
17 hours ago
My thinking disagrees with yours, but I can't say I have fully made up my mind yet. To me a compiler that is not "exploiting UB" has several valid choices about how to compile this. i may be stored in memory, in a register or may be deleted entirely as it's known at compile time that it will have value 0. The store to arr[n] may go through or it may be deleted as it's a dead store.
You may say I'm "exploiting ub" when making these deductions but I would disagree. My argument is not of the form "x is ub so I can do whatever I want".
To elaborate on the example, if arr was volatile then I would expect the write to always go through. And if i was volatile then I would expect i to always be read. However it's still not guaranteed that i is stored immediately after arr, as the compiler has some discretion about where to place variables afaik. But if i is indeed put immediately after, then the function should indeed return 5 for n=3. For n>3 it should either return 0 (if writing to unused stack memory), page fault (for small n outside of the valid stack space), or stomp on random memory (for unlucky and large n). For negative n, many bad things are likely to happen.
Edit: I think I mixed up which way the stack grows but yeah.
dgrunwald
16 hours ago
> But if i is indeed put immediately after, then the function should indeed return 5 for n=3.
That's not how compilers work. The optimization changing `return i;` into `return 0;` happens long before the compiler determines the stack layout.
In this case, because `return i;` was the only use of `i`, the optimization allows deleting the variable `i` altogether, so it doesn't end up anywhere on the stack. This creates a situation where the optimization only looks valid in the simple "flat memory model" because it was performed; if the variable `i` hadn't been optimized out, it would have been placed directly after `arr` (at least in this case: https://godbolt.org/z/df4dhzT5a), so the optimization would have been invalid.
There's no infrastructure in any compiler that I know of that would track "an optimization assumed arr[3] does not alias i, so a later stage must take care not to place i at that specific point on the stack". Indeed, if array index was a runtime value, the compiler would be prevented from ever spilling to the stack any variable that was involved in any optimizations.
So I think your general idea "the allowable behaviors of an out-of-bounds write is specified by the possible actual behaviors in a simple flat memory model for various different stack layouts" could work as a mathematical model as an alternative to UB-based specifications, but it would end up not being workable for actual optimizing compiler implementations -- unless the compiler could guarantee that a variable can always stay in a register and will never be spilled (how would the compiler do that for functions calls?), it'd have to essentially treat all variables as potentially-modified by basically any store-via-pointer, which would essentially disable all optimizations.