I really like these suggestions since they can be summed up in one sentence: they are what C programmers who write code with UB would already expect any reasonably sane platform would do. I think it's definitely a very positive change in attitude from the "undefined behaviour, therefore anything can happen" that resulted in compilers' optimisations becoming very surprising and unpredictable.
Rather, we are trying rescue the predictable little language that we all know is hiding within the C standard.
Well said. I think the practice of UB-exploiting optimisation was completely against the spirit of the language, and that the majority of optimisation benefits happen in the compiler backend (instruction selection, register allocation, etc.) At least as an Asm programmer, I can attest that IS/RA can make a huge difference in speed/size.
The other nice point about this friendly C dialect is that it still allows for much optimisation, but with a significant difference: instead of basing it on assumptions of UB defined by the standard, it can still be done based on proof; e.g. code that can be proved to be unneeded can be eliminated, instead of code that may invoke UB. I think this sort of optimisation is what most C programmers intuitively agree with.
instead of basing it on assumptions of UB defined by the standard, it can still be done based on proof; e.g. code that can be proved to be unneeded can be eliminated, instead of code that may invoke UB. I think this sort of optimisation is what most C programmers intuitively agree with.
The main motivation for relying on some of the UB assumptions is that compilers weren't able to prove things in a lot of cases where programmers expected them to, because C (especially in the face of arbitrary pointers, and esp. with incremental compilation) is quite hard to prove things about. So compilers started inferring things about variables from what programmers do with them. For example, if a pointer is used in a memcpy(), then the programmer is signalling to us that they know this is a non-null pointer, since you can't pass NULL as an address to memcpy(). So when compiled with non-debugging, high-optimization flags, the compiler trusts the programmer, and assumes this is a known-to-be-non-null pointer.
It would be interesting to see some benchmarks digging into whether turning off that kind of inference has significant performance impact on real-world code. Without adding more source-level annotations to C (like being able to flag pointers as non-NULL), it will reduce some optimization opportunities, since the compiler won't be able to infer as many things, even things that are definitely true. But it might not be enough cases to matter.
Rather, we are trying rescue the predictable little language that we all know is hiding within the C standard.
Well said. I think the practice of UB-exploiting optimisation was completely against the spirit of the language, and that the majority of optimisation benefits happen in the compiler backend (instruction selection, register allocation, etc.) At least as an Asm programmer, I can attest that IS/RA can make a huge difference in speed/size.
The other nice point about this friendly C dialect is that it still allows for much optimisation, but with a significant difference: instead of basing it on assumptions of UB defined by the standard, it can still be done based on proof; e.g. code that can be proved to be unneeded can be eliminated, instead of code that may invoke UB. I think this sort of optimisation is what most C programmers intuitively agree with.