Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

1. -fomit-frame-pointer is implied by O3 on most platforms now

2. "too likely to cause harm, too unlikely to make a significant performance difference in most cases."

Please define "most cases". Without this, GCC will have significant trouble being able to derive the bounds of most loops, and in turn, will not be able to vectorize, unroll, peel, split, etc.

Saying "unlikely to make a significant performance different in most cases" is probably very very wrong for most people. The last benchmarks I saw across a wide variety of apps showed the perf difference was 10% in most cases, and a lot more in others.



1. that's good to know. I'm all for shortening my cflags line since I don't squelch my Makefile rules.

2. I always get bitten when I try and generalize. I tested this in all of my software, and was not able to detect any performance difference with or without -fwrapv (that is to say, < 1% difference, too small of a difference to make any conclusions.)

I know you can create extreme edge cases where there's a huge difference, just as you can probably make up one that's slower without -fwrapv if you really wanted to.

But yeah, maybe I just don't write code that lends itself to benefiting heavily from these types of assumptions. I also tend to not really rely on signed integer overflow following twos-complement. But all the same, I will take well-defined behavior over the crazy stuff GCC can produce any day, even at the cost of a bit of performance. Of course, going all the way to -O0 is way too extreme. So a case where I see no perceptible performance impact and gain defined behavior? Win-win.


2. You must not write software very amenable.

Fun fact btw: GCC and LLVM are the only compilers I know of to assume loops can overflow at all when optimizations are on.

Compilers like XLC will actually even assume unsigned loop induction variables will not ovefrlow at O3, unless you give them special flags.

:)


> Compilers like XLC will actually even assume unsigned loop induction variables will not ovefrlow at O3

That's not exactly fair. The C standard guarantees that unsigned variables will overflow by wrapping, so if the compiler assumes such a loop won't terminate, it is not conformant.


That's exactly the point: they cheat, because it makes code faster except in the small percent of code it breaks.

Let us all now bow our heads to the almighty SPEC gods ...


I'm curious if there is a way to rewrite your loops which use unsigned types to somehow communicate to the compiler that you will not overflow.


You'd have to use annotations or asserts. Or rely on literal whole program analysis to prove upper bounds of parameters/etc (which still may not be possible statically)

Otherwise, given as something as simple as

for (unsigned i = 0; i < N; i+=2)

You can't say it iterates N/2 times


I took a couple examples where gcc failed vectorization for unsigned indexes and tried to rewrite them in a way that didn't sacrifice 1/2 of the type's range just to satisfy the optimizer. In the first example I could just change a "<= n" to "!= n + 1", and the second example, which was based on your comment, could be solved by using pointers instead of indexing. I still wonder how many examples can't be solved without using signed types.

The results: http://ideone.com/7vidIs

Generated code http://tinyurl.com/le72k4o


In reality I'm not sure what kind of loop would have an index only fitting in an unsigned type.

On a 32-bit machine an unsigned array index means one object using more than half the address space. It's sensible to use unsigned 64-bit for file sizes, but I think it's quite odd that C programers would use it for a loop or array index. Wrong, but defined, behavior is worse than undefined behavior, you know.

On a 64-bit machine, well, it shouldn't be a problem to use signed long long.


If you change <= n to != n + 1, you have just written broken code.

Think of n == UNSIGNED_MAX. n+1 will overflow.


Right, but the same is true of signed as well.


The issue i raised is that you replaced a perfectly functioning loop with one that does not work properly (it does iterate the same number of times for all inputs)

You said "I can change <= n to != n + 1". You cannot. for n == UNSIGNED_MAX, the former loop will iterate infinitely, the latter loop will never iterate.

Both are well defined to occur. You cannot change the loop behavior and say you have the same loop :)


I never meant to imply that changing "<= n" to "!= n + 1" does not change the semantics of the unsigned loop. This was an example shown in some LLVM documentation as to why you should use signed loop variables. I was just showing that you can still use unsigned variables, have the same semantics as _the signed loop_ ,and get vectorized code.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: