Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Why doesn't GCC optimize a*a*a*a*a*a to (a*a*a)*(a*a*a)? (2013) (stackoverflow.com)
227 points by Doublon on March 27, 2014 | hide | past | favorite | 83 comments



Wow! Thank you. I didn't realize there were such legit programming podcasts around. Any other similar ones I need to know about?


Check out The Haskell Cast:

http://www.haskellcast.com


I also follow Debug podcast, some C9 efforts and some vendor podcasts. It's rare for me to discover the type of high quality content in audio you used to find at places like ITconversations back in the day, mostly because people use video/YouTube these days. But I try to keep an ear out ;-)


Because, of course, floating point addition and multiplication is not associative. This turns out surprisingly easy to demonstrate:

    0.1 + (0.2 + 0.3) = 0.6
    0.1 + 0.2 + 0.3 = 0.6000000000000001
and the same for multiplication:

    0.1 * 0.2 * 0.3 = 6.000000000000001e-3
    0.1 * (0.2 * 0.3) = 6.0e-3
It actually isn't "surprising" if you understand how the format works. It essentially uses scientific notation but in binary, with a set number of bits for both the mantissa and the exponent as well as a few changes thrown in for better behavior at its limits (like denormalization). This means that it can't directly express numbers which are very easy to write in decimal form, like 0.1, just like we can't express 1/3 as a finite decimal. It's designed to manage this as well as possible with the small number of bits at its disposal, but we still inevitably run into these issues.

Of course, most programmers only have a vague idea of how floating point numbers work. (I'm certainly among them!) It's very easy to run into problems. And even with a better understanding of the format, it's still very difficult to predict exactly what will happen in more complex expressions.

A really cool aside is that there are some relatively new toys we can use to model floating point numbers in interesting ways. In particular, several SMT solvers including Z3[1] now support a "theory of floating point" which lets us exhaustively verify and analyze programs that use floating point numbers. I haven't seen any applications taking advantage of this directly, but I personally find it very exciting and will probably try using it for debugging the next time I have to work with numeric code.

A little while back, there was an article about how you can test floating point functions by enumerating every single 32-bit float. This is a great way of thinking! However, people were right to point out that this does not really scale when you have more than one float input or if you want to talk about doubles. This is why SMT solvers supporting floating point numbers is so exciting: it makes this sort of approach practical even for programs that use lots of doubles. So you can test every single double or every single pair of doubles or more, just by being clever about how you do it.

I haven't tried using the floating point theories, so I have no idea how they scale. However, I suspect they are not significantly worse than normal bitvectors (ie signed/unsigned integers). And those scale really well to larger sizes or multiple variables. Assuming the FP support scales even a fraction as well, this should be enough to practically verify pretty non-trivial functions!

[1]: http://z3.codeplex.com/


As the answer on stackoverflow mentions, there is -ffast-math option to gcc that tells the compiler to treat fp operations as associative. I've done some testing:

  $ clang -v
  clang version 3.3 (tags/RELEASE_33/final)
  Target: x86_64-apple-darwin12.4.0

  double pow(double);
  
        LLVM bitcode                   Assembly

  // a*a*a*a*a*a                 
  %1 = fmul double %a, %a       |  movaps %xmm0, %xmm1
  %2 = fmul double %1, %a       |  mulsd  %xmm1, %xmm1
  %3 = fmul double %2, %a       |  mulsd  %xmm0, %xmm1
  %4 = fmul double %3, %a       |  mulsd  %xmm0, %xmm1
  %5 = fmul double %4, %a       |  mulsd  %xmm0, %xmm1
  ret double %5                 |  mulsd  %xmm0, %xmm1
                                |  movaps %xmm1, %xmm0

  // a*a*a*a*a*a -ffast-math    
  %1 = fmul fast double %a, %a  |  mulsd  %xmm0, %xmm0
  %2 = fmul fast double %1, %1  |  movaps %xmm0, %xmm1
  %3 = fmul fast double %1, %2  |  mulsd  %xmm1, %xmm1
  ret double %3                 |  mulsd  %xmm0, %xmm1
                                |  movaps %xmm1, %xmm0

  // (a*a*a)*(a*a*a)            
  %1 = fmul fast double %a, %a  |  movaps %xmm0, %xmm1
  %2 = fmul fast double %1, %a  |  mulsd  %xmm0, %xmm0
  %3 = fmul fast double %2, %2  |  mulsd  %xmm1, %xmm0
  ret double %3                 |  mulsd  %xmm0, %xmm0


I wrote GCC's reassociation pass.

LLVM and GCC use the same algorithm for reassociation (and as far as i know, there is still no good literature on tradeoffs one way or the other), so you will get the same results, modulo some small differences implementation differences.

The reassociation passes exist mainly to promote redundancy elimination, and so the factorizations they perform are tilted towards making binary operations that look the same. Factorization and transformation into pow calls is also done, but this is just because it's easy to do :)

Note that the operations it forms may be "less than ideal" from a register pressure perspective, since it does not take this into account at the level reassociation is being performed (it assumes something will later reassociate them some other way if it wishes).


Just to make your significant digits clearer:

    0.1 + 0.2 + 0.3   = 0.6000000000000000900
    0.1 + (0.2 + 0.3) = 0.5999999999999999800

    0.1 * 0.2 * 0.3   = 0.0060000000000000010
    0.1 * (0.2 * 0.3) = 0.0060000000000000001
Using -ffast-math doesn't effect these results.


I find it illuminating to look at the exact values of the floating point numbers. This takes away an air of mystery: floating point numbers do in fact have very specific values, which we can inspect exactly.

Here's 0.1 + 0.2 + 0.3:

        0.1 (0.1000000000000000055511151231257827021181583404541015625)
      + 0.2 (0.200000000000000011102230246251565404236316680908203125)
           = 0.3000000000000000166533453693773481063544750213623046875
    rounded: 0.3000000000000000444089209850062616169452667236328125

      + 0.3 (0.299999999999999988897769753748434595763683319091796875)
           = 0.600000000000000033306690738754696212708950042724609375
    rounded: 0.600000000000000088817841970012523233890533447265625
The operands can't be represented exactly as floating point so they are rounded to the nearest floating-point number before the addition even happens. And once the addition does happen, it is rounded to the nearest representable floating point value.

Now we can look at the same for 0.1 + (0.2 + 0.3):

        0.2 (0.200000000000000011102230246251565404236316680908203125)
      + 0.3 (0.299999999999999988897769753748434595763683319091796875)
           = 0.500000000000000000000000000000000000000000000000000000
    rounded: 0.500000000000000000000000000000000000000000000000000000

      + 0.1 (0.1000000000000000055511151231257827021181583404541015625)
           = 0.6000000000000000055511151231257827021181583404541015625
    rounded: 0.59999999999999997779553950749686919152736663818359375
It's interesting to note that the first addition ends up being exact! This is a bit lucky, because 0.5 is exactly representable in floating point and the initial operands got rounded in opposite directions. It's also interesting that 0.6... only turns into 0.599999... at the point that we round the answer.

You can play around with this stuff conveniently using the Python "Decimal" module.

    $ python
    Python 2.7.5 (default, Aug 25 2013, 00:04:04)-
    [GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> from decimal import Decimal, getcontext
    >>> getcontext().prec = 1000  # To prevent truncation/rounding
    >>> Decimal(0.1)
    Decimal('0.1000000000000000055511151231257827021181583404541015625')
    >>> a = Decimal(0.1) + Decimal(0.2)
    >>> a
    Decimal('0.3000000000000000166533453693773481063544750213623046875')
    >>> Decimal(float(a))  # get the value rounded to the nearest fp number
    Decimal('0.3000000000000000444089209850062616169452667236328125')
You can also use my tool "dumpfp" (https://github.com/haberman/dumpfp) to inspect the value of a floating-point number in detail:

    $ ./dumpfp 0.1
    Single Precision (IEEE 32-bit):
               raw = 0x3dcccccd
              sign = 0x0
          exponent = 0x7b (-4)
       significand = 0x4ccccd

       VALUE CALCULATION =
           significand   (1 + 5033165/2^23  (1.60000002384185791015625))
         * 2^exponent    (2^-4)
         = VALUE         (13421773/2^27  (0.100000001490116119384765625))

    Double Precision (IEEE 64-bit):
               raw = 0x3fb999999999999a
              sign = 0x0
          exponent = 0x3fb (-4)
       significand = 0x999999999999a

       VALUE CALCULATION =
           significand   (1 + 1351079888211149/2^51  (1.600000000000000088817841970012523233890533447265625))
         * 2^exponent    (2^-4)
         = VALUE         (3602879701896397/2^55  (0.1000000000000000055511151231257827021181583404541015625))


Note that recent Python versions use Gay's correctly-rounded conversions, so .1+(.2+.3) displays as .6, since that's equivalent when converted back to binary floating point. http://bugs.python.org/issue1580


If you want an exact representation, C99 has the %a format specifier for base-16 floating point. It seems that Python 3 doesn't support it but Lua 5.2 does.

    Lua 5.2.2  Copyright (C) 1994-2013 Lua.org, PUC-Rio
    > print(string.format("  0.1 (%a)\n+ 0.2 (%a)\n= %a\n+ 0.3 (%a)\n= %a", 0.1,0.2,0.1+0.2, 0.3, 0.1+0.2+0.3))
      0.1 (0x1.999999999999ap-4)
    + 0.2 (0x1.999999999999ap-3)
    = 0x1.3333333333334p-2
    + 0.3 (0x1.3333333333333p-2)
    = 0x1.3333333333334p-1
    > print(string.format("  0.2 (%a)\n+ 0.3 (%a)\n= %a\n+ 0.1 (%a)\n= %a", 0.2,0.3,0.2+0.3, 0.1, 0.2+0.3+0.1))
      0.2 (0x1.999999999999ap-3)
    + 0.3 (0x1.3333333333333p-2)
    = 0x1p-1
    + 0.1 (0x1.999999999999ap-4)
    = 0x1.3333333333333p-1


I don't know if you realise that by converting a double "0.1" into the python decimal of a much larger size, you're bound to get garbage.

Edit: I see what you're doing now. I didn't appreciate enough how converting to decimals had such unpleasant effects :) Thanks.


I'm not sure what you're trying to say. 0.1000000000000000055511151231257827021181583404541015625 is the precise value of float(0.1). It is not "garbage" or an approximation in any way. It is exact.


> Of course, most programmers only have a vague idea of how floating point numbers work.

From my experience, most programmers have a vague idea of how basic integers work, so that's not so surprising. (I'm in the group who only knows a little more than average about floating-point, but for what I do, I don't use FP math much either.)


I've always thought it would be helpful to have floating point values that tracked the upper and lower bounds of the possible error. That would make analysis of the results much easier.


That exists and is called interval arithmetic:

https://en.wikipedia.org/wiki/Interval_analysis


That's called ball arithmetic, and there are many libraries for it, such as arb. http://fredrikj.net/arb/ It's useful in mathematical calculation, but usually with floating point calculations you have some error bounds coming from your algorithm which are "good enough" if the result is going to be the value of something with 5% tolerance.


But what about the accuracy of the error bounds?


It is easy to track the real result of the computation with floating-point intervals: round down when computing the lower bound, and up when computing the upper bound of each interval.


A common and simple way to treat FP mathematically is by bounding the result with the machine epsion u [1]. For example, given FP numbers x and y:

x <fp_mul> y = (1+e)xy ; abs(e) <= u

The u ideally depends only on the FP format and rounding mode. I say ideally because library functions like sqrt may be less precise than the ideal (which is the result being equal to the exact value rounded according to the rounding mode). Note that the abstraction only as long as you don't go outside the range of normal exponents (it breaks when you reach infinity or subnormal).

EXAMPLE. Suppose we have a FP value L representing a length of something, and want to split it into segments no longer than the FP value M. Assume both integers are small enough to be representable in FP. The answer in exact arithmetic is:

N = ceil(L/M).

However if we evaluate this in FP, we may get a result N'<N for certain edge values. Which means one of the segment lengths we output will be greater than M!. So the result of the algorithm is incorrect!

We can fix it by multiplying (L/M) with a number sufficiently greater than one before the ceil, at the cost of sometimes deciding to split into one more segment than mathematically necessary. The fixed FP algorithm may then be:

N' = ceil(1.00005 <fp_mul> (L <fp_div> M))

After we apply the machine epsilon formulas we come to the conclusion that:

N' = ceil( (1+e1)(1+e2)1.00005 (L/M) ) ; abs(e1)<=u, abs(e2)<=u

We then show that:

(1+e1)(1+e2)*1.00005 >= 1.

This proves that what the algorithm gives to the ceil function is not less the exact value L/M, and that N'>=N. And ceil() can't make errors by definition, assuming the integer value is representable. QED

[1] http://en.wikipedia.org/wiki/Machine_epsilon


The “malicious” rounding that you are attempting to protect against cannot actually occur in IEEE-754 floating-point. You are jumping at shadows, making lots of results less accurate to protect against something illusory.

Why is this the case?

In order for ceil(L/M) to produce an incorrect result, we would need to have the mathematically exact result be of the form N + epsilon, with 0 < epsilon <= ulp(N)/2, and N an exact integer. But, can that ever actually happen? No. If it did, we would have a floating-point number L = M(N+epsilon) exactly, or L = MN + delta, with 0 < delta < ulp(MN)[1], which can never happen by the definition of ulp.

This illustrates a difficulty naive backwards-error analysis as commonly taught in numerical analysis courses. While it provides reasonable conservative bounds, which is often sufficient for engineering needs, it misses much of the subtlety of floating-point arithmetic, and leads people to take unnecessary steps to guard against “errors” that aren’t really there.

[1] Lemma: For any binary floating-point type, ulp(xy) > x ulp(y)/2.

Proof: First note that rounding of the product xy will only ever make ulp(xy) larger, so we can safely ignore it. Use the formula ulp(x) = 2^(floor(log2(x)) - P) where P is the precision in bits and expand:

    ulp(x*y)  =  2^(floor(log2(xy)) - P)
              =  2^(floor(log2(x) + log2(y)) - P)
             >=  2^(floor(log2(x)) + floor(log2(y)) - P)
              =  2^(floor(log2(x)) ulp(y)
              >  2^(log2(x) - 1) ulp(y) = x ulp(y)/2.


I suspect a bug in your proof right at the beginning, where you say "exact result be of the form N + epsilon, with 0 < epsilon <= ulp(N)/2, and N an exact integer". Assuming you're referring to the result of the entire expression including ceil, can we still say that, considering that ceil() is far from being a continuous function?


I’m referring to the mathematically exact result L/M. Sorry for being unclear.


Anyway, my example was a bit too theoretical. A better version is one where L and M are integers which are not necessarily representable. Here: http://ideone.com/1Xkhjp


Yes, of course it’s possible to exhibit errors if you change the problem (though it should be noted that the source of error here is entirely in the fact that l is not representable). As soon as you allow representation error, of course, your tolerance-based approach goes out the window; what if the representation error in l and m is larger than your tolerance accounts for?

If I wanted a correct bound for your revised problem, I would require that l be an upper bound on the true value (by forcing it to be rounded up earlier computations, for example) and that m be a lower bound on the true value. Then the simple computation delivers a correct result regardless of the magnitude of the error in l or m.


"This illustrates a difficulty naive backwards-error analysis"

It's not naive. It's conservative, the exact opposite.


I pointed part of this out:

> it provides reasonable conservative bounds

It is the technique of backwards-error analysis that is naive, not the resulting bound. A “naive” method is not “simple” or “dumb” (or “aggressive” or whatever you think the opposite of “conservative” might be). It is simply a method that doesn’t use some a priori knowledge that’s available for the specific problem at hand. Backwards-error analysis is a naive technique in that it doesn’t benefit from specialized theorems of floating-point that allow one to establish tighter bounds, like the one I proved.


Though, as nice as the examples are, FP rounding problems in practice are rarely due to binary/decimal conversions, which are only relevant when reading and writing decimal encoded numbers. These are normally absent when doing tasks such as moving motors based on sensor readings.


Are there cases where a program says a * a * a but evaluating it as a * (a * a) would be wrong?

I gather (at least in your examples) that dealing with bigger numbers first typically affords more 'real' accuracy?


There are no values of a where a * a * a differs from a * (a * a), because floating-point multiplication is commutative.

With arbitrary numbers a, b, and c, it would be a different story for a * b * c. Without further knowledge of a, b, c, there would be no particular reason to do the operations in one order or another, but the result of a * b * c and a * (b * c) could definitely differ by one ULP, probably by two in ordinary cases, by more if underflow occurred (but one of the variables would already have to be very small for this to happen).


> floating-point multiplication is commutative.

It's not, except in the special case of one and the same number.


You have to be kidding. Otherwise, read up on the definition of “commutative” here: http://en.wikipedia.org/wiki/Commutative_property

Look for the first occurrence of the word “commutative” in this page: http://en.wikipedia.org/wiki/Floating_point

But remember, if both the Wikipedia editors and I are wrong, you only need to produce one pair of operands a and b such that a * b does not produce not the same result(1) as b * a to be vindicated.

(1) I suggest we consider all NaN values to be “the same result” for the purpose of this challenge.


You're right of course. I was thinking about associtivity, not seeing what's really written. Sorry. I didn't expect discussing commutativity in the context (it's the non-associativity that doesn't allow all the "automatic" optimizations expected by some) where the formula in the same sentence is with three elements and the different order of computation. Which doesn't change the fact that I've made an error.


No, fp (IEEE-745) + and * are both commutative. They aren't associative or distributive in general.


Yep grouping can lead to better real accuracy. But then you lose exact IEEE compatibility (get slightly different results that will fail a comparison).


Thanks, I now see more clearly why this is important for comparisons :)


Speaking of comparisons... -Wfloat-equal will warn whenever you attempt to compare floats or doubles for equality.


Rule number one of floats: don't use == on them.


Pure cargo-culting. There are circumstances in which it’s perfectly appropriate to compare floating-point data for equality. There are also circumstances in which a comparison with some tolerance is more appropriate. Understand what situation you’re in, and understand the tools that you’re using.


Comparing floats for equality is equivalent to comparing them with tolerance that you don't control. I don't see a good reason to risk like that.


Comparing floats for equality is equivalent to a comparison with zero tolerance, not an uncontrolled tolerance.

Consider storing a timestamp as a number of seconds since the epoch. We might use == to check if the timestamp has changed.


Read the wikipedia entry on IEEE754-1985 (it has better diagrams than the main 754 entry): http://en.wikipedia.org/wiki/IEEE_754-1985#Representation_of...

Knowing is better than fearing.


No, it's not equivalent to that at all. Why would you think such a thing?


Oh, there must be a Common Lisp predicate that addresses this. prettymuchequalsp?

More seriously, I remember APL\360 SETFUZZ from long before many of you whippersnappers were born, which told the system to disregard so-many bits in an 'equals' evaluation.

FP implementations are generally evil, and errors cascade in irritating fashion. 50% of a numerical analysis class I took once was dedicated to error estimation. Stick to rationals if you can - it'll save your hairline.


Comparing floats is sometimes useful. For example, if you are testing that you have copied a data structure correctly.


Basically:

floating point is hard. Programmers get it wrong, compilers get it right.

(normally).


In the case of C compilers, they get floating-point right by re-defining what the language definition means: http://arxiv.org/abs/cs/0701192

There has been some progress since the time this report was an accurate description of the situation, but the situation is still mostly terrible: http://stackoverflow.com/questions/17663780/is-there-a-docum...

And soon, fused-multiply-add will be common enough that compilers will start generating by default the instruction for code that contains addiction and multiplications, causing a regression with respect to the current situation.


Using fused-multiply-add unless the user has given the compiler option to deviate from strict IEEE754 results would be a compiler bug. I'm pretty sure gcc gets this right -- it will use FMAC on ARM (for instance) only if you pass it the fast-math option.


A C99 compiler pragma, FP_CONTRACT, lets the programmer decide locally whether using an FMA instruction for multiplication and addition is allowable.

The debate is whether the pragma should be on or off by default. I am a strong proponent of the “off” default. Some argue for the “on” default. Even if the “on” camp wins at some point for some compiler, you can always use the line below to make a standard-compliant compiler that is otherwise respectful of IEEE 754 respect multiplication and addition:

#pragma STDC FP_CONTRACT ON

But the point is that you will have to use the incantation explicitly.

If you are an ordinary programmer who does not know about the incantation, you will eventually encounter a situation where, say, (a * b + c == a * b + c) evaluates to false despite the results being finite, and it will reinforce the superstition that floating-point is black magic. And since this example is remarkable, you will tell others about it, the example will be repeated and distorted, and the superstition will never die.


I think the most compelling argument for “off” being the default is xx - yy. If FP_CONTRACT defaults to “on”, most uses of this expression will become fma(x,x,-yy), which will result in the expression not being exactly zero when x == y.

The most compelling argument for “on” being the default is that most users don’t care, and “on” allows compiler optimizations that save energy. Numerical consistency or independence of Russian gas[1]?

[1] Yeah, right. "Numerical consistency or 20 seconds longer playtime of flappy bird on a charge?” is a way better argument.


slightly off-topic here, floating point operation is even much much more harder inside kernel, mainly because of trapping mechinism. In fact most OS either do not support floating point operation inside kernel or just turn that option off by default. But the good thing is you do not really need floating point operations inside kernel, but when you do and needs to modify your kenrel code to achieve that, it is extremely implementationally intensive and hard to made one (and the runtime cost is expensive also).


Hm, in our kernel, when we need to do floating point, we just set a flag on the thread control block which preserves the floating registers on context switch, and then do our floating point operations. Maybe there is some API abstraction hiding the messy details :).


Always a great read: "What Every Computer Scientist Should Know About Floating-Point Arithmetic"

PDF rendering: http://www.cse.msu.edu/~cse320/Documents/FloatingPoint.pdf

HTML rendering: http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.ht...


One thing I always wondered about is why are we using floating point arithmetic at all, instead of fixed point math with explicitly specified ranges (say, "here I need 20 bits for the integer part and 44 for the fractional part")? What is the practical value of having a floating point that would justify dealing with all that complexity and conceptual problems they introduce?


Have you used fixed-point arithmetic much? You need to specify the width and the position of the dot for each intermediate variable (for instance, multiplying two numbers that have 20 bits for the integer part may require 40 bits for the integer part of the result). Since the operands in 20.44 format had only 64 bits of precision, it only makes sense to represent the result to 64 bits, so the format of this result should probably be 40.24.

If you don't do this, your program may still work but you are wasting space. For instance, if the multiplication of your two 20.44 numbers does not overflow the 20.44 format, good for you, but it means that between the two operands, you were wasting 20 bits in total for leading zeroes that carried no information.

Consider floating-point as a way not to have to make all these decisions at compile-time and still use space extremely efficiently to represent as much precision as possible.


Also: when people screw with FP, 99% of the time the answer is still vaguely sane or makes it very obvious that something went wrong. When people screw up with fixed point, 99% of the time the result is indistinguishable from noise.


Because of hardware, and the fact that fixed point doesn't solve the core problems with floating point as it stands today. Its hard enough to get good stable/fast floating point hardware, much less adding the complexities of variable sized words, with variable sized mantissa/exponents.

Really, what your probably asking for is decimal floating point. Which, has its own set of limitations when implemented in hardware.

And if you happen to have an application where what you really need is more precision there are a number of bignum libraries like GMP (https://gmplib.org/) which provide that. But, throwing more precision at a problem doesn't alleviate the need to do error estimation.


Dynamic range. A double-precision variable has the same precision at 1e+10 as at 1e-10, your proposal does not. This is not a big problem if you have a good idea about the numbers in your application (e.g., digital signal processing), but it is a big problem if you write a matrix library (or anything else which deals with real numbers) for arbitrary inputs.


But OTOH having "the same precision" means loosing accuracy. This is a common case in computer games, especially with outdoor worlds like space simulators. When you move away from the world center, you start loosing significant digits on the fractional side and your calculations soon turn into nonsense. When you go far enough, the difference between two consecutive numbers can get as high as 0.1, at which point everything falls apart.

KSP scene has even a fun name for it, Deep Space Kraken [0].

This can be of course mitigated by subdividing the world (octrees and stuff) and stacking coordinate frames (though for some reason nobody seems to be doing that), but you said that floats are good "for arbitrary inputs". And if you have small and big floats in your computations, the result might quickly stop making any sense.

I admit I'm confused about the topic, any clarification is welcome.

[0] - http://wiki.kerbalspaceprogram.com/wiki/Deep_Space_Kraken


Any particular fixed point format will have slightly more precision/accuracy at the edges than a floating point format with the same number of bits, but then it will hit a brick wall and stop working entirely. And it's easy for it to have less precision/accuracy in practice because it's hard to keep each intermediate result at the exact right exponent.

Fixed point won't save you from the space kraken. Instead of going so deep into space that consecutive numbers gradually become .1 apart, your ship will spontaneously explode a fiftieth of that distance out.

It's probably much easier to use two coordinate frames than to properly use fixed point numbers. Then a ship can run physics relative to its center of mass or a nearby planet/moon and have very high precision/accuracy.


On one hand FP lets you program without thinking too hard about the numerical properties of your program - until you hit a gotcha. On the other hand FP is pretty powerful for many domains if you're a FP wizard. Like physics simulation.

There have been other approaches besides fixed that people have tried out, such as interval arithmetic and exact (rational) arithmetic, but they all have their own problems.


> Um... you know that aaaaaa and (aaa)(aaa) are not the same with floating point numbers, don't you?

Why would so many people upvote such a condescending comment?


The exact same reason I stopped asking/answering questions on SO years ago - too many condescending trolls trying to act smart.


Random sidenote, but I've never seen an SO answer be upvoted in realtime until now. Didn't even know they programmed such functionality.


It's a pretty impressive system. If you're interested in their architecture, here's a YouTube video worth watching: "Marco Cecconi The Architecture of Stack Overflow - Developer Conference 2013" [0]

[0] https://www.youtube.com/watch?v=OGi8FT2j8hE



About 3 years ago...


3 years is a loooooong time for the internet. I am going to put a note on my calendar to post this again in 3 years. What is the median age of an active HN account anyway?


Perhaps you'd like to to tell us about FP on the Mill.


The non associativity of addition is obvious, but for the multiplication, I understand why it does not give always the same answer, but I do not see why a change of the order could change the accuracy.


The result depends on the number of multiplications, and hence roundings, that are done.


How would a functional language compiler deal with this case? like GHC?


It doesn't. If we forget about FPU flags, floating-point multiplication is pure. That does not change anything to the fact that a * (b * c) is different from (a * b) * c.

The problem is not side-effects in multiplication. The problem is that * is not associative.


You could try it out!


I already knew about the lack of associativity of IEEE floats, but TIL that `pow(x,n)` is more precise than chained multiplications.

Good to know.


Technically it is not a matter of precision but of accuracy.


Interesting, thanks for the correction.

I'm neither a native English speaker nor an engineer, and I didn't know about the distinction.

https://en.wikipedia.org/wiki/Accuracy_and_precision


Also a crap ton faster, right?


Yes, if by faster you mean slower.


Ha! Yes, for floating point. I meant my post to more communicate that moving it behind a library/method would allow you to make that decision elsewhere easily. I did not even come close to making that point. (I got distracted reading some interesting hacks that approximate this in insane speed.)


There is a discussion from today and yesterday on llvm-dev that deals with floating point guarantees and optimizations: http://thread.gmane.org/gmane.comp.compilers.clang.devel/358...


floating point in short: computer representation of scientific notation(1) with sign bit, exponent(2) and coefficient(3) crammed in the same word.

1. http://en.wikipedia.org/wiki/Scientific_notation

2. base 2, biased by 127, 8 bits (IEEE float)

3. base 2, implicit leading 1, 23 bits (IEEE float)

http://en.wikipedia.org/wiki/Single-precision_floating-point...


It actually does, if a is an integer.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: