Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Is this program "almost certainly wrong":

    uint32_t bytesToUnsigned(uint8_t bytes[4]
    {
      return bytes[0] |
             bytes[1] << 8 |
             bytes[2] << 16 |
             bytes[3] << 24;
    }
The behavior is undefined on a system with 32 bit integers because of signed arithmetic overflow (despite the fact that all the explicit types involved are unsigned, a uint8_t gets promoted to a signed integer before the left-shift operation).

Right now it will work on every compiler I've tried, but it would be perfectly valid (by the ANSI specification) for a compiler to assume that the result of that function can never have the highest bit set. In friendly C, the result is well defined.



>Is this program "almost certainly wrong"

No, that one's a consequence of C's insane type system. The solution here isn't to change the semantics of signed integer arithmetic. The solution is to change integer promotion to use unsigned arithmetic like it should have done in the first place.


Not taking a position on this, and it's been a long while, but I seem to remember that the discussion in X3J11 on the issue of integer promotions, which mostly occurred before I joined, were long and heated.


Integer promotion will not help, because it may not go to a sufficiently wide unsigned type to cover the shift. (In practice it will, but unsigned int could be just 16 bits).

Promotion of unsigned chars to unsigned int would have problems of its own, mostly because unsigned arithmetic (modulo power of two arithmetic) is inappropriate for most uses, and error-prone: it has a large, silent discontinuity right next to zero.

Alas, in fact, unsigned chars can promote to unsigned int: on rare platforms like DSP's where you have sizeof (int) == 1. Sigh.


There's a simple fix to that:

    -Wsign-conversion -Werror
You've effectively just made C into a safer language by disallowing implicit conversions that change signedness.


Friendly C should also require char to be unsigned so that this code is safe:

   int my_getchar(char *foo)
   {
        if (!*foo) return -1; // Return -1 at end of string
        else return *foo; // Should never return -1
   }


More to the point, char should be unsigned so that this code is safe:

   #include <ctype.h>

   if (isdigit(str[i])) { ... }
The ctype functions require an int-valued argument which is the value of an unsigned char: it must be a positive value, or the negative value EOF.

This is an unfriendly pitfall in the language.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: