"This should never happen" is a design pattern of defensive programming. This is...

ctdonath · on March 31, 2016

It's a sub-pattern of "ain't got time for dat". Developer knows that condition should never happen, but is not inclined to prove it (as represented by coding type checking or other handling) yet realizes it shouldn't be ignored outright (if only to document the unproven condition in code, or to shut the compiler up about incompleteness warnings).

Aaargh20318 · on March 31, 2016

It's also for cases when you know it can't happen. For example, Java's String class has a method

    public byte[] getBytes(String charsetName) throws UnsupportedEncodingException

Since it can throw a checked exception you have to catch it, which generally is fine, but consider this case:

    someString.getBytes("UTF-8");

This call can never fail (support for UTF-8 encoding is required in Java) but in my case I have to do something with the exception in the catch statement or our static code analysis tool will start complaining (and rightfully so). So that's where I'll log a 'can't happen error'. It truly cannot happen.

jdmichal · on March 31, 2016

Also in Java: Switching or if-else chains on an `enum`. It's still a good practice to include a final `else` or `default` case, but it should really never happen. Actually, inclusion of the `default` case will be enforced by the compiler if it can detect a code path that doesn't return. [0]

    enum Whatever { FOO, BAR }

    if (whatever == FOO) {
    } else if (whatever == BAR) {
    } else {
        // Should never happen!
    }

[0] http://stackoverflow.com/questions/5013194/why-is-default-re...

jrgv · on March 31, 2016

I'd prefer to throw an exception in that else block:

    throw new AssertionError("Should never happen");

jdmichal · on March 31, 2016

I just put a comment as an example. I typically log-and-throw.

lancefisher · on March 31, 2016

This can happen when some other developer adds to your enum not knowing about the use.

masklinn · on March 31, 2016

FWIW getBytes(Charset) doesn't throw, and there's a base set of charsets in StandardCharsets (1.7+):

    someString.getBytes(StandardCharsets.UTF_8);

(Charset.forName doesn't throw either, StandardCharsets avoid stringly-typed code but it's not available on 1.6 so if you're still stuck there Charset.forName works)

jdmichal · on March 31, 2016

You see the same thing with cipher suites, and there's unfortunately no `StandardCiphers` class.

Aaargh20318 · on March 31, 2016

yeah I know, this was just the first example that came to mind.

nostrademons · on March 31, 2016

Swift has the force-unwrap operator ! and its exception-handling variant try! for that. Sometimes you know that the exception case in the API will never arise, and the appropriate thing to do is to crash and let the programmer know that one of their assumptions is wrong. For example, you might be parsing JSON data that was generated within the program itself; normally JSON deserialization can fail for malformed JSON, but if you just constructed that JSON string within the same function and passed it directly, you know it's not gonna fail. It's pretty handy to ignore the error and turn it into an assertion in these cases, although this power should be used judiciously.

wyager · on March 31, 2016

getBytes is poorly designed. In a safety-oriented language like Haskell or Rust, the set of encodings would be represented as an ADT (which forms a closed set) or s Typeclass (open set). All possible type-correct encoding arguments would be safe.

MichaelBurge · on March 31, 2016

Speaking to the Haskell part:

An ExceptT or Maybe monad for handling encoding errors feels a lot like throwing exceptions, although they are less disruptive than exceptions. I'd probably represent a decoder as a function with type ByteString -> ErrorT ParseError m Text, which is neither an ADT or Typeclass. It's a 3rd solution. Either that or an Attoparsec parser, which is probably equivalent. An encoder seems like it shouldn't fail at all, but if it eventually forks out to one of the C locale functions I can see it throwing errors too.

Meanwhile, in the real world, Data.Text.Encoding uses a 4th solution implementing decodeUtf8With and encodeUtf8 that ultimately represents the UTF-8 encoding as a pair of FFI functions with these signatures:

  foreign import ccall unsafe "_hs_text_decode_utf8"   c_decode_utf8
      :: MutableByteArray# s -> Ptr CSize
      -> Ptr Word8 -> Ptr Word8 -> IO (Ptr Word8)

and this one:

  foreign import ccall unsafe "_hs_text_encode_utf8" c_encode_utf8
      :: Ptr (Ptr Word8) -> ByteArray# -> CSize -> CSize -> IO ()

text-icu also ultimately represents an encoding as an opaque pointer returned by the ICU library, and works in the IO monad. So it too could fail in similar ways. Errors throw an exception of type ICUError, which the caller can catch using the 'catch' function from Control.Exception.

The encoding library does use typeclasses like you suggest, but I'm not sure anybody uses it. Sometimes people drop in #haskell and complain about that library, and the response is usually "don't use that; use the one in Data.Text.Encoding instead".

I don't use Rust, but if the language is at all practical, I imagine they shuttle their equivalent of pointers and bytestrings around and depend on foreign C libraries and locales just the same. Probably they don't want to change the core library every time the Unicode Consortium publishes a new encoding scheme, so I can't imagine them exposing only a closed type.

So, looking purely at the signature and comparing it to examples from a language you suggested, it doesn't appear to be poorly designed at all. It's exactly what I would expect and want in any language, and the library consensus seems to agree. I think you're just imagining the grass being greener on the other side.

steveklabnik · on March 31, 2016

In Rust, our main two string types are String and &str, which are both UTF-8 encoded. For interoperability with other things, we have additional types that you can convert to/from. http://andrewbrinker.github.io/blog/2016/03/27/string-types-... is a recent overview in a blog post.

nostrademons · on March 31, 2016

You still run into the problem with other functions, though. For example, in Haskell, 'tail' is a partial function - it's undefined if the list is nil. If, in your code, you write:

  xs = if p x then concat [[x, "bar"], foos] else "baz" : foos
  ys = tail xs

Then you know that the call to tail is not going to fail in your code, because the input has a guaranteed minimum length of either 1 or 2. You can't make that same guarantee about tail in isolation, though.

chopin · on March 31, 2016

Sometimes you need also a user provided encoding (think of editors). In that case, the exception makes sense. Haskell or Rust would need to provide an extra API for this case. But generally you are right, stronger type checking would be preferable. Anyway, I dislike API's which take a String but only support a strongly limited subset of these. In that case, a dedicated type suits much better.

wyager · on March 31, 2016

That's what typeclasses are for.

catnaroek · on March 31, 2016

Rather than “poorly designed”, I'd say “reflects a limitation of the language”. You can't blame libraries for language defects.

chopin · on March 31, 2016

This one bothers me every time. Other parts in the library provide a checked and an unchecked variant to achieve the same. If you put in a hardcoded string, you know it never fails.

BTW I made a (almost religious) habit out of ensuring that everything I touch is encoded UTF-8 or can be converted to that as I have been bitten several times hard by unexpected encoding stuff. Therefore, the above problem catches me pretty often.

amenod · on March 31, 2016

I often use such error checking. Usually the value of such error message is in making the code easier to reason about, to further clarify some obscure use case (which can't happen). And if it does happen anyway - well, at least we get that alert. ;)

spdionis · on March 31, 2016

Except that you never know how the code you wrote in a module will be used by other developers in the future. Such defensive programming will a least help others prevent mistakes when using your code.

someguy1234 · on March 31, 2016

Sometimes you do it because the compiler insists a case (that really can never happen) needs to be handled.

It happens a lot to me in Rust and Go.

masklinn · on March 31, 2016

Rust has support for "this can not happen" in core: http://doc.rust-lang.org/core/macro.unreachable!.html

GCC also has an extension for that: https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#index... which Clang supports: http://clang.llvm.org/docs/LanguageExtensions.html#builtin-u...

cpeterso · on March 31, 2016

Note that gcc and clang's __builtin_unreachable() are optimization pragmas, not assertions. If control actually reaches a __builtin_unreachable(), your program doesn't necessarily abort. Terrible things can happen such as switch statements jumping into random addresses or functions running off the end without returning:

https://raw.githubusercontent.com/bjacob/builtin-unreachable...

nightpool · on March 31, 2016

Sure, these aren't for defensive programming—they're for places where you know a location is unreachable, but your compiler can't prove it for you. The example given in the rust docs, for example, is a match clause with complete guard arms (i.e. if n < 0 and if n >= 0).

eternalban · on March 31, 2016

> It's a sub-pattern of "ain't got time for dat".

Disagree on this. It has nothing to do with efficiency in context of unlikely events. As others have noted here, it is effectively an assertion of expected language/system/operating-environment properties. Think axioms.

ctdonath · on April 1, 2016

No, I'm primarily thinking of "ain't got time for dat", as in "there's a very real deadline, I have a lot of other things to get done, and this case isn't ever going to happen and I don't have time to prove it to the compiler."

kbenson · on March 31, 2016

Well, it's used in many, many ways and places, and that's one of them. Often, I'm using it after a two or more branch decision tree, where each branch returns. If someone refactors the code at some point and removes a return, or changes the condition to allow a case to fall through, it catches that.

You could be inclined to see it as "is not inclined to prove it", but I prefer to think it happens mostly because someone didn't think they changed something that could affect that (i.e. "I was sure that simple change to the boolean expression was equivalent when I made it...")

pron · on March 31, 2016

> is not inclined to prove it (as represented by coding type checking or other handling)

This is known as a reachability problem that in the general case cannot be proven. So, "not inclined" may actually mean "can't (in any reasonable amount of time)".

semi-extrinsic · on March 31, 2016

It's not "in any reasonable amount of time". Proving in the general case whether a variable is unused in code or not is equivalent to solving the halting problem. This follows from Rice's theorem:

https://en.wikipedia.org/wiki/Rice%27s_theorem#Proof_by_redu...

pron · on March 31, 2016

Yes, but it's more nuanced than that. Even if you can prove that a computation always terminates you can't necessarily prove that it yields the wanted result in any reasonable amount of time. This is the bounded halting problem, and it applies even to languages that only allow terminating computations. In those languages, the halting problem is nominally gone, but the bounded halting problem is just as bad as for Turing-complete languages. Just how bad is it? It is a time complexity class that includes all time complexity classes, i.e. it is harder (in general) than any problem that is computable and known to complete within f(n) steps, where n is the size of the input and f is any computable function.

dasil003 · on March 31, 2016

I came to say much the same thing, but even take it a step further and point out that proofs are subject to human error as well. More powerful programming constructs are a great tool, but at some point it's turtles all the way down. The Knuth quote comes to mind: "Beware of bugs in the above code; I have only proved it correct, not tried it."

frostymarvelous · on March 31, 2016

Brilliant response.

zaphar · on March 31, 2016

And this pattern is exactly why I prefer compile time type safety in my languages. This pattern is still sometimes necessary but there is a whole class of error this pattern gets used for that you can many times eliminate.

gue5t · on March 31, 2016

What's interesting is that (as described in the present top comment on this article, about "CALL BRIAN"), if the abstraction of "type safety" is leaky (as it is, e.g. in the presence of memory or hardware errors), this kind of paranoia can actually have real-world benefits even though you can prove the impossibility of the code running using static analysis.

Sometimes the important artifact is the executable in the larger context of the deployed system, rather than the code you generate it from.

nightski · on March 31, 2016

There is nothing preventing modeling the contextual environment within static analysis. Static analysis/type systems help the programmer draw the line between the known and the unknown. Some conditions are just not practical or efficient to check for. However for the context you use it within you can make certain guarantees about the code. This is still incredibly useful even though it doesn't guarantee an error can never take place.

You say it "is" leaky in the presence of memory corruption. However that is not necessarily true. One could model software memory verification within a type system. Meaning, you could guarantee at compile time that each time a variable is read it is verified via checksum against its last written value. This would not be particularly efficient, but the point stands that type systems can be used (and should be used) to model hardware failures.

This is no different than network link failures, etc...

TillE · on March 31, 2016

And how do you verify that your checksum-verifying code hasn't been corrupted with cosmic rays?

The real world is messy. We can mitigate that, but we can never have mathematical purity.

nightski · on March 31, 2016

That is exactly the point, to mitigate it. Just because you can't mitigate everything doesn't mean it is less valuable. It is still extremely valuable to mitigate the things you do (or choose to) have control over.

chopin · on March 31, 2016

In typed languages this is pretty often circumvented by choosing the wrong type. Java (even the language library) is literally littered with API's which take a String where the parameter does not have the semantics of a string (a bunch of characters with no meaning). The worst offenders take a String, support only a very limited subset and offer no explanation which are valid ones.

Someone1234 · on March 31, 2016

I agree that it does demonstrate defensive programming.

But can I also just add that the error message remains unhelpful and outright "bad." You should absolutely have checks for "impossible" situations, but when those checks fail you need some way of determining which check failed (and cannot always assume you'll have stack backtrace, in particular if an end-user is telling you the error message).

For example you could do this: "Impossible Error in GetName(): {Exception}"

estsauver · on March 31, 2016

In a lot of these cases they probably know they'll have access to stack traces/Exception monitoring.

igravious · on March 31, 2016

Rather than "Impossible […]: {Exception}", how about "Regrettable and Extremely Rare […]: {Programmer Will Be Fired}"?

wellpast · on March 31, 2016

Right. An identical query: https://github.com/search?utf8=%E2%9C%93&q=IllegalStateExcep...