"This should never happen" is a design pattern of defensive programming. This is the same pattern for assert.
The usual use is to catch errors caused by misuse of a method. There is some invariant that the method assumes but is not enforced by the type signature of the interface. So if something goes wrong in outside code, or someone tries to use the method incorrectly, the invariant is not satisfied. When you catch such a problem, the current code context is FUBAR. The question is how aggressively to bail out : spew errors to a log and proceed with some GIGO calculation? Throw an exception? Exit the program?
It's a sub-pattern of "ain't got time for dat". Developer knows that condition should never happen, but is not inclined to prove it (as represented by coding type checking or other handling) yet realizes it shouldn't be ignored outright (if only to document the unproven condition in code, or to shut the compiler up about incompleteness warnings).
It's also for cases when you know it can't happen. For example, Java's String class has a method
public byte[] getBytes(String charsetName) throws UnsupportedEncodingException
Since it can throw a checked exception you have to catch it, which generally is fine, but consider this case:
someString.getBytes("UTF-8");
This call can never fail (support for UTF-8 encoding is required in Java) but in my case I have to do something with the exception in the catch statement or our static code analysis tool will start complaining (and rightfully so). So that's where I'll log a 'can't happen error'. It truly cannot happen.
Also in Java: Switching or if-else chains on an `enum`. It's still a good practice to include a final `else` or `default` case, but it should really never happen. Actually, inclusion of the `default` case will be enforced by the compiler if it can detect a code path that doesn't return. [0]
enum Whatever { FOO, BAR }
if (whatever == FOO) {
} else if (whatever == BAR) {
} else {
// Should never happen!
}
FWIW getBytes(Charset) doesn't throw, and there's a base set of charsets in StandardCharsets (1.7+):
someString.getBytes(StandardCharsets.UTF_8);
(Charset.forName doesn't throw either, StandardCharsets avoid stringly-typed code but it's not available on 1.6 so if you're still stuck there Charset.forName works)
Swift has the force-unwrap operator ! and its exception-handling variant try! for that. Sometimes you know that the exception case in the API will never arise, and the appropriate thing to do is to crash and let the programmer know that one of their assumptions is wrong. For example, you might be parsing JSON data that was generated within the program itself; normally JSON deserialization can fail for malformed JSON, but if you just constructed that JSON string within the same function and passed it directly, you know it's not gonna fail. It's pretty handy to ignore the error and turn it into an assertion in these cases, although this power should be used judiciously.
getBytes is poorly designed. In a safety-oriented language like Haskell or Rust, the set of encodings would be represented as an ADT (which forms a closed set) or s Typeclass (open set). All possible type-correct encoding arguments would be safe.
An ExceptT or Maybe monad for handling encoding errors feels a lot like throwing exceptions, although they are less disruptive than exceptions. I'd probably represent a decoder as a function with type ByteString -> ErrorT ParseError m Text, which is neither an ADT or Typeclass. It's a 3rd solution. Either that or an Attoparsec parser, which is probably equivalent. An encoder seems like it shouldn't fail at all, but if it eventually forks out to one of the C locale functions I can see it throwing errors too.
Meanwhile, in the real world, Data.Text.Encoding uses a 4th solution implementing decodeUtf8With and encodeUtf8 that ultimately represents the UTF-8 encoding as a pair of FFI functions with these signatures:
text-icu also ultimately represents an encoding as an opaque pointer returned by the ICU library, and works in the IO monad. So it too could fail in similar ways. Errors throw an exception of type ICUError, which the caller can catch using the 'catch' function from Control.Exception.
The encoding library does use typeclasses like you suggest, but I'm not sure anybody uses it. Sometimes people drop in #haskell and complain about that library, and the response is usually "don't use that; use the one in Data.Text.Encoding instead".
I don't use Rust, but if the language is at all practical, I imagine they shuttle their equivalent of pointers and bytestrings around and depend on foreign C libraries and locales just the same. Probably they don't want to change the core library every time the Unicode Consortium publishes a new encoding scheme, so I can't imagine them exposing only a closed type.
So, looking purely at the signature and comparing it to examples from a language you suggested, it doesn't appear to be poorly designed at all. It's exactly what I would expect and want in any language, and the library consensus seems to agree. I think you're just imagining the grass being greener on the other side.
In Rust, our main two string types are String and &str, which are both UTF-8 encoded. For interoperability with other things, we have additional types that you can convert to/from. http://andrewbrinker.github.io/blog/2016/03/27/string-types-... is a recent overview in a blog post.
You still run into the problem with other functions, though. For example, in Haskell, 'tail' is a partial function - it's undefined if the list is nil. If, in your code, you write:
xs = if p x then concat [[x, "bar"], foos] else "baz" : foos
ys = tail xs
Then you know that the call to tail is not going to fail in your code, because the input has a guaranteed minimum length of either 1 or 2. You can't make that same guarantee about tail in isolation, though.
Sometimes you need also a user provided encoding (think of editors). In that case, the exception makes sense. Haskell or Rust would need to provide an extra API for this case. But generally you are right, stronger type checking would be preferable. Anyway, I dislike API's which take a String but only support a strongly limited subset of these. In that case, a dedicated type suits much better.
This one bothers me every time. Other parts in the library provide a checked and an unchecked variant to achieve the same. If you put in a hardcoded string, you know it never fails.
BTW I made a (almost religious) habit out of ensuring that everything I touch is encoded UTF-8 or can be converted to that as I have been bitten several times hard by unexpected encoding stuff. Therefore, the above problem catches me pretty often.
I often use such error checking. Usually the value of such error message is in making the code easier to reason about, to further clarify some obscure use case (which can't happen). And if it does happen anyway - well, at least we get that alert. ;)
Except that you never know how the code you wrote in a module will be used by other developers in the future. Such defensive programming will a least help others prevent mistakes when using your code.
Note that gcc and clang's __builtin_unreachable() are optimization pragmas, not assertions. If control actually reaches a __builtin_unreachable(), your program doesn't necessarily abort. Terrible things can happen such as switch statements jumping into random addresses or functions running off the end without returning:
Sure, these aren't for defensive programming—they're for places where you know a location is unreachable, but your compiler can't prove it for you. The example given in the rust docs, for example, is a match clause with complete guard arms (i.e. if n < 0 and if n >= 0).
Disagree on this. It has nothing to do with efficiency in context of unlikely events. As others have noted here, it is effectively an assertion of expected language/system/operating-environment properties. Think axioms.
No, I'm primarily thinking of "ain't got time for dat", as in "there's a very real deadline, I have a lot of other things to get done, and this case isn't ever going to happen and I don't have time to prove it to the compiler."
Well, it's used in many, many ways and places, and that's one of them. Often, I'm using it after a two or more branch decision tree, where each branch returns. If someone refactors the code at some point and removes a return, or changes the condition to allow a case to fall through, it catches that.
You could be inclined to see it as "is not inclined to prove it", but I prefer to think it happens mostly because someone didn't think they changed something that could affect that (i.e. "I was sure that simple change to the boolean expression was equivalent when I made it...")
> is not inclined to prove it (as represented by coding type checking or other handling)
This is known as a reachability problem that in the general case cannot be proven. So, "not inclined" may actually mean "can't (in any reasonable amount of time)".
It's not "in any reasonable amount of time". Proving in the general case whether a variable is unused in code or not is equivalent to solving the halting problem. This follows from Rice's theorem:
Yes, but it's more nuanced than that. Even if you can prove that a computation always terminates you can't necessarily prove that it yields the wanted result in any reasonable amount of time. This is the bounded halting problem, and it applies even to languages that only allow terminating computations. In those languages, the halting problem is nominally gone, but the bounded halting problem is just as bad as for Turing-complete languages. Just how bad is it? It is a time complexity class that includes all time complexity classes, i.e. it is harder (in general) than any problem that is computable and known to complete within f(n) steps, where n is the size of the input and f is any computable function.
I came to say much the same thing, but even take it a step further and point out that proofs are subject to human error as well. More powerful programming constructs are a great tool, but at some point it's turtles all the way down. The Knuth quote comes to mind: "Beware of bugs in the above code; I have only proved it correct, not tried it."
And this pattern is exactly why I prefer compile time type safety in my languages. This pattern is still sometimes necessary but there is a whole class of error this pattern gets used for that you can many times eliminate.
What's interesting is that (as described in the present top comment on this article, about "CALL BRIAN"), if the abstraction of "type safety" is leaky (as it is, e.g. in the presence of memory or hardware errors), this kind of paranoia can actually have real-world benefits even though you can prove the impossibility of the code running using static analysis.
Sometimes the important artifact is the executable in the larger context of the deployed system, rather than the code you generate it from.
There is nothing preventing modeling the contextual environment within static analysis. Static analysis/type systems help the programmer draw the line between the known and the unknown. Some conditions are just not practical or efficient to check for. However for the context you use it within you can make certain guarantees about the code. This is still incredibly useful even though it doesn't guarantee an error can never take place.
You say it "is" leaky in the presence of memory corruption. However that is not necessarily true. One could model software memory verification within a type system. Meaning, you could guarantee at compile time that each time a variable is read it is verified via checksum against its last written value. This would not be particularly efficient, but the point stands that type systems can be used (and should be used) to model hardware failures.
This is no different than network link failures, etc...
That is exactly the point, to mitigate it. Just because you can't mitigate everything doesn't mean it is less valuable. It is still extremely valuable to mitigate the things you do (or choose to) have control over.
In typed languages this is pretty often circumvented by choosing the wrong type. Java (even the language library) is literally littered with API's which take a String where the parameter does not have the semantics of a string (a bunch of characters with no meaning). The worst offenders take a String, support only a very limited subset and offer no explanation which are valid ones.
I agree that it does demonstrate defensive programming.
But can I also just add that the error message remains unhelpful and outright "bad." You should absolutely have checks for "impossible" situations, but when those checks fail you need some way of determining which check failed (and cannot always assume you'll have stack backtrace, in particular if an end-user is telling you the error message).
For example you could do this: "Impossible Error in GetName(): {Exception}"
The usual use is to catch errors caused by misuse of a method. There is some invariant that the method assumes but is not enforced by the type signature of the interface. So if something goes wrong in outside code, or someone tries to use the method incorrectly, the invariant is not satisfied. When you catch such a problem, the current code context is FUBAR. The question is how aggressively to bail out : spew errors to a log and proceed with some GIGO calculation? Throw an exception? Exit the program?