Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Exponential time complexity in the Swift type checker (cocoawithlove.com)
133 points by adamnemecek on July 17, 2016 | hide | past | favorite | 46 comments


Why does Swift use this weird constraint-based type resolution anyway? To me it seems like it, whenever it manages to found a non-trivial solution, would be one that is unexpected to the programmer and leads to unintended consequences. Anyone have a motivating example of when it would be useful?

What's wrong with a type resolver that resolves (or not, with an error!) greedily in order of precedence?

Here's how I would solve

   let a: Animal = 1 + -(2)
Ok, 2 is an int. If that gives problems down the line, too bad. -(2) is an int, if that gives problems, too bad. 1+-(2) is int+int so result should be an int. a is an Animal, try to convert int to Animal. Fail. Error!

One thing in particular that scares me with the constraint approach is that it seems to me (though I may be wrong) that whether overflow occurs or not is not immediately obvious.

    let a: Double = -(INT_MAX) + - (INT_MAX)
whats to say those INT_MAXes wont somehow be interpreted as Doubles and a will be -2 INT_MAX ?


Swift is not unique in using constraint solving to provide bidirectional type inference. Scala and Haskell provide the same level of flexibility; although not with the same compile-time problems. In the case of Haskell, this is ultimately because the language is intentionally simpler in some ways.

Swift brings together three features that can be hard on type inference:

* Overloaded literals. When overloaded literals like maps can themselves contain literals, there are actually quite a few possible overloadings to choose from.

* Subtyping. We infer not types but type bounds (ranges).

* Overloading.

Haskell has notably no overloading at all and no subtyping. Somehow Scala manages to combine all these things and a quite astonishing conversions system, too.


Haskell does have overloaded literals: integer literals have type `Num a => a`, and floating-point literals have type `Fractional a => a`. And GHC has an extension OverloadedStrings to make string literals have type `IsString a => a` (which makes it easier to work with ByteString and other different string-like types).


Sorry, I was not being clear. Since I treated "overloaded literals" and "overloading" as separate bullet points I was using "overloading" to mean only overloaded declarations -- declaring the same function multiple times with different types.


Haskell is also very happy to throw "ambiguous literal..." errors at you when it gets a little confused.


AFAIK, Haskell doesn't really have overloading in the sense that other languages use it. "Num a => a" is just one type. Contrast with Java, where a class can have two methods named "foo" with completely different arguments and return types.


“Overloaded literals” only has one meaning as far as I know, and what Haskell has is it: a literal whose type isn’t solely determined by its contents.


Essentially the parent commenters are talking about too different things, one is talking about the first bullet point and the other is talking about the third.


However, it seems that the really tricky overloads in Swift's case are the lists, which can contain other overloaded literals inside them.


Paul Philips showed me methods in Scala that took minutes to compile as it tried to solve for the return type. It has similar problems and is related to why the compiler is so slow.


Interesting, didn't know that about Swift compiler. What does constraint solving provide that Hindley-Milner doesn't? My understanding of HM was that it's reducible to the system of linear equations, therefore solvable in linear time. What is that that Swift type checker requires that HM doesn't provide?


HM has no concept of subtyping or overloading, I think. Scala has both of those and cannot do full type inference, only local inference.


It is worth considering that neither Haskell nor Scala are strictly Hindley-Milner.


Type inference for HM is actually exponential, due to generalization for let-polymorphism. You're probably thinking of inference for a language with just simple types.


Both of your questions have the same answer: Swift doesn't implicitly convert between types.

First answer: since Swift has no implicit conversions, greedily selecting types would create a situation where mathematical expressions or even function parameters that use Float instead of Double would be completely impractical. Every floating point literal would be interpreted as Double and compiler error on attempt to use as a Float. You'd need to manually construct a Float around every literal to actually get Float. The same problem would plague and UInt, Int8, etc – any value would require manual construction from a literal of type Int.

Second answer: Since Swift has no implicit conversions:

    let a: Double = -(Int.max) + -(Int.max)
would always be a compile error because the right-hand side is an Int and can't be converted to Double. This isn't a type inference or type checking problem, it's a "no implicit conversions allowed" error. You would need to write:

    let a = Double(-(Int.max) &+ -(Int.max))
for the line to compile (I've used the overflow allowed addition operator '&+' since the regular '+' will trigger an abort if it actually overflows).


The idea of having a select number of explicitly defined implicit conversions is interesting, though, but not unheard of. (Is there not something like that in Scala?)

However, in order to perform the simplistic algorithm that the parent comment imagines, we have to give up overloading. If you have f(A) and f(B) and an implicit conversion between A and B, and we type f(x) (where x is an A), both f(A) and f(B) could be selected. Since they might have different types, a larger expression it is a part of could be valid by picking one and not the other, but we cannot know which until we go through all possibilities. That's an exponential algorithm too.


Yes, Scala has that. There's a precedence, and ties are compile errors.It

It won't apply more then one level of conversion, though you can perform recursive operations in the type system. Shapeless is a library that uses this for ad-hoc typesafe programming without boilerplate.


Ok, so what if you type out int max manually?

    let a: Double = -(9223372036854775807) + - (9223372036854775807)


In that case, all literals would be interpreted as Double.


Swift supports overloading functions by return type.


Overloading with type inference is an interesting problem. Haskell has type classes, which is an elegant solution but has its own issues. Swift has more flexible overloading at the expense of compile times and corner cases.


Swift is highly inspired by Standard ML. If you look at https://en.wikipedia.org/wiki/Standard_ML, you can see the similarities:

    fun dist ((x0, y0), (x1, y1)) = let
      val dx = x1 - x0
      val dy = y1 - y0
      in
        Math.sqrt (dx * dx + dy * dy)
      end
Because of that, it also has its exponential time complexity (see for example http://spacemanaki.com/blog/2014/08/04/Just-LOOK-at-the-humo...)

What makes things especially bad for Swift seems to be that it has function overloading and uses lots of it in its standard library, as that increases the constants in the exponentiation (for example, with 6 choices to make and 10 vs 2 possible overloads per decision point you get 10^6 = a million versus 2^6 = 64 options to consider)


I had thought about that, but I am pretty sure the reason Hindley-Milner is exponential has to do with the generalization for let-polymorphism. I'm not familiar with all the details of Swift, but I do not recall seeing that it will infer polymorphic types for you.

That doesn't mean there isn't a good reason that Swift's inference is exponential – I just think HM's exponential time behavior is a red herring.


The article focusses on function overloads, which are not a part of Hindley-Milner (so yes, a red herring in this case) and not necessarily exponential. The fact that Swift is exponential here is an implementation shortcoming and little more.

Swift's type checker does also have an exponential path for protocol constrained generic parameters which is functionally equivalent to the exponential path for classical Hindley-Milner. However, there are "pseudo" linear optimizations for this too that Swift should also consider.


I'm confused, shouldn't people be supplying type signatures to make the compilers job easier? I use plenty of lambdas in Haskell, but my top level functions are always typed.


I can't believe something like this makes it into "production". How could anybody value such an asinine type inference system over compile time?

If your compiler needs a constraints solver to determine the type of an expression, it belongs in a science exhibit, not into a real-world toolchain.

Don't try to be clever! Give me a compiler that's predictable, reliable and fast. If that means I have to add some type information here and there, that's just fine!


A very large percentage of compilers for statically typed languages include algorithms with worst-case exponential time. For instance, the celebrated Hindley Milner type inference algorithm, which is "almost linear" in practice, has worst-case exponential time (the reason it rarely comes up is that programs rarely have deeply-left-nested let bindings). Anyway, type checkers that employ constraint solving can be a lot easier to reason about and extend than alternatives (and give way better error messages, which is probably the second-most-important thing a type checker does).


Lots of languages have these features and it works fine. Finding the type of an expression is one of the canonical applications of a constraint solver.


    The following line will give an error if you try to compile it in Swift 3:

    let a: Double = -(1 + 2) + -(3 + 4) + 5
Reminds me when Golang said you should avoid using numbers greater than about 20k in your program to avoid GC problems. I.e. farcical.


Go's GC has been precise for heap-allocated data since go1.1 (released 3 years ago) and precise for heap and stack-allocated data since go1.3 (released 2 years ago), completely fixing any GC problems with values that look like pointers.

I agree that the statement about avoiding large numbers was farcical, but it was not made by "Golang" but rather by someone on the go-nuts mailing list. Your sentence makes it sound like this was general advice for people programming in go from the go team, but to the best of my knowledge this is not true.


This was in a 32 bit version of a very old Go compiler (pre 1.0?) - this should not mistaken for a current Go problem.


How has this been addressed in modern Go?


> Golang said you should avoid using numbers greater than about 20k in your program to avoid GC problems

Isn't this unavoidable with conservative garbage collectors? The GC looks at your fields and believes your integer is a pointer… It cannot know better.

Swift doesn't have that issue, but reference counting has its own risks of memory leaks through mishandled weak references, and a performance overhead.


Scanning the heap looking for values that "look like pointers" in some indefinite way is a fairly unusual strategy. There are many GC'd languages -- Haskell and O'Caml come to mind -- without this limitation. Haskell used "pointer tagging" where every value had two bits reserved at the top end and thus Ints were limited to 30 bits. The two bits were used to tell the GC what it was looking at. Later developments side-stepped this limitation but I'm not sure how.

Does ARC have an overhead relative to GC? If if it does, it at least prevents latency spikes.


"Does ARC have an overhead relative to GC? If if it does, it at least prevents latency spikes"

It doesn't prevent latency spikes. As an extreme example, create a linked list or tree with a billion nodes, and let its root go out of scope. That will call free a billion times.


Do that every request, and it will be the same each time.

With GC, you can see a latency spike when every request creates such a list and they aren't freed until, due to memory pressure, many are freed all at once. Some of the requests will get stuck waiting for a linked list node and have very high latency relative to the other ones.

The strength of ARC for interactive applications is its consistency.


If you really want to use a conservative GC, you should not use the top or bottom of your address space for heap.


Yes. I would have thought this to be solved by now. What is especially scary is that very simple expressions can significantly increase compile time, and there is no easy way of knowing.

What would be hilarious is if the swift dev team releases a compile profiler that shows "hot lines" that take a long time to compile, while not solving the actual issue.


This information is currently partially available. Not line by line but definition by definition. It's hidden behind a compiler flag but there are open source tools that consume it and display it for developer consumption.


What. Can you give a link, please? Wasn't able to google that up.



Considering how likely it is that developers are running into this particular set of compiler bugs and are scratching their head figuring out what is wrong it should be fixed soon. Swift can not afford to be called unpredictable from a developer perspective.


It clearly can. The swift compiler and language have had far more fundamental problems than this in previous releases, and that hasn't stopped its rapid growth.


Apple has demonstrated again and again that developers will put up with just about anything. As long as Apple has control of the platform with the most valuable customers, it will stay that way.


Except the last time something like this happened[1] it was fixed within a day[2].

[1]: https://spin.atomicobject.com/2016/04/26/swift-long-compile-...

[2]: https://github.com/apple/swift/commit/2cdd7d64e1e2add7bcfd54...


I first, somehow, misread the title, and changed "Swift type" for "Swype". Made me wonder how an Android keyboard would be affected by an exponential time complexity algorithm and why would it be a big deal.

Everything makes much more sense now.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: