The big language design problem that I think this post highlights is that the flip side of Julia's composability is that composing generic code with types that implement abstractions can easily expose bugs when the caller and the callee don't agree on exactly what the abstraction is.
Several of the bugs that Yuri reported are a very specific case of this: there's a lot of generic code that assumes that array indexing always starts at one, but that's not always the case since OffsetArrays allow indexing to start anywhere. The older code in the stats ecosystem is particularly badly hit by this because it often predates the existence of OffsetArrays and the APIs that were developed to allow writing efficient generic code that works with arrays that don't start at the typical index (or which might even want to be iterated in a different order).
Fixing these specific OffsetArray bugs is a fairly straightforward matter of searching for `1:length(a)` and replacing it with `eachindex(a)`. But there's a bigger issue that this general problem raises: How does one, in general, check whether an implementation of an abstraction is correct? And how can one test if generic code for an abstraction uses the abstraction correctly?
Many people have mentioned interfaces and seem to believe that they would solve this problem. I don't believe that they do, although they do help. Why not? Consider the OffsetArray example: nothing about `for i in 1:length(a)` violates anything about a hypothetical interface for AbstractArrays. Yes, an interface can tell you what methods you're supposed to implement. There's a couple of issues with that: 1) you might not actually need to implement all of them—some code doesn't actually use all of an interface; 2) you can find out what methods you need to implement just by running the code that uses the implementation and see what fails. What the interface would guarantee is that if you've implemented these methods, then no user of your implementation will hit a missing method error. But all that tells you is that you've implemented the entire surface area of the abstraction, not that you've implemented the abstraction at all correctly. And I think that covering the entire surface area of an abstraction when implementing it is the least hard part.
What you really want is a way to generically express behaviors of an abstraction in a way that can be automatically tested. I think that Clojure's spec is much closer to what's needed than statically checked interfaces. The idea is that when someone implements an abstraction, they can automatically get tests that their implementation implements the abstraction correctly and fully, including the way it behaves. If you've implemented an AbstractArray, one of the tests might be that if you index the array with each index value returned by `eachindex(a)` that it works and doesn't produce a bounds error.
On the other end, you also want some way of generating mock instances of an abstraction for testing generic code. We do a bit of this in Julia's test suite: there are GenericString and GenericSet types, which implement the minimal string/set abstraction, and use these to test generic code to verify that it doesn't assume more than it should about the string and set abstractions. For a GenericArray type, you'd want it to start at an arbitrary index and do other weird stuff that exotic array types are technically allowed to do, so that any generic code that makes invalid assumptions will get caught. You could call this type AdversarialArray or something like that.
I've personally thought quite a bit about these issues, but as Keno has said, there hasn't been time to tackle these problems in the last couple of years. But they certainly are important and worth solving.
On a personal note, Yuri, thanks for all the code and I'm sorry to see you go.
It seems to me that much of the difficulty with interfaces, whether they are made explicit or kept implicit, lies in defining the semantics that the functions are supposed to have.
As we expand the types our generic code can handle, we have to refine the semantics it relies on. For a long time, Base.length(::AbstractArray) could mean “the largest one-based index of the array”, but then we started using the same code that handles regular Arrays for OffsetArrays and this interpretation was no longer valid. I guess the alternative would have been to leave length(::OffsetArray) unimplemented and block the valid use of OffsetArrays for all generic code that understands Base.length as “the number of values”.
It can still be difficult to tell what a function like Base.length should mean if I implement it for my types. For example, should it return the number of local values or the global length for an array that is distributed between multiple processes (e.g. in an MPI program)? Perhaps some generic code will use it to allocate a buffer for intermediate values, in which case it should be the local length. Or some generic code computes an average by dividing the (global) sum by the global length.
It seems impossible to come up with a precise definition of all the semantics your generic code assumes a priori, so we can either restrict our usage of generics to a small number of concrete types that were considered when the code was written, or we have to accept that we occasionally run into these sorts of issues while we refine the semantics.
Anecdotally, it has been my experience that packages that have been made to work in many generic contexts (such as the ODE packages) are likely to work flawlessly with my custom types, while packages that have seen less such effort (e.g. iterative solvers) are more likely to cause issues. This makes me hopeful that it is possible to converge towards very general generic implementations.
It is also worth mentioning that it is very possible to use Julia without ambitious use of cross-package generic functionality, and use it “merely” as a better Fortran or Matlab.
To expand on the "interfaces are not enough" part: Defining an interface on an abstract type only gives you that a implementation exists, not that it is correct, i.e. that the specific implementation for a subtype guarantees the same properties the interface specifies.
On top of this, you really want to be alerted to when you expect more of an interface than the interface guarantees - this is what happened in the case of `1:length(A)` being assumed to give the indices into `A`, when the `AbstractArray` interface really only guarantees that a given set of methods exists.
I feel like these sorts of issues more or less require more formal models being provided & checked by the compiler. Luckily for us, nothing in this space has been implemented or attempted in & for julia, while there are a lot of experiments with formal methods and proofing systems being researched right now (TLA+, coq,..). There are of course a lot of footguns[1], but the space is moving fast and I'd love to see something that makes use of this integrated into julia at some point.
> Defining an interface on an abstract type only gives you that a implementation exists, not that it is correct
Pretty far off topic for Julia, but the definition of Rust's Traits over semantics rather than syntax (even though of course the compiler will only really check your syntax) gives me a lot of this.
The fact that this Bunch<Doodad> claims to be IntoIterator<Item=Doodad> tells me that the person who implemented that explicitly intends that I can iterate over the Doodads. They can't accidentally be IntoIterator<Item=Doodad> the author has to literally write the implementation naming the Trait to be implemented.
But that comes at a heavy price of course, if the author of Bunch never expected me to iterate over it, the best I can do is new type MyBunch and implement IntoIterator using whatever ingredients are provided on the surface of Bunch. This raises the price of composition considerably :/
> you really want to be alerted to when you expect more of an interface than the interface guarantees
In the case alluded to (AbstractArray) I feel like the correct thing was not to implement the existing interface. That might have been disruptive at the time, but people adopting a new interface which explicitly warns them not to 1:length(A) are not likely to screw this up, and by now perhaps everything still popular would have upgraded.
Re-purposing existing interfaces is probably always a bad idea, even if you can persuade yourself it never specifically said it was OK to use it the way you suspect everybody was in practice using it, Hyrum's Law very much applies. That interface is frozen in place, make a new one.
I think the OP specifically complains about the use of @inbounds and that the documentation was advocating an invalid use of it. Some libraries may not have been updated to handle AbstractArray: that's normal SW rot. But the out of bound access being unreported is the actual grief of the OP.
> What you really want is a way to generically express behaviors of an abstraction in a way that can be automatically tested.
The pure FP ecosystems in Scala often accomplish this in the form of "laws", which are essentially bundles of pre-made unit tests that they ship alongside their core abstraction libraries.
Invenia's approach to interface testing ("Development with Interface Packages" on our blog) does some of the things you suggest as a standard of practice, by providing tools to check correctness that implementers can use as part of package tests. ChainRulesTestUtils.jl is a decent example (although this one doesn't come with fake test types). I think this is typically good enough, and only struggles with interface implementations with significant side effects.
One little win could be publishing interface tests like these for Base interfaces in the Test stdlib. I appreciate that the Generic* types are already exposed in the Test stdlib!
That's why interfaces are useful—they save you from that. But they don't actually solve the problem of checking that an abstraction has been implemented correctly, just that you've implemented the entire API surface area, possibly incorrectly. Note, however, that if you have a way of automatically testing the behavioral correctness of an implementation, then those tests presumably cover the entire API, so automatic testing would subsume the benefit that static interface checking provides—just run the automatic tests and it tells you what you haven't implemented as well as what you may have implemented incorrectly.
Indeed, static types don't save you from this issue, at least structural ones don't, exactly the same issue would occur as in Julia (see: C++ templates). Static structural types have the same problem as Julia here, you gain a lot of compositional power at the expense of potential correctness issues.
However, nominal types do "solve" the problem somewhat, as there's a clear statement of intent when you do "X implements Y" that the compiler enforces. If that promise is missing, the compiler will not let you use an X where a Y is expected. And if you do say X implements Y, then you probably tested that you did it correctly.
But this would also fail at the OffsetArray problem. The only way I can see of protecting against it (statically or dynamically) is to have an "offset-index" type, different from an integer, that you need to have to index an OffsetArray. That makes a[x] not be compatible between regular Arrays and OffsetArrays.
I don't think anyone wants that mess though. So if your language has OffsetArrays, and they're supposed to be compatible with Arrays, and you can index both the same way, no amount of static types will help (save for dependent/refinement types but those are their own mess).
EDIT: I seem to have replied to the wrong comment, but the right person, so hey, no issue in the end :)
I think I agree with your claim that writing tests would be a super set in terms of covering the use cases of interfaces. But
1) Testing is such a PAIN in Julia. You HAVE to run all the tests every time. You HAVE to write tests in a separate folder. Multiple dispatch and composibility prevent having confidence that your tests cover all the cases.
2) In a lot of cases, interfaces improve readability of the code. Being able to just look at code in a large code base and know which interfaces are implemented + need to be implemented is such an advantage
3) Static analysis tooling can provide linting. This doesn't even have to be implemented by the core team at the moment. The lack of interface limits any kind of tooling to be developed.
All in all, when I write Julia code, I know I have to write EXTENSIVE tests. Even more tests than I have to write with Python (with mypy). Almost an order of magnitude more tests than I have to write with Rust.
Sometimes it feels like I spend more time writing / running tests than adding features to our packages. And that is honestly just not fun for me, let alone for other scientists and researchers on my team.
Interface’s provide correctness guarantees by way of implementing them is a conscious decision. If your array implements GenericArray, you know about that interface, and presumably what it is used for. Its methods can also contain documentation.
The point is a common point of… trust may be the word? Two developers that don’t even know each other can use each other’s code correctly by programming against a third, hypothetical implementation that they both agree on. Here OffsetArray would simply not implement the GenericArray interface if the latter expects 1-based indexing.
In this specific case the solution would be to move the indexing question into the interface itself - it is not only an implementation detail. Make the UltraGenericArray interface have an offset() method as well and perhaps make [] do 1-based indexing always (with auto-offsetting for indexed arrays), and a separate index-aware get() method, so that downstream usage must explicitly opt in to different indexing.
I remember reading a long time ago about the 1-based array and the offset-array 'kludge'.
My first thought was they should have replicated Ada's design instead, my second thought I hope that they have a good linter because putting arbitrary offset implementation in a library is a minefield.
I don't claim to be especially smart: this is/was obvious.. Unfortunately what isn't obvious is how to fix this issue and especially how to fix the culture which produces this kind of issue..
Offset arrays aren't a kludge and the package would exist regardless of whether zero or one has been chosen as a default base for indexing. Having arbitrary index ranges in different dimensions is extremely useful in many application domains. When working with FFTs, for example, it's natural for the indices to be symmetrical around zero. Or when doing coordinate transforms like in this example from the excellent Images.jl package: https://juliaimages.org/stable/tutorials/indexing/.
Offset arrays are a wonderful source of footguns, and this should have been obvious from the start. Either you train everyone to write loops in an official way that avoids the problem, or you are going to have bugs. And if you choose training, you have to train EVERYONE, because anyone coming in from another language will have the wrong habits.
Even Perl realized that changing the indexing of arrays was a bad idea. Which is why the special variable $[ has been deprecated every which way possible because it caused too many bugs.
Several of the most serious scientifically minded programming languages have also had arrays that can be indexed starting at different offsets, including Fortran, Chapel and Fortress. If Julia were zero-based OffsetArrays would still exist and be highly useful. OffsetArrays is not some "kludge" just to allow indexing from zero. Frankly, indexing from zero isn't improtant enough for that. What really makes OffsetArrays useful is being able to do things like have indices go from -k:k where k may depend on the particular dimension. That way the point A[0, 0, 0] is the center of the array and you can navigate up/down/back from there naturally. Of course you can simulate that with arrays that arrays that start at zero or one, but it's a major pain.
Several of the bugs that Yuri reported are a very specific case of this: there's a lot of generic code that assumes that array indexing always starts at one, but that's not always the case since OffsetArrays allow indexing to start anywhere. The older code in the stats ecosystem is particularly badly hit by this because it often predates the existence of OffsetArrays and the APIs that were developed to allow writing efficient generic code that works with arrays that don't start at the typical index (or which might even want to be iterated in a different order).
Fixing these specific OffsetArray bugs is a fairly straightforward matter of searching for `1:length(a)` and replacing it with `eachindex(a)`. But there's a bigger issue that this general problem raises: How does one, in general, check whether an implementation of an abstraction is correct? And how can one test if generic code for an abstraction uses the abstraction correctly?
Many people have mentioned interfaces and seem to believe that they would solve this problem. I don't believe that they do, although they do help. Why not? Consider the OffsetArray example: nothing about `for i in 1:length(a)` violates anything about a hypothetical interface for AbstractArrays. Yes, an interface can tell you what methods you're supposed to implement. There's a couple of issues with that: 1) you might not actually need to implement all of them—some code doesn't actually use all of an interface; 2) you can find out what methods you need to implement just by running the code that uses the implementation and see what fails. What the interface would guarantee is that if you've implemented these methods, then no user of your implementation will hit a missing method error. But all that tells you is that you've implemented the entire surface area of the abstraction, not that you've implemented the abstraction at all correctly. And I think that covering the entire surface area of an abstraction when implementing it is the least hard part.
What you really want is a way to generically express behaviors of an abstraction in a way that can be automatically tested. I think that Clojure's spec is much closer to what's needed than statically checked interfaces. The idea is that when someone implements an abstraction, they can automatically get tests that their implementation implements the abstraction correctly and fully, including the way it behaves. If you've implemented an AbstractArray, one of the tests might be that if you index the array with each index value returned by `eachindex(a)` that it works and doesn't produce a bounds error.
On the other end, you also want some way of generating mock instances of an abstraction for testing generic code. We do a bit of this in Julia's test suite: there are GenericString and GenericSet types, which implement the minimal string/set abstraction, and use these to test generic code to verify that it doesn't assume more than it should about the string and set abstractions. For a GenericArray type, you'd want it to start at an arbitrary index and do other weird stuff that exotic array types are technically allowed to do, so that any generic code that makes invalid assumptions will get caught. You could call this type AdversarialArray or something like that.
I've personally thought quite a bit about these issues, but as Keno has said, there hasn't been time to tackle these problems in the last couple of years. But they certainly are important and worth solving.
On a personal note, Yuri, thanks for all the code and I'm sorry to see you go.