Interesting points. > Implementing generics in this way breaks parametricity. Si...

noelwelsh · on Jan 7, 2025

In terms of implementation, you can view parametricity as meaning that within the body of a function with a generic type, the only operations that can be applied to values of that type are also arguments to that function.

This means you cannot write

fn sort<A>(elts: Vec<A>): Vec<A>

because you cannot compare values of type A within the implementation of sort with this definition. You can write

fn sort<A>(elts: Vec<A>, lessThan: (A, A) -> Bool): Vec<A>

because a comparison function is now a parameter to sort.

This helps both the programmer and the compiler. The practical upshot is that functions are modular: they specify everything they require. It follows from this that if you can compile a call to a function there is a subset of errors that cannot occur.

In a language without parametricity, functions can work with only a subset of possible calls. If we take the first definition of sort, it means a call to sort could fail at compile-time, or worse, at run-time, because the body of the function doesn't have a case that knows how to compare elements of that particular type. This leads to a language that is full of special cases and arbitrary decisions.

Javascript / Typescript is an example of a language without parametricity. sort in Javascript has what are, to me, insane semantics: converting values to strings and comparing them lexicographically. (See https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...) This in turn can lead to amazing bugs, which are only prevented by the programmer remembering to do the right thing. Remembering to do the right thing is fine in the small but it doesn't scale.

Breaking parametricity definitely has uses. The question becomes one about the tradeoffs one makes. That's why I'd rather have a discussion about those tradeoffs than just "constime good" or "parametricity good". Better yet are neat ideas that capture the good parts of both. (E.g. type classes / implicit parameters reduce the notational overhead of calling functions with constrained generic types, but this bring their own tradeoffs around modularity and coherence.)

dmvdoug · on Jan 7, 2025

Do you have a blog or other site where you post your writing? Your explanations are quite good and easy to follow for someone like me, an interested/curious onlooker.

noelwelsh · on Jan 7, 2025

Thanks! I appreciate that. A few things:

- https://scalawithcats.com/ is a book I'm writing. There is an associated newsletter to which I post blog articles and the like.

- https://noelwelsh.com/ is my personal site, which hosts my blog.

paulddraper · on Jan 8, 2025

As a former user of Scala and at times cats, I look forward to your work.

James_K · on Jan 7, 2025

Functions can crash anyway. I don't see how what you describe is different from a function on integers that errors on inputs that are too big. The programmer has to actively choose to make function break parametricity, and they can equally chose not to do that.

josephg · on Jan 7, 2025

In an ideal world, a function that only works for small integers would bake that into its type system. Ie, rather than accepting “any integer”, the function would accept a u8, or a value compile-time guaranteed in the range of 0..10 or something.

Your point still stands though. Modern programming languages don’t constrain you much at all with their type systems.

I spent a little time in Haskell a few years ago and it the kind of reasoning you can do about functions is wild. Eg, if a function has the type signature of A -> A, we know the function has to be the identity function because that’s the only function that matches the type signature. (Well that or the “bottom function”, which loops forever). Lots of functions are like that - where the type definitions alone are enough to reason about a lot of code.

_aavaa_ · on Jan 8, 2025

> In an ideal world, a function that only works for small integers would bake that into its type system. Ie, rather than accepting “any integer”, the function would accept a u8, or a value compile-time guaranteed in the range of 0..10 or something.

Ada takes this approach.

James_K · on Jan 7, 2025

Haskell has quite a few partial functions in the standard library if I recall. Bog standard (!!), head, and tail for lists; fromJust to unwrap Maybes; and even some more interesting examples like "all" which mightn't work on infinite lists. Indeed any Turing-complete language includes partial functions of that final variety.

nimish · on Jan 7, 2025

This is _also_ doable with the ability to constrain generics.

sort<A> where A implements Comparable

Simpler explanation IMO.

tines · on Jan 7, 2025

Not really an "also", "implements" is just syntax sugar for what the GP is saying.

nimish · on Jan 7, 2025

Sure, but it's the same thing in 10x fewer words. Having parametric generic types that accept constraints allows your functions to have the best of all worlds.

So a language _with_ that is superior. All zig needs to do is add some way to allow for constraints.

norir · on Jan 7, 2025

Fair point about parametricity. A language could in the macro expansion do the equivalent of a scala implicit lookup for a sorting function for the type and return an error at macro expansion time if it can't find one. That avoids the doing the right thing requires discipline problem but I agree it is still less clear from the type signature alone what the type requirements are.

SkiFire13 · on Jan 7, 2025

> For example, I believe C++ template programming also breaks "parametricity" by supporting template specialisation.

C++ breaks parametricity even with normal templates, since you can e.g. call a method that exists/is valid only on some instantiations of the template.

The issue is that the compiler can't help you check whether your template type checks or not, you will only figure out when you instantiate it with a concrete type. Things get worse when you call a templated function from within another templated function, since the error can then be arbitrarily levels deep.

> My point is that in makes sense for C++ to allow this, I think.

Whether it makes sense or not it's a big pain point and some are trying to move away from it (see e.g. Carbon's approach to generics)

marhee · on Jan 7, 2025

> C++ breaks parametricity even with normal templates

I might be wrong here, but as I understand it "parametricity" means loosely that all instantiations use the same function body. To quote wikipedia:

"parametricity is an abstract uniformity property enjoyed by parametrically polymorphic functions, which captures the intuition that all instances of a polymorphic function act the same way"

In this view, C++ does not break parametricity with "normal" (i.e. non-specialised) templates. Of course, C++ does not type check a template body against its parameters (unless concepts/trairs are used), leading to the problems you describe, but it's a different thing as far as I understand.

lmm · on Jan 7, 2025

To be parametric it needs to be the same body semantically, not just textually. Particularly in C++ with its heavy operator overloading and limited safety, you can very easily write a template whose body will do the right thing for some types and be undefined behaviour for others (e.g. if your template has comparisons in it and then you instantiate it with a pointer or something).

jcelerier · on Jan 7, 2025

Hm, wouldn't any use of if constexpr break that definition?

e.g.

     template<typename T>
     void f() { 
       if constexpr (is_int<T>) { return 0; } 
       else ...

deredede · on Jan 7, 2025

Parametricity is about behavior, not code. A function parametric in a variable should bevave identically for all values of the variable. If one instance of a C++ template fails to compile and another instance of the same template does compile it is a stretch to say they behave identically, even though they use the same code.

Quekid5 · on Jan 7, 2025

> Consider for example std::vector<bool>: implementations can be made that actually store a single bit per vector element (instead of how a bool normally is represented using an int or char).

Your example is considered a misfeature and demonstrates why breaking parametricity is a problem: the specialized vector<bool> is not a standard STL container even though vector<anythingelse> is. That's at best confusing -- and can leads to very confusing problems in generic code. (In this specific case, C++11's "auto" and AAA lessens some of the issues, but even then it can cause hard-to-diagnose performance problems even when the code compiles)

See https://stackoverflow.com/a/17797560 for more details.

HelloNurse · on Jan 7, 2025

The C++ vector<bool> specialization is bad because breaking many important implicit contracts about taking the address of vector<> elements makes it practically unusable if a normal vector<> is expected, but it isn't specialized incorrectly in a formally meaningful sense: all errors are outside the class (unsatisfied expectations from client code) and implicit (particularly for C++ as it was at the time).

Quekid5 · on Jan 7, 2025

You're not wrong, but is at the very least weird that a specialization doesn't conform to the concept that the general template does. Something which proper parametricity would have avoided -- if it were available.

(The Hysterical Raisins are understandable, esp. given that it wasn't even possible to specify Concepts in C++ until 2020...)

HelloNurse · on Jan 7, 2025

The point is exactly that the "concept" of what the template should do is informal and a careful, detailed specification in a far more expressive language than vintage C++ would be needed to elicit good compilation errors from something like vector<bool>.

Proper parametricity is only a starting point: types that specify alignment, equality and lifetimes would be needed to make it useful.

beached_whale · on Jan 7, 2025

Vector bool may not have to store a range of space optimized bool values but the interface is still different enough and guarantees different enough that is is largely thought of as a mistake. For one the const reference type is bool and not bool const &. Plus other members like flip… mostly the issue is in generic code expecting a normal vector

nialv7 · on Jan 7, 2025

one thing you can reason about a function is: does it exist at all? if you don't have parametricity, you can't even be sure about that. in Rust, as long as your type satisfies a generic function's bounds, you can be sure instantiating that function with this type will compile; in C++ you don't have that luxury.