Boxed values are pretty much necessary for dynamic languages — that's where the ...

pron · on Feb 18, 2012

Not necessarily. You just store a pointer to the type info inline with the data (like C++'s vtables). You can have unboxed "value-types" (const structs, essentially) in dynamic languages. In fact, you could even differentiate between a "boxed" ref type and a value type at runtime, because refs don't need all 64 bits of the pointer. So a ref is a 64 bit pointer with the first bit set to, say, 0, and a value (struct) type always begins with a 64 bit pointer to it's type information, only it's tagged with MSB of 1. Since you can't extend concrete types, you can easily store value types inline in an array, and just have the type-info pointer (which must be the same for all elements, b/c there is no inheritance) at the beginning of the array. And if your structs are aligned OK, you could easily pass them to C by skipping the type-info pointer both in the single value case and in the array case.

StefanKarpinski · on Feb 19, 2012

Ok, having read this comment again, here's a more measured response. You're assuming in this comment the value-type vs. object-type dichotomy that's used in, e.g., C#. That's one way to go, but I'm not sold that it's the best way. Deciding whether you want something to be storable inline in arrays or not when you define a type is kind of a strange thing. Maybe sometimes you do and sometimes you don't. So the bigger question is really if that's the best way to go about the matter.

It seems that in dynamically typed languages, you either need to have two kinds of objects (value types vs. object types), or two kinds of storage slots (e.g. arrays that hold values inline vs. arrays that hold references to heap-allocated values). The boxing aspect is really only part of that since you can't get the shared-reference behavior unless the storage is heap-allocated, regardless of whether there's a box header or not.

So yeah, it's a complicated issue.

StefanKarpinski · on Feb 18, 2012

This is an interesting scheme and seems like it might work, but I'm not sure. Would you be willing to pop onto julia-dev@googlegroups.com and post this suggestion there so we can have a full-blown discussion of it? Hard to do here — and some of the other developers would need to chime in on this too.

dkarl · on Feb 18, 2012

As a compromise, I think it would be helpful to be able to define structs that have a known layout in memory but no dynamic identity. They would be treated like primitive types in Java, but with the crucial difference that users could define their own. That way users could write Julia code that stores and accesses data in the same format they need for interoperating with whatever native libraries they use, instead of serializing and deserializing between Julia objects and C struct arrays (or using int or byte arrays in their Julia code and giving up most of the advantages of a modern programming language.)

StefanKarpinski · on Feb 18, 2012

We've discussed our way down that path but the design ends up being unsatisfying because "objects" and "structs" end up being so similar yet different. It may be what we have to do, but I haven't given up yet on having structs and Julia composite types be compatible somehow. pron's scheme is interesting.

JulianMorrison · on Feb 19, 2012

Take a look at how Go handles interface types, because that is basically dynamic typing, and Go can allocate data as unboxed values.