both c and c++ have the concept of zero-length arrays, if you malloc them - int ...

monocasa · on Jan 31, 2023

I think the parent is talking about the c pattern of having the last member of a struct be a zero length array, which is actually a dynamically sized array that the struct is only the header to (ostensibly with another field of the struct specifying the length of the array). It's fallen a bit out of favor, but it is a handy way to commingle the header and array with one allocation/pointer.

And interestingly COBOL handled this in a cleaner way. I forget some of the specfics but there was a way to specify to the compiler that one field of a record specified the length of the following array, allowing the same pattern in a type safe way.

ddtaylor · on Feb 1, 2023

> And interestingly COBOL handled this in a cleaner way

That is the least reassuring sentence I have read on HN in a while!

ykonstant · on Feb 1, 2023

ROFL I can use that.

- Boss!

- ..yes?!

- I was reading a COBOL code base and got a brilliant idea--

- Say no more, you need a vacation. I am sorry, I have been too pushy. Take 3 weeks; just promise me one thing: no COBOL.

le-mark · on Feb 1, 2023

In cobol it wasn’t dynamic, or infinite. The syntax of the array/table declaration includes a maximum and it would allocate the entire array statically up to the max.

ufo · on Jan 31, 2023

According to the article, it's better to use the new flexible array syntax (int arr[]) instead of the old zero-length syntax (int arr[0]), because that allows the compiler emit better warning messages.

fsckboy · on Feb 1, 2023

how would compiler confusion manifest between [] and [0] if it wanted to emit warning messages?

ufo · on Feb 1, 2023

Technically, [0] is not allowed by the standard. The real issue is [] vs [1].

ChrisSD · on Jan 31, 2023

According to the C spec, zero length arrays are explicitly illegal.

> Zero-length array declarations are not allowed, even though some compilers offer them as extensions (typically as a pre-C99 implementation of flexible array members).

However, as they say, gcc (and therefore clang) have an extension that allows it. So does MSVC but it works slightly differently.

fsckboy · on Jan 31, 2023

you should be specific about which C spec you are referring to, you talk about "the C spec" and then you mention the "pre-C99 implementation". Maybe you mean they've always been illegal in every version, but it would be more clear.

alerighi · on Feb 1, 2023

In reality, who cares of what the C spec says? Unless you have to port code on different compilers (which is very unlikely, unless you are building a library meant to be shared with different projects) you only care about the fact that the code works correctly with the compiler you choose to use.

I don't get all the programmers that scandalize if you use GNU extensions, they are fine, and mostly useful, so if you are using GCC to program I don't see why not use -std=gnu11 instead of -std=c11... who cares if the program is not compliant?

Even if you want to be compliant to share code, the C standard is you last problem, the thing is that you probably use a ton of libraries and header files specific to that implementation (such as POSIX or even worse Linux-specific stuff), so your code is 99% not portable anyway without rewriting most of it.

andreareina · on Feb 1, 2023

It’s all well and good until someone else wants to compile it on clang[1]. There’s a lot of value in C being a lingua franca of sorts, and that depends on a cross-implementation understanding of semantics, i.e. standards, (though that doesn’t have to be from whatever official body that produced C99 etc).

[1] IIRC on OSX[2] gcc actually invokes clang

[2] stopped upgrading before it became macOS

astrange · on Feb 1, 2023

Clang implements nearly all GCC extensions. (No nested functions.)

ChrisSD · on Jan 31, 2023

The bit after `>` is a quote, not my words. See https://en.cppreference.com/w/c/language/array for the source

It's saying that C99 implemented flexible array members but before then some compilers introduced their own (nonstandard) implementation of flexible array members, using the (not allowed) zero sized array notation.

segfaultbuserr · on Jan 31, 2023

The sequence of event was basically:

1. Pre-C99, no flexible array was allowed.

2. Gradually, people started using the "size-1 array at the end of a sturct but write beyond" hack as flexible array.

3. As an attempt to do the hack in a more ordered manner, some compilers, including GCC, started officially supporting the non-standard "size-0" array extension.

4. C99 added flexible array with indefinite length (array[]), while prohibiting both the undefined-behavior "array[1]" and the non-standard extentios "array[0]".

So when people say "size-0" array, it could mean either of these three things, and it does get a bit confusing. But fundamentally the idea was the same, all of the three techniques are used to achieve the same practical effect.

ChrisSD · on Jan 31, 2023

Kind of:

> If the size of the space requested is zero, the behavior is implementation-defined: either a null pointer is returned to indicate an error, or the behavior is as if the size were some nonzero value, except that the returned pointer shall not be used to access an object

So it may actually allocate (although the allocation is unusable).

tmtvl · on Jan 31, 2023

Hang on, let me think this through...

If malloc(0) gets called as first malloc in the program the system break does not need to be moved, as there is always 0 bytes space available... but malloc does like to move sysbreak by a large amount at a time to reduce the need for repeated calls...

I'm guessing malloc(0) does not move sysbreak and simply returns a pointer to the bottom of the heap?

monocasa · on Jan 31, 2023

Implementation defined. I've heard of returning null (under the case that your free() implementation allows nulls to be passed in) or returning a pointer to a zero length object on the heap like you're suggesting. Really just about the only requirement is that the pointer can subsequently be given to free() since dereferencing the pointer is UB.

int_19h · on Jan 31, 2023

free(NULL) is required to be a no-op by the ISO C standard.

monocasa · on Feb 1, 2023

Oh, good call. I had that backwards in my memory. The issue is when you give out non NULL pointers to zero sized objects, you have to make sure to give out unique pointer bit patterns at least versus nonzero sized objects so that the matching calls to free don't stomp on eachother.

p0nce · on Jan 31, 2023

Just want to point out malloc(0) might gives you any pointer, and you still have to call free() on it.

zabzonk · on Jan 31, 2023

malloc is a user level library function - c/c++ implementers can do what they like with it

assume the downvote was from someone that malloc is a system call

loeg · on Feb 1, 2023

I downvoted for the assumption that sbrk is involved in any way. It's an implementation detail of (some) historical malloc implementations, not some inherent aspect of all implementations. The entire thought experiment is faulty.

tmtvl · on Feb 1, 2023

Right, malloc uses mmap instead of sbrk, I'm an idiot. I really ought to read up on up-to-date implementation practices.

russdill · on Jan 31, 2023

Then you have two allocations instead of one and related data is now likely to be farther away, possibly even in a different page.

yyyk · on Feb 1, 2023

malloc(0) return value is undefined by POSIX and can return NULL (IIRC it did on NetBSD).

kevin_thibedeau · on Jan 31, 2023

malloc() doesn't allocate arrays. It allocates blocks of memory. Hence sizeof doesn't work the same for malloc() objects as it does on arrays.

int_19h · on Jan 31, 2023

sizeof doesn't work the same for malloc() because the type of the returned value is a pointer, and the behavior of sizeof is dependent solely on the static type. For comparison, calloc() is specifically defined as "allocates space for an array of ... objects" in the Standard, but since return type is still void*, the caveat with sizeof still applies.

unwind · on Feb 1, 2023

That is not true, as long as VLA:s are in the language spec for the version you're using.

This works:

    void vla_print(int n)
    {
      int foo[n];
      printf("Got %zu bytes right there!\n", sizeof foo);
    }

    int main(void)
    {
       vla_print(47);
       return 0;
    }

This prints 188 [1].

Even if you "hide" n from the compiler, i.e. make its value something that is only known at run-time (which is jumping through hoops, pretty sure the above is enough).

Also, it was jarring that the fine article kept referring to sizeof as sizeof(), a notation that C programmers typically use to identify function names. As everyone knows, sizeof is not a function. It's an operator. I really need to print up that t-shirt soon. Or maybe my first tattoo ... Hm.

[1]: https://ideone.com/kKf3bT

int_19h · on Feb 1, 2023

In your example, sizeof works because the type of foo is int[n], so I'm not sure what point you're making. It's still true that the way sizeof works depends solely on the static type of the expression that it is applied to - if it's an array, you get the actual size, including the necessary dynamic computation if it's a VLA, and if it's a pointer, you get the size of a pointer even if it points to an array.

loeg · on Feb 1, 2023

Well, sure, but the reason for that is that C's type system simply cannot represent functions returning arrays with known size. This is a weakness of the type system. Arrays degrade to a pointer type, lacking array length, as soon as you pass them between functions.

fsckboy · on Feb 1, 2023

> sizeof is dependent solely on the static type

so what, an array has a size that sizeof can measure. what is returned from malloc is not an array and what sizeof measures is not reflective of the size of the allocation

kevin_thibedeau · on Feb 1, 2023

calloc() doesn't allocate an array either. It's purpose is to allocate blocks of memory larger than SIZE_MAX. Mostly irrelevant now but was an issue on 16-bit systems.

int_19h · on Feb 1, 2023

I literally quoted the Standard (C17 7.22.3.2) in my earlier comment, and it very specifically says that calloc allocates an array. This language isn't from a new standard, either - it goes all the way back to ISO C90.