15

The C11 standard says the arrays, both sized and variable length "shall have a value greater than zero." What is the justification for not allowing a length of 0?

Especially for variable length arrays it makes perfect sense to have a size of zero every once and a while. It is also useful for static arrays when their size is from a macro or build configuration option.

Interestingly GCC (and clang) provide extensions that allow zero length arrays. Java also allows arrays of length zero.

Kevin Cox
  • 261

6 Answers6

11

The issue I would wager is that C arrays are just pointers to the beginning of an allocated chunk of memory. Having a 0 size would mean that you have a pointer to... nothing? You can't have nothing, so there would have had to be some arbitrary thing chosen. You can't use null, because then your 0 length arrays would look like null pointers. And at that point every different implementation is going to pick different arbitrary behaviors, leading to chaos.

Telastyn
  • 110,259
6

Let's look at how an array is typically laid out in memory:

         +----+
arr[0] : |    |
         +----+
arr[1] : |    |
         +----+
arr[2] : |    |
         +----+
          ...
         +----+
arr[n] : |    |
         +----+

Note that there isn't a separate object named arr that stores the address of the first element; when an array appears in an expression, C computes the address of the first element as needed.

So, let's think about this: a 0-element array would have no storage set aside for it, meaning there's nothing to compute the array address from (put another way, there's no object mapping for the identifier). It's like saying, "I want to create an int variable that takes up no memory." It's a nonsensical operation.

Edit

Java arrays are completely different animals from C and C++ arrays; they're not a primitive type, but a reference type derived from Object.

Edit2

A point brought up in the comments below - the "greater than 0" constraint only applies to arrays where the size is specified through a constant expression; a VLA is allowed to have a 0 length Declaring a VLA with a 0-valued non-constant expression is not a constraint violation, but it does invoke undefined behavior.

It's clear that VLAs are different animals from regular arrays, and their implementation can allow for a 0 size. They cannot be declared static or at file scope, because the size of such objects must be known before the program starts.

It's also worth nothing that as of C11, implementations are not required to support VLAs.

John Bode
  • 11,004
  • 1
  • 33
  • 44
2

If the expression type name[count] is written in some function then you tell the C compiler to allocate on the stack frame sizeof(type)*count bytes and compute the address of the first element in the array.

If the expression type name[count] is written outside all functions and structs definitions then you tell the C compiler to allocate on the data segment sizeof(type)*count bytes and compute the address of the first element in the array.

name actually is constant object that stores the address of the first element in the array and every object that stores an address of some memory is called pointer, so this is the reason you treat name as a pointer rather than an array. Note that arrays in C can be accessed only through pointers.

If count is a constant expression that evaluates to zero then you tell the C compiler to allocate zero bytes either on the stack frame or data segment and return the address of the first element in the array, but the problem in doing this is that the first element of zero-length array doesn't exist and you cannot compute the address of something that doesn't exist.

This is rational that element no. count+1 doesn't exist in count-length array, so this is the reason that the C compiler forbids to define zero-length array as variable in and outside of a function, because what is the contents of name then? What address name stores exactly?

If p is a pointer then the expression p[n] is equivalent to *(p + n)

Where the asterisk * in the right expression is dereference operation of pointer, which means access the memory pointed by p + n or access the memory whose address is stored in p + n, where p + n is pointer expression, it takes the address of p and adds to this address the number n multiply the size of the type of the pointer p.

Is it possible to add an address and a number?

Yes it is possible, because address is unsigned integer commonly represented in hexadecimal notation.

1

If you want a pointer to a memory address, declare one. An array actually points at a chunk of memory you have reserved. Arrays decay to pointers when passed to functions, but if the memory they are pointing at is on the heap, no problem. There is no reason to declare an array of size zero.

1

You would usually want your zero (in fact variable) size array to know its size at run time. Then pack that in a struct and use flexible array members, like e.g.:

struct my_st {
   unsigned len;
   double flexarray[]; // of size len
};

Obviously the flexible array member has to be the last in its struct and you need to have something before. Often that would be something related to the actual runtime-occupied length of that flexible array member.

Of course you would allocate:

 unsigned len = some_length_computation();
 struct my_st*p = malloc(sizeof(struct my_st)+len*sizeof(double));
 if (!p) { perror("malloc my_st"); exit(EXIT_FAILURE); };
 p->len = len;
 for (unsigned ix=0; ix<len; ix++)
    p->flexarray[ix] = log(3.0+(double)ix);

AFAIK, this was already possible in C99, and it is very useful.

BTW, flexible array members don't exist in C++ (because it would be difficult to define when and how they should be constructed & destructed). See however the future std::dynarray

1

From the days of the original C89, when a C Standard specified that something had Undefined Behavior, what that meant was "Do whatever would make an implementation on a particular target platform most suitable for its intended purpose". The authors of the Standard didn't want to try to guess what behaviors might be most suitable for any particular purpose. Existing C89 implementations with VLA extensions might have had different, but logical, behaviors when given a size of zero (e.g. some might have treated the array as an address expression yielding NULL, while others treating it as an address expression which might equal the address of another arbitrary variable, but could safely have zero added to it without trapping). If any code might have relied upon such different behavior, the authors of the Standard wouldn't want to forbid compilers from continuing to support such code.

Rather than trying to guess what implementations might do, or suggesting that any behavior should be considered superior to any other, the authors of the Standard simply allowed implementers to use judgment in handling that case as best they saw fit. Implementations that use malloc() behind the scenes might treat the array's address as NULL (if size-zero malloc yields null), those that use stack-address computations might yield a pointer which matches some other variable's address, and some other implementations might do other things. I don't think they expected that compiler writers would go out of their way to make the zero-size corner case behave in deliberately-useless fashion.

supercat
  • 8,629