66

Well, I know that there are things like malloc/free for C, and new/using-a-destructor for memory management in C++, but I was wondering why there aren't "new updates" to these languages that allow the user to have the option to manually manage memory, or for the system to do it automatically (garbage collection)?

Somewhat of a newb-ish question, but only been in CS for about a year.

genesis
  • 205
Dark Templar
  • 6,323

16 Answers16

77

Garbage collection requires data structures for tracking allocations and/or reference counting. These create overhead in memory, performance, and the complexity of the language. C++ is designed to be "close to the metal", in other words, it takes the higher performance side of the tradeoff vs convenience features. Other languages make that tradeoff differently. This is one of the considerations in choosing a language, which emphasis you prefer.

That said, there are a lot of schemes for reference counting in C++ that are fairly lightweight and performant, but they are in libraries, both commercial and open source, rather than part of the language itself. Reference counting to manage object lifetime is not the same as garbage collection, but it addresses many of the same kinds of issues, and is a better fit with C++'s basic approach.

Others
  • 101
kylben
  • 2,318
50

Strictly speaking, there is no memory management at all in the C language. malloc() and free() are not keywords in the language, but just functions that are called from a library. This distinction may be pedantic now, because malloc() and free() are part of the C standard library, and will be provided by any standard compliant implementation of C, but this wasn't always true in the past.

Why would you want a language with no standard for memory management? This goes back to C's origins as 'portable assembly'. There are many cases of hardware and algorithms that can benefit from, or even require, specialized memory management techniques. As far as I know, there is no way to completely disable Java's native memory management and replace it with your own. This is simply not acceptable in some high performance/minimal resource situations. C provides almost complete flexibility to choose exactly what infrastructure your program is going to use. The price paid is that the C language provides very little help in writing correct, bug free code.

38

The real answer is that the only way to make a safe, efficient garbage collection mechanism is to have language-level support for opaque references. (Or, conversely, a lack of language-level support for direct memory manipulation.)

Java and C# can do it because they have special reference types that cannot be manipulated. This gives the runtime the freedom to do things like move allocated objects in memory, which is crucial to a high-performance GC implementation.

For the record, no modern GC implementation uses reference counting, so that is completely a red herring. Modern GCs use generational collection, where new allocations are treated essentially the same way that stack allocations are in a language like C++, and then periodically any newly allocated objects that are still alive are moved to a separate "survivor" space, and an entire generation of objects is deallocated at once.

This approach has pros and cons: the upside is that heap allocations in a language that supports GC are as fast as stack allocations in a language that doesn't support GC, and the downside is that objects that need to perform cleanup before being destroyed either require a separate mechanism (e.g. C#'s using keyword) or else their cleanup code runs non-deterministically.

Note that one key to a high-performance GC is that there must be language support for a special class of references. C doesn't have this language support and never will; because C++ has operator overloading, it could emulate a GC'd pointer type, although it would have to be done carefully. In fact, when Microsoft invented their dialect of C++ that would run under the CLR (the .NET runtime), they had to invent a new syntax for "C#-style references" (e.g. Foo^) to distinguish them from "C++-style references" (e.g. Foo&).

What C++ does have, and what is regularly used by C++ programmers, is smart pointers, which are really just a reference-counting mechanism. I wouldn't consider reference counting to be "true" GC, but it does provide many of the same benefits, at the cost of slower performance than either manual memory management or true GC, but with the advantage of deterministic destruction.

At the end of the day, the answer really boils down to a language design feature. C made one choice, C++ made a choice that enabled it to be backward-compatible with C while still providing alternatives that are good enough for most purposes, and Java and C# made a different choice that is incompatible with C but is also good enough for most purposes. Unfortunately, there is no silver bullet, but being familiar with the different choices out there will help you to pick the correct one for whatever program you're currently trying to build.

28

Because, when using the power of C++, there is no need.

Herb Sutter: "I've haven't written delete in years."

see Writing modern C++ code: how C++ has evolved over the years 21:10

It may surprise many experienced C++ programmers.

Lior Kogan
  • 1,467
15

"All" a garbage collector is is a process that runs periodically checking to see if there are any unreferenced objects in memory and if there are deletes them. (Yes, I know this is a gross oversimplification). This is not a property of the language, but the framework.

There are garbage collectors written for C and C++ - this one for example.

One reason why one hasn't been "added" to the language could be because of the sheer volume of existing code that would never use it as they use their own code for managing memory. Another reason could be that the types of applications written in C and C++ don't need the overhead associated with a garbage collection process.

ChrisF
  • 38,948
  • 11
  • 127
  • 168
12

I don't have the exact quotes but both Bjarne and Herb Sutter says something along the lines:

C++ doesn't need a garbage collector, because it has no garbage.

In modern C++ you use smart pointers and therefore have no garbage.

ronag
  • 1,209
12

C was designed in an era when garbage collection was barely an option. It was also intended for uses where garbage collection would not generally work - bare metal, real time environments with minimal memory and minimal runtime support. Remember that C was the implementation language for the first unix, which ran on a pdp-11 with 64*K* bytes of memory. C++ was originally an extension to C - the choice had already been made, and it's very hard to graft garbage collection onto an existing language. It's the kind of thing that has to be built in from the ground floor.

ddyer
  • 4,078
8

You ask why these languages haven't been updated to include an optional garbage collector.

The problem with optional garbage collection is that you can't mix code that uses the different models. That is, if I write code that assumes you are using a garbage collector you can't use it in your program which has garbage collection turned off. If you do, it'll leak everywhere.

Winston Ewert
  • 25,052
8

There's various issues, including...

  • Although GC was invented before C++, and possibly before C, both C and C++ were implemented before GCs were widely accepted as practical.
  • You can't easily implement a GC language and platform without an underlying non-GC language.
  • Although GC is demonstrably more efficient than non-GC for typical applications code developed in typical timescales etc, there are issues where more development effort is a good trade-off and specialized memory management will outperform a general-purpose GC. Besides, C++ is typically demonstrably more efficient than most GC languages even without any extra development effort.
  • GC is not universally safer than C++-style RAII. RAII allows resources other than memory to be automatically cleaned up, basically because it supports reliable and timely destructors. These cannot be combined with conventional GC methods because of issues with reference cycles.
  • GC languages have their own characteristic kinds of memory leaks, particularly relating to memory that will never be used again, but where existing references existed that have never been nulled out or overwritten. The need to do this explicitly is no different in principle than the need to delete or free explicitly. The GC approach still has an advantage - no dangling references - and static analysis can catch some cases, but again, there's no one perfect solution for all cases.

Basically, partly it's about the age of the languages, but there will always be a place for non-GC languages anyway - even if it is a bit of a nichey place. And seriously, in C++, the lack of GC isn't a big deal - your memory is managed differently, but it isn't unmanaged.

Microsofts managed C++ has at least some ability to mix GC and non-GC in the same application, allowing a mix-and-match of the advantages from each, but I don't have the experience to say how well this works in practice.

Rep-whoring links to related answers of mine...

7

Can you imagine writing a device handler in a language with garbage collection? How many bits could come down the line while the GC was running?

Or an operating system? How could you start the garbage collection running before you even start the kernel?

C is designed for low level close to the hardware tasks. The problem? is it is such a nice language that its a good choice for many higher level tasks as well. The language czars are aware of these uses but they need to support the requirements of device drivers, embedded code and operating systems as a priority.

7

The short and boring answer to this question is that there needs to be a non-garbage collected language out there for the people that write the garbage collectors. It's not conceptually easy to have a language that at the same time allows for very precise control over the memory layout and has a GC running on top.

The other question is why C and C++ don't have garbage collectors. Well, I know C++ has a couple of them around but they aren't really popular because they are forced to deal with a language that wasn't designed to be GC-ed in the first place, and the people that still use C++ in this age aren't really the kind that misses a GC.

Also, instead of adding GC to an old non-GC-ed language, it is actually easier to create a new language that has most of the same syntax while supporting a GC. Java and C# are good examples of this.

hugomg
  • 2,102
5

Garbage collection is fundamentally incompatible with a systems language used for developing drivers for DMA-capable hardware.

It's entirely possible that the only pointer to an object would be stored in a hardware register in some peripheral. Since the garbage collector wouldn't know about this, it would think the object was unreachable and collect it.

This argument holds double for compacting GC. Even if you were careful to maintain in-memory references to objects used by hardware peripherals, when the GC relocated the object, it wouldn't know how to update the pointer contained in the peripheral config register.

So now you'd need a mixture of immobile DMA buffers and GC-managed objects, which means you have all the disadvantages of both.

Ben Voigt
  • 3,266
3

Because, C & C++ are relatively low level languages meant for general purpose, even, for example, to run on a 16-bit processor with 1MB of memory in an embedded system, which couldn't afford wasting memory with gc.

Petruza
  • 1,058
2

There are garbage collectors in C++ and C. Not sure how this works in C, but in C++ you can leverage RTTI to dynamically discover your object graph and use that for garbage collection.

To my knowledge, you cannot write Java without a garbage collector. A little search turned up this.

The key difference between Java and C/C++ is that in C/C++ the choice is always yours, whereas in Java you're often left without options by design.

back2dos
  • 30,140
2

It's a trade off between performance and safety.

There is no guarantee that your garbage will be collected in Java, so it may be hanging around using up space for a long time, while the scanning for unreferenced objects (ie garbage) also takes longer than explicitly deleting or freeing an unused object.

The advantage is, of course, that one can build a language without pointers or without memory leaks, so one is more likely to produce correct code.

There can be a slight 'religious' edge to these debates sometimes - be warned!

2

Here is a list of inherent problems of GC, which make it unusable in a system language like C:

  • The GC has to run below the level of the code whose objects it manages. There is simply no such level in a kernel.

  • A GC has to stop the managed code from time to time. Now think about what would happen if it did that to your kernel. All processing on your machine would stop for, say, a millisecond, while the GC scans all existing memory allocations. This would kill all attempts to create systems that operate under strict real-time requirements.

  • A GC needs to be able to distinguish between pointers and non-pointers. That is, it must be able to look at every memory object in existence, and be able to produce a list of offsets where its pointers can be found.

    This discovery must be perfect: The GC must be able to chase all the pointers it discovers. If it dereferenced a false positive, it would likely crash. If it failed to discover a false negative, it would likely destroy an object that's still in use, crashing the managed code or silently corrupting its data.

    This absolutely requires that type information is stored in every single object in existence. However, both C and C++ allow for plain old data objects which contain no type information.

  • GC is an inherently slow business. Programmers that have been socialized with Java may not realize this, but programs can be orders of magnitude faster when they are not implemented in Java. And one of the factors that make Java slow is GC. This is what precludes GCed languages like Java from being used in supercomputing. If your machine costs a million a year in power consumption, you don't want to pay even 10% of that for garbage collection.

C and C++ are languages that are created to support all possible use cases. And, as you see, many of these use cases are precluded by garbage collection. So, in order to support these use cases, C/C++ cannot be garbage collected.