3

I think I generally know what the Garbage Collector in Java does, but It's praised a lot, so I thought maybe I'm missing something about it's functionality.

What I know is, that the GC takes care of erasing from the memory objects that have no reference to them, and thus are unreachable by the programmer.

For example, if inside a loop I'm constantly creating a new Object(), the previous ones will eventually get deleted by the GC (correct me if I'm wrong).

This is very useful for me as a Java programmer, and spares me a lot of headache that I assume C++ programmers have to deal with - manually deleting objects from the memory (again, correct me if I'm wrong).

This feature is awesome, but is there anything more to the Garbage Collector? Am I missing something?

Kilian Foth
  • 110,899
Aviv Cohn
  • 21,538

5 Answers5

12

For the visually minded, here's a parable by Kieron Briggs:

Picture a big empty room with a big furnace/incinerator type thing at one end. Hanging form the roof are a number of ropes, called Threads. Attached to the various threads are little sparkly Objects, and those Objects can have other Object attached to them in turn by little rods called References. This creates a (hopefully) beautiful structure of Objects all attaches (either directly or indirectly) to a Thread. You can even have Objects which link to more than one Thread.

When an Object isn't needed any more, the Reference to it disappears. If that leaves the Object (or a whole collection of Objects) unattached to any Thread, it will fall down onto the floor and shatter. This gradually builds up a layer of broken objects lying around the floor. In a language like C, eventually the floor would become so full that there was no room for more Objects, and Bad Things would happen.

But in Java, we have a Garbage Collector. This is a little dude in overalls that climbs down a special Staff Only Thread, sweeps up all the broken Objects on the floor, and shovels them into the incinerator. You never know exactly when he's going to come along, but he's always there keeping an eye on the mess on the floor to make sure that it doesn't fill up too much...

  • more formal description: mark-and-sweep algorithm

    ...each object in memory has a flag (typically a single bit) reserved for garbage collection use only. This flag is always cleared, except during the collection cycle. The first stage of collection does a tree traversal of the entire 'root set', marking each object that is pointed to as being 'in-use'. All objects that those objects point to, and so on, are marked as well, so that every object that is ultimately pointed to from the root set is marked. Finally, all memory is scanned from start to finish, examining all free or used blocks; those with the in-use flag still cleared are not reachable by any program or data, and their memory is freed. (For objects which are marked in-use, the in-use flag is cleared again, preparing for the next cycle.)

https://i.sstatic.net/7Meuu.jpg

gnat
  • 20,543
  • 29
  • 115
  • 306
7

No, not missing anything about Garbage Collection.

Garbage collection frees developers from the minutia of allocating and de-allocating memory themselves.

Being part of the platform means there is a whole class of problems that Java does not have (and .NET and other garbage collected runtimes as well).

What you may be missing is just how much trouble having to deal with memory allocations and de-allocations and the bugs inherent with not getting them right really is ;)

Oded
  • 53,734
3

You're not missing anything. However, I would like to point out something that I don't think any of the other answers have addressed clearly. Garbage collection is about program correctness and security first and foremost; convenience is secondary (although it is very convenient.)

If the programmer manages memory manually, making a mistake means you corrupt the program's state. The program keeps going but execution's gone off the rails and no longer has any meaning. It could fail catastrophically right away, but will probably keep going in a way that's seemingly correct but very subtly wrong, possibly for days or weeks before anyone notices the problem. Even worse is that memory bugs could be exploited by attackers. If the buggy program runs with administrator privileges, and the bug allows an attacker to run arbitrary code, the potential for damage is almost unlimited. This sort of bug is very hard to detect and impossible to avoid; sooner or later you will introduce a memory bug because we're all human. A language that outright forbids manual memory management is guaranteed to be free of this type of bug (though memory bugs are definitely not the only way to introduce a security vulnerability.)

Lack of garbage collection generally precludes complicated object structures. For the most part you're limited to creating objects that either always exist, or have a single owner that's responsible for deleting them. That rules out, for example, persistent data structures, because any given node could be shared by many objects and there's no reliable way of managing that. Using smart pointers with reference counting is not a solution - their performance is bad and cylic references (two objects pointing to each other) won't be freed. You can introduce weak references to avoid the latter problem, but now you've made manual memory management even harder to get right (and the performance is still bad). Garbage collection enables these complicated data structures to be cleaned up correctly and in an efficient manner.

Ironically, being deprived of persistent data structures makes the program harder to reason about, because changes to the data structure are destructive and now you have to be very careful about who has references to it and who's making changes to it. So on top of memory bugs, you have to worry about additional state/aliasing bugs as well.

Doval
  • 15,487
0

A similar analogy to gnat's answer is to imagine someone sat at a desk in a paper based office. The person performs tasks using paper documents (objects) and needs to have these bits of paper on their desk. When they have finished and no longer need to reference a document they throw it in the bin.

The desk and the bin have a finite capacity (memory). The GC is the janitor who comes along occasionally and empties the bin so that capacity isn't exceeded*.

*Of course you can always run out of memory by putting too many documents on your desk. There's nothing the janitor can do about this.

Qwerky
  • 1,582
0

There is one more thing you can do with GC besides simply offloading your memory management to it: weak references, which could be used to offload your caches management onto GC as well.

Your weak objects are reachable and can still be used at some point, but you're instructing GC that it's ok to get rid of them at any moment, and you don't really mind losing them.

SK-logic
  • 8,517