89

As a long-time C# programmer, I have recently come to learn more about the advantages of Resource Acquisition Is Initialization (RAII). In particular, I have discovered that the C# idiom:

using (var dbConn = new DbConnection(connStr)) {
    // do stuff with dbConn
}

has the C++ equivalent:

{
    DbConnection dbConn(connStr);
    // do stuff with dbConn
}

meaning that remembering to enclose the use of resources like DbConnection in a using block is unnecessary in C++ ! This seems to a major advantage of C++. This is even more convincing when you consider a class that has an instance member of type DbConnection, for example

class Foo {
    DbConnection dbConn;

    // ...
}

In C# I would need to have Foo implement IDisposable as such:

class Foo : IDisposable {
    DbConnection dbConn;

    public void Dispose()
    {       
        dbConn.Dispose();
    }
}

and what's worse, every user of Foo would need to remember to enclose Foo in a using block, like:

   using (var foo = new Foo()) {
       // do stuff with "foo"
   }

Now looking at C# and its Java roots I am wondering... did the developers of Java fully appreciate what they were giving up when they abandoned the stack in favor of the heap, thus abandoning RAII?

(Similarly, did Stroustrup fully appreciate the significance of RAII?)

JoelFan
  • 7,121

11 Answers11

62

Yes, the designers of C# (and, I'm sure, Java) specifically decided against deterministic finalization. I asked Anders Hejlsberg about this multiple times circa 1999-2002.

First, the idea of different semantics for an object based on whether its stack- or heap-based is certainly counter to the unifying design goal of both languages, which was to relieve programmers of exactly such issues.

Second, even if you acknowledge that there are advantages, there are significant implementation complexities and inefficiencies involved in the book-keeping. You can't really put stack-like objects on the stack in a managed language. You are left with saying "stack-like semantics," and committing to significant work (value types are already hard enough, think about an object that is an instance of a complex class, with references coming in and going back into managed memory).

Because of that, you don't want deterministic finalization on every object in a programming system where "(almost) everything is an object." So you do have to introduce some kind of programmer-controlled syntax to separate a normally-tracked object from one that has deterministic finalization.

In C#, you have the using keyword, which came in fairly late in the design of what became C# 1.0. The whole IDisposable thing is pretty wretched, and one wonders if it would be more elegant to have using work with the C++ destructor syntax ~ marking those classes to which the boiler-plate IDisposable pattern could be automatically applied?

Stephen C
  • 25,388
  • 6
  • 66
  • 89
Larry OBrien
  • 5,037
43

Keep in mind that Java was developed in 1991-1995 when C++ was a much different language. Exceptions (which made RAII necessary) and templates (which made it easier to implement smart pointers) were "new-fangled" features. Most C++ programmers had come from C and were used to doing manual memory management.

So I doubt that Java's developers deliberately decided to abandon RAII. It was, however, a deliberate decision for Java to prefer reference semantics instead of value semantics. Deterministic destruction is difficult to implement in a reference-semantics language.

So why use reference semantics instead of value semantics?

Because it makes the language a lot simpler.

  • There is no need for a syntactic distinction between Foo and Foo* or between foo.bar and foo->bar.
  • There is no need for overloaded assignment, when all assignment does is copy a pointer.
  • There is no need for copy constructors. (There is occasionally a need for an explicit copy function like clone(). Many objects just don't need to be copied. For example, immutables don't.)
  • There is no need to declare private copy constructors and operator= to make a class noncopyable. If you don't want objects of a class copied, you just don't write a function to copy it.
  • There is no need for swap functions. (Unless you're writing a sort routine.)
  • There is no need for C++0x-style rvalue references.
  • There is no need for (N)RVO.
  • There is no slicing problem.
  • It's easier for the compiler to determine object layouts, because references have a fixed size.

The main downside to reference semantics is that when every object potentially has multiple references to it, it becomes hard to know when to delete it. You pretty much have to have automatic memory management.

Java chose to use a non-deterministic garbage collector.

Can't GC be deterministic?

Yes, it can. For example, the C implementation of Python uses reference counting. And later added tracing GC to handle the cyclic garbage where refcounts fail.

But refcounting is horribly inefficient. Lots of CPU cycles spent updating the counts. Even worse in a multi-threaded environment (like the kind Java was designed for) where those updates need to be synchronized. Much better to use the null garbage collector until you need to switch to another one.

You could say that Java chose to optimize the common case (memory) at the expense of non-fungible resources like files and sockets. Today, in light of the adoption of RAII in C++, this may seem like the wrong choice. But remember that much of the target audience for Java was C (or "C with classes") programmers who were used to explicitly closing these things.

But what about C++/CLI "stack objects"?

They're just syntactic sugar for Dispose (original link), much like C# using. However, it doesn't solve the general problem of deterministic destruction, because you can create an anonymous gcnew FileStream("filename.ext") and C++/CLI won't auto-Dispose it.

Glorfindel
  • 3,167
dan04
  • 3,957
40

Now looking at C# and its Java roots I am wondering... did the developers of Java fully appreciate what they were giving up when they abandoned the stack in favor of the heap, thus abandoning RAII?

(Similarly, did Stroustrup fully appreciate the significance of RAII?)

I am pretty sure Gosling did not get the significance of RAII at the time he designed Java. In his interviews he often talked about reasons for leaving out generics and operator overloading, but never mentioned deterministic destructors and RAII.

Funny enough, even Stroustrup wasn't aware of the importance of deterministic destructors at the time he designed them. I can't find the quote, but if you are really into it, you can find it among his interviews here: http://www.stroustrup.com/interviews.html

19

Java7 introduced something similar to the C# using: The try-with-resources Statement

a try statement that declares one or more resources. A resource is as an object that must be closed after the program is finished with it. The try-with-resources statement ensures that each resource is closed at the end of the statement. Any object that implements java.lang.AutoCloseable, which includes all objects which implement java.io.Closeable, can be used as a resource...

So I guess they either didn't consciously choose not to implement RAII or they changed their mind meanwhile.

gnat
  • 20,543
  • 29
  • 115
  • 306
Patrick
  • 1,873
19

Java intentionally does not have stack-based objects (aka value-objects). These are necessary to have the object automatically destructed at the end of the method like that.

Because of this and the fact that Java is garbage-collected, deterministic finalization is more-or-less impossible (ex. What if my "local" object became referenced somewhere else? Then when the method ends, we don't want it destructed).

However, this is fine with most of us, because there's almost never a need for deterministic finalization, except when interacting with native (C++) resources!


Why does Java not have stack-based objects?

(Other than primitives..)

Because stack-based objects have different semantics than heap-based references. Imagine the following code in C++; what does it do?

return myObject;
  • If myObject is a local stack-based object, the copy-constructor is called (if the result is assigned to something).
  • If myObject is a local stack-based object and we're returning a reference, the result is undefined.
  • If myObject is a member/global object, the copy-constructor is called (if the result is assigned to something).
  • If myObject is a member/global object and we're returning a reference, the reference is returned.
  • If myObject is a pointer to a local stack-based object, the result is undefined.
  • If myObject is a pointer to a member/global object, that pointer is returned.
  • If myObject is a pointer to a heap-based object, that pointer is returned.

Now what does the same code do in Java?

return myObject;
  • The reference to myObject is returned. It doesn't matter if the variable is local, member, or global; and there are no stack-based objects or pointer cases to worry about.

The above shows why stack-based objects are a very common cause of programming errors in C++. Because of that, the Java designers took them out; and without them, there is no point in using RAII in Java.

17

Your description of the holes of using is incomplete. Consider the following problem:

interface Bar {
    ...
}
class Foo : Bar, IDisposable {
    ...
}

Bar b = new Foo();

// Where's the Dispose?

In my opinion, not having both RAII and GC was a bad idea. When it comes to closing files in Java, it's malloc() and free() over there.

DeadMG
  • 36,914
15

I'm pretty old. I've been there and seen it and banged my head about it many times.

I was at a conference in Hursley Park where the IBM boys were telling us how wonderful this brand new Java language was, only someone asked ... why isn't there a destructor for these objects. He didn't mean the thing we know as a destructor in C++, but there was no finaliser either (or it had finalisers but they basically didn't work). This is way back, and we decided Java was a bit of a toy language at that point.

now they added Finalisers to the language spec and Java saw some adoption.

Of course, later everyone was told not to put finalisers on their objects because it slowed the GC down tremendously. (as it had to not only lock the heap but move the to-be-finalised objects to a temp area, as these methods could not be called as the GC has paused the app from running. Instead they would be called immediately before the next GC cycle)(and worse, sometimes the finaliser would never get called at all when the app was shutting down. Imagine not having your file handle closed, ever)

Then we had C#, and I remember the discussion forum on MSDN where we were told how wonderful this new C# language was. Someone asked why there was no deterministic finalisation and the MS boys told us how we didn't need such things, then told us we needed to change our way of designing apps, then told us how amazing GC was and how all our old apps were rubbish and never worked because of all the circular references. Then they caved in to pressure and told us they'd added this IDispose pattern to the spec that we could use. I thought it was pretty much back to manual memory management for us in C# apps at that point.

Of course, the MS boys later discovered that all they'd told us was... well, they made IDispose a bit more than just a standard interface, and later added the using statement. W00t! They realised that deterministic finalisation was something missing from the language after all. Of course, you still have to remember to put it in everywhere, so its still a bit manual, but it's better.

So why did they do it when they could have had using-style semantics automatically placed on each scope block from the start? Probably efficiency, but I like to think that they just didn't realise. Just like eventually they realised you still need smart pointers in .NET (google SafeHandle) they thought that the GC really would solve all problems. They forgot that an object is more than just memory and that GC is primarily designed to handle memory management. they got caught up in the idea that the GC would handle this, and forgot that you put other stuff in there, an object isn't just a blob of memory that doesn't matter if you don't delete it for a while.

But I also think that the lack of a finalise method in the original Java had a bit more to it - that the objects you created were all about memory, and if you wanted to delete something else (like a DB handle or a socket or whatever) then you were expected to do it manually.

Remember Java was designed for embedded environments where people were used to writing C code with lots of manual allocations, so not having automatic free wasn't much of a problem - they never did it before, so why would you need it in Java? The issue wasn't anything to do with threads, or stack/heap, it was probably just there to make memory allocation (and therefore de-alloc) a bit easier. In all, the try/finally statement is probably a better place to handle non-memory resources.

So IMHO, the way .NET simply copied Java's biggest flaw is its biggest weakness. .NET should have been a better C++, not a better Java.

gbjbaanb
  • 48,749
  • 7
  • 106
  • 173
11

Bruce Eckel, author of "Thinking in Java" and "Thinking in C++" and a member of the C++ Standards Committee, is of the opinion that, in many areas (not just RAII), Gosling and the Java team didn't do their homework.

...To understand how the language can be both unpleasant and complicated, and well designed at the same time, you must keep in mind the primary design decision upon which everything in C++ hung: compatibility with C. Stroustrup decided -- and correctly so, it would appear -- that the way to get the masses of C programmers to move to objects was to make the move transparent: to allow them to compile their C code unchanged under C++. This was a huge constraint, and has always been C++'s greatest strength ... and its bane. It's what made C++ as successful as it was, and as complex as it is.

It also fooled the Java designers who didn't understand C++ well enough. For example, they thought operator overloading was too hard for programmers to use properly. Which is basically true in C++, because C++ has both stack allocation and heap allocation and you must overload your operators to handle all situations and not cause memory leaks. Difficult indeed. Java, however, has a single storage allocation mechanism and a garbage collector, which makes operator overloading trivial -- as was shown in C# (but had already been shown in Python, which predated Java). But for many years, the partly line from the Java team was "Operator overloading is too complicated." This and many other decisions where someone clearly didn't do their homework is why I have a reputation for disdaining many of the choices made by Gosling and the Java team.

There are plenty of other examples. Primitives "had to be included for efficiency." The right answer is to stay true to "everything is an object" and provide a trap door to do lower-level activities when efficiency was required (this would also have allowed for the hotspot technologies to transparently make things more efficient, as they eventually would have). Oh, and the fact that you can't use the floating point processor directly to calculate transcendental functions (it's done in software instead). I've written about issues like this as much as I can stand, and the answer I hear has always been some tautological reply to the effect that "this is the Java way."

When I wrote about how badly generics were designed, I got the same response, along with "we must be backwards compatible with previous (bad) decisions made in Java." Lately more and more people have gained enough experience with Generics to see that they really are very hard to use -- indeed, C++ templates are much more powerful and consistent (and much easier to use now that compiler error messages are tolerable). People have even been taking reification seriously -- something that would be helpful but won't put that much of a dent in a design that is crippled by self-imposed constraints.

The list goes on to the point where it's just tedious...

gnat
  • 20,543
  • 29
  • 115
  • 306
Gnawme
  • 1,333
11

The best reason is much simpler than most of the answers here.

You can't pass stack allocated objects to other threads.

Stop and think about that. Keep thinking.... Now C++ didn't have threads when everyone got so keen in RAII. Even Erlang ( separate heaps per thread) gets icky when you pass too many objects around. C++ only got a memory model in C++2011; now you can almost reason about concurrency in C++ without having to refer to your compiler's "documentation".

Java was designed from (almost) day one for multiple threads.

I've still got my old copy of "The C++ Programming language" where Stroustrup assures me I won't need threads.

The second painful reason is to avoid slicing.

5

In C++, you use more general-purpose, lower-level language features (destructors automatically called on stack-based objects) to implement a higher-level one (RAII), and this approach is something the C# / Java folks seem not to be too fond of. They'd rather design specific high-level tools for specific needs, and provide them to the programmers ready-made, built into the language. The problem with such specific tools is that they are often impossible to customize (in part that's what makes them so easy to learn). When building from smaller blocks, a better solution may come around with time, while if you only have high-level, built-in constructs, this is less likely.

So yeah, I think (I wasn't actually there...) it was a concious decision, with the goal of making the languages easier to pick up, but in my opinion, it was a bad decision. Then again, I generally prefer the C++ give-the-programmers-a-chance-to-roll-their-own philosophy, so I'm a bit biased.

imre
  • 151
-2

You already called out the rough equivalent to this in C# with the Dispose method. Java also has finalize. NOTE: I realize that Java's finalize is non-deterministic and different from Dispose, I am just pointing out that they both have a method of cleaning resources alongside the GC.

If anything C++ becomes more of a pain though because an object has to be physically destroyed. In higher level languages like C# and Java we depend on a garbage collector to clean it up when there are no longer references to it. There is no such guarantee that DBConnection object in C++ doesn't have rogue references or pointers to it.

Yes the C++ code can be more intuitive to read but can be a nightmare to debug because the boundaries and limitations that languages like Java put in place rule out some of the more aggravating and difficult bugs as well as protect other developers from common rookie mistakes.

Perhaps it comes down to preferences, some like the low-level power, control and purity of C++ where others like myself prefer a more sandboxed language that are much more explicit.

maple_shaft
  • 26,570