Why did the team at LMAX use Java and design the architecture to avoid GC at all cost?

Question

Why did the team at LMAX design the LMAX Disruptor in Java but all their design points to minimizing GC use? If one does not want to have GC run then why use a garbage collected language?

Their optimizations, the level of hardware knowledge and the thought they put are just awesome but why Java?

I'm not against Java or anything, but why a GC language? Why not use something like D or any other language without GC but allows efficient code? Is it that the team is most familiar with Java or does Java possess some unique advantage that I am not seeing?

Say they develop it using D with manual memory management, what would be the difference? They would have to think low level (which they already are), but they can squeeze the best performance out of the system as it's native.

score 19 · Accepted Answer · edited Jun 12 '14 at 17:00

Because there is a huge difference between optimizing the performance and turning off completely a safety

By reducing the number of GC, their framework is more responsive and can run (presumably) quicker. Now, optimizing for the garbage collector don't mean they don't ever do a garbage collection. It just mean they do it less often, and when they do it, it run really fast. Those kind of optimization include :

Minimizing the number of object that move to a survivor space (i.e that survived at least one garbage collection) by using small throw-away objects. Object that moved to the survivor space are harder to collect and a garbage collection here sometime imply freezing the whole JVM.
Don't allocate too many objects to begin with. This can backfire if you're not careful, as the young generation objects are super cheap to allocate and collect.
Ensure that new object point to old one (and not the other way around) so that the young object are easy to collect, since there is no reference to them that will cause them to be kept

When you tune out the performance, you usually tune some very specific "hot spot" while ignoring code that don't run often. If you do that in Java, you can let the garbage collector still take care of those dark corner (since it won't make a lot of difference) while optimizing very carefully for area that run in a tight loop. So you can choose where you optimize and where you don't, and you can thus focus your effort where it matter.

Now, if you turn off completely garbage collection, then you can't choose. You must manually dispose of every object, ever. That method get called at most once per day? In Java, you can let it be, as its performance impact is negligible (it may be OK to let a full GC occur every month). In C++, you are still leaking resource, so you must take care even of that obscure method. So you must pay the price for resource management in every, single, part of your application, while in Java you can focus.

But it get worse.

What if you have a bug, let say in a dark corner of your application that is only accessed on Monday on a full moon? Java have strong safety guarantee. There is little to no "undefined behavior". If you use something wrong, an Exception is thrown, your program stop, and no data corruption occur. So you are pretty sure that nothing wrong can happen without you noticing.

But in something like D, you can have a bad pointer access, or a buffer overflow, and you can corrupt your memory, but your program won't know (you turned the safety off, remember?) and will keep running with its incorrect data, and do some pretty nasty things and corrupt your data, and you don't know, and as more corruption happen, your data get more and more wrong, and then suddenly it break, and it was in a life critical application, and some error happened in the computation of a rocket, and so it doesn't work, and the rocket explode, and someone die, and your company is in the front page of every newspaper and your boss point its finger to you saying "You are the engineer that suggested we used D to optimize performance, how come you didn't think of safety? ". And it is your fault. You killed those people with your foolish attempt at performance.

OK, ok, most of the time it is much less dramatic than that. But even a business critical application or just a GPS app or, let say, a government healthcare website can yield some pretty negative consequence if you have bugs. Using a language that either prevent them completely or fail-fast when they happen is usually a very good idea.

There is a cost to turning off a safety. Going native doesn't always make sense. Sometime it is much simpler and safer to just optimize a bit a safe language that to go all in for a language where you can shoot yourself in the foot big-time. Correctness and safety in a lot of case trump the few nano second you would have scrapped by eliminating the GC completely. Disruptor can be used in those situation, so I think LMAX-Exchange made the right call.

But what about D in particular? You do have a GC if you want for the dark corners, and the SafeD subset (that I didn't know of before the edit) remove undefined behavior (if you remember to use it!).

Well in that case its a simple question of maturity. The Java ecosystem is full of well-written tool and mature libraries (better for development). Much more developers know Java than D (better for maintenance). Going for a new and not-so popular language for something as critical as a financial application would not have been a good idea. With less-known language, if you have a problem, few can help you, and the libraries you find tend to have more bugs since they were exposed to less people.

So my last point still hold: if you want to avoid problems with dire consequences, stick with safe choices. At this point in the life of D, its customer are the little start-ups ready to take crazy risks. If a problem can cost millions, you are better staying further in the innovation bell curve.

score 4 · Answer 2 · answered Dec 24 '13 at 16:32

It seems the reason it's written in Java is that they have Java expertise in-house and it was probably written (although it's still in active development) before C++ got its act together with C++0x/11.

Their code is really only Java by name, they use sun.misc.Unsafe quite a bit which kind of defeats the point of Java and the safety is supposedly gives. I have written a C++ port of the Disruptor and it outperforms the Java code they ship (I did not spend a lot of time tuning the JVM).

That said, the principles that the disruptor follows are not language specific, e.g. Don't expect low latency C++ code that allocs or frees from the heap.

score 4 · Answer 3 · answered Jun 23 '16 at 16:23

This question states an incorrect premise as fact, then makes an argument about that incorrect premise.

Lets dig in to this .. "all their design points to minimizing GC use" - simply isn't true. The innovation in the disruptor has little to do with GC. The disruptor performs because its design cleverly considers how modern computers work - something that's much less common than one might expect. See Cliff Click's talk http://www.azulsystems.com/events/javaone_2009/session/2009_J1_HardwareCrashCourse.pdf for a discussion.

Its well known that LMax are customers of Azul. I know firsthand that with Azul GCs are simply a nonissue - even with heaps of 175GB.

gnat · Answer 4 · 2013-12-23T15:56:04.773

They would have to think low level

Above makes half of the answer you're looking for. You can find another half to complete the reasoning no farther than in LMAX blog:

While very efficient, it can lead to a number of errors as it is very easy to screw up...

As admitted by LMAX developers, code like that might be quite difficult to develop, understand and debug - even in Java. Going lower level further than where they are now will only exacerbate this problem, as pointed in Wikipedia article on low level programming languages:

A program written in a low-level language can be made to run very quickly, and with a very small memory footprint; an equivalent program in a high-level language will be more heavyweight. Low-level languages are simple, but are considered difficult to use, due to the numerous technical details which must be remembered.

By comparison, a high-level programming language isolates the execution semantics of a computer architecture from the specification of the program, which simplifies development...

score 3 · Answer 5 · answered Jun 12 '14 at 16:47

If you use Java as a syntax language and avoid its JDK libraries it can be as fast as a compiled non-GC language. GC is not suitable for real-time systems, but it is possible to develop systems in Java that do not leave any garbage behind. As a result the GC never triggers.

We believe that the Java language and platform have many advantages over C/C++ and we have developed and benchmarked some ultra-low-latency Java components to prove it. We talk about the techniques to do so in this article: Java Development without GC.

score 1 · Answer 6 · answered Dec 23 '13 at 15:59

LMAX is a High Performance Inter-Thread Messaging Library.

To be useful someone else has to write the code to get each thread to do useful work. Given that the code is most likely to be in Java or C# and then there are very few chooses of language that interface well with them.

Using C or C++ is not a good option unless you wish to limit your users to a single OS, as there is no threading model defined in them.

Java is the standard for a lot of software development these days, so unless you have a good reason otherwise, it tends to be the best choose. (When in Rome do as the Romans…)

Writing High Performance software in Java (or C#) is often done to prove a point…

Why did the team at LMAX use Java and design the architecture to avoid GC at all cost?

6 Answers6

Linked