296

I used to code in Python a lot. Now, for work reasons, I code in Java. The projects I do are rather small, and possibly Python would work better, but there are valid non-engineering reasons to use Java (I can't go into details).

Java syntax is no issue; it is just another language. But apart from the syntax, Java has a culture, a set of development methods, and practices that is considered "correct". And for now I am completely failing to "grok" that culture. So I would really appreciate explanations or pointers in the right direction.

A minimal complete example is available in a Stack Overflow question that I started: https://stackoverflow.com/questions/43619566/returning-a-result-with-several-values-the-java-way/43620339

I have a task - parse (from a single string) and handle a set of three values. In Python it is a one-liner (tuple), in Pascal or C a 5-liner record/struct.

According to the answers, the equivalent of a struct is available in Java syntax and a triple is available in a widely-used Apache library - yet the "correct" way of doing it is actually by creating a separate class for the value, complete with getters and setters. Someone was very kind to provide a complete example. It was 47 lines of code (well, some of these lines were blanks).

I understand that a huge development community is likely not "wrong". So this is a problem with my understanding.

Python practices optimize for readability (which, in that philosophy, leads to maintainability) and after that, development speed. C practices optimize for resource usage. What do Java practices optimize for? My best guess is scalability (everything should be in a state ready for a millions-LOC project), but it is a very weak guess.

16 Answers16

242

The Java Language

I believe all these answers are missing the point by trying to ascribe intent to the way Java works. Java's verbosity does not stem from it being object oriented, as Python and many other languages are too yet have terser syntax. Java's verbosity doesn't come from its support of access modifiers either. Instead, it's simply how Java was designed and has evolved.

Java was originally created as a slightly improved C with OO. As such Java has 70s-era syntax. Furthermore, Java is very conservative about adding features in order to retain backward compatibility and to allow it to stand the test of time. Had Java added trendy features like XML literals in 2005 when XML was all the rage the language would have been bloated with ghost features that nobody cares about and that limit its evolution 10 years later. Therefore Java simply lacks a lot of modern syntax to express concepts tersely.

However, there's nothing fundamental preventing Java from adopting that syntax. For example, Java 8 added lambdas and method references, greatly reducing verbosity in many situations. Java could similarly add support for compact data type declarations such as Scala's case classes. But Java simply hasn't done so. Do note that custom value types are on the horizon and this feature may introduce a new syntax for declaring them. I suppose we will see.


The Java Culture

The history of enterprise Java development has largely led us to the culture we see today. In the late 90s/early 00s, Java became an extremely popular language for server-side business applications. Back then those applications were largely written ad-hoc and incorporated many complex concerns, such as HTTP APIs, databases, and processing XML feeds.

In the 00s it became clear that many of these applications had a lot in common and frameworks to manage these concerns, like the Hibernate ORM, the Xerces XML parser, JSPs and the servlet API, and EJB, became popular. However, while these frameworks reduced the effort to work in the particular domain that they set to automate, they required configuration and coordination. At the time, for whatever reason, it was popular to write frameworks to cater to the most complex use case and therefore these libraries were complicated to set up and integrate. And over time they grew increasingly complex as they accumulated features. Java enterprise development gradually became more and more about plugging together third party libraries and less about writing algorithms.

Eventually the tedious configuration and management of enterprise tools became painful enough that frameworks, most notably the Spring framework, came along to manage the management. You could put all your configuration in one place, the theory went, and the configuration tool would then configure the pieces and wire them together. Unfortunately these "framework frameworks" added more abstraction and complexity on top of the whole ball of wax.

Over the past few years more lightweight libraries have grown in popularity. Nonetheless an entire generation of Java programmers came of age during the growth of heavy enterprise frameworks. Their role models, those developing the frameworks, wrote factory factories and proxy configuration bean loaders. They had to configure and integrate these monstrosities day-to-day. And as a result the culture of the community as a whole followed the example of these frameworks and tended to badly over-engineer.

75

I believe I have an answer for one of the points you raise that haven't been raised by others.

Java optimizes for compile-time programmer errors detection by never making any assumptions

In general, Java tends to only infer facts about the source code after the programmer has already explicitly expressed his intent. The Java compiler never makes any assumptions about the code and will only use inference to reduce redundant code.

The reason behind this philosophy is that the programmer is only human. What we write is not always what we actually intend the program to do. The Java language tries to mitigate some of those issues by forcing developer to always explicitly declare their types. That's just a way of double checking that the code that was written actually does what was intended.

Some other languages push that logic even further by checking pre-conditions, post-conditions and invariants (though I'm not sure they do it at compile time). These are even more extreme ways for the programmer to have the compiler double check his own work.

In your case, this means that in order for the compiler to guarantee that you are actually returning the types that you think you are returning, you need to provide that information to the compiler.

In Java there are two ways to do that:

  1. Use Triplet<A, B, C> as return type (which really should be in java.util, and I can't really explain why it isn't. Especially since JDK8 introduce Function, BiFunction, Consumer, BiConsumer, etc... It just seems that Pair and Triplet at least would make sense. But I digress)

  2. Create your own value type for that purpose, where each field is named and typed properly.

In both of those cases, the compiler can guarantee that your function returns the types it declared, and that the caller realizes what is the type of each returned field and uses them accordingly.

Some languages do provide static type checking AND type inference at the same time, but that leaves the door open for a subtle class of type mismatch issues. Where the developer intended to return a value of a certain type, but actually returns another and the compiler STILL accepts the code because it happens that, by co-incidence both the function and the caller only use methods that can be applied to both the intended and the actual types.

Consider something like this in Typescript (or flow type) where type inference is used instead of explicit typing.

function parseDurationInMillis(csvLine){
    // here the developer intends to return a Number, 
    // but actually it's a String
    return csv.firstField();
}

// Compiler infers that parseDurationInMillis is String, so it does
// string concatenation and infers that plusTwoSeconds is String
// Developer actually intended Number
var plusTwoSeconds = 2000 + parseDurationInMillis(csvLine);

This is a silly trivial case, of course, but it can be a lot more subtle and lead to difficult to debug issues, because the code looks right. This kind of issues is entirely avoided in Java, and that is what the whole language is designed around.


Note that following the proper Object Oriented principles and domain-driven modeling, the duration parsing case in the linked question could also be returned as a java.time.Duration object, which would be a lot more explicit that both of the above cases.

LordOfThePigs
  • 1,111
  • 7
  • 10
50

Java and Python are the two languages I use the most but I'm coming from the other direction. That is, I was deep in the Java world before I started using Python so I might be able to help. I think the answer to the larger question of "why are things so heavy" comes down to 2 things:

  • The costs around development between the two are like the air in a long balloon animal balloon. You can squeeze one part of the balloon and another part swells. Python tends to squeeze the early part. Java squeezes the later part.
  • Java still lacks some features that would remove some of this weight. Java 8 made a huge dent in this but the culture hasn't fully digested the changes. Java could use a few more things such as yield.

Java 'optimizes' for high-value software that will be maintained for many years by large teams of people. I've had the experience of writing stuff in Python and looking at it a year later and be baffled by my own code. In Java I can look at tiny snippets of other people's code and instantly know what it does. In Python you can't really do that. It's not that one is better as you seem to realize, they just have different costs.

In the specific case you mention, there are no tuples. The easy solution is to create a class with public values. When Java first came out, people did this fairly regularly. The first problem with doing that is that it's a maintenance headache. If you need to add some logic or thread safety or want to use polymorphism, you will at the very least need to touch every class that interacts with that 'tuple-esque' object. In Python there are solutions for this such as __getattr__ etc. so it's not so dire.

There are some bad habits (IMO) around this though. In this case if you want a tuple, I question why you would make it a mutable object. You should only need getters (on a side note, I hate the get/set convention but it is what it is.) I do think a bare class (mutable or not) can be useful in a private or package-private context in Java. That is, by limiting the references in the project to the class, you can refactor later as needed without changing the public interface of the class. Here's an example of how you can create a simple immutable object:

public class Blah 
{
  public static Blah blah(long number, boolean isInSeconds, boolean lessThanOneMillis)
  {
    return new Blah(number, isInSeconds, lessThanOneMillis);
  }

  private final long number;
  private final boolean isInSeconds;
  private final boolean lessThanOneMillis;

  public Blah(long number, boolean isInSeconds, boolean lessThanOneMillis)
  {
    this.number = number;
    this.isInSeconds = isInSeconds;
    this.lessThanOneMillis = lessThanOneMillis;
  }

  public long getNumber()
  {
    return number;
  }

  public boolean isInSeconds()
  {
    return isInSeconds;
  }

  public boolean isLessThanOneMillis()
  {
    return lessThanOneMillis;
  }
}

This is a kind of pattern I use. If you aren't using an IDE, you should start. It will generate the getters (and setters if you need them) for you so this isn't so painful.

I would feel remiss if I didn't point out that there is already a type that would appear to meet most of your needs here. Setting that aside, the approach you are using is not well suited to Java as it plays to it's weaknesses, not it's strengths. Here's a simple improvement:

public class Blah 
{
  public static Blah fromSeconds(long number)
  {
    return new Blah(number * 1000_000);
  }

  public static Blah fromMills(long number)
  {
    return new Blah(number * 1000);
  }

  public static Blah fromNanos(long number)
  {
    return new Blah(number);
  }

  private final long nanos;

  private Blah(long nanos)
  {
    this.nanos = nanos;
  }

  public long getNanos()
  {
    return nanos;
  }

  public long getMillis()
  {
    return getNanos() / 1000; // or round, whatever your logic is
  }

  public long getSeconds()
  {
    return getMillis() / 1000; // or round, whatever your logic is
  }

  /* I don't really know what this is about but I hope you get the idea */
  public boolean isLessThanOneMillis()
  {
    return getMillis() < 1;
  }
}
JimmyJames supports Canada
  • 30,578
  • 3
  • 59
  • 108
41

Before you get too mad at Java, please read my answer in your other post.

One of your complaints is the need to create a class just to return some set of values as an answer. This is a valid concern that I think shows your programming intuitions are right on track! However, I think other answers are missing the mark by sticking with the primitive obsession anti-pattern that you've committed to. And Java doesn't have the same ease of working with multiple primitives as Python has, where you can return multiple values natively and assign them to multiple variables easily.

But once you start thinking about what an ApproximateDuration type does for you, you realize that it's not scoped so narrowly to "just a seemingly needless class in order to return three values". The concept represented by this class is actually one of your core domain business concepts—the need to be able to represent times in an approximate way, and compare them. This needs to be part of the ubiquitous language of the core of your application, with good object and domain support for it, so that it can be tested, modular, reusable, and useful.

Is your code that sums approximate durations together (or durations with a margin of error, however you represent that) wholly procedural or is there any object-ness to it? I would propose that good design around summing approximate durations together would dictate doing it outside of any consuming code, within a class that itself can be tested. I think using this kind of domain object will have a positive ripple effect in your code that helps steer you away from line-by-line procedural steps to accomplish a single high-level task (though with many responsibilities), toward single-responsibility classes that are free from the conflicts of differing concerns.

For example, let's say that you learn more about what precision or scale is actually required for your duration summing and comparison to work correctly, and you find out that you need an intermediate flag to indicate "approximately 32 milliseconds error" (close to the square root of 1000, so halfway logarithmically between 1 and 1000). If you've bound yourself to code that uses primitives to represent this, you'll have to find every place in the code where you have is_in_seconds,is_under_1ms and change it to is_in_seconds,is_about_32_ms,is_under_1ms. Everything would have to change all over the place! Making a class whose responsibility is to record the margin of error so that it can be consumed elsewhere frees your consumers from knowing details of what margins of error matter or anything about how they combine, and lets them just specify the margin of error that is relevant at the moment. (That is, no consuming code whose margin of error is correct is forced to change when you add a new margin of error level in the class, since all the old margins of error are still valid).

Closing Statement

The complaint about Java's heaviness then seems to go away as you move closer to the principles of SOLID and GRASP, and more advanced software engineering.

Addendum

I will add completely gratuitously and unfairly that C#'s automatic properties and ability to assign get-only properties in constructors helps clean up even further the somewhat messy code that "the Java way" will require (with explicit private backing fields and getter/setter functions):

// Warning: C# code!
public sealed class ApproximateDuration {
   public ApproximateDuration(int lowMilliseconds, int highMilliseconds) {
      LowMilliseconds = lowMilliseconds;
      HighMilliseconds = highMilliseconds;
   }
   public int LowMilliseconds { get; }
   public int HighMilliseconds { get; }
}

Here is a Java implementation of the above:

public final class ApproximateDuration {
  private final int lowMilliseconds;
  private final int highMilliseconds;

  public ApproximateDuration(int lowMilliseconds, int highMilliseconds) {
    this.lowMilliseconds = lowMilliseconds;
    this.highMilliseconds = highMilliseconds;
  }

  public int getLowMilliseconds() {
    return lowMilliseconds;
  }

  public int getHighMilliseconds() {
    return highMilliseconds;
  }
}

Now that is pretty darn clean. Note the very important and intentional use of immutability--this seems crucial for this particular kind of value-bearing class.

For that matter, this class is also a decent candidate for being a struct, a value type. Some testing will show whether switching to a struct has a run-time performance benefit (it could).

ErikE
  • 1,181
24

Both Python and Java are optimized for maintainability according to the philosophy of their designers, but they have very different ideas about how to achieve this.

Python is a multi-paradigm language which optimizes for clarity and simplicity of code (easy to read and write).

Java is (traditionally) a single-paradigm class-based OO language which optimizes for explicitness and consistency - even at the cost of more verbose code.

A Python tuple is a data structure with a fixed number of fields. The same functionality can be achieved by a regular class with explicitly declared fields. In Python it is natural to provide tuples as alternative to classes because it allows you to simplify code greatly, especially due to the built-in syntax support for tuples.

But this does not really match the Java culture to provide such shortcuts, since you can already use explicit declared classes. No need to introduce a different kind of data structure just to save some lines of code and avoid some declarations.

Java prefers a single concept (classes) consistently applied with a minimum of special-case syntactic sugar, while Python provides multiple tools and lots of syntactic sugar to allow you to chose the most convenient for any particular purpose.

JacquesB
  • 61,955
  • 21
  • 135
  • 189
16

Don't search for practices; it is usually a bad idea, as said in Best practices BAD, patterns GOOD?. I know you're not asking for best practices, but I still think you will find some relevant elements in there.

Searching for a solution to your problem is better than a practice, and your problem is no tuple to return three values in Java fast:

  • There are Arrays
  • You can return an array as a list in a one-liner: Arrays.asList(...)
  • If you want to keep the object side with the less boilerplate possible (and no lombok):

class MyTuple {
    public final long value_in_milliseconds;
    public final boolean is_in_seconds;
    public final boolean is_under_1ms;
    public MyTuple(long value_in_milliseconds,....){
        ...
    }
 }

Here you have an immutable object containing just your data, and public so there is no need for getters. Note that however if you use some serialization tools or persistence layer like an ORM, they commonly use getter/setter (and may accept a parameter to use fields instead of getter/setter). And this is why these practices are used a lot. So if you want to know about practices, it's better to understand why they're here for a better usage of them.

Finally: I use getters because I use a lot of serialisation tools, but I don't write them neither; I use lombok: I use the shortcuts provided by my IDE.

Walfrat
  • 3,536
11

About Java idioms in general:

There are various reasons why Java has classes for everything. As far as my knowledge goes, the main reason is:

Java should be easy to learn for beginners. The more explicit things are, the harder it is to miss important details. Less magic happens that would be hard for beginners to grasp.


As to your specific example: the line of argument for a separate class is this: if those three things are strongly-enough related to each other that they are returned as one value, its worth naming that "thing". And introducing a name for a group of things that are structured in a common way means defining a class.

You can reduce the boilerplate with tools like Lombok:

@Value
class MyTuple {
    long value_in_milliseconds;
    boolean is_in_seconds;
    boolean is_under_1ms;
}
marstato
  • 4,638
7

There are lots of things that could be said about the Java culture, but I think that in the case you're confronted with right now, there are a few significant aspects:

  1. Library code is written once but used much more often. While it's nice to minimize the overhead of writing the library, it's probably more worthwhile in the long run to write in a way that minimizes the overhead of using the library.
  2. That means that self-documenting types are great: method names help make it clear what's happening and what you're getting out of an object.
  3. Static typing is a very useful tool for eliminating certain classes of errors. It certainly doesn't fix everything (people like to joke about Haskell that once you get the type system to accept your code, it's probably correct), but it makes it very easy to make certain kinds of wrong things impossible.
  4. Writing library code is about specifying contracts. Defining interfaces for your argument and result types makes the boundaries of your contracts more clearly defined. If something accepts or produces a tuple, there's no saying whether it's the kind of tuple you should actually receive or produce, and there's very little in the way of constraints on such a generic type (does it even have the right number of elements? are they of the type that you were expecting?).

"Struct" classes with fields

As other answers have mentioned, you can just use a class with public fields. If you make these final, then you get an immutable class, and you'd initialize them with the constructor:

   class ParseResult0 {
      public final long millis;
      public final boolean isSeconds;
      public final boolean isLessThanOneMilli;

      public ParseResult0(long millis, boolean isSeconds, boolean isLessThanOneMilli) {
         this.millis = millis;
         this.isSeconds = isSeconds;
         this.isLessThanOneMilli = isLessThanOneMilli;
      }
   }

Of course, this means that you're tied to a particular class, and anything that ever needs to produce or consume a parse result has to use this class. For some applications, that's fine. For others, that can cause some pain. Much Java code is about defining contracts, and that will typically take you into interfaces.

Another pitfall is that with a class based approach, you're exposing fields and all of those fields must have values. E.g., isSeconds and millis always have to have some value, even if isLessThanOneMilli is true. What should the interpretation of the value of the millis field be when isLessThanOneMilli is true?

"Structures" as Interfaces

With the static methods allowed in interfaces, it's actually relatively easy to create immutable types without a whole lot of syntactic overhead. For instance, I might implement the kind of result structure you're talking about as something like this:

   interface ParseResult {
      long getMillis();

      boolean isSeconds();

      boolean isLessThanOneMilli();

      static ParseResult from(long millis, boolean isSeconds, boolean isLessThanOneMill) {
         return new ParseResult() {
            @Override
            public boolean isSeconds() {
               return isSeconds;
            }

            @Override
            public boolean isLessThanOneMilli() {
               return isLessThanOneMill;
            }

            @Override
            public long getMillis() {
               return millis;
            }
         };
      }
   }

That's still a lot of boilerplate, I absolutely agree, but there are a couple of benefits, too, and I think those get start to answer some of your main questions.

With a structure like this parse result, the contract of your parser is very clearly defined. In Python, one tuple isn't really distinct from another tuple. In Java, static typing is available, so we already rule out certain classes of errors. For instance, if you're returning a tuple in Python, and you want to return the tuple (millis, isSeconds, isLessThanOneMilli), you can accidentally do:

return (true, 500, false)

when you meant:

return (500, true, false)

With this kind of Java interface, you can't compile:

return ParseResult.from(true, 500, false);

at all. You have to do:

return ParseResult.from(500, true, false);

That's a benefit of statically typed languages in general.

This approach also starts to give you the ability to restrict what values you can get. For instance, when calling getMillis(), you could check whether isLessThanOneMilli() is true, and if it is, throw an IllegalStateException (for instance), since there's no meaningful value of millis in that case.

Making it Hard to Do the Wrong Thing

In the interface example above, you still have the problem that you could accidentally swap the isSeconds and isLessThanOneMilli arguments, though, since they have the same type.

In practice, you really might want to make use of TimeUnit and duration, so that you'd have a result like:

   interface Duration {
      TimeUnit getTimeUnit();

      long getDuration();

      static Duration from(TimeUnit unit, long duration) {
         return new Duration() {
            @Override
            public TimeUnit getTimeUnit() {
               return unit;
            }

            @Override
            public long getDuration() {
               return duration;
            }
         };
      }
   }

   interface ParseResult2 {

      boolean isLessThanOneMilli();

      Duration getDuration();

      static ParseResult2 from(TimeUnit unit, long duration) {
         Duration d = Duration.from(unit, duration);
         return new ParseResult2() {
            @Override
            public boolean isLessThanOneMilli() {
               return false;
            }

            @Override
            public Duration getDuration() {
               return d;
            }
         };
      }

      static ParseResult2 lessThanOneMilli() {
         return new ParseResult2() {
            @Override
            public boolean isLessThanOneMilli() {
               return true;
            }

            @Override
            public Duration getDuration() {
               throw new IllegalStateException();
            }
         };
      }
   }

That's getting to be a lot more code, but you only need to write it once, and (assuming you've properly documented things), the people who end up using your code don't have to guess at what the result means, and can't accidentally do things like result[0] when they mean result[1]. You still get to create instances pretty succinctly, and getting data out of them isn't all that hard either:

  ParseResult2 x = ParseResult2.from(TimeUnit.MILLISECONDS, 32);
  ParseResult2 y = ParseResult2.lessThanOneMilli();

Note that you could actually do something like this with the class based approach, too. Just specify constructors for the different cases. You still have the issue of what to initialize the other fields to, though, and you can't prevent access to them.

Another answer mentioned that the enterprise-type nature of Java means that much of the time, you're composing other libraries that already exist, or writing libraries for other people to use. Your public API shouldn't require lots of time consulting the documentation to decipher result types if it can be avoided.

You only write these structures once, but you create them many times, so you still do want that concise creation (which you get). The static typing makes sure that the data you're getting out of them is what you expect.

Now, all that said, there are still places where simple tuples or lists can make a lot of sense. There may be less overhead in returning an array of something, and if that's the case (and that overhead is significant, which you'd determine with profiling), then using a simple array of values internally may make lots of sense. Your public API should still probably have clearly defined types.

7

The problem is that you compare apples to oranges. You asked how to simulate returning more than single value giving a quick&dirty python example with untyped tuple and you actually received the practically one-liner answer.

The accepted answer provides a correct business solution. No quick temporary workaround which you would have to throw away and implement correctly first time you'd need to do anything practical with the returned value, but a POJO class that is compatibile with a large set of libraries, including persistance, serialization/deserialization, instrumentation and anything possible.

This is also not long at all. The only thing you need to write are the field definitions. Setters, getters, hashCode and equals can be generated. So your actual question should be, why the getters and setters are not auto-generated but it's a syntax issue (syntactic sugar issue, would some say) and not the cultural issue.

And finally, you're overthinking trying to speed up something that's not important at all. The time spent writing DTO classes is insignificant compared to time spent maintaining and debugging the system. Therefore nobody optimizes for less verbosity.

5

The are three different factors that contribute to what you are observing.

Tuples versus named fields

Perhaps the most trivial - in the other languages you used a tuple. Debating whether tuples is a good idea is not really the point - but in Java you did use a heavier structure so it's a slightly unfair comparison: you could have used an array of objects and some type casting.

Language syntax

Could it be easier to declare the class? I'm not talking about making the fields public or using a map but something like Scala's case classes, which provide all the benefits of the setup you described but much more concise:

case class Foo(duration: Int, unit: String, tooShort: Boolean)

We could have that - but there's a cost: the syntax becomes more complicated. Of course it might be worth it for some cases, or even for most cases, or even for most cases for the next 5 years - but it needs to be judged. By the way, this is on of the nice things with languages you can modify yourself (i.e. lisp) - and note how this becomes possible due to the simplicity of syntax. Even if you don't actually modify the language, a simple syntax enables more powerful tools; for example a lot of times I miss some refactoring options available for Java but not for Scala.

Language philosophy

But the most important factor is that a language should enable a certain way of thinking. Sometimes it might feel oppressive (I've often wished for support for a certain feature) but removing features is just as important as having them. Could you support everything? Sure, but then you might as well just write a compiler that compiles every single language. In other words, you won't have a language - you'll have a superset of languages and every project would basically adopt a subset.

It is of course possible to write code that goes against the language's philosophy, and, as you observed, the results are often ugly. Having a class with just a couple of fields in Java is akin to using a var in Scala, degenerating prolog predicates to functions, doing an unsafePerformIO in haskell etc. Java classes are not meant to be light - they are not there to pass data around. When something seems hard, it's often fruitful to step back and see if there's another way. In your example:

Why have the duration separate from the units? There are plenty of time libraries that let you declare a duration - something like Duration(5, seconds) (syntax will vary), which will then let you do whatever you want in a far more robust way. Maybe you want to convert it - why check if result[1] (or [2]?) is a 'hour' and multiplying by 3600? And for the third argument - what's its purpose? I'd guess that at some point you'll have to print "less than 1ms" or the actual time - that's some logic that naturally belongs with the time data. I.e. you should have a class like this:

class TimeResult {
    public TimeResult(duration, unit, tooShort)
    public String getResult() {
        if tooShort:
           return "too short"
        else:
           return format(duration)
}

}

or whatever you actually want to do with the data, hence encapsulating the logic.

Of course, there might be a case where this way won't work - I'm not saying that this is the magic algorithm for converting tuple results to idiomatic Java code! And there might be cases where it's very ugly and bad and perhaps you should have used a different language - that's why there are so many after all!

But my view on why classes are "heavy structures" in Java is that you are not meant to use them as data containers but as self-contained cells of logic.

thanos
  • 721
5

To my understanding, the core reasons are

  1. Interfaces are the basic Java way to abstract away classes.
  2. Java can only return a single value from a method - an object or an array or a native value (int/long/double/float/boolean).
  3. Interfaces cannot contain fields, only methods. If you want to access a field, you must go through a method - hence getters and setters.
  4. If an method returns an interface, you must have an implementing class to actually return.

This gives you the "You must write a class to return for any non-trivial result" which in turn is rather heavy. If you used a class instead of the interface you could just have fields and use them directly, but that tie you to a specific implementation.

3

I agree with JacquesB answer that

Java is (traditionally) a single-paradigm class-based OO language which optimizes for explicitness and consistency

But explicitness and consistency are not the end goals to optimize for. When you say 'python is optimized for readability', you immediately mention that the end goal is 'maintainability' and 'development speed'.

What do you achieve when you have explicitness and consistency, done Java way? My take is that it's evolved as a language that claims to provide predictable, consistent, uniform way to solve any software problem.

In other words, Java culture is optimized for making managers believe that they understand software development.

Or, as one wise guy put it a very long time ago,

The best way to judge a language is to look at the code written by its proponents. "Radix enim omnium malorum est cupiditas" - and Java is clearly an example of a money oriented programming (MOP). As the chief proponent of Java at SGI told me: "Alex, you have to go where the money is." But I do not particularly want to go where the money is - it usually does not smell nice there.

artem
  • 331
  • 1
  • 5
3

(This answer is not an explanation for Java in particular, but instead addresses the general question of “What might [heavy] practices optimize for?”)

Consider these two principles:

  1. It's good when your program does the right thing. We should make it easy to write programs that do the right thing.
  2. It's bad when your program does the wrong thing. We should make it harder to write programs that do the wrong thing.

Trying to optimize one of these goals may sometimes get in the way of the other (i.e. making it harder to do the wrong thing may also make it harder to do the right thing, or vice-versa).

Which tradeoffs are made in any particular case depends on the application, the decisions of the programmers or team in question, and the culture (of the organization or language community).

For example, if a bug or a few hours' outage in your program could result in loss of lives (medical systems, aeronautics) or even merely money (like millions of dollars in say Google's ads systems), you would make different tradeoffs (not just in your language but also in other aspects of engineering culture) than you would for a one-off script: it is likely lean towards the "heavy" side.

Other examples that tend to make your system more "heavy":

  • When you have a large codebase worked on by many teams over many years, one big concern is that someone may use someone else's API incorrectly. A function getting called with arguments in the wrong order, or being called without ensuring some preconditions/constraints that it expects, could be catastrophic.
  • As a special case of this, say your team maintains a particular API or library, and would like to change or refactor it. The more “constrained” your users are in how they could be using your code, the easier it is to change it. (Note that it would be preferable here to have actual guarantees here that no one could be using it in an unusual way.)
  • If the development is split among multiple people or teams, it might seem a good idea to have one person or team “specify” the interface, and have others actually implement it. For it to work, you need to be able to gain a degree of confidence when the implementation is done, that the implementation actually matches the specification.

These are just some examples, to give you an idea of cases where making things “heavy” (and making it harder for you to just write out some code quickly) may genuinely be intentional. (One might even argue that if writing code requires a lot of effort, it may lead you to think more carefully before writing code! Of course this line of argument quickly gets ridiculous.)

An example: Google's internal Python system tends to make things “heavy” such that you cannot simply import someone else's code, you have to declare the dependency in a BUILD file, the team whose code you want to import need to have their library declared as visible to your code, etc.


Note: All the above is just about when things tend to get "heavy". I absolutely do not claim that Java or Python (either the languages themselves, or their cultures) make optimum tradeoffs for any particular case; that's for you to think about. Two related links on such tradeoffs:

2

The Java culture has evolved over time with heavy influences from both open source and enterprise software backgrounds--which is a strange mix if you really think about it. Enterprise solutions demand heavy tools, and open source demands simplicity. The end result is that Java is somewhere in the middle.

Part of what is influencing the recommendation is what is considered readable and maintainable in Python and Java is very different.

  • In Python, the tuple is a language feature.
  • In both Java and C#, the tuple is (or would be) a library feature.

I only mention C# because the standard library has a set of Tuple<A,B,C,..n> classes, and it's a perfect example of how unwieldy tuples are if the language doesn't support them directly. In almost every instance, your code becomes more readable and maintainable if you have well chosen classes to handle the issue. In the specific example in your linked Stack Overflow question, the other values would be easily expressed as calculated getters on the return object.

An interesting solution that the C# platform did which provides a happy middle ground is the idea of anonymous objects (released in C# 3.0) which scratch this itch quite well. Unfortunately, Java doesn't have an equivalent yet.

Until Java's language features are amended, the most readable and maintainable solution is to have a dedicated object. That's due to constraints in the language that date back to it's beginnings in 1995. The original authors had many more language features planned that never made it, and backwards compatibility is one of the main constraints surrounding Java's evolution over time.

0

I think one of the core things about using a class in this case is that what goes together should stay together.

I've had this discussion the other way around, about method arguments: Consider a simple method that calculates BMI:

CalculateBMI(weight,height)
{
  System.out.println("BMI: " + (( weight / height ) x 703));
}

In this case I would argue against this style because weight and height are related. The method "communicates" those are two separate values when they're not. When would you calculate a BMI with the weight of one person and the height of another? It would not make sense.

CalculateBMI(Person)
{
  System.out.println("BMI: " + (( Person.weight / Person.height ) x 703));
}

Makes a lot more sense because now you clearly communicate that the height and weight come from the same source.

The same goes for returning multiple values. If they are clearly connected then return a neat little package and use an object, if they're not return multiple values.

Pieter B
  • 13,310
0

To be frank, the culture is that Java programmers originally tended to come out of universities where Object Orientated principles and sustainable software design principles were taught.

As ErikE says in more words in his answer, you don't seem to be writing sustainable code. What I see from your example is there's a very awkward entanglement of concerns.

In Java culture you will tend to be aware of what libraries are available, and that will allow you to achieve much more than your off-the-cuff programming. So you would be trading out your idiosyncrasies for the design patterns and styles that had been tried and tested in hard core industrial settings.

But as you say, this isn't without downsides: today, having used Java for over 10 years, I tend to use either Node/Javascript or Go for new projects, because both allow quicker development, and with microservice-style architectures these are often suffice. Judging by the fact Google first was heavily using Java, but has been the originator of Go, I guess they might be doing the same. But even though I use Go and Javascript now, I still use many of the design skills I got from years of using and understanding Java.

Tom
  • 117