30

In Eric Lippert's article What's Up With Hungarian Notation?, he states that the purpose of Hungarian Notation (the good kind) is to

extend the concept of "type" to encompass semantic information in addition to storage representation information.

A simple example would be prefixing a variable that represents an X-coordinate with "x" and a variable that represents a Y-coordinate with "y", regardless of whether those variables are integers or floats or whatever, so that when you accidentally write xFoo + yBar, the code clearly looks wrong.

But I've also been reading about Haskell's type system, and it seems that in Haskell, one can accomplish the same thing (i.e. "extend the concept of type to encompass semantic information") using actual types that the compiler will check for you. So in the example above, xFoo + yBar in Haskell would actually fail to compile if you designed your program correctly, since they would be declared as incompatible types. In other words, it seems like Haskell's type system effectively supports compile-time checking equivalent to Hungarian Notation

So, is Hungarian Notation just a band-aid for programming languages whose type systems cannot encode semantic information? Or does Hungarian Notation offer something beyond what a static type system such as Haskell's can offer?

(Of course, I'm using Haskell as an example. I'm sure there are other languages with similarly expressive (rich? strong?) type systems, though I haven't come across any.)


To be clear, I'm not talking about annotating variable names with the data type, but rather with information about the meaning of the variable in the context of the program. For example, a variable may be an integer or float or double or long or whatever, but maybe the variable's meaning is that it's a relative x-coordinate measured in inches. This is the kind of information I'm talking about encoding via Hungarian Notation (and via Haskell types).

Glorfindel
  • 3,167

7 Answers7

27

I would say "Yes".

As you say, the purpose of Hungarian Notation is to encode information in the name that cannot be encoded in the type. However, there are basically two cases:

  1. That information is important.
  2. That information is not important.

Let's start with case 2 first: if that information is not important, then Hungarian Notation is simply superfluous noise.

The more interesting case is number 1, but I would argue that if the information is important, it should be checked, i.e. it should be part of the type, not the name.

Which brings us back to the Eric Lippert quote:

extend the concept of "type" to encompass semantic information in addition to storage representation information.

Actually, that's not "extending the concept of type", that is the concept of type! The whole purpose of types (as a design tool) is to encode semantic information! Storage representation is an implementation detail that doesn't usually belong in the type at all. (And specifically in an OO language cannot belong in the type, since representation independence is one of the major prerequisites for OO.)

Jörg W Mittag
  • 104,619
9

The whole purpose of types (as a design tool) is to encode semantic information!

I liked this answer and wanted to follow up on this answer ...

I don't know anything about Haskell, but you can accomplish something like the example of xFoo + yBar in any language that supports some form of type safety such as C, C++ or Java. In C++ you could define XDir and YDir classes with overloaded '+' operators that only take objects of their own type. In C or Java, you would need to do your addition using an add() function/method instead of the '+' operator.

I have always seen Hungarian Notation used for for type information, not semantics (except insofar as semantics might be represented by type). A convenient way to remember the type of a variable back in the days before "smart" programming editors that display the type for you in one way or or another right in the editor.

BHS
  • 251
7

Hungarian notation was invented for BCPL, a language which didn't have types at all. Or rather, it had exactly one data type, the word. A word could be a pointer or it could be a character or boolean or a plain integer number depending on how you used it. Obviously this made it very easy to make horrible mistakes like dereferencing a character. So Hungarian notation was invented so the programmer could at least perform manual type checking by looking at the code.

C, a descendant of BCPL, has distinct types for integers, pointers, chars etc. This made the basic Hungarian notation superfluous to some extent (you didn't need to encode in the variable name if it was an int or a pointer), but semantics beyond this level still couldn't be expressed as types. This lead to the distinction between what has been called "Systems" and "Apps" Hungarian. You didn't need to express that a variable was an int, but you could use code-letters to indicate if the int was a say an x or y coordinate or an index.

More modern languages allow definitions of custom types, which means you can encode the semantic constraints in the types, rather then in the variable names. For example a typical OO language will have specific types for coordinate-pairs and areas, so you avoid adding an x coordinate to an y coordinate.

For example, in Joels famous article praising Apps Hungarian, he uses the example of the prefix us for an unsafe string, and s for a safe (html encoded) string, in order to prevent HTML-injection. The developer can prevent HTML-injection mistakes by simply carefully inspecting the code and ensyre that the variable prefixes match up. His example is in VBScript, a now obsolete language which didn't initially allow custom classes. In a modern language the problem can be fixed with a custom type, and indeed this is what Asp.net does with the HtmlString class. This way the compiler will automatically find the error, which is much safer that relying on human eyeballing. So clearly a language with custom types eliminates the need for "Apps Hungarian" in this case.

JacquesB
  • 61,955
  • 21
  • 135
  • 189
4

I realize that the phrase "Hungarian Notation" has come to mean something different that the original, but I'll answer "no" to the question. Naming variables with either semantic or computational type does not do the same thing as SML or Haskell style typing. It's not even a bandaid. Taking C as an example, you could name a variable gpszTitle, but that variable might not have global scope, it might not even constitute a point to a null-terminated string.

I think the more modern Hungarian notations have even bigger divergence from a strong type deduction system, because they mix "semantic" information (like "g" for global or "f" for flag) with the computational type ("p" pointer, "i" integer, etc etc.) That just ends up as an unholy mess where variable names have only a vague resemblance to their computational type (which changes over time) and all look so similar that you can't use "next match" to find a variable in a particular function - they're all the same.

Bruce Ediger
  • 3,535
2

Remember, there was a time when IDEs didn't have popup hints telling you what they type of a variable is. There was a time when IDEs didn't understand the code they were editing so you couldn't jump from usage to declaration easily. There was also a time, when you couldn't refactor a variable name without manually going through the whole of the codebase, making the change by hand and hoping you didn't miss one. You couldn't use search & replace because searching for Customer also gets you CustomerName...

Back in those dark days, it was helpful to know what type a variable was where it was being used. If properly maintained (a BIG if because of the lack of refactoring tools) Hungarian notation gave you that.

The cost these days of the horrible names it produces is too high but that's a relatively recent thing. A lot of code still exists that predates the IDE developments I've described.

mcottle
  • 6,152
  • 2
  • 26
  • 27
2

Yes, though many languages which have otherwise strong enough type systems still have a problem - expressibility of new types that are based on/similar to existing types.

i.e. In many langugaes where we could use the type system more we don't because the overhead of making a new type that is basically the same as an existing type other than name and a couple of conversion functions is too great.

Essentially we need some sort of strongly typed typedefs to kill thoroughly hungarian notation in these languages (F# style UoM could also do it)

jk.
  • 10,306
0

Correct!

Outside of totally untyped languages such as Assembler, Hungarian notation is superfluous and annoying. Doubly so when you consider that most IDEs check type safety as you, er, type.

The extra "i" "d" and "?" prefixes just make the code less readable, and, can be truly misleading - as when a "cow-orker" changes the type of iaSumsItems from Integer to Long but doesn't bother refactoring the field name.