43

If I want to compare two numbers (or other well-ordered entities), I would do so with x < y. If I want to compare three of them, the high-school algebra student will suggest trying x < y < z. The programmer in me will then respond with "no, that's not valid, you have to do x < y && y < z".

Most languages I've come across don't seem to support this syntax, which is odd given how common it is in mathematics. Python is a notable exception. JavaScript looks like an exception, but it's really just an unfortunate by-product of operator precedence and implicit conversions; in node.js, 1 < 3 < 2 evaluates to true, because it's really (1 < 3) < 2 === true < 2 === 1 < 2.

So, my question is this: Why is x < y < z not commonly available in programming languages, with the expected semantics?

Aaron Hall
  • 6,003
JesseTG
  • 657

10 Answers10

43

Why is x < y < z not commonly available in programming languages?

In this answer I conclude that

  • although this construct is trivial to implement in a language's grammar and creates value for language users,
  • the primary reasons that this does not exist in most languages is due to its importance relative to other features and the unwillingness of the languages' governing bodies to either
    • upset users with potentially breaking changes
    • to move to implement the feature (i.e.: laziness).

Introduction

I can speak from a Pythonist's perspective on this question. I am a user of a language with this feature and I like to study the implementation details of the language. Beyond this, I am somewhat familiar with the process of changing languages like C and C++ (the ISO standard is governed by committee and versioned by year.) and I have watched both Ruby and Python implement breaking changes.

Python's documentation and implementation

From the docs/grammar, we see that we can chain any number of expressions with comparison operators:

comparison    ::=  or_expr ( comp_operator or_expr )*
comp_operator ::=  "<" | ">" | "==" | ">=" | "<=" | "!="
                   | "is" ["not"] | ["not"] "in"

and the documentation further states:

Comparisons can be chained arbitrarily, e.g., x < y <= z is equivalent to x < y and y <= z, except that y is evaluated only once (but in both cases z is not evaluated at all when x < y is found to be false).

Logical Equivalence

So

result = (x < y <= z)

is logically equivalent in terms of evaluation of x, y, and z, with the exception that y is evaluated twice:

x_lessthan_y = (x < y)
if x_lessthan_y:       # z is evaluated contingent on x < y being True
    y_lessthan_z = (y <= z)
    result = y_lessthan_z
else:
    result = x_lessthan_y

Again, the difference is that y is evaluated only one time with (x < y <= z).

(Note, the parentheses are completely unnecessary and redundant, but I used them for the benefit of those coming from other languages, and the above code is quite legal Python.)

Inspecting the parsed Abstract Syntax Tree

We can inspect how Python parses chained comparison operators:

>>> import ast
>>> node_obj = ast.parse('"foo" < "bar" <= "baz"')
>>> ast.dump(node_obj)
"Module(body=[Expr(value=Compare(left=Str(s='foo'), ops=[Lt(), LtE()],
 comparators=[Str(s='bar'), Str(s='baz')]))])"

So we can see that this really isn't difficult for Python or any other language to parse.

>>> ast.dump(node_obj, annotate_fields=False)
"Module([Expr(Compare(Str('foo'), [Lt(), LtE()], [Str('bar'), Str('baz')]))])"
>>> ast.dump(ast.parse("'foo' < 'bar' <= 'baz' >= 'quux'"), annotate_fields=False)
"Module([Expr(Compare(Str('foo'), [Lt(), LtE(), GtE()], [Str('bar'), Str('baz'), Str('quux')]))])"

And contrary to the currently accepted answer, the ternary operation is a generic comparison operation, that takes the first expression, an iterable of specific comparisons and an iterable of expression nodes to evaluate as necessary. Simple.

Conclusion on Python

I personally find the range semantics to be quite elegant, and most Python professionals I know would encourage the usage of the feature, instead of considering it damaging - the semantics are quite clearly stated in the well-reputed documentation (as noted above).

Note that code is read much more than it is written. Changes that improve the readability of code should be embraced, not discounted by raising generic specters of Fear, Uncertainty, and Doubt.

So why is x < y < z not commonly available in programming languages?

I think there are a confluence of reasons that center around the relative importance of the feature and the relative momentum/inertia of change allowed by the governors of the languages.

Similar questions can be asked about other more important language features

Why isn't multiple inheritance available in Java or C#? There is no good answer here to either question. Perhaps the developers were too lazy, as Bob Martin alleges, and the reasons given are merely excuses. And multiple inheritance is a pretty big topic in computer science. It is certainly more important than operator chaining.

To quote James Gosling, who gives no further explanation:

JAVA omits many rarely used, poorly understood, confusing features of C++ that in our experience bring more grief than benefit. This primarily consists of operator overloading (although it does have method overloading), multiple inheritance, and extensive automatic coercions.

And these words attributed to Chris Brumme, after citing the amount of work to determine the right way to do it, user complexity, and difficulties in implementing:

It's not at all clear that this feature would pay for itself. It's something we are often asked about. It's something we haven't done due diligence on. But my gut tells me that, after we've done a deep examination, we'll still decide to leave the feature unimplemented.

These aren't great answers. Python has had multiple inheritance for a long time, it's well studied - it seems to me these are just implementations that need working out now. Is the conclusion right for the language? Maybe. It does limit the expressiveness of the languages, though.

Simple workarounds exist

Comparison operator chaining is elegant, but by no means as important as multiple inheritance. And just as Java and C# have interfaces as a workaround, so does every language for multiple comparisons - you simply chain the comparisons with boolean "and"s, which works easily enough.

Most languages are governed by committee

Most languages are evolving by committee (rather than having a sensible Benevolent Dictator For Life like Python has). And I speculate that this issue just hasn't seen enough support to make it out of its respective committees.

Can the languages that don't offer this feature change?

If a language allows x < y < z without the expected mathematical semantics, this would be a breaking change. If it didn't allow it in the first place, it would be almost trivial to add.

Breaking changes

Regarding the languages with breaking changes: we do update languages with breaking behavior changes - but users tend to not like this, especially users of features that may be broken. If a user is relying on the former behavior of x < y < z, they would likely loudly protest. And since most languages are governed by committee, I doubt we would get much political will to support such a change.

Aaron Hall
  • 6,003
41

These are binary operators, which when chained, normally and naturally produce an abstract syntax tree like:

normal abstract syntax tree for binary operators

When evaluated (which you do from the leaves up), this produces a boolean result from x < y, then you get a type error trying to do boolean < z. In order for x < y < z to work as you discussed, you have to create a special case in the compiler to produce a syntax tree like:

special case syntax tree

Not that it isn't possible to do this. It obviously is, but it adds some complexity to the parser for a case that doesn't really come up that often. You're basically creating a symbol that sometimes acts like a binary operator and sometimes effectively acts like a ternary operator, with all the implications of error handling and such that entails. That adds a lot of space for things to go wrong that language designers would rather avoid if possible.

Karl Bielefeldt
  • 148,830
14

Computer languages try to define the smallest possible units and let you combine them. The smallest possible unit would be something like x < y which gives a boolean result.

You may ask for a ternary operator. An example would be x < y < z. Now what combinations of operators do we allow? Obviously x > y > z or x >= y >= z or x > y >= z or maybe x == y == z should be allowed. What about x < y > z ? x != y != z ? What does the last one mean, x != y and y != z or that all three are different?

Now argument promotions: In C or C++, arguments would be promoted to a common type. So what does x < y < z mean of x is double but y and z are long long int? All three promoted to double? Or y is taken as double once and as long long int the other time? What happens if in C++ one or both of the operators are overloaded?

And last, do you allow any number of operands? Like a < b > c < d > e < f > g ?

Well, it all gets very complicated. Now what I wouldn't mind is x < y < z producing a syntax error. Because the usefulness of it is small compared to the damage done to beginners who can't figure out what x < y < z actually does.

gnasher729
  • 49,096
10

In many programming languages, x < y is a binary expression that accepts two operands and evaluates to a single boolean result. Therefore, if chaining multiple expressions, true < z and false < z won't make sense, and if those expressions successfully evaluate, they're likely to produce the wrong result.

It's much easier to think of x < y as a function call that takes two parameters and produces a single boolean result. In fact, that's how many languages implement it under the hood. It's composable, easily compilable, and it just works.

The x < y < z scenario is much more complicated. Now the compiler, in effect, has to fashion three functions: x < y, y < z, and the result of those two values anded together, all within the context of an arguably ambiguous language grammar.

Why did they do it the other way? Because it is unambiguous grammar, much easier to implement, and much easier to get correct.

Robert Harvey
  • 200,592
8

Most mainstream languages are (at least partially) object-oriented. Fundamentally, the underlying principle of OO is that objects send messages to other objects (or themselves), and the receiver of that message has complete control over how to respond to that message.

Now, let's see how we would implement something like

a < b < c

We could evaluate it strictly left-to-right (left-associative):

a.__lt__(b).__lt__(c)

But now we call __lt__ on the result of a.__lt__(b), which is a Boolean. That makes no sense.

Let's try right-associative:

a.__lt__(b.__lt__(c))

Nah, that doesn't make sense either. Now, we have a < (something that's a Boolean).

Okay, what about treating it as syntactic sugar. Let's make a chain of n < comparisons send an n-1-ary message. This could mean, we send the message __lt__ to a, passing b and c as arguments:

a.__lt__(b, c)

Okay, that works, but there is a strange asymmetry here: a gets to decide whether it is less than b. But b doesn't get to decide whether it is less than c , instead that decision is also made by a.

What about interpreting it as an n-ary message send to this?

this.__lt__(a, b, c)

Finally! This can work. It means, however, that the ordering of objects is no longer a property of the object (e.g. whether a is less than b is neither a property of a nor of b) but instead a property of the context (i.e. this).

From a mainstream standpoint that seems weird. However, e.g. in Haskell, that's normal. There can be multiple different implementations of the Ord typeclass, for example, and whether or not a is less than b, depends on which typeclass instance happens to be in scope.

But actually, it is not that weird at all! Both Java (Comparator) and .NET (IComparer) have interfaces that allow you to inject your own ordering relation into e.g. sorting algorithms. Thus, they fully acknowledge that an ordering is not something that is fixed to a type but instead depends on context.

A far as I know, there are currently no languages that perform such a translation. There is a precedence, however: both Ioke and Seph have what their designer calls "trinary operators" – operators which are syntactically binary, but semantically ternary. In particular,

a = b

is not interpreted as sending the message = to a passing b as argument, but rather as sending the message = to the "current Ground" (a concept similar but not identical to this) passing a and b as arguments. So, a = b is interpreted as

=(a, b)

and not

a =(b)

This could easily be generalized to n-ary operators.

Note that this is really peculiar to OO languages. In OO, we always have one single object which is ultimately responsible for interpreting a message send, and as we have seen, it is not immediately obvious for something like a < b < c which object that should be.

This doesn't apply to procedural or functional languages though. For example, in Scheme, Common Lisp, and Clojure, the < function is n-ary, and can be called with an arbitrary number of arguments.

In particular, < does not mean "less than", rather these functions are interpreted slightly differently:

(<  a b c d) ; the sequence a, b, c, d is monotonically increasing
(>  a b c d) ; the sequence a, b, c, d is monotonically decreasing
(<= a b c d) ; the sequence a, b, c, d is monotonically non-decreasing
(>= a b c d) ; the sequence a, b, c, d is monotonically non-increasing
Jörg W Mittag
  • 104,619
4

It's simply because the language designers didn't think of it or didn't think it was a good idea. Python does it as you described with a simple (almost) LL(1) grammar.

Neil G
  • 448
4

The following C++ program compiles with nary a peep from clang, even with warnings set to the the highest possible level (-Weverything):

#include <iostream>
int main () { std::cout << (1 < 3 < 2) << '\n'; }

The gnu compiler suite on the other hand nicely warns me that comparisons like 'X<=Y<=Z' do not have their mathematical meaning [-Wparentheses].

So, my question is this: why is x < y < z not commonly available in programming languages, with the expected semantics?

The answer is simple: Backwards compatibility. There is a vast amount of code out in the wild that use the equivalent of 1<3<2 and expect the result to be true-ish.

A language designer has but one chance at getting this "right", and that is the point in time the language is first designed. Get it "wrong" initially means that other programmers will rather quickly take advantage of that "wrong" behavior. Getting it "right" the second time around will break that existing code base.

David Hammen
  • 8,391
1

The short answer: Because C did not have it.

The majority of today's mainstream languages have inherited the set of operators and the rules for operator precedence from C. This is the case for C++, Java, C#, JavaScript and many others.

Python, on the other hand, is not directly derived from C syntax. And several other non-C-derived languages support comparison chaining similar to Python, for example Perl, Raku and Julia. SQL supports a limited version with the between operator.

Of course this just raises the question of why C didn't have this syntax. I don't know if the designers of C even considered the syntax, but I doubt it. x > y > z would basically be syntactic sugar over x > y & y > z and would compile down to the same machine code. C was designed to be minimal and doesn't generally have a lot of syntactic sugar. (It might seem that way for e.g. the x++ operator which is equivalent to x+=1, but processors tend to have dedicated instructions for increment which is faster than a general addition, so a distinct operator is justified in this case.)

But the descendant languages like C++ or C# does not have the same focus on minimalism and could have added the syntax along the way. And a bit of googling show it has indeed been considered by several languages:

C#: Proposal: Chained Comparisons #2954 (https://github.com/dotnet/csharplang/issues/2954).

C++ Proposal: Chaining Comparisons (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0893r0.html)

I'm sure there are others.

The discussions show there are no fundamental objection against the feature, it is just a question if the benefit outweighs the cost of adding a new feature. For C++ there is also the issue of backwards compatibility, since x > y > z is already a legal expression.


I also want to add a non-reason: Some other answers claim that it is somehow especially complicated to parse an expression with three or more operands. This is not an explanation. All parsers, even for a simple language like C, can parse a function call with an argument list of arbitrary length no problem.

JacquesB
  • 61,955
  • 21
  • 135
  • 189
0

Stated simply: x < y < z will be interpreted either as (x < y) < z or x < (y < z).

In either case, the (parenthesized subexpression) will be evaluated first and will produce a boolean result: "either it is less or it isn't."

Leaving either: (boolean) < z or x < (boolean) ... both of which would be considered wrong.

The trivially equivalent "right way" works just as well, so that is how we say it: (x < y) && (y < z)

0

In languages with operator overloading, this is possible.

So to see what it would be like, I implemented this in Swift:

precedencegroup ComparisonPrecedence {
    associativity: left
    higherThan: AdditionPrecedence
}

infix operator < : ComparisonPrecedence

// Start the chain, (C, C) -> ComparableWrapper<C>
func < <C: Comparable>(lhs: C, rhs: C) -> ComparableWrapper<C> {
    return ComparableWrapper<C>.startChain(lhs: lhs, rhs: rhs)
}

// Extend the chain, (ComparableWrapper<C>, C) -> ComparableWrapper<C>
func < <C: Comparable>(lhs: ComparableWrapper<C>, rhs: C) -> ComparableWrapper<C> {
    return ComparableWrapper<C>.extendChain(lhs: lhs, rhs: rhs)
}

// Terminate the chain, (ComparableWrapper<C>, C) -> Bool
func < <C: Comparable>(lhs: ComparableWrapper<C>, rhs: C) -> Bool {
    return ComparableWrapper<C>.terminateChain(lhs: lhs, rhs: rhs)
}

// A wrapper object which represents a node in a left-associative chain of `a < b < ... < z`
// Evaluating from left-to-right, it stores the largest value seen so far, and whether or not
// the elements so far have been strictly-increasing
struct ComparableWrapper<C: Comparable> {
    // The largest value in the chain so far, unless isTrueSoFar is false,
    // in which case it won't be needed anymore, so it's some arbitrary value in the chain,
    // to prevent needless further comparisons
    let value: C
    let isTrueSoFar: Bool   

    init(largerValue value: C, isTrueSoFar: Bool) {
        self.value = value
        self.isTrueSoFar = isTrueSoFar
    }

    static func startChain(lhs: C, rhs: C) -> ComparableWrapper<C> {
        if lhs < rhs {
            return ComparableWrapper(largerValue: rhs, isTrueSoFar: true)
        }
        else {
            return ComparableWrapper(largerValue: lhs, isTrueSoFar: false)
        }
    }

    static func extendChain(lhs: ComparableWrapper<C>, rhs: C) -> ComparableWrapper<C> {
        if lhs.isTrueSoFar {
            let newLargestValueSoFar = max(lhs.value, rhs)
            return ComparableWrapper(largerValue: newLargestValueSoFar, isTrueSoFar: lhs.isTrueSoFar)
        }
        else {  
            // Don't even bother comparing, the result will be false eventually anway.
            return ComparableWrapper(largerValue: rhs, isTrueSoFar: false)
        }
    }

    static func terminateChain(lhs: ComparableWrapper<C>, rhs: C) -> Bool {
        return lhs.isTrueSoFar && lhs.value < rhs
    }
}



let x = 3

if 1 < 2 < x < 4 < 5 {
    print("true!")
}

let start = ComparableWrapper<Int>.startChain
let extend = ComparableWrapper<Int>.extendChain
let terminate = ComparableWrapper<Int>.terminateChain


let boolResult = terminate(extend(extend(start(1, 2), 3), 4), 5)
print(boolResult)

I define a new infix < operator that shadows the built in one. Rather than it having just one type, (Value, Value) -> Bool, I define 3 separate overloads for it. I make the 3 overloads call 3 differently named methods, so as to be able to talk about them unambiguously.

The above code parses like:

terminate(extend(extend(start(1, 2), 3), 4), 5)

There are several problems with this code:

  1. If the standard library were to do this, 1 < 2 would become ambiguous without context. The result could either be Bool or ComparisonWrapper<Int>.

    • It could parse as the regular comparison operator (Int, Int) -> Bool
    • Or it could parse as startChain(1, 2), a function of type (Int, Int) -> ComparisonWrapper<Int>

    Special rules could be introduced to disambiguate it, but adding special cases to a language always has trade-offs.

  2. This is already kind of hairy, and it only support <, not <=, >, >= and ==.

  3. It introduces a lot of overloads, which slows down type checking.
Alexander
  • 5,185