308

I have no idea what these are actually called, but I see them all the time. The Python implementation is something like:

x += 5 as a shorthand notation for x = x + 5.

But why is this considered good practice? I've run across it in nearly every book or programming tutorial I've read for Python, C, R so on and so forth. I get that it's convenient, saving three keystrokes including spaces. But they always seem to trip me up when I'm reading code, and at least to my mind, make it less readable, not more.

Am I missing some clear and obvious reason these are used all over the place?

J. Mini
  • 1,005
Fomite
  • 2,656

16 Answers16

609

It's not shorthand.

The += symbol appeared in the C language in the 1970s, and - with the C idea of "smart assembler" correspond to a clearly different machine instruction and adressing mode:

Things like "i=i+1", "i+=1" and "++i", although at an abstract level produce the same effect, correspond at low level to a different way of working of the processor.

In particular those three expressions, assuming the i variable resides in the memory address stored in a CPU register (let's name it D - think of it as a "pointer to int") and the ALU of the processor takes a parameter and return a result in an "accumulator" (let's call it A - think to it as an int).

With these constraints (very common in all microprocessors from that period), the translation will most likely be

;i = i+1;
MOV A,(D); //Move in A the content of the memory whose address is in D
ADD A, 1;  //The addition of an inlined constant
MOV (D) A; //Move the result back to i (this is the '=' of the expression)

;i+=1;
ADD (D),1; //Add an inlined constant to a memory address stored value

;++i;
INC (D); //Just "tick" a memory located counter

The first way of doing it is disoptimal, but it is more general when operating with variables instead of constant (ADD A, B or ADD A, (D+x)) or when translating more complex expressions (they all boil down in push low priority operation in a stack, call the high priority, pop and repeat until all the arguments had been eliminated).

The second is more typical of "state machine": we are no longer "evaluating an expression", but "operating a value": we still use the ALU, but avoid moving values around being the result allowed to replace the parameter. These kind of instruction cannot be used where more complicated expression are required: i = 3*i + i-2 cannot be operated in place, since i is required more times.

The third -even simpler- does not even consider the idea of "addition", but uses a more "primitive" (in computational sense) circuitry for a counter. The instruction is shorted, load faster and executes immediately, since the combinatorial network required to retrofit a register to make it a counter is smaller, and hence faster than the one of a full-adder.

With contemporary compilers (refer to C, by now), enabling compiler optimization, the correspondence can be swapped based on convenience, but there is still a conceptual difference in the semantics.

x += 5 means

  • Find the place identified by x
  • Add 5 to it

But x = x + 5 means:

  • Evaluate x+5
    • Find the place identified by x
    • Copy x into an accumulator
    • Add 5 to the accumulator
  • Store the result in x
    • Find the place identified by x
    • Copy the accumulator to it

Of course, optimization can

  • if "finding x" has no side effects, the two "finding" can be done once (and x become an address stored in a pointer register)
  • the two copies can be elided if the ADD is applied to &x instead to the accumulator

thus making the optimized code to coincide the x += 5 one.

But this can be done only if "finding x" has no side effects, otherwise

*(x()) = *(x()) + 5;

and

*(x()) += 5;

are semantically different, since x() side effects (admitting x() is a function doing weird things around and returning an int*) will be produced twice or once.

The equivalence between x = x + y and x += y is hence due to the particular case where += and = are applied to a direct l-value.

To move to Python, it inherited the syntax from C, but since there is no translation / optimization BEFORE the execution in interpreted languages, things are not necessarily so intimately related (since there is one less parsing step). However, an interpreter can refer to different execution routines for the three types of expression, taking advantage of different machine code depending on how the expression is formed and on the evaluation context.


For who likes more detail...

Every CPU has an ALU (arithmetic-logical unit) that is, in its very essence, a combinatorial network whose inputs and output are "plugged" to the registers and / or memory depending on the opcode of the instruction.

Binary operations are typically implemented as "modifier of an accumulator register with an input taken "somewhere", where somewhere can be - inside the instruction flow itself (typical for manifest contant: ADD A 5) - inside another registry (typical for expression computation with temporaries: e.g. ADD A B) - inside the memory, at an address given by a register (typical of data fetching e.g.: ADD A (H)) - H, in this case, work like a dereferencing pointer.

With this pseudocode, x += 5 is

ADD (X) 5

while x = x+5 is

MOVE A (X)
ADD A 5
MOVE (X) A

That is, x+5 gives a temporary that is later assigned. x += 5 operates directly on x.

The actual implementation depends on the real instruction set of the processor: If there is no ADD (.) c opcode, the first code becomes the second: no way.

If there is such an opcode, and optimization are enabled, the second expression, after eliminating the reverse moves and adjusted the registers opcode, become the first.

mt3
  • 101
294

Depending on how you think about it, it's actually easier to understand because it's more straightforward. Take, for example:

x = x + 5 invokes the mental processing of "take x, add five to it, and then assign that new value back to x"

x += 5 can be thought of as "increase x by 5"

So, it's not just shorthand, it actually describes the functionality much more directly. When reading through gobs of code, it's much easier to grasp.

Eric King
  • 11,008
53

At least in Python, x += y and x = x + y can do completely different things.

For example, if we do

a = []
b = a

then a += [3] will result in a == b == [3], while a = a + [3] will result in a == [3] and b == []. That is, += modifies the object in-place (well, it might do, you can define the __iadd__ method to do pretty much anything you like), while = creates a new object and binds the variable to it.

This is very important when doing numerical work with NumPy, as you frequently end up with multiple references to different parts of an array, and it is important to make sure you don't inadvertently modify part of an array that there are other references to, or needlessly copy arrays (which can be very expensive).

James
  • 101
44

It is called an idiom. Programming idioms are useful because they are a consistent way of writing a particular programming construct.

Whenever someone writes x += y you know that x is being incremented by y and not some more complex operation (as a best practice, typically I wouldn't mix more complicated operations and these syntax shorthands). This makes the most sense when incrementing by 1.

Joe
  • 299
42

To put @Pubby's point a little clearer, consider someObj.foo.bar.func(x, y, z).baz += 5

Without the += operator, there are two ways to go:

  1. someObj.foo.bar.func(x, y, z).baz = someObj.foo.bar.func(x, y, z).baz + 5. This is not only awfully redundant and long, it's also slower. Therefore one would have to
  2. Use a temporary variable: tmp := someObj.foo.bar.func(x, y, z); tmp.baz = tmp.bar + 5. This is ok, but it's a lot of noise for a simple thing. This is actually really close to what happens at runtime, but it's tedious to write and just using += will shift the work to the compiler/interpreter.

The advantage of += and other such operators is undeniable, while getting used to them is only a matter of time.

back2dos
  • 30,140
26

It's true that it's shorter and easier, and it's true that it was probably inspired by the underlying assembly language, but the reason it's best practice is that it prevents a whole class of errors, and it makes it easier to review the code and be sure what it does.

With

RidiculouslyComplexName += 1;

Since there's only one variable name involved, you're sure what the statement does.

With RidiculouslyComplexName = RidiculosulyComplexName + 1;

There's always doubt that the two sides are exactly the same. Did you see the bug? It gets even worse when subscripts and qualifiers are present.

Jamie Cox
  • 101
18

While the += notation is idiomatic and shorter, these are not the reasons why it is easier to read. The most important part of reading code is mapping syntax to meaning, and so the closer the syntax matches the programmer's thought processes, the more readable it will be (this is also the reason why boilerplate code is bad: it is not part of the thought process, but still necessary to make the code function). In this case, the thought is "increment variable x by 5", not "let x be the value of x plus 5".

There are other cases where a shorter notation is bad for readability, for example when you use a ternary operator where an if statement would be more appropriate.

tdammers
  • 52,936
15

For some insight to why these operators are in the 'C-style' languages to begin with, there's this excerpt from K&R 1st Edition (1978), 34 years ago:

Quite apart from conciseness, assignment operators have the advantage that they correspond better to the way people think. We say "add 2 to i" or "increment i by 2," not "take i, add 2, then put the result back in i." Thus i += 2. In addition, for a complicated expression like

yyval[yypv[p3+p4] + yypv[p1+p2]] += 2

the assignment operator makes the code easier to understand, since the reader doesn't have to check painstakingly that two long expressions are indeed the same, or wonder why they're not. And an assignment operator may even help the compiler to produce more efficient code.

I think it's clear from this passage that Brian Kernighan and Dennis Ritchie (K&R), believed that compound assignment operators helped with code readability.

It's been a long time since K&R wrote that, and a lot of the 'best practices' about how people should write code has changed or evolved since then. But this programmers.stackexchange question is the first time I can recall someone voicing a complaint about the readability of compound assignments, so I wonder if many programmers find them to be a problem? Then again, as I type this the question has 95 upvotes, so maybe people do find them jarring when reading code.

9

Besides readability, they actually do different things:+= doesn't have to evaluate its left operand twice.

For instance, expr = expr + 5 would evalaute expr twice (assuming expr is impure).

Pubby
  • 3,390
6

It's concise.

It's much shorter to type. It involves fewer operators. It has less surface area and less opportunity for confusion.

It uses a more specific operator.

This is a contrived example, and I'm not sure if actual compilers implement this. x += y actually uses one argument and one operator and modifies x in place. x = x + y could have an intermediate representation of x = z where z is x + y. The latter uses two operators, addition and assignment, and a temporary variable. The single operator makes it super clear that the value side can't be anything other than y and doesn't need to be interpreted. And there could theoretically be some fancy CPU that has a plus-equals operator that runs faster than a plus operator and an assignment operator in series.

Mark Canlas
  • 4,004
6

Besides the obvious merits which other people described very well, when you have very long names it is more compact.

  MyVeryVeryVeryVeryVeryLongName += 1;

or

  MyVeryVeryVeryVeryVeryLongName =  MyVeryVeryVeryVeryVeryLongName + 1;
5

It is a nice idiom. Whether it is faster or not depends on the language. In C, it is faster because it translates to an instruction to increase the variable by the right hand side. Modern languages, including Python, Ruby, C, C++ and Java all support the op= syntax. It's compact, and you get used to it quickly. Since you will see it a whole lot in other peoples' code (OPC), you may as well get used to it and use it. Here is what happens in a couple of other languages.

In Python, typing x += 5 still causes the creation of the integer object 1 (although it may be drawn from a pool) and the orphaning of the integer object containing 5.

In Java, it causes a tacit cast to occur. Try typing

int x = 4;
x = x + 5.2  // This causes a compiler error
x += 5.2     // This is not an error; an implicit cast is done.
4

Operators such as += are very useful when you're using a variable as an accumulator, i.e. a running total:

x += 2;
x += 5;
x -= 3;

Is a lot easier to read than:

x = x + 2;
x = x + 5;
x = x - 3;

In the first case, conceptually, you're modifying the value in x. In the second case, you're computing a new value and assigning it to x each time. And while you'd probably never write code that's quite that simple, the idea remains the same... the focus is on what you're doing to an existing value instead of creating some new value.

Caleb
  • 39,298
1

Consider this

(some_object[index])->some_other_object[more] += 5

D0 you really want to write

(some_object[index])->some_other_object[more] = (some_object[index])->some_other_object[more] + 5
S.Lott
  • 45,522
  • 6
  • 93
  • 155
1

Say it once and only once: in x = x + 1, I say 'x' twice.

But do not ever write, 'a = b +=1' or we will have to kill 10 kittens, 27 mice, a dog and a hamster.


You should never change the value of a variable, as it makes it easier to prove the code is correct — see functional programming. However if you do, then it is better no say things only once.

1

The other answers target the more common cases, but there is another reason: In some programming languages, it can be overloaded; e.g. Scala.


Small Scala lesson:

var j = 5 #Creates a variable
j += 4    #Compiles

val i = 5 #Creates a constant
i += 4    #Doesn’t compile

If a class only defines the + operator, x+=y is indeed a shortcut of x=x+y.

If a class overloads +=, however, they are not:

var a = ""
a += "This works. a now points to a new String."

val b = ""
b += "This doesn’t compile, as b cannot be reassigned."

val c = StringBuffer() #implements +=
c += "This works, as StringBuffer implements “+=(c: String)”."

Additionally, operators + and += are two separate operators (and not only these: +a, ++a, a++, a+b a += b are different operators as well); in languages where operator overloading is available this might create interesting situations. Just as described above - if you'll overload the + operator to perform the adding, bear in mind that += will have to be overloaded as well.