32

Currently, I'm reading "Code Complete" by Steve McConnell, 24.3 chapter "Specific Refactorings". I've become kind of confused with 4th paragraph, that looks like:

Move an expression inline Replace an intermediate variable that was assigned the result of an expression with the expression itself.

Does that inline expression decrease readability of code compared to a well-named (even if it intermediate) variable? When will it be recommended to use inline expressions and when intermediate variables?

Christophe
  • 81,699
CoderDesu
  • 1,015

6 Answers6

61

Personally I prefer having temporary variables with explicit names (but don't abuse them either).

For me:

void foo()
{
    int const number_of_elements = (function_1() + function_2()) / function_3(); // Or more complex expression
write(number_of_elements);

}

is clearer than:

void foo()
{
    write((function_1() + function_2()) / function_3());
}

But if you use temporary variables for something like:

void foo()
{
    int const number_of_elements = get_number_of_elements();
write(number_of_elements);

}

This is less readable than:

void foo()
{
    write(get_number_of_elements());
}

I am not the author so I can't know what he was thinking when writing this line but it is a possibility.

The book being quite old (18 years for the second edition) there might also be historical reasons for that...

Robert Harvey
  • 200,592
f222
  • 1,040
26

I think there are two key factors here:

  • How complex is the expression?
  • How meaningful is the intermediate variable?

Take an example where we have three elements to get a final price:

total = price + tax - discount;

An intermediate variable used only once is probably a bad idea:

taxed_price = price + tax;
total = taxed_price - discount;

But if the "taxed_price" variable is actually used repeatedly in the function, then naming and reusing it might make sense.

Meanwhile, if the individual elements are all themselves expressions, the full expression becomes unwieldy inline:

total = product.get_price() + calculate_tax(jurisdiction, product) - session.get_active_discount();

In this case, there are few enough operators that we can put a line break next to each one, which helps, but it's still hard to see at a glance what the overall calculation is doing

total = product.get_price() 
    + calculate_tax(jurisdiction, product)
    - session.get_active_discount();

In this case, introducing single-use variables makes it easier to read:

price = product.get_price();
tax = calculate_tax(jurisdiction, product);
discount = session.get_active_discount();
total = price + tax - discount;

But the variables are chosen to be meaningful, not just arbitrary break points - this doesn't really help, if taxed_price is only used once:

taxed_price = product.get_price() - calculate_tax(jurisdiction, product);
discount = session.get_active_discount();
total = taxed_price - discount;

And this is almost certainly nonsensical (it's hard to even name the variable, which is always a bad smell):

price = product.get_price();
tax_minus_discount = calculate_tax(jurisdiction, product) - session.get_active_discount();
total = price + tax_minus_discount;
IMSoP
  • 5,947
16

“Refactoring” means changing what the code looks like, without changing what it does. A book about refactoring will tell you about changes you can make and how to make them without affecting what your code does.

You will often find a description how to change A to B, followed by a description how to change B to A - and your job is to find out which is actually an improvement. Sometimes A is better, sometimes B. The book tells you how to make the change, not what’s better.

gnasher729
  • 49,096
7

Local immutable variables (const in C++, final inJava, etc.) can add useful context-dependent information for the reader of the code and are useful when debugging interactively (stepping statement-by-statement). They are a trivial for the optimizer to get rid of, so thus can generally be considered a useful “zero cost abstraction”.

Otherwise, when the context is clear and debugger ergonomics is not of a concern, then they are just a syntactic noise (especially if syntactically mutable), and more of a disturbance.

So overall — a matter of taste, I’d say.

bobah
  • 718
5

No language was specified, so I'll give a few examples.

  • In languages with const correctness (e.g., C or C++) I make more intermediate variables
    • Always make intermediate variables final or const using any/all tools you have
    • in C++, use references (const Type&) if appropriate (be careful with ownership)
    • in C, use pointers (const Type* const ) if appropriate (maybe don't make something a pointer that wasn't already one)
  • Languages without that, like C# or Python, I think there's diminishing returns to having a lot of intermediate variables

Consider creating an inline variable if:

You can add context to the return value of a generic function

  • What image are we reading and why is it important?
// C#
byte[] trainingImage = ReadImage("car1.png");
byte[] validationImage = ReadImage("car2.png");

You want to cache the result of a slow or unreliable operation

  • If you really only need to query something once, do it only once.
  • Don't sprinkle extra slow or unreliable operations throughout your code, then you have to spread that dependency and its error handling throughout your code too.
  • Also remember that remote resources could "change" their answer the next time you contact them, which could lead to unexpected results.
// C#
Response telescopeResponse = RequestHubbleTelescopeData();
// do something with response

If it helps you provide better error messages

Inline expressions can make you tempted to lump all of them into the same recovery strategy even when that might not be a good fit.

Instead of one line that opens a file and parses the data,

  • Open the file in one of the lines, and be prepared to handle errors/exceptions about file I/O, file not found, etc.
  • In a separate line, process the data, and be prepared to handle exceptions about data in the wrong format, missing information, etc.

To document the code (better than comments)

From my answer to Windows API dialogs without using resource files:

Here hopefully it's clear why I am adding 1 to the length.

// C
int textLength_WithNUL = GetWindowTextLength(txtEditHandle) + 1;

From my answer to Get rid of the white space around matlab figure's pdf output

Here I unpack width/height from OldAxisPosition, so hopefully people don't need to look up what each index is.

% matlab

%% Get old axis position and inset offsets %Note that the axes and title are contained in the inset OldAxisPosition = get(AxisHandle, 'Position'); OldAxisInset = get(AxisHandle, 'TightInset');

OldAxisWidth = OldAxisPosition(3); OldAxisHeight = OldAxisPosition(4);

OldAxisInsetLeft = OldAxisInset(1); OldAxisInsetBottom = OldAxisInset(2); OldAxisInsetRight = OldAxisInset(3); OldAxisInsetTop = OldAxisInset(4);

To point out intermediate variables that somebody might want to monitor for debugging

I may want to know where the player was prior to the collision, so I can see whether SimulateCollision is working properly.

// C#
Point3D oldPosition = player.Position;
Point3D newPosition = SimulateCollision(player, otherObject);
player.Position = newPosition;

Consider not creating an inline variable if:

  • There isn't a descriptive name for the variable, or the name is very long.
    • Long names can be more annoying in languages without good linting / static type systems.
  • The name of the variable is more or less just repeating the function name and doesn't provide any other value-add
// C#
string productPrice = GetProductPrice();
  • The class property/field/etc. that you're reading from is already contextually appropriate and clear, and the property isn't slow or unreliable
// C#
string customerAddress = customer.Address;
  • You are doing math but the intermediate calculation doesn't make any sense on its own

This is a bit of a judgment call because sometimes I do this when dividing.

// C++
const float squareRootOfXPlus5 = sqrt(x + 5.0);
  • Making an intermediate variable involves an extremely expensive copy operation and there's no practical way to avoid it

  • In duck typed languages (e.g., Python), it may not be worth storing the value of a dict instead of just accessing it by key

# python
customer_name = customer['name']
jrh
  • 230
1
Customer customer = Customer.builder()
  .name("John Doe")
  .age(15)
  .address(Address.builder()
    .number(15)
    .street("Maple street")
    .build()
  )
  .build();

vs.

Address address = Address.builder()
  .number(15)
  .street("Maple street")
  .build();

Customer customer = Customer.builder() .name("John Doe") .age(15) .address(address) .build();

Imagine going overboard with the first case (deep builders) - I've seen that, what a mess.

Just treat each case separately, don't inline everything because the book said you can.

Shadov
  • 139