Using compound statements ("{" ... "}" blocks) to enforce variable locality

Question

Introduction

Many "C-like" programming languages use compound statements (code blocks specified with "{" and "}") to define a variables scope.

Here is a simple example.

for (int i = 0; i < 100; ++i) {
    int value = function(i); // Here 'value' is local to this block
    printf("function(%d) == %d\n", i, value);
}

This is good because it limits the scope of the value to where it is used. It is hard for programmers to use value in ways that they are not meant to because they can only access it from within its scope.

I am almost all of you are aware of this and agree that it is good practice to declare variables in the block they are used to limit their scope.

But even though it is an established convention to declare variables in their smallest possible scope it is not very common to use an naked compound statement (that is an compound statement that is not connected to an if, for, while statement).

Swapping the values of two variables

Programmers often write the code like this:

int x = ???
int y = ???
// Swap x and y
int tmp = x;
x = y;
y = tmp;

Would it not be better to write the code like this:

int x = ???
int y = ???
// Swap x and y
{
    int tmp = x;
    x = y;
    y = tmp;
}

It looks quite ugly but I think that this is a good way to enforce variable locality and make the code safer to use.

This does not only apply to temporaries

I often see similar patterns where a variable is used once in a function

Object function(ParameterType arg) {
    Object obj = new Object(obj);
    File file = File.open("output.txt", "w+");
    file.write(obj.toString());
// `obj` is used more here but `file` is never used again.
...

}

Why don't we write it like this?

RET_TYPE function(PARAM_TYPE arg) {
    Object obj = new Object(obj);
    {
       File file = File.open("output.txt", "w+");
       file.write(obj.toString());
    }
// `obj` is used more here but `file` is never used again.
...

}

Summary of question

It is hard to come up with good examples. I am sure that there are betters ways to write the code in my examples but that is not what this question is about.

My question is why we do not use "naked" compound statements more to limit the scope of variables.

What do you think about using a compound statement like this

{
    int tmp = x;
    x = y;
    y = z;
}

to limit the scope of tmp?

Is it good practice? Is it bad practice? Explain your thoughts.

score 72 · Answer 1 · answered Nov 10 '15 at 14:19

It is indeed a good practice to keep your variable's scope small. However, introducing anonymous blocks into large methods only solves half the problem: the scope of the variables shrinks, but the method (slightly) grows!

The solution is obvious: what you wanted to do in an anonymous block, you should be doing in a method. The method gets its own block and its own scope automatically, and with a meaningful name you get better documentation out of it, too.

score 22 · Answer 2 · answered Nov 10 '15 at 14:21

Often if you find places to create such a scope it's an opportunity to extract out a function.

In a language with pass-by-reference you would instead call swap(x,y).

For writing the file that would be advisable to use the block to ensure RAII will close the file and free up the resources as soon as possible.

score 13 · Answer 3 · edited Nov 10 '15 at 21:15

I assume you do not know, yet, of Expression-Oriented languages?

In Expression-Oriented languages, (nearly) everything is an expression. This means, for example, that a block can be an expression as in Rust:

// A typical function displaying x^3 + x^2 + x
fn typical_print_x3_x2_x(x: i32) {
    let y = x * x * x + x * x + x;
    println!("{}", y);
}

you may be worried that the compiler will redundantly compute x * x though, and decided to memorize the result:

// A memoizing function displaying x^3 + x^2 + x
fn memoizing_print_x3_x2_x(x: i32) {
    let x2 = x * x;
    let y = x * x2 + x2 + x;
    println!("{}", y);
}

However, now x2 clearly outlives its usefulness. Well, in Rust a block is an expression which returns the value of its last expression (not of its last statement, so avoid a closing ;):

// An expression-oriented function displaying x^3 + x^2 + x
fn expr_print_x3_x2_x(x: i32) {
    let y = {
        let x2 = x * x;
        x * x2 + x2 + x // <- missing semi-colon, this is an expression
    };
    println!("{}", y);
}

Thus I would say that newer languages have recognized the importance of limiting the scope of variables (makes things cleaner), and are increasingly offering facilities to do so.

Even notable C++ experts such as Herb Sutter are recommending anonymous blocks of this kind, "hacking" lambdas to initialize constant variables (because immutable is great):

int32_t const y = [&]{
    int32_t const x2 = x * x;
    return x * x2 + x2 + x;
}(); // do not forget to actually invoke the lambda with ()

score 8 · Answer 4 · edited May 23 '17 at 12:40

Congratulations, you've got scope isolation of some trivial variables in a large and complex function.

Unfortunately, you've got a large and complex function. The proper thing to do is instead of creating a scope within the function for the variable, to extract it to its own function. This encourages reuse of the code and allows for the enclosing scope to be passed in as parameters to the function and not be pseudo-global variables for that anonymous scope.

This means that everything outside of the scope of the unnamed block is still in scope in the unnamed block. You are effectively programming with globals and unnamed functions run serially without any way to have code reuse.

Furthermore, consider that in many runtimes, all variable declaration within the anonymous scopes is declared and allocated at the top of the method.

Lets look at some C# (https://dotnetfiddle.net/QKtaG4):

using System;

public class Program
{
    public static void Main()
    {
        string h = "hello";
        string w = "world";
        {
            int i = 42;
            Console.WriteLine(i);
        }
        {
            int j = 4;
            Console.WriteLine(j);
        }
        Console.WriteLine(h + w);
    }
}

And when you start digging into the IL for the code (with dotnetfiddle, under 'tidy up' there is also 'View IL'), right there at the top it shows what has been allocated for this method:

.class public auto ansi beforefieldinit Program
    extends [mscorlib]System.Object
{
    .method public hidebysig static void Main() cli managed
    {
      //
      .maxstack 2
      .locals init (string V_0,
          string V_1,
          int32 V_2,
          int32 V_3)

This allocates the space for two strings and two integers at the initialization of the method Main, even though only one int32 is in scope at any given time. You can see a more in depth analysis of this on a tangential topic of initializing variables at Foreach loop and variable initialization.

Lets look at some Java code instead. This should look rather familiar.

public class Main {
    public static void main(String[] args) {
        String h = "hello";
        String w = "world";
        {
            int i = 42;
            System.out.println(i);
        }
        {
            int j = 4;
            System.out.println(j);
        }
        System.out.println(h + w);
    }
}

Compile this with javac, and then invoke javap -v Main.class and you get the Java class file disassembler.

Right there at the top, it tells you how many slots it needs for local variables (output of javap command goes into the explication of the parts of it a bit more):

  public static void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    flags: ACC_PUBLIC, ACC_STATIC
    Code:
      stack=3, locals=4, args_size=1

It needs four slots. Even though two of these local variables are never in the scope at the same time, it still allocates room for four local variables.

On the other hand, if you give the compiler the option, extracting short methods (such as a 'swap') will do a better job of inlining the method and making more optimal use of variables.

Laurent LA RIZZA · Answer 5 · 2015-11-11T10:47:34.493

Good practice. Period. Full stop. You make it explicit to future readers that the variable is not used anywhere else, enforced by the rules of the language.

This is provisional politeness for the future maintainers, because you tell them something you know. These facts could take dozens of minutes to determine in a sufficiently lengthy or convoluted function.

The seconds you'll spend documenting this powerful fact by adding a pair of braces will pay off minutes for each maintainer who will not have to determine this fact.

The future maintainer can be yourself, years later, (because you're the only one who knows this code) with other things on your mind (because the project has long been over and you've been reassigned ever since), and under pressure. (because we're doing this graciously for the customer, you know, they've been long time partners, you know, and we had to smooth out our commercial relationship with them because the last project we delivered was not that up to expectations, you know, and oh, it is just a small change and it should not take that long for an expert like you, and we've got no budget so don't do "overquality")

You'd hate yourself for not doing it.

score 3 · Answer 6 · answered Nov 10 '15 at 21:39

Yes, naked blocks may limit variable scopes further than you would normally limit them. However:

The only gain is an earlier end of the variables lifetime; you can easily, and entirely unobstrusively limit the beginning of its lifetime by moving the declaration down to the appropriate place. This is common practice, so we are already halfway of where you can get with naked blocks, and this half way comes for free.
Typically, each variable has its own place where it ceases to be useful; just like each variable has its own place where it should be declared. So, if you wanted to limit all variables scopes as much as possible, you would end up with a block for every single variable in many cases.
A naked block introduces additional structure to the function, making its operation harder to grasp. Together with point 2, this can render a function with fully restricted scopes pretty near unreadable.

So, I guess, the bottom line is that naked blocks simply don't pay off in the vast majority of cases. There are cases, where a naked block may make sense because you need to control where a destructor gets called, or because you have several variables with nearly identical end of use. Yet, especially in the later case you should definitely think about factoring the block out into its own function. The situations that are best solved by a naked block are extremely rare.

score 3 · Answer 7 · answered Nov 11 '15 at 00:13

Yes, naked blocks are very rarely seen, and I think that's because they are very rarely needed.

One place I do use them myself is in switch statements:

case 'a': {
  int x = getAmbientTemperature();
  int y = getBackgroundIllumination();
  setVasculosity(x * y);
  break;
}
case 'b': {
  int x = getUltrification();
  int y = getMendacity();
  setVasculosity(x + y);
  break;
}

I tend to do this whenever I'm declaring variables within the branches of a switch, because it keeps the variables out of scope of subsequent branches.

score 2 · Answer 8 · answered Nov 10 '15 at 16:01

Additional blocks/scopes add additional verbiage. While the idea carries some initial appeal, you'll soon run into situations where it becomes unfeasible anyway because you have temporary variables with overlapping scope that cannot be properly reflected by a block structure.

So since you cannot make the block structure consistently reflect variable lifetimes anyway as soon as things get slightly more complex, bending over backwards for the simple cases where it does work seems like an exercise in eventual futility.

For data structures with destructors that lock up significant resources during their lifetime, there is the possibility to call the destructor explicitly. As opposed to using a block structure, this does not mandate a particular order of destruction for variables introduced at different points of time.

Of course blocks have their uses when they are tied with logical units: particularly when using macro programming, the scope of any variable not explicitly named as macro argument intended for output is best restricted to the macro body itself in order not to cause surprises when using a macro several times.

But as lifetime markers for variables in sequential execution, blocks tend to be overkill.

Mike Nakis · Accepted Answer · 2015-12-13T01:12:26.960

The following java code shows what I believe to be one of the best examples of how naked blocks can be useful.

As you can see, the compareTo() method has three comparisons to make, and the results of the first two need to be temporarily stored in a local. The local is just a 'difference' in both cases, but re-using the same local variable is a bad idea, and as a matter of fact on a decent IDE you can configure local variable reuse to cause a warning.

class MemberPosition implements Comparable<MemberPosition>
{
    final int derivationDepth;
    final int lineNumber;
    final int columnNumber;

    MemberPosition( int derivationDepth, int lineNumber, int columnNumber )
    {
        this.derivationDepth = derivationDepth;
        this.lineNumber = lineNumber;
        this.columnNumber = columnNumber;
    }

    @Override
    public int compareTo( MemberPosition o )
    {
        /* first, compare by derivation depth, so that all ancestor methods will be executed before all descendant methods. */
        {
            int d = Integer.compare( derivationDepth, o.derivationDepth );
            if( d != 0 )
                return d;
        }

        /* then, compare by line number, so that methods will be executed in the order in which they appear in the source file. */
        {
            int d = Integer.compare( lineNumber, o.lineNumber );
            if( d != 0 )
                return d;
        }

        /* finally, compare by column number.  You know, just in case you have multiple test methods on the same line.  Whatever. */
        return Integer.compare( columnNumber, o.columnNumber );
    }
}

Note how in this particular case you cannot offload the work to a separate function, as Kilian Foth's answer suggests. So, in cases like this, naked blocks are always my preference. But even in cases where you can in fact move the code to a separate function, I prefer a) keeping things in one place as to minimize the scrolling necessary in order to make sense of a piece of code, and b) not bloating my code with lots of functions. Definitely a good practice.

(Side note: one of the reasons why egyptian curly bracket style truly sucks is that it does not work with naked blocks.)

(Another side note that I remembered now, a month later: the code above is in Java, but I actually picked up the habit from my days of C++, where the closing curly bracket causes destructors to be invoked. This is the RAII bit that rachet freak also mentions in his answer, and it is not just a good thing, it is pretty much indispensable.)

Using compound statements ("{" ... "}" blocks) to enforce variable locality

Introduction

Swapping the values of two variables

This does not only apply to temporaries

Summary of question

9 Answers9