176

This is a minor niggle, but every time I have to code something like this, the repetition bothers me, but I'm not sure that any of the solutions aren't worse.

if(FileExists(file))
{
    contents = OpenFile(file); // <-- prevents inclusion in if
    if(SomeTest(contents))
    {
        DoSomething(contents);
    }
    else
    {
        DefaultAction();
    }
}
else
{
    DefaultAction();
}
  • Is there a name for this kind of logic?
  • Am I a tad too OCD?

I'm open to evil code suggestions, if only for curiosity's sake...

Deduplicator
  • 9,209
Benjol
  • 3,737

24 Answers24

106

Extract it to separate function (method) and use return statement:

if(FileExists(file))
{
    contents = OpenFile(file); // <-- prevents inclusion in if
    if(SomeTest(contents))
    {
        DoSomething(contents);
        return;
    }
}

DefaultAction();

Or, maybe better, separate getting contents and its processing:

contents_t get_contents(name_t file)
{
    if(!FileExists(file))
        return null;

    contents = OpenFile(file);
    if(!SomeTest(contents)) // like IsContentsValid
        return null;

    return contents;
}

...

contents = get_contents(file)
contents ? DoSomething(contents) : DefaultAction();

Upd:

Why not exceptions, why OpenFile doesn't throw IO exception:
I think that it's really generic question, rather than question about file IO. Names like FileExists, OpenFile can be confusing, but if to replace them with Foo, Bar, etc, - it would be clearer that DefaultAction may be called as often as DoSomething, so it may be non-exceptional case. Péter Török wrote about this at end of his answer

Why there is ternary conditional operator in 2nd variant:
If there would be [C++] tag, I'd wrote if statement with declaration of contents in its condition part:

if(contents_t contents = get_contents(file))
    DoSomething(contents);
else
    DefaultAction();

But for other (C-like) languages, if(contents) ...; else ...; is exactly the same as expression statement with ternary conditional operator, but longer. Because the main part of the code was get_contents function, I just used the shorter version (and also omitted contents type). Anyway, it's beyond this question.

Deduplicator
  • 9,209
Abyx
  • 1,445
58

If the programing language you're using (0) short circuits binary comparisons (i.e. if doesn't call SomeTest if FileExists returns false) and (1) assignment returns a value (the result of OpenFile is assigned to contents and then that value is passed as an argument to SomeTest), you can use something like the following, but it would still be advised to you to comment the code noting that the single = is intentional.

if( FileExists(file) && SomeTest(contents = OpenFile(file)) )
{
    DoSomething(contents);
}
else
{
    DefaultAction();
}

Depending on how convoluted the if is, it might be better to have a flag variable (which separates the testing of success/failure conditions with the code that handles the error DefaultAction in this case)

Deduplicator
  • 9,209
frozenkoi
  • 349
27

More seriously than the repetition of the call to DefaultAction is the style itself because the code is written non-orthogonal (see this answer for good reasons for writing orthogonally).

To show why non-orthogonal code is bad consider the original example, when a new requirement that we should not open the file if it is stored on a network disk is introduced. Well, then we could just update the code to the following:

if(FileExists(file))
{
    if(! OnNetworkDisk(file))
    {
        contents = OpenFile(file); // <-- prevents inclusion in if
        if(SomeTest(contents))
        {
            DoSomething(contents);
        }
        else
        {
            DefaultAction();
        }
    }
    else
    {
        DefaultAction();
    }
}
else
{
    DefaultAction();
}

But then there also comes a requirement that we should not open large files over 2Gb either. Well, we just update again:

if(FileExists(file))
{
    if(LessThan2Gb(file))
    {
        if(! OnNetworkDisk(file))
        {
            contents = OpenFile(file); // <-- prevents inclusion in if
            if(SomeTest(contents))
            {
                DoSomething(contents);
            }
            else
            {
                DefaultAction();
            }
        }
        else
        {
            DefaultAction();
        }
    else
    {
        DefaultAction();
    }
}
else
{
    DefaultAction();
}

It should be very clear that such code style will be a huge maintenance pain.

Among the answers here which is written properly orthogonally are Abyx' second example and Jan Hudec's answer, so I will not repeat that, just point out that adding the two requirements in those answers would just be

if(! LessThan2Gb(file))
    return null;

if(OnNetworkDisk(file))
    return null;

(or goto notexists; instead of return null;), not affecting any other code than those lines added. E.g. orthogonal.

When testing, the general rule should be to test exceptions, not the normal case.

Deduplicator
  • 9,209
hlovdal
  • 238
  • 3
  • 13
25

Obviously:

Whatever(Arguments)
{
    if(!FileExists(file))
        goto notexists;
    contents = OpenFile(file); // <-- prevents inclusion in if
    if(!SomeTest(contents))
        goto notexists;
    DoSomething(contents);
    return;
notexists:
    DefaultAction();
}

You said you are open even to evil solutions, so using evil goto counts, no?

In fact depending on context this solution might well be less evil than either evil doing the action twice or evil extra variable. I wrapped it in a function, because it would definitely not be OK in the middle of long function (not least due to the return in the middle). But than long function is not OK, period.

When you have exceptions, they will be easier to read, especially if you can have the OpenFile and DoSomething simply throw exception if the conditions are not satisfied, so you don't need explicit checks at all. On the other hand in C++, Java and C# throwing an exception is a slow operation, so from performance point, the goto is still preferable.


Note on "evil": C++ FAQ 6.15 defines "evil" as:

It means such and such is something you should avoid most of the time, but not something you should avoid all the time. For example, you will end up using these "evil" things whenever they are "the least evil of the evil alternatives."

And that applies to goto in this context. Structured flow control constructs are better most of the time, but when you get in the situation where they accumulate too many of their own evils, like assignment in condition, nesting more than about 3 levels deep, duplicating code or long conditions, goto may simply end up being less evil.

Deduplicator
  • 9,209
Jan Hudec
  • 18,410
12

One possibility:

boolean handled = false;

if(FileExists(file))
{
    contents = OpenFile(file); // <-- prevents inclusion in if
    if(SomeTest(contents))
    {
        DoSomething(contents);
        handled = true;
    }
}
if (!handled)
{
    DefaultAction();
}

Of course, this makes the code slightly more complex in a different way. So it is largely a style question.

A different approach would be using exceptions, e.g.:

try
{
    contents = OpenFile(file); // throws IO exception if file not found
    DoSomething(contents); // calls SomeTest() and throws exception on failure
}
catch(Exception e)
{
    DefaultAction();
    // and the exception should be at least logged...
}

This looks simpler, however it is only applicable if

  • we know precisely what kind of exceptions to expect, and DefaultAction() fits to each
  • we expect the file processing to succeed, and a missing file or a failing SomeTest() is clearly an erroneous condition, thus it is suitable to throw an exception on it.
Deduplicator
  • 9,209
12
function FileContentsExists(file) {
    return FileExists(file) ? OpenFile(file) : null;
}

...

contents = FileContentExists(file);
if(contents && SomeTest(contents))
{
    DoSomething(contents);
}
else
{
    DefaultAction();
}
Deduplicator
  • 9,209
herby
  • 2,774
  • 1
  • 20
  • 25
11

Functions should do one thing. They should do it well. They should do it only.
— Robert Martin in Clean Code

Some people find that approach a little extreme, but it's also very clean. Allow me to illustrate in Python:

def processFile(self):
    if self.fileMeetsTest():
        self.doSomething()
    else:
        self.defaultAction()

def fileMeetsTest(self):
    return os.path.exists(self.path) and self.contentsTest()

def contentsTest(self):
    with open(self.path) as file:
        line = file.readline()
        return self.firstLineTest(line)

When he says functions should do one thing, he means one thing. processFile() chooses an action based on the result of a test, and that's all it does. fileMeetsTest() combines all the conditions of the test, and that's all it does. contentsTest() transfers the first line to firstLineTest(), and that's all it does.

It seems like a lot of functions, but it reads practically like straight English:

To process the file, check if it meets the test. If it does, then do something. Otherwise, take the default action. The file meets the test if it exists and passes the contents test. To test the contents, open the file and test the first line. The test for the first line ...

Granted, that's a little wordy, but note that if a maintainer doesn't care about the details, he can stop reading after just the 4 lines of code in processFile(), and he will still have good high level knowledge of what the function does.

Deduplicator
  • 9,209
Karl Bielefeldt
  • 148,830
11

This is at a higher level of abstraction:

if (WeCanDoSomething(file))
{
   DoSomething(contents);
}
else
{
   DefaultAction();
} 

And this fills in the details.

boolean WeCanDoSomething(file)
{
    if FileExists(file)
    {
        contents = OpenFile(file);
        return (SomeTest(contents));
    }
    else
    {
        return FALSE;
    }
}
Deduplicator
  • 9,209
7

This conforms to DRY, no-goto and no-multiple-returns rules, is scalable and readable in my opinion:

success = FileExists(file);
if (success)
{
    contents = OpenFile(file);
    success = SomeTest(contents);
}
if (success)
{
    DoSomething(contents);
}
else
{
    DefaultAction();
}
Deduplicator
  • 9,209
mouviciel
  • 15,491
6

With regards to what this is called, it can easily develop into the arrowhead anti pattern as your code grows to handle more requirements (as shown by the answer provided at https://softwareengineering.stackexchange.com/a/122625/33922) and then falls into the trap of having huge sections of codes with nested conditional statements that resemble a arrow.

See Links such as;

http://codinghorror.com/blog/2006/01/flattening-arrow-code.html

http://lostechies.com/chrismissal/2009/05/27/anti-patterns-and-worst-practices-the-arrowhead-anti-pattern/

There's plenty more on this and other anti patterns to be found on Google.

Some great tips Jeff provides on his blog regarding this are;

1) Replace conditions with guard clauses.

2) Decompose conditional blocks into seperate functions.

3) Convert negative checks into positive checks

4) Always opportunistically return as soon as possible from the function.

See some of the comments on Jeff's blog regarding Steve McConnells suggestions on early returns also;

"Use a return when it enhances readability: In certain routines, once you know the answer, you want to return it to the calling routine immediately. If the routine is defined in such a way that it doesn't require any further cleanup once it detects an error, not returning immediately means that you have to write more code."

...

"Minimize the number of returns in each routine: It's harder to understand a routine when, reading it at the bottom, you're unaware of the possibility that it returned somewhere abouve. For that reason, use returns judiciously--only when they improve readability. "

I always subscribed to the 1 entry/exit per function theory due to what I was taught 15 years or so ago. I feel this just makes code so much easier to read and as you mention more maintainable

Mr Moose
  • 375
3

I'd extract it to a separate method and then:

if(!FileExists(file))
{
    DefaultAction();
    return;
}

contents = OpenFile(file);
if(!SomeTest(contents))
{
    DefaultAction();
    return;
}

DoSomething(contents);

which also allows for

if(!FileExists(file))
{
    DefaultAction();
    return Result.FileNotFound;
}

contents = OpenFile(file);
if(!SomeTest(contents))
{
    DefaultAction();
    return Result.TestFailed;
}

DoSomething(contents);
return Result.Success;            

then possibly you could remove the the DefaultAction calls and leave executing the DefaultAction for the caller:

Result OurMethod(file)
{
    if(!FileExists(file))
    {
        return Result.FileNotFound;
    }

    contents = OpenFile(file);
    if(!SomeTest(contents))
    {
        return Result.TestFailed;
    }

    DoSomething(contents);
    return Result.Success;            
}

void Caller()
{
    // something, something...

    var result = OurMethod(file);
    // if (result == Result.FileNotFound || result == Result.TestFailed), or just
    if (result != Result.Success)        
    {
        DefaultAction();
    }
}

I like Jeanne Pindar's approach as well.

Deduplicator
  • 9,209
3

For this particular case, the answer is easy enough...

There is a race condition between FileExists and OpenFile: what happens if the file is removed?

The only sane way to deal with this particular case is to skip FileExists:

contents = OpenFile(file);
if (!contents) // open failed
    DefaultAction();
else (SomeTest(contents))
    DoSomething(contents);

This neatly solves this problem and makes the code cleaner.

In general: Try to rethink the problem and devise another solution which avoids the issue entirely.

Deduplicator
  • 9,209
2

The case shown in the sample code can usually be reduced to a single if statement. On many systems, the file-open function will return an invalid value if the file doesn't already exist. Sometimes this is the default behavior; other times, it must be specified via an argument. This means the FileExists test can be dropped, which can also help with race conditions resulting from file deletion between the existence test and file opening.

file = OpenFile(path);
if(isValidFileHandle(file) && SomeTest(file)) {
    DoSomething(file);
} else {
    DefaultAction();
}

This doesn't directly address the abstraction-level-mixing issue as it sidesteps the issue of multiple, unchainable tests entirely, though doing away with the file existence test is not incompatible with separating abstraction levels. Assuming that invalid file handles are equivalent to "false" and file handles close when they go out of scope:

OpenFileIfSomething(path:String) : FileHandle {
    file = OpenFile(path);
    if (file && SomeTest(file)) {
        return file;
    }
    return null;
}

...

if ((file = OpenFileIfSomething(path))) {
    DoSomething(file);
} else {
    DefaultAction();
}
Deduplicator
  • 9,209
outis
  • 1,171
2

I'm in agreement with frozenkoi, however, for C# anyways, I thought it would help to follow the syntax of the TryParse methods.

if(FileExists(file) && TryOpenFile(file, out contents))
    DoSomething(contents);
else
    DefaultAction();
bool TryOpenFile(object file, out object contents)
{
    try{
        contents = OpenFile(file);
    }
    catch{
        //something bad happened, computer probably exploded
        return false;
    }
    return true;
}
Deduplicator
  • 9,209
2

Another possibility if you don't like to see too many the else's is to drop the use of else altogether and throw in an extra return statement. Else is kind of superfluous unless you require more complex logic to determine if there are more than just two action possibilities.

Thus your example might become:

void DoABunchOfStuff()
{
    if(FileExists(file))
    {
        DoSomethingWithFileContent(file);
        return;
    }

    DefaultAction();
}

void DoSomethingWithFileContent(file)
{        
    var contents = GetFileContents(file)

    if(SomeTest(contents))
    {
        DoSomething(contents);
        return;
    }

    DefaultAction();
}

AReturnType GetFileContents(file)
{
    return OpenFile(file);
}

Personally I don't mind using the else clause as it states explicitly how the logic is supposed to work, and so improves readability of your code. Some code beautification tools however prefer to simplify to a single if statement to discourage nesting logic.

Deduplicator
  • 9,209
S.Robins
  • 11,505
1

Of course you can only go so far in scenarios like these, but here's a way to go:

interface File<T> {
    function isOK():Bool;
    function getData():T;
}

var appleFile:File<Apple> = appleStorage.get(fileURI);
if (appleFile.isOK())
    eat(file.getData());
else
    cry();

You may want additional filters. Then do this:

var appleFile = appleStorage.get(fileURI, isEdible);
//isEdible is of type Apple->Bool and will be used internally to answer to the isOK call
if (appleFile.isOK())
    eat(file.getData());
else
    cry();

Although this might make sense just as well:

function eat(apple:Apple) {
     if (isEdible(apple)) 
         digest(apple);
     else
         die();
}
var appleFile = appleStorage.get(fileURI);
if (appleFile.isOK())
    eat(appleFile.getData());
else
    cry();

Which is the best? That depends on the real world problem you're facing.
But the thing to take away is: you can do a lot with composition and polymorphism.

Deduplicator
  • 9,209
back2dos
  • 30,140
1

What's wrong with the obvious

if(!FileExists(file)) {
    DefaultAction();
    return;
}
contents = OpenFile(file);
if(!SomeTest(contents))
{
    DefaultAction();
    return;
}        
DoSomething(contents);

It seems pretty standard to me? For that kind of big procedure where lots of little things have to happen, failure of any of which would prevent the latter. Exceptions make it a bit cleaner if that's an option.

Deduplicator
  • 9,209
1

Your code is ugly because you are doing too much in a single function. You want to either process the file or take the default action, so start by saying that:

if (!ProcessFile(file)) { 
  DefaultAction(); 
}

Perl and Ruby programmers write processFile(file) || defaultAction()

Now go write ProcessFile:

if (FileExists(file)) { 
  contents = OpenFile(file);
  if (SomeTest(contents)) {
    processContents(contents);
    return true;
  }
}
return false;
Deduplicator
  • 9,209
kevin cline
  • 33,798
0

How about this solution:

content = NULL; //I presume OpenFile returns a pointer 
if(FileExists(file))
    contents = OpenFile(file);
if(content != NULL && SomeTest(contents))
    DoSomething(contents);
else
    DefaultAction();

I made the assumption that OpenFile returns a pointer, but this could work also with value type return by specifying some default value not returnable (error codes or something like that).

Of course I don't expect some possible action via the SomeTest method on NULL pointer (but you never know), so this could be also seen as an extra check for NULL pointer for the SomeTest(contents) call.

Deduplicator
  • 9,209
chedi
  • 1
0

Clearly, the most elegant and concise solution is to use a preprocessor macro.

#define DOUBLE_ELSE(CODE) else { CODE } } else { CODE }

Which allows you to write beautiful code like this:

if(FileExists(file))
{
    contents = OpenFile(file);
    if(SomeTest(contents))
    {
        DoSomething(contents);
    }
    DOUBLE_ELSE(DefaultAction();)

It might be difficult to rely on automatic formatting if you use this technique often, and some IDEs might yell at you a little bit about what it erroneously supposes is malformed. And as the saying goes, everything is a tradeoff, but I suppose it's not a bad price to pay to avoid the evils of repeated code.

Deduplicator
  • 9,209
Peter Olson
  • 1,062
0

I saw a lot of examples with "return" which I use too but sometimes I want to avoid creating new functions and use a loop instead:

while (1) {
    if (FileExists(file)) {
        contents = OpenFile(file);
        if (SomeTest(contents)) {
           DoSomething(contents);
           break;
        } 
    }
    DefaultAction();
    break;
}

If you want to write less lines or you hate infinite loops as me, you can change the loop type to "do ... while(0)" and avoid the last "break".

Deduplicator
  • 9,209
XzKto
  • 101
  • 3
0

To reduce nested IF:

1/ early return;

2/ compound expression (short-circuit aware)

So, your example may be refactored like this:

if( FileExists(file) && SomeTest(contents = OpenFile(file)) )
{
    DoSomething(contents);
    return;
}
DefaultAction();
Deduplicator
  • 9,209
DQ_vn
  • 1
0

I realse that this is an old question, but I noticed a pattern that has not been mentioned; mainly, setting a variable to later determine the method/s you would like to call (outside of the if...else...).

This is just as another angle to look at to make the code easier to work with. It also allows for when you may want to add another method to be called or change the appropriate method that needs to be called in certain situations.

Rather than having to replace all mentions of the method (and maybe missing some scenarios), they are all listed at the end of the if...else... block and are simpler to read and alter. I tend to use this when for example, several methods may be called, but within the nested if...else... a method may be called in several matches.

If you set a variable that defines the state, you could have many deeply nested options, and update state when something is to be (or not to be) performed.

This could be used as in the example asked in the question where we are checking if 'DoSomething' has occurred, and if not, perform the default action. Or you could have state for each method you may want to call, set when applicable, then call the applicable method outside of the if...else...

At the end of the nested if...else... statements, you check state and act accordingly. This means you only need a single mention of a method instead of all the locations that it should be applied.

bool ActionDone = false;

if (Method_1(object_A)) // Test 1
{
    result_A = Method_2(object_A); // Result 1

    if (Method_3(result_A)) // Test 2
    {
        Method_4(result_A); // Action 1
        ActionDone = true;
    }
}

if (!ActionDone)
{
    Method_5(); // Default Action
}
Deduplicator
  • 9,209
-1

Since you asked out of curiosity, and your question is not tagged with a specific language (even though it's clear you had imperative languages in mind), it may be worth adding that languages supporting lazy evaluation allow for a complete different approach. In those languages, expressions are only evaluated when needed, so you can define "variables" and use them only when it makes sense to do so. For example, in a fictional language with lazy let/in structures you forget about flow control and write:

let
  contents = ReadFile(file)
in
  if FileExists(file) && SomeTest(contents) 
    DoSomething(contents)
  else 
    DefaultAction()
Deduplicator
  • 9,209
tokland
  • 131