106

I'm working through the book "Head First Python" (it's my language to learn this year) and I got to a section where they argue about two code techniques:
Checking First vs Exception handling.

Here is a sample of the Python code:

# Checking First
for eachLine in open("../../data/sketch.txt"):
    if eachLine.find(":") != -1:
        (role, lineSpoken) = eachLine.split(":",1)
        print("role=%(role)s lineSpoken=%(lineSpoken)s" % locals())

# Exception handling        
for eachLine in open("../../data/sketch.txt"):
    try:
        (role, lineSpoken) = eachLine.split(":",1)
        print("role=%(role)s lineSpoken=%(lineSpoken)s" % locals())
    except:
        pass

The first example deals directly with a problem in the .split function. The second one just lets the exception handler deal with it (and ignores the problem).

They argue in the book to use exception handling instead of checking first. The argument is that the exception code will catch all errors, where checking first will only catch the things you think about (and you miss the corner cases). I have been taught to check first, so my intial instinct was to do that, but their idea is interesting. I had never thought of using the exception handling to deal with cases.

Which of the two is the generally considered the better practice?

Deduplicator
  • 9,209
jmq
  • 6,108

9 Answers9

82

In Python in particular, it is usually considered better practice to catch the exception. It tends to get called Easier to Ask for Forgiveness than Permission (EAFP), compared to Look Before You Leap (LBYL). There are cases where LBYL will give you subtle bugs in some cases.

However, do be careful of bare except: statements as well as overbroad except statements, since they can both also mask bugs - something like this would be better:

for eachLine in open("../../data/sketch.txt"):
    try:
        role, lineSpoken = eachLine.split(":",1)
    except ValueError:
        pass
    else:
        print("role=%(role)s lineSpoken=%(lineSpoken)s" % locals())
lvc
  • 1,016
  • 6
  • 3
75

In .NET, it is common practice to avoid the overuse of Exceptions. One argument is performance: in .NET, throwing an exception is computationally expensive.

Another reason to avoid their overuse is that it can be very difficult to read code that relies too much on them. Joel Spolsky's blog entry does a good job of describing the issue.

At the heart of the argument is the following quote:

The reasoning is that I consider exceptions to be no better than "goto's", considered harmful since the 1960s, in that they create an abrupt jump from one point of code to another. In fact they are significantly worse than goto's:

1. They are invisible in the source code. Looking at a block of code, including functions which may or may not throw exceptions, there is no way to see which exceptions might be thrown and from where. This means that even careful code inspection doesn't reveal potential bugs.

2. They create too many possible exit points for a function. To write correct code, you really have to think about every possible code path through your function. Every time you call a function that can raise an exception and don't catch it on the spot, you create opportunities for surprise bugs caused by functions that terminated abruptly, leaving data in an inconsistent state, or other code paths that you didn't think about.

Personally, I throw exceptions when my code can't do what it is contracted to do. I tend to use try/catch when I'm about to deal with something outside of my process boundary, for instance a SOAP call, a database call, file IO, or a system call. Otherwise, I attempt to code defensively. It's not a hard and fast rule, but it is a general practice.

Scott Hanselman also writes about exceptions in .NET here. In this article he describes several rules of thumb regarding exceptions. My favourite?

You shouldn't throw exceptions for things that happen all the time. Then they'd be "ordinaries".

Kyle
  • 2,793
30

A Pragmatic Approach

You should be defensive but to a point. You should write exception handling but to a point. I'm going to use web programming as an example because this is where I live.

  1. Assume all user input is bad and write defensively only to the point of data type verification, pattern checks, and malicious injection. Defensive programming should be things that can potentially happen very often that you cannot control.

  2. Write exception handling for networked services that may fail at times and handle gracefully for user feedback. Exception programming should be used for networked things that may fail from time to time but are usually solid AND you need to keep your program working.

  3. Don't bother to write defensively within your application after the input data has been validated. It's a waste of time and bloats your app. Let it blow up because it's either something very rare that isn't worth handling or it means you need to look at steps 1 and 2 more carefully.

  4. Never write exception handling within your core code that is not dependent on a networked device. Doing so is bad programming and costly to performance. For example, writing a try-catch in the case of out of bounds array in a loop means you didn't program the loop correctly in the first place.

  5. Let everything be handled by central error logging that catches exceptions in one place after following the above procedures. You cannot catch every edge case as that may be infinite, you only need to write code that handles expected operation. That's why you use central error handling as the last resort.

  6. TDD is nice because in a way is try-catching for you without bloat, meaning giving you some assurance of normal operation.

  7. Bonus point is to use a code coverage tool for example Istanbul is a good one for the node as this shows you where you aren't testing.

  8. The caveat to all of this is developer-friendly exceptions. For example, a language would throw if you used the syntax wrong and explain why. So should your utility libraries that the bulk of your code depends on.

This is from experience working in large team scenarios.

An Analogy

Imagine if you wore a spacesuit inside the ISS ALL the time. It would be hard to go to the bathroom or eat, at all. It would be super bulky inside the space module to move around. It would suck. Writing a bunch of try-catches inside your code is kind of like that. You have to have some point where you say, hey I secured the ISS and my astronauts inside are OK so it's just not practical to wear a spacesuit for every scenario that could possibly happen.

Milan
  • 117
17

The book's main argument is that the exception version of the code is better because it will catch anything that you might have overlooked if you tried to write your own error checking.

I think this statement is true only in very specific circumstances - where you don't care if the output is correct.

There is no doubt that raising exceptions is a sound and safe practice. You should do so whenever you feel there's something in the current state of the program that you (as a developer) cannot, or don't want to, deal with.

Your example, however, is about catching exceptions. If you catch an exception, you're not protecting yourself from scenarios you might have overlooked. You are doing precisely the opposite: you assume that you haven't overlooked any scenario that might have caused this type of exception, and therefore you're confident that it's alright to catch it (and thus prevent it from causing the program to exit, as any uncaught exception would).

Using the exception approach, if you see ValueError exception, you skip a line. Using the traditional non-exception approach, you count the number of returned values from split, and if it's less than 2, you skip a line. Should you feel more secure with the exception approach, since you may have forgotten some other "error" situations in your traditional error check, and except ValueError would catch them for you?

This depends on the nature of your program.

If you're writing, for example, a web browser or a video player, a problem with inputs should not cause it to crash with an uncaught exception. It's far better to output something remotely sensible (even if, strictly speaking, incorrect) than to quit.

If you're writing an application where correctness matters (such as business or engineering software), this would be a terrible approach. If you forgot about some scenario that raises ValueError, the worst thing you can do is to silently ignore this unknown scenario and simply skip the line. That's how very subtle and costly bugs end up in software.

You might think that the only way you can see ValueError in this code, is if split returned only one value (instead of two). But what if your print statement later starts using an expression that raises ValueError under some conditions? This will cause you to skip some lines not because they miss :, but because print fails on them. This is an example of a subtle bug I was referring to earlier - you would not notice anything, just lose some lines.

My recommendation is to avoid catching (but not raising!) exceptions in the code where producing incorrect output is worse than exiting. The only time I'd catch an exception in such code is when I have a truly trivial expression, so I can easily reason what may cause each of the possible exception types.

As to the performance impact of using exceptions, it is trivial (in Python) unless exceptions are encountered frequently.

If you do use exceptions to handle routinely occurring conditions, you may in some cases pay a huge performance cost. For example, suppose you remotely execute some command. You could check that your command text passes at least the minimum validation (e.g., syntax). Or you could wait for an exception to be raised (which happens only after the remote server parses your command and finds a problem with it). Obviously, the former is orders of magnitude faster. Another simple example: you can check whether a number is zero ~10 times faster than trying to execute the division and then catching ZeroDivisionError exception.

These considerations only matter if you frequently send malformed command strings to remote servers or receive zero-valued arguments which you use for division.

Note: I assume you would use except ValueError instead of the just except; as others pointed out, and as the book itself says in a few pages, you should never use bare except.

Another note: the proper non-exception approach is to count the number of values returned by split, rather than search for :. The latter is far too slow, since it repeats the work done by split and may nearly double the execution time.

max
  • 1,115
8

As a general rule, if you know a statement could generate an invalid result, test for that and deal with it. Use exceptions for things you do not expect; stuff that is "exceptional". It makes the code clearer in a contractual sense ("should not be null" as an example).

Ian
  • 5,452
1

Use what ever works well in..

  • your chosen programming language in terms of code readability and efficiency
  • your team and the set of agreed code conventions

Both exception handling and defensive programming are different ways of expressing the same intent.

Sri
  • 129
0

TBH, it doesn't matter if you use the try/except mechanic or an if statement check. You commonly see both EAFP and LBYL in most Python baselines, with EAFP being slightly more common. Sometimes EAFP is much more readable/idiomatic, but in this particular case I think it's fine either way.

However...

I'd be careful using your current reference. A couple of glaring issues with their code:

  1. The file descriptor is leaked. Modern versions of CPython (a specific Python interpreter) will actually close it, since it's an anonymous object that's only in scope during the loop (gc will nuke it after the loop). However, other interpreters do not have this guarantee. They may leak the descriptor outright. You almost always want to use the with idiom when reading files in Python: there are very few exceptions. This isn't one of them.
  2. Pokemon exception handling is frowned upon as it masks errors (i.e. bare except statement that doesn't catch a specific exception)
  3. Nit: You don't need parens for tuple unpacking. Can just do role, lineSpoken = eachLine.split(":",1)

Ivc has a good answer about this and EAFP, but is also leaking the descriptor.

The LBYL version is not necessarily as performant as the EAFP version, so saying that throwing exceptions is "expensive in terms of performance" is categorically false. It really depends on the type of strings you're processing:

In [33]: def lbyl(lines):
    ...:     for line in lines:
    ...:         if line.find(":") != -1:
    ...:             # Nuke the parens, do tuple unpacking like an idiomatic Python dev.
    ...:             role, lineSpoken = line.split(":",1)
    ...:             # no print, since output is obnoxiously long with %timeit
    ...:

In [34]: def eafp(lines):
    ...:     for line in lines:
    ...:         try:
    ...:             # Nuke the parens, do tuple unpacking like an idiomatic Python dev.
    ...:             role, lineSpoken = eachLine.split(":",1)
    ...:             # no print, since output is obnoxiously long with %timeit
    ...:         except:
    ...:             pass
    ...:

In [35]: lines = ["abc:def", "onetwothree", "xyz:hij"]

In [36]: %timeit lbyl(lines)
100000 loops, best of 3: 1.96 µs per loop

In [37]: %timeit eafp(lines)
100000 loops, best of 3: 4.02 µs per loop

In [38]: lines = ["a"*100000 + ":" + "b", "onetwothree", "abconetwothree"*100]

In [39]: %timeit lbyl(lines)
10000 loops, best of 3: 119 µs per loop

In [40]: %timeit eafp(lines)
100000 loops, best of 3: 4.2 µs per loop
-5

Basically Exception handling supposed to be more appropriate for OOP languages.

Second point is the performance, because you don't have to execute eachLine.find for every line.

Elalfer
  • 99
-6

I think defensive programming hurts performance. You should also catch only the exceptions you are going to handle, let the runtime deal with the exception you don't know how to handle.

Manoj
  • 141
  • 2