43

When you create an extension method you can, of course, call it on null.But, unlike an instance method call, calling it on null doesn't have to throw a NullReferenceException -> you have to check and throw it manually.

For the implementation of the Linq extension method Any() Microsoft decided that they should throw a ArgumentNullException (https://github.com/dotnet/corefx/blob/master/src/System.Linq/src/System/Linq/AnyAll.cs).

It irks me to have to write if( myCollection != null && myCollection.Any() )

Am I wrong, as a client of this code, to expect that e.g. ((int[])null).Any() should return false?

Buh Buh
  • 303

13 Answers13

162

I have a bag with five potatoes in it. Are there .Any() potatoes in the bag?

"Yes," you say. <= true

I take all of the potatoes out and eat them. Are there .Any() potatoes in the bag?

"No," you say. <= false

I completely incinerate the bag in a fire. Are there .Any() potatoes in the bag now?

"There is no bag." <= ArgumentNullException

Dan Wilson
  • 3,143
55

First off, it appears that that source code will throw ArgumentNullException, not NullReferenceException.

Having said that, in many cases you already know that your collection is not null, because this code is only called from code that knows that the collection already exists, so you won't have to put the null check in there very often. But if you don't know that it exists, then it does make sense to check before calling Any() on it.

Am I wrong, as a client of this code, to expect that e.g. ((int[])null).Any() should return false?

Yes. The question that Any() answers is "does this collection contain any elements?" If this collection does not exist, then the question itself is nonsensical; it can neither contain nor not-contain anything, because it doesn't exist.

Mason Wheeler
  • 83,213
22

Null means missing information, not no elements.

You might consider more broadly avoiding null — for example, use one of the built-in empty enumerables to represent a collection with no elements instead of null.

If you are returning null in some circumstances, you might change that to return the empty collection.  (Otherwise, if you're finding null's returned by library methods (not yours), that's unfortunate, and I would wrap them to normalize.)

See also https://stackoverflow.com/questions/1191919/what-does-linq-return-when-the-results-are-empty

Erik Eidt
  • 34,819
14

Aside from the null-conditional syntax, there is another technique to alleviate this problem: don't let your variable ever remain null.

Consider a function that accepts a collection as a parameter. If for the purposes of the function, null and empty are equivalent, you can ensure that it never contains null at the beginning:

public MyResult DoSomething(int a, IEnumerable<string> words)
{
    words = words ?? Enumerable.Empty<string>();

    if (!words.Any())
    {
        ...

You can do the same when you fetch collections from some other method:

var words = GetWords() ?? Enumerable.Empty<string>();

(Note that in cases where you have control over a function like GetWords and null is equivalent to the empty collection, it is preferable to just return the empty collection in the first place.)

Now you may perform any operation your wish on the collection. This is especially helpful if you need to perform many operations that would fail when the collection is null, and in cases where you get the same result by looping over or querying an empty enumerable, it will allow eliminating if conditions entirely.

jpmc26
  • 5,489
13

Am I wrong, as a client of this code, to expect that e.g. ((int[])null).Any() should return false?

Yes, simply because you're in C# and that behavior is well defined and documented.

If you were making your own library, or if you were using a different language with different exception culture then it would be more reasonable to expect false.

Personally I feel as though return false is a safer approach that makes your program more robust, but it's debatable at least.

Telastyn
  • 110,259
11

If the repeated null-checks annoy you, you could create your own 'IsNullOrEmpty()' extension method, to mirror the String method by that name, and wrap both the null-check and the .Any() call into a single call.

Otherwise, the solution, mentioned by @17 of 26 in a comment under your question, is shorter than the 'standard' method as well, and reasonably clear to anyone familiar with the new null-conditional syntax.

if(myCollection?.Any() == true)
9

Am I wrong, as a client of this code, to expect that e.g. ((int[])null).Any() should return false?

If you wonder about expectations you have to think about intentions.

null means something very different from Enumerable.Empty<T>

As mentioned in Erik Eidt's answer, there is a difference in meaning between null and an empty collection.

Let's first glance at how they are supposed to be used.

The book Framework Design Guidelines: Conventions, Idioms, and Patterns for Reusable .NET Libraries, 2nd Edition written by Microsoft architects Krzysztof Cwalina and Brad Abrams states the following best practice:

X DO NOT return null values from collection properties or from methods returning collections. Return an empty collection or an empty array instead.

Consider your calling a method that is ultimately getting data from a database: If you receive an empty array or Enumerable.Empty<T> this simply means your sample space is empty, i.e. your result is an empty set. Receiving null in this context, however, would signify an error state.

In the same line of thinking as Dan Wilson's potato analogy, it makes sense to ask questions about your data even if it is an empty set. But it makes a lot less sense, if there is no set.

3

There are many answers explaining why null and empty are different and enough opinions trying to explain both why they should be treated differently or not. However you're asking:

Am I wrong, as a client of this code, to expect that e.g. ((int[])null).Any() should return false?

It's a perfectly reasonable expectation. You're as right as someone else advocating for the current behavior. I agree with current implementation philosophy but driving factors are not - only - based on out of context considerations.

Given that Any() without predicate is essentially Count() > 0 then what do you expect from this snippet?

List<int> list = null;
if (list.Count > 0) {}

Or a generic:

List<int> list = null;
if (list.Foo()) {}

I suppose you expect NullReferenceException.

  • Any() is an extension method, it should smoothly integrate with the extended object then throwing an exception is the least surprising thing.
  • Not every .NET language supports extension methods.
  • You can always call Enumerable.Any(null) and there you definitely expect ArgumentNullException. It's the same method and it has to be consistent with - possibly - almost EVERYTHING else in the Framework.
  • Accessing a null object is a programming error, framework should not enforce null as magic value. If you use it that way then it's your responsibility to deal with it.
  • It's opinionated, you think one way and I think another way. Framework should be as much unopinionated as possible.
  • If you have a special case then you must be consistent: you have to take more and more highly opinionated decisions about all the other extension methods: if Count() seems an easy decision then Where() is not. What about Max()? It throws an exception for an EMPTY list, shouldn't it throw also for a null one?

What library designers did before LINQ was to introduce explicit methods when null is a valid value (for example String.IsNullOrEmpty()) then they HAD to be consistent with existing design philosophy. That said, even if pretty trivial to write, two methods EmptyIfNull() and EmptyOrNull() might be handy.

1

Jim is supposed to leave potatoes in every bag. Otherwise I'm going to kill him.

Jim has a bag with five potatoes in it. Are there .Any() potatoes in the bag?

"Yes," you say. <= true

Ok so Jim lives this time.

Jim takes all of the potatoes out and eats them. Are there .Any() potatoes in the bag?

"No," you say. <= false

Time to kill Jim.

Jim completely incinerates the bag in a fire. Are there .Any() potatoes in the bag now?

"There is no bag." <= ArgumentNullException

Should Jim live or die? Well we didn't expect this so I need a ruling. Is letting Jim get away with this a bug or not?

You can use annotations to signal that you're not putting up with any null shenanigans this way.

public bool Any( [NotNull] List bag ) 

But your tool chain has to support it. Which means you likely will still end up writing checks.

candied_orange
  • 119,268
0

If this bothers you so much, I suggest a simple extension method.

static public IEnumerable<T> NullToEmpty<T>(this IEnumerable<T> source)
{
    return (source == null) ? Enumerable.Empty<T>() : source;
}

Now you can do this:

List<string> list = null;
var flag = list.NullToEmpty().Any( s => s == "Foo" );

...and flag will set to false.

John Wu
  • 26,955
0

This is a question about C#'s extension methods and their design philosophy, so I think that the best way to answer this question is to quote MSDN's documentation on the purpose of extension methods:

Extension methods enable you to "add" methods to existing types without creating a new derived type, recompiling, or otherwise modifying the original type. Extension methods are a special kind of static method, but they are called as if they were instance methods on the extended type. For client code written in C#, F# and Visual Basic, there is no apparent difference between calling an extension method and the methods that are actually defined in a type.

In general, we recommend that you implement extension methods sparingly and only when you have to. Whenever possible, client code that must extend an existing type should do so by creating a new type derived from the existing type. For more information, see Inheritance.

When using an extension method to extend a type whose source code you cannot change, you run the risk that a change in the implementation of the type will cause your extension method to break.

If you do implement extension methods for a given type, remember the following points:

  • An extension method will never be called if it has the same signature as a method defined in the type.
  • Extension methods are brought into scope at the namespace level. For example, if you have multiple static classes that contain extension methods in a single namespace named Extensions, they will all be brought into scope by the using Extensions; directive.

To summarize, extension methods are designed to add instance methods to a particular type, even when the developers cannot do so directly. And because instance methods will always override extension methods if present (if called using instance method syntax), this should only be done if you cannot directly add a method or extend the class.*

In other words, an extension method should act just like an instance method would, because it may end up being made an instance method by some client. And because an instance method should throw if the object it's being called on is null, so should the extension method.


*As a side note, this is exactly the situation that the designers of LINQ faced: when C# 3.0 was released, there were already millions of clients that were using System.Collections.IEnumerable and System.Collections.Generic.IEnumerable<T>, both in their collections and in foreach loops. These classes returned IEnumerator objects which only had the two methods Current and MoveNext, so adding any additional required instance methods, such as Count, Any, etc., would be breaking these millions of clients. So, in order to provide this functionality (especially since it can be implemented in terms of Current and MoveNext with relative ease), they released it as extension methods, which can be applied to any currently existing IEnumerable instance and can also be implemented by classes in more efficient ways. Had C#'s designers decided to release LINQ on day one, it would have been provided as instance methods of IEnumerable, and they probably would have designed some kind of system to provide default interface implementations of those methods.

0

I think the salient point here is that by returning false instead of throwing an exception, you are obfuscating information that may be relevant to future readers/modifiers of your code.

If it's possible for the list to be null, I would suggest having a separate logic path for that, as otherwise in the future someone may add some list based logic (like an add) to the else{} of your if, resulting in an unwanted exception that they had no reason to predict.

Readability and maintainability trumps 'I have to write an extra condition' every time.

0

As a general rule, I write most of my code to assume that the caller is responsible for not handing me null data, mostly for the reasons outlined in Say No to Null (and other similar posts). The concept of null is often not well understood, and for that reason, it's advisable to initialize your variables ahead of time, as much as practical. Assuming you're working with a reasonable API, you should ideally never get a null value back, so you should never have to check for null. The point of null, as other answers have stated, is to make sure that you have a valid object (e.g. a "bag") to work with before continuing. This isn't a feature of Any, but instead a feature of the language itself. You can't do most operations on a null object, because it does not exist. It's like if I ask you to drive to the store in my car, except I don't own a car, so you have no way to use my car to drive to the store. It's nonsensical to perform an operation on something that literally does not exist.

There are practical cases for using null, such as when you literally do not have the requested information available (e.g. if I asked you what my age is, the most likely choice for you to answer with at this point is "I don't know", which is what a null value is). However, in most cases, you should at least know if your variables are initialized or not. And if you don't, then I would recommend that you need to tighten up your code. The very first thing I do in any program or function is to initialize a value before I use it. It is rare that I need to check for null values, because I can guarantee that my variables are not null. It's a good habit to get in to, and the more frequently you remember to initialize your variables, the less frequently you need to check for null.