7

return this (or similar construct) allows method chaining. Lack of it is painful, because you have to write such code (C#):

var list = new List<string>();
list.Add("hello");
list.Add("world");

instead of

list.Add("hello").Add("world");

Elixir solves it nicely for function chaining, instead of relying on callee it relies on caller (forgive me my mistakes, I don't know Elixir):

list |> add("hello") |> add("world");

But now I have just read this sentence at wikipedia:

Returning an object of built-in type from a function usually carries little to no overhead, since the object typically fits in a CPU register.

On one hand callee does not know if the result will be used or not, on the other hand caller cannot stop callee from setting the result value. So I am skeptical about this "little", but "no overhead"?

Thus MY QUESTION for this very particular pattern (i.e return this with method chaining) -- can it be optimized with no overhead? How?

Question by example -- say I will write a framework and sprinkle every possible method with return this just to give ability for method chaining. The question arise -- will user who does not use method chaining will pay the price of lowered performance? How compiler could optimize code that this feature will have zero cost.

Update after first 2 comments -- "cheap"!="free", so maybe another perspective for my question, why the difference "little" vs. "no cost". If it can be guaranteed it is at no cost, we write "no cost", period. So I assume it cannot be guaranteed, thus "little".

Clarification I am not asking how to make another Elixir-like syntax in other language. I am asking how it is possible for compiler to optimize callee-caller interaction on return this + method chaining (or lack of it, when not used).

greenoldman
  • 1,533

8 Answers8

11

First off,

var list = new List<string>();
list.Add("hello");
list.Add("world");

Is just as, if not more readable than

var list = new List<string>().Add("hello").Add("world");

Lines of code is not, in any way a proxy for code's cleanliness or simplicity.

Thus my question for this very particular pattern (i.e "return this" for method chaining) -- can it be optimized with no overhead? How?

So theoretically, sure. This would be similar to tail call optimization, where you wouldn't need to completely unwind the stack frame when moving between functions, you could leave the this argument where it is when you're done (or just peek rather than pop it in a stack based model). The challenge comes that you can only do that if you know the next call will be a chain. Since functions don't really know about their callers, they don't know if they should clean up after themselves or not. You could have some internal flag, which the function could check to know which path to take, but that would be more expensive than just passing in this all the time.

The compiler could also make two versions of the function, and pick the right one at compile time. That would bloat the resultant executable, slow compilation time, but probably shave an instruction or two from actual runtime.

Telastyn
  • 110,259
3

Returning an object of built-in type from a function usually carries little to no overhead, since the object typically fits in a CPU register.

The point of this comment from wikipedia isn't that returning the value has no cost. Overhead is additional cost, and the here the cost is in addition to that of the return value mechanism. The article is contrasting additional work required for objects to have be copied when they are returned.

Thus MY QUESTION for this very particular pattern (i.e return this with method chaining) -- can it be optimized with no overhead? How?

In general, no. In particular cases (such as where the function can be inlined) it can be.

Winston Ewert
  • 25,052
3

can it be optimized with no overhead? How?

As Telastyn wrote, one approach is to have a compiler providing two function versions. If I were in the role of a compiler designer, I would handle it this way:

  • I would build a compiler with inlining support. Such a tool can obviously optimize the return this statement out when it is not used.

  • and if this hypothetical compiler decides not to inline a specific function, optimizing out the machine code equivalent of return this is definitely not worth it (if it would be, the compiler would inline).

So this does not lead to "zero overhead" in a pure mathematical sense, but to "zero overhead" for any practical purpose.

Doc Brown
  • 218,378
2

At least in C#, your desired construct is entirely unnecessary.

You can write something like

var mylist = new List<string>(new[] {"Hello", "World"});
...
mylist.AddRange(new[] {"Add", "Some", "More", "Items"});

which is about as concise as you can get, while still being perfectly readable.

If you really want method chaining in a class, even one that you don't have the source for, you can use extension methods.

public static class ListExtender
{
    public static IList<T> ChainedAdd<T>(this IList<T> list, T item)
    {
        list.Add(item);
        return list;
    }
}

then

 mylist.ChainedAdd("x").ChainedAdd("y").ChainedAdd("z");
Peregrine
  • 1,246
2

If the implementation uses a calling convention that uses the same register to pass the this pointer to a function that it does to return a value from the function, then return this is a zero-cost operation (because the return value is already there).

dan04
  • 3,957
2

A compiler could guarantee that the return this + function chaining pattern has never worse performance than calling multiple methods on the same object, by always rewriting the former pattern into the latter.

But I think it would be a relatively complicated rewrite and I'm not sure it would be worth the effort. Especially since I believe the return this pattern has a potential to be more efficient, because it means that this doesn't have to remembered by both the caller and the callee: the caller can forget the object while the chain is executing.

Note that the compiler is unlikely to recognize this. So, to take advantage of this (theoretical) optimization, instead of writing:

var list = new List<string>();
list.Add("hello").Add("world");
foo(list);

You would write:

foo(new List<string>().Add("hello").Add("world"));
svick
  • 10,137
  • 1
  • 39
  • 53
1

Dart has this feature. So you can do

StringList myList = new StringList()
    ..Add("hello")
    ..Add("world");

Here we can treat F(expr..m(...)) as syntax sugar for:

var _generated_temp = expr;
_generated_temp.m(...);
F(_generated_temp);

You can see Dart also allows setting properties to be chained in the same way:

Person myPerson = new Person()
    ..Name = "Jane Doe"
    ..Age  = 31;

C# gives you a subset of these features with initialization syntax. So you can translate the above into

StringList myList = new StringList { "hello", "world" };

Person myPerson = new Person{ Name = "Jane Doe", Age = 31 };

However, this syntax sugar is available only at initialization and not available for arbitrary method calls, so it is strictly less expressive than Dart's version.

So it is possible to offer fluent interfaces as a language feature, which would remove any performance cost. However, the performance cost of a fluid interface is so negligible as to be effectively nil, so one should not argue for it on those bases. (Note that it might be in theory possible for non-virtual methods that just return this to be optimized away. This optimization would not be worth performing.) Rather, the benefit would be the opportunity to use a fluent interface even when it was not designed for.

walpen
  • 3,241
0

In theory, yes. C++ has a term for it: copy elision. The C++ standard allows for copy elision. The concept you are asking about, in fact, has its own tla: RVO (return-value optimization).

Essentially, the compiler can (but, unfortunately, is not required to) decide on the space where the return value will reside before the function is called. This avoids pushing the return value on the stack and then copying it to this space. The compiler can generate the function code in such a way that instead of changing the value to be pushed on the stack it gets written directly to the space where that stack value will get copied after the function returns.

The reason why this cannot be done always (ie, by default) is that the space where the value is to be copied may contain information which is used during the execution of the function. But there are well-known methods to get around that, too. For example, copy-on-write is what operating systems do to blocks of memory shared by multiple processes. Compilers can, in the same way, do copy-on-write of space which is to be populated at the return of a function.