Python: Is returning self in method chaining a violation of Demeter's law?

Question

In Python it is very common to see code that uses method chaining, the main difference with code elsewhere is that this is also combined with returning an object of the same type but modified. This approach usually assumes that objects are inmmutable and only new instances are returned.

The question is whether that violates Law of Demeter.

Some example code from the popular framework called PySpark (from the Java framework Spark):

from pyspark.sql import functions as F
dataframe = (
    spark.createDataFrame([], StructType([]))
         .withColumn("a", F.rand())
         .withColumn("b", F.rand())
         .withColumn("c", F.col("a") + F.col("b"))
)

Another way to write this would be:

from pyspark.sql import functions as F
dataframe = spark.createDataFrame([], StructType([]))
dataframe = dataframe.withColumn("a", F.rand())
dataframe = dataframe.withColumn("b", F.rand())
dataframe = dataframe.withColumn("c", F.col("a") + F.col("b"))

This other approach is hard to follow as the same variable is overwritten over and over and keeping track of the value might get too confusing. However, creating new variables for each step polutes the namespace with many variables that will only be used once, hence the method chaining approach that is ubiquitous to the Spark framework.

Equivalent code without any framework would be like:

from dataclasses import dataclass
@dataclass
class Transformer:
    data: list[int]
def transform(self, function: Callable[[list[int]], list[int]]):
    return Transformer(data=function(self.data))


transformer = (
    Transformer([1, 4, 3])
    .transform(sorted)
    .transform(lambda x: x**2)
)
or equivanlently
transformer = Transformer([1, 4, 3])
transformer = transformer.transform(sorted)
transformer = transformer.transform(lambda x: x**2)

score 13 · Accepted Answer · answered Oct 29 '22 at 05:19

The law of demeter isn't fundamentally about method chaining, even though it often manifests itself as such. It is about limiting access to the internals of an object and using well-defined behaviors instead of reaching into fields directly. A classic example is that if you wanted money from a person, you wouldn't reach into their wallet directly (person.getWallet().removeMoney()) but instead ask the person to do it for you (person.requestPayment()).

In your example, you are simply using method chaining to improve readability (also known as a fluent interface) and this is not a violation of the law of demeter. This is just a form of syntactic sugar that is used to initialize complex data structures or otherwise chain existing behavior to make it more readable.

It is important to note that an actual demeter violation can exist even without method chaining. For example, rewriting the wallet use case like this:

wallet = person.getWallet()
wallet.removeMoney()

does not actually fix the problem even though we've apparently "removed" method chaining by introducing an extra variable.

candied_orange · Answer 2 · 2022-11-15T16:21:22.367

Perhaps you’ve heard something like this:

the law can be stated simply as "use only one dot". That is, the code a.m().n() breaks the law where a.m() does not.

wikipedia.com - Law of Demeter

This is a lie.⁼ Well, a lie to children⁼ anyway. A gross over simplification if you’re willing to be rude about it.

When you see this don’t think Demeter violation. Think potential violation. Because this isn’t enough evidence to convict.

Despite the fact that this is often how it’s taught that isn’t what the law says:

More formally, the Law of Demeter for functions requires that a method m of an object a may only invoke the methods of the following kinds of objects:

a itself;

m's parameters;

any objects instantiated within m;

a's attributes;

global variables accessible by a in the scope of m.

wikipedia.com - Law of Demeter

If you think about it critically you’ll realize there are all sorts of dot chains that don’t violate the rules above. Rather than panic when you see a dot chain look up these rules and do some thinking.

In this particular case carefully consider the phrase above: “kinds of objects”.

Going from object a to b through m has different structural knowledge implications when objects a and b are the same type. See rule 1.

What were really trying to avoid is someone diving into a large code base and randomly chaining together things never meant to know about each other. This is bad because now when they change they must change together. Which is a shame because one of the reasons we made them separate classes is so they wouldn’t need to change together.

Now things that were designed to change together are fine. You aren’t destroying carefully built separations when you use them together. See Java 8 streams,⁼ JOOQ,⁼ and StringBuilder.⁼

Demeter⁼ tells you to talk to your friends. Not friends of friends. Demeters formal rules are one way to define your friends. Being explicitly told by the designers who your friends are is another. And so is sitting around talking to yourself⁼. Don’t call it a violation unless you’ve checked what you’re really talking to.

gnasher729 · Answer 3 · 2022-10-29T23:46:18.760

To violate Demeter's law, you start with an object x, then get an object y from x (by calling a method or reading a property or attribute of x), then get an object z from y, and so on, and use the last object to do the actual work.

The reason why this is considered bad is because it exposes that x can give you y, that y can give you z, and so on. Often it would be better if x had a method that does what you want to achieve.

But what you do here is different: You start with x. Then you call a method that does something and by convention returns the same x. Then you call another method on x doing the same. You do this by using the return value of the first method, but you could have just used the same x again. It's just that this pattern allows you to call any number of methods on the same object. That is shorter and more readable than using the same object again and again. Also, if you want to call the same ten methods on x, then on y, then your source code for doing this will be identical except the very first x gets changed to y in your source code.

The pattern is fine as long as the method name doesn't indicate that it returns an object, but what action it produces on the object. With Demeter's law violations, the methods that you call would be named to make the caller know which object is returned. (At least I hope its named that way, or Demeter is not your biggest problem).

PS. What's the problem with person.getWallet().removeMoney()? Just assume class Person gets a new property getBankaccount(). And suddenly the rules how to get money from the person get difficult. Consider for example that I want to buy an item but not if it means overdrawing my bank account. But there are other payments that I must make, even if I get overdrawn. Or a rule for taking money out of my wallet: I always want a pound coin in my wallet which you need in the UK to get a shopping trolley. So normally you can't take that last pound coin. Except when I need it to get a trolley.

So that's all rules that the Person object should understand, but not the caller. On the other hand Person could have a member "finances" which handles everything about money. So person.getFinances().getMoney could be Ok.

Python: Is returning self in method chaining a violation of Demeter's law?

or equivanlently

3 Answers3