If a function mutates outer state during execution but reverts the outer state into original state after execution, does it still contain side effect?

Question

According to What is a "side effect?", I know the side effect means changing outside world. But what if a function that changes the outside state during execution, but reverts the state to original state after execution? For example, originally I have a function about physics simulation that doesn't modify the outer variable (copy a new data to do simulation):

int predictNumberOfStoppedMarbles(std::vector<Marble> marbles){
    //some physics simulation that modifies x,y,vx,vy of marbles but not add,remove marbles or change sequence of marbles
    int count=0;
    for(Marble& marble : marbles){
        count += marble.vx==0 && marble.vy==0?1:0;
    }
    return count;
}

However, I found this method is too slow, because it needs to copy all marbles when the function executes, so I modify it as follows, which mutates the exact income data directly:

int predictNumberOfStoppedMarbles(std::vector<Marble>& marbles){
    std::vector<std::vector<float> > originalDataArray;
    for(Marble& marble : marbles){ //backup original x,y,vx,vy
        originalDataArray.push_back({marble.x,marble,y,marble.vx,marble.vy});
    }
    //some physics simulation that modifies x,y,vx,vy of marbles but not add,remove marbles or change sequence of marbles
    int count=0;
    for(Marble& marble : marbles){
        count+= marble.vx==0 && marble.vy==0?1:0;
    }
    for(int i=0;i<marbles.size();i++){ //restore original x,y,vx,vy
        marbles[i].x=originalDataArray[i][0];
        marbles[i].y=originalDataArray[i][1];
        marbles[i].vx=originalDataArray[i][2];
        marbles[i].vy=originalDataArray[i][3];
    }
    return count;
}

now it modifies the outer data source (marbles from outer world) directly during simulation, but after execution, the function reverts the data back. Is the function still considered as "no side effect"?

Note: In real code, the physics engine needs to accept Marble type as parameter, it is not easy to copy or modify the physics logic code that operates from Marble type to float array type, so the solution that modifies the copied array is not suitable for me.

Doc Brown · Accepted Answer · 2024-03-12T13:34:59.057

Well, at least it is a temporary side effect.

You may notice the difference to the fully side-effect free version when you run your program in a multi-threaded context. Since your case involves performance optimization, I think it is not unlikely you want to utilize multiple threads. Now imagine another thread trying to read data from the marbles vector in parallel while predictNumberOfStoppedMarbles is executing.

Good luck with debugging such a program!

Of course, from a practical point of view, in a single-threaded context, you can treat the optimized version of predictNumberOfStoppedMarbles as if it has no side-effect - when you are 100% sure all exceptions inside the physics simulation are caught (thanks to @GregBurkhardt pointing out that exceptions can cause trouble here).

As an alternative, why not use a combination of your two approaches, like

int predictNumberOfStoppedMarbles(const std::vector<Marble>& marbles){
    std::vector<std::vector<float> > workArray;
    for(Marble& marble : marbles){
       workArray.push_back({marble.x,marble,y,marble.vx,marble.vy});
    }
    // some physics simulation that modifies x,y,vx,vy inside workArray,
    // but leaves marbles unchanged !!!
    int count=0;
    for(size_t i=0;i<workArray.size();i++){
        count+= workArray[2]==0 && workArray[3]==0 ?1:0;
    }
    return count;
}

(For the sake of simplicity, I omitted introducing a struct for x,y,vx,vy, but I guess you get the idea).

score 5 · Answer 2 · answered Mar 11 '24 at 22:54

It is not side effect free anymore.

Even if you try to cleanup the "temporary modifications", it is a visible side effect. You may get away with it in some situations, of course, but it can cause hard to find concurrency bugs or security issues later.

In fact, if you watched the CPU space, you may have seen the "SPECTRE" vulnerability being discussed and lots of followups to it. That issue crept into the system exactly due to such a temporary modification of global CPU state that leaked out of the system. The designers thought speculative execution was basically side-effect free, but they were wrong.

So if there is any chance, that your "temporary" modifications may be visible to concurrent or later operations, you can cause trouble and bugs.

score 3 · Answer 3 · answered Mar 11 '24 at 13:17

There are a number of ways the existence of "side effects" can be analysed.

The term is now cemented in the lexicon, and it probably originates from a similar meaning as the term "side channel" (cross-pollinated with the idea in functional programming that function arguments and results are the "main" channel of data flow in functional programs), but it was a poor choice of term because most people already know the word "side effects" from a medical context meaning something adverse and to be avoided, when that is definitely not the case in programming.

In fact, "side effects" in programming - meaning a flow of data occuring other than through function arguments and results - are often the main intended effect of a program, as much as intoxication is the main intended effect of consuming alcohol rather than a side effect of the consumption.

Only disorderly flows of data are undesirable, like alcohol consumption to excess.

That said, a function which changes "outside" data, and then changes it back before its own conclusion, may be interpreted as having "side effects". The question might be whether the effects could be observed - that is, whether any other part of the program (when working normally) could actually read the altered data in the meantime, and whether its execution or results could be affected by it.

There isn't a universal definition of "side effects" that would distinguish the two cases.

It's certainly possible for a function to use non-local storage - that is, storage whose allocation is neither controlled internally by the function nor passed directly as an argument - yet the overall program design could still make it intentionally impossible in practice for there to be any effect outside the function.

However, if you have enough storage and performance available to duplicate the contents of non-local storage, alter it's contents, then set everything back at the end, the real question might be why you don't just operate with the local duplicate, discarding it at the end, and leave the non-local original untouched throughout.

score 2 · Answer 4 · answered Mar 11 '24 at 20:28

Your question is a good one and it reminded me of an interesting feature of Clojure. The term used for this (at least in Clojure) is 'Transients':

If a tree falls in the woods, does it make a sound?

If a pure function mutates some local data in order to produce an immutable return value, is that ok?

The upshot: this is absolutely done in Clojure and I would consider it one of the more 'pure' functional languages in general use.

I am not a functional programming expert and I am wary of making claims related to it, but I'll hazard a claim that you can do this and still be 'pure' with the caveat that you never leak that state to anything else. Well, maybe for debugging but nothing else.

score 0 · Answer 5 · answered Mar 12 '24 at 20:44

If there is any possibility that some other code acts on the temporary modified state then it is a side effect. If it is impossible that some other code acts on it then it is no side effect. For example if a resource is protected by a mutex, or if it is marked as invalid while temporary modified.

score -1 · Answer 6 · answered Mar 13 '24 at 11:59

Can you guarantee that your current function is the only thing touching the data right now? Multithreading exists. Something, someday, may be doing operations on very same data and you just changed the data.
Only if you can guarantee atomicity of the operation, you can consider it side effect free. (Atomicity = until all operations are complete, nothing else can touch the data).
Secondly, you do not guarantee the data to be recovered if exception happens. If something breaks during calculation, the data will be left scrambled.

If a function mutates outer state during execution but reverts the outer state into original state after execution, does it still contain side effect?

6 Answers6