6

According to Why is Global State so Evil?, I know there are already many answers about why is "global states" bad. However, I'm not raising new reasons to oppose that. Instead, the currently top answer : https://softwareengineering.stackexchange.com/a/148109/432039, which used a term seems unique to other answer : "starting state", and what I don't know is, what is the relationship between "starting state is changed" and "unpredictable"?

Over past years I misunderstood those answer in several ways: The answer says "global state" is bad because it makes the program unpredictable, which I think "unpredictable" means something interact with others is logically to complex to understand and hence unpredictable. Also it doesn't distinguishes "global variables" from "global state", making me wrongly understand the answer as : "Don't write any programs with global states that interact with others", and hence I often want to ask the author of the answers : Should I only write stateless apps? How can I write a shopping app with shopping carts that store the selected items? Or should I write all the codes in a single class so that no other class exists and hence no "global states"? And even if it is the reason, why would passing the "global states" with "dependency injection" solves the problem?

After reading https://softwareengineering.stackexchange.com/a/440481/432039, I finally understand the answer is not opposing all global states, but opposing "global variables" only, which suggest accessing global states with a better method : dependency injection.

Also after reading https://softwareengineering.stackexchange.com/a/450923/432039, I finally understand "unpredictable" actually means "hidden dependency" : when you use global variables:

public float getCost(float price, int quantity){
    return UserData.discount*price*quantity;
}

you may not realize UserData.discount is also affecting the output besides price and quantity, it is called "hidden dependency" and that is so called "unpredictable". So modifying it as

public float getCost(UserData userData, float price, int quantity){
    return userData.discount*price*quantity;
}

so that all dependencies appear on the parameters and hence I can find all dependencies in easier way.

However, what I don't understand is at the second and third paragraph, which the paragraphs roughly says: Even with global variable, if the global variable is initialised but not changed yet, the function is still predictable, but after the initial value of the global variable is changed, the function becomes unpredictable. Then a question comes in mind: why isn't the function unpredictable regardless of if the global variable is changed?

Why would the answer mentions about "starting state is changed"? Taking the examples of "getCost()" above , because I think even if "UserData.discount" is never changed, it is still a hidden dependency that affecting the output. Even if "UserData.discount" is never changed after the program starts, it seems wouldn't help me to aware the line of code "UserData.discount*". So I think whether starting states is changed isn't relevant to the drawbacks of global variables. So my question is, why does the answer attribute "unpredictable" to "global state is changed"? Why should I specially care the initial value of the global variable? And even if it is true, why would modifying the initial value through passing parameters (dependency injection) solves the problem? Or am I still misunderstanding the answers (eg: misunderstanding what "starting state" refers to?)?

9 Answers9

8

The thought model here is, when a global variable is set to a well-known value (and then never changed), we can predict the outcome of a function or process which depends on that variable. However, if a global variable is initialized by some black-box process, where we don't know what value the global variable is set to, we may not be able to predict the outcome.

For example, when UserData.discount is set to a starting state of zero (and never changed afterwards), we can predict that getCost will always return 0. When UserData.discount is set to one, we can predict that getCost will always return price * quantity. But when UserData.discount is initialized with an unknown value, maybe by some black-box process (or worse, by some concurrent process, or maybe different processes), then we cannot easily predict what getCost(float price, int quantity) will return - at least not without inspecting and analyzing all other processes which may modify UserData.discount. That's what meant by "unpredictability".

Doc Brown
  • 218,378
6

I finally understand the answer is not opposing all global states, but opposing "global variables" only, which suggest accessing global states with a better method : dependency injection.

True. The more formal term for what you're wrestling with is:

Shared Mutable State.

Shared mutable state works as follows: If two or more parties can change the same data (variables, objects, etc.). And if their lifetimes overlap. Then there is a risk of one party's modifications preventing other parties from working correctly.

exploringjs.com - shared mutable state

Global only comes into this because it's a way to share. There are other ways to share state that aren't global that cause the same problem.

In contrast some globals are fine. I've never been bothered at all by Math.PI being shared across threads. That's because while it is global, it's constant, that is, immutable. So I don't care what other threads have been up to with it since it doesn't change.

I finally understand "unpredictable" actually means "hidden dependency" : when you use global variables:

Sure, suppose some application has the idea of a circle constant. When it was written it assumed the value of Math.PI would back that circle constant. Then some fan of Tau day decides 2*Math.PI is the better circle constant. Suddenly the application is misbehaving because its assumption has been violated. These decisions are called conventions. They are global, and for the most part immutable. Right up until they aren't. This kind of hidden dependency is difficult to avoid because repeatedly passing Math.PI to your circle functions is a pain.

That reluctance to repeatedly pass Math.PI is how it ends up baked into what you're calling "starting state".

circle.circumference()

This will return different values for the same circle depending on what the circle constant is configured to be. Whether the result is correct depends on if the configuration is consistent with the codes assumptions.

So I think whether starting states is changed isn't relevant to the drawbacks of global variables. So my question is, why does the answer attribute "unpredictable" to "global state is changed"? Why should I specially care the initial value of the global variable? And even if it is true, why would modifying the initial value through passing parameters (dependency injection) solves the problem? Or am I still misunderstanding the answers (eg: misunderstanding what "starting state" refers to?)?

A function is considered deterministic if repeated calls will always return the same value. For that to work, anything that would change the returned value must be something passed in or must be constant.

Since "global state" affects the return value, and isn't passed in, it must remain constant for the return to reliably be the same. That's all that is meant by unpredictable. It certainly doesn't mean incalculable because if it was the computer couldn't calculate it.

Global has another problem. It isn't simply a way to share. It's a sneaky way to share. When I pass something in I'm loudly saying "Hey, that stuff down there depends on this stuff here". Which is nice. Makes reading the code easy. Conversely, I'm not going to be happy if spotting that CircleConstant is used requires checking every line of circle.circumference() (which, please no, might be hundreds of lines long). I greatly prefer to use arguments to show what my code depends on when I can.

When I can't, I try to define the interesting things at the top of the method so I don't surprise you later.

But there are limits. How often do you want to see LOG passed in? Use args to avoid surprising me. Don't use them to bore me.

These terms and ideas come from the functional programming discipline. They turn out to be fairly useful to know even when working in OOP languages.

candied_orange
  • 119,268
4

The idea that global state is bad or unpredictable is a useful simplification.

The base idea is very general. If you write code which does "something," you should know what that something is that it does. If you wrote code to print a line of Shakespeare, and it accidentally writes a line of Tolkien, that's bad. But if we're really honest, you never truly know what code will do. If we did, then we wouldn't need debugging. Understanding what the code does is a gradient, ranging from "I dunno, I wrote something that compiles" through "This code has undergone independent review and qualification testing."

To understand what some code does, you need to understand its inputs. This is where the global variable story starts. It's expensive to know what code does. It costs time to understand it. It costs time to review it. If you intentionally design your software to have easy to understand variables, its more likely that you've written cost-efficient code.

The issue with global variables is not that they're hard to understand as a rule. Its that they're cloyingly easy to use in ways which make it hard to understand later. Over many years we've noticed that the unlimited accessibility of global variables means that if you want to understand the state of that global variable, you really need to understand the entire behavior of the entire program. As a general rule, this is expensive.

It doesn't have to be. Your project can maintain meticulous documentation that clarifies the precise behavior of the global variable. This can cut down on the cost of achieving predictability substantially. And, in such cases, the advantages of global variables may outweigh the downsides. An example of where such global variables do indeed outweigh the downsides is when interacting with hardware. The realities of hardware make globals a very natural pattern because the hardware itself is global.

However, in practice, we find that most development teams do not have such discipline. In my opinion, the worst case scenario is when one writes code using a global variable, does it correctly, goes through the reviews and verifications, and everything is good. Then, six months or a year later, someone does something new with that global variable which violates one of the small assumptions made in that earlier code review. Suddenly the code breaks because the global variable is doing something different than when you did the comprehensive review. In theory the second change should have flagged the issue; in practice its really hard to keep track of enough information to realize the issue is an issue.

This applies to things like dependency injection as well. You can create a variable that isn't a global variable in the language sense, but is global in every other sense of the word because it got injected into absolutely everything. Such a variable will have the same issues. However, in practice the extra level of effort required to do dependency injection is often sufficient to encourage developers to do the right thing. The lack of extra effort required by global variables makes them too tempting to misuse for many teams.

Cort Ammon
  • 11,917
  • 3
  • 26
  • 35
4

@CortAmmon has the correct answer, but it is too close to OP existing understanding, so it may not help much. I'd like to illustrate the described problem in more concrete and different terms.

State machine

A component is a state machine. Possible states and transitions between them are defined by all the data component has access to.

An isolated component is one, where all available data is local to the component.

The complexity of analyzing and modifying an isolated component, grows linearly with it state and transitions space (not in a strict sense, as for numbers, for example, the effective complexity increases only with count of edge cases, not with bit count, still good enough for my explanation).

If a component A accesses data from component B of similar size (and nothing else) the combined isolated component AB is exponentially harder to analyze, because its state space size is now a product of original components state spaces.

The complexity of such combined systems is reduced by reducing cross-access with limits imposed on each components state publicly observable by other components. This limits potential state transitions, as every component can only make decisions based on the state visible to it.

Global state is a component, every other component has access to. Readable global state increases state visible to all otherwise isolated components. Any application that has writable global state, has a single isolated component - the whole application, where any component is affected by any other component.

Sure, the limits imposed on the global state may limit the observable state, but the product of all other states is just too large to risk even a single bit of global state. Also, the limitations imposed on publicly visible state break.

Example

A simple application

Consider an application of three isolated components

  • Component A has public states A1, A2, A3 and iterates over them in order each tick unconditionally.
  • Component B has states B1, B2, B3 and does the same.
  • Component C has states C1, C2, C3 and does the same.

The total count of application states is 3:

A1B1C1, A2B2C2, A3B3C3

The total count of potential state transitions is 3:

A1B1C1 -> A2B2C2 -> A3B3C3 -> A1B1C1.

Introducing global state

Now make component C globally readable. Components A and B now can make conditional transitions based on state of component C.

The total count of application states is 27. The total count of potential state transitions is 243:

Potential state transition table https://docs.google.com/spreadsheets/d/1uTqdkaqanAOKdA_nlegFrbAE_UROm4eqN3SGVasaLKc/edit?usp=sharing

Were component C globally writable, there would be no limits on state transitions left, resulting in 729 potential state transitions.

Just listing all states and transitions is a hard work. Their analysis more so. Real applications have much more states, so the impact of any additional global shared state would be much more drastic.

Basilevs
  • 3,896
3

I think part of the reason that there's a lot of confusion around why the idea that "global state is evil" became a mantra of development is that the actual problematic aspect of their use is associated with a style of programming that is no longer widely practiced. It might very well be that the reason for this is that this mantra has trained a new generation of developers to avoid it.

Basically, in my experience, the badness of global variables would look something like this:

  1. A routine/function/method sets a global variable's value in anticipation of what follows.
  2. Some other routine (often invoked indirectly or independently) reads that variable. Does some work and then sets another global variable.
  3. A third routine (again often invoked indirectly or independently) reads that variable (or both) and does something else.

And so on and so forth. In my first full-time professional programming role, I was often maintaining a lot of code written in this style. The general term for this and related styles is 'spaghetti code'. It is extremely hard to follow and very brittle. Simply attempting to add features on top of it is painfully error prone. Trying to extract part of it into a reusable module is extremely challenging and generally requires refactoring. Introducing concurrency and/or multi-threading is nearly impossible without refactoring. Worst of all, it's easy to introduce subtle bugs into such code that only appear under very particular circumstances and therefore slip through testing. When discovered, it can be very challenging to reproduce those bugs.

Based on that experience, the phrase 'global state is evil' had an obvious meaning to me. It meant, don't do what I describe above. The context is that such approaches were relatively common when the phrase was coined. Over time, languages have improved as well as techniques. But if you've never encountered such a monstrosity, I can understand why it's not so obvious what the issue is.

I caught some downvotes on this and I can only guess that it's because I didn't address some of the other potential issues with using global variables. In my opinion, outside of the above, the use of global values can be problematic but calling them 'evil' in general is extreme. In addition, the term 'global variable' is a bit fuzzy. Most of the languages in popular today don't have anything called a 'global' variable and in Python, which does have a something called a 'global' scope, is really a module-level scope. You can't declare variables at an application level. For example, in Java, what's a 'global'? A static variable? A singleton? Even those are scoped under a classloader. It's all a bit loosey-goosey. Often, I've seen DI proponents assert that any sort of non-injected dependency is a 'global' (and therefore 'evil',) but I think that whether DI is necessary depends a lot on the language you are using and what kind of application you are writing. DI is a powerful and useful technique but using it everywhere often results in over-engineered solutions.

The important thing is to understand what constraints you are putting on your design when using various scopes. There are costs and benefits to using broader scopes. I like candied_orange's answer for that discussion and see no need to reiterate.

JimmyJames supports Canada
  • 30,578
  • 3
  • 59
  • 108
2

If you have a global variable with a value of 1, and change the value to 2, then anybody reading the value afterwards will get a value of 2.

But if you might change the value to any new value you want, or might not change it at all, and I don’t know what you are doing, then to me the result of reading the variable is unpredictable. And if I act differently depending on the value, to anyone else it is unpredictable what my code will do.

So “predictable” and “unpredictable” depends on the knowledge. And global variables can be changed without anyone knowing, so things will be unpredictable.

gnasher729
  • 49,096
2

I think when you say "global state" you mean something a bit different to everyone else.

Here we are talking about global variables. They make program unpredictable, because they introduce race conditions, and side effects.

For example

{
  var global = 1
  for(var i = 0;i<10; i++)
  {
     var task1 = globalAdd(1);
     var task2 = globalmultiply(2);
     await Task.All(task1, task2);
  }
  Console.WriteLine(global)
}

The output depends on how long task1 and task2 take on each iteration. That could change on every run, so the output is unpredictable.

A non race condition

{
  var global = 1
  for(var i = 0;i<10; i++)
  {
     globalAdd(1);
     globalmultiply(2);
     someFunctionYouDidntWrite();
  }
  Console.WriteLine(global)
}

Now the output in unpredictable, because you don't know if someFunctionYouDidntWrite modifies global or not. If you have a large program with many functions, then you might forget which ones change global variables and under what conditions.

Ewan
  • 83,178
1

The arguments against globals are very much overdone.

It's unusual nowadays to put any fields formally into the global scope, because it's just too easy not to with modern languages. Old languages had fewer alternatives and had to use the global scope more often for essential tasks. With some really old compilers, return values from functions had to pass as globals!

But it's still very common to have global access - static fields and methods, singletons, and other more elaborate patterns that disguise the essential reality of global accessibility.

Almost anyone who wants to use Intellisense for example, will make a static class called Common and then put his "globals" there, because now you can get up a neat list of just the globals, by remembering to type just the word Common. Look ma, no globals!

An excessive number of globals accessed in a chaotic way (i.e. an absence of sufficient orderliness and modularity) is characteristic of an incompetent (or novice) programmer, but it is the chaotic manner of access that is the root of the problem, not the global accessibility per se, and it is quite easy to be just as chaotic but with more ceremony and patterns about it.

At times there has been a belief that forcing people to strictly control access either sorts the men from the boys by raising the bar of difficulty, or it naturally forces the programmer to impose order on chaos, but it really doesn't. It just turns them both into circus performers when trying to get simple things done, and the boys now fall from more serious heights (i.e. spending even longer to produce even more crap code) instead of just tripping over their shoelaces.

The reason is because the incompetent can more easily grasp that he must work via the trapeze, than he can grasp the motor skills whose absence previously caused him to fall over shoelaces and now causes him to fall from the trapeze. Similarly, the incompetent grasps "no using global scope" more easily than he grasps "orderly and modular design".

The competent programmer fits only as much access control to fields as necessary in the circumstances, and making a reasonable judgement about it is part of professional expertise and learning from experience of the particular circumstances of development.

Writing a 100-line CLI utility by yourself is very different from say a 50-man, 3-generation, million-line application, and overheads and programming styles that have been found necessary for the largest teams can be baroque and grossly wasteful at smaller scales.

Steve
  • 12,325
  • 2
  • 19
  • 35
0

First, let me say that global are not bad or evil, they are expensive — they require extra thought, extra design and extra maintenance.

So, you ask…

Why would the answer mentions about "starting state is changed"?

That is referring specifically to the starting state of a METHOD that is being called.

If you have a method and it’s caller

public float getCost(float price, int quantity){
    return UserData.discount*price*quantity;
}
…
float cost = getCost(1.04, 12);

You have no idea what cost is going to be after calling getCost with those values, the starting state is undefined. If UserData.discount can be changed, and if you have either multi-threading or events, then it may be literally impossible to know what that method will return without executing it.

Even if this is the only method that uses the discount, changing the enclosing class so that the discount is injected when the class is created makes it easier to reason about the class. If it’s used in multiple places or is itself a calculation, making that explicit via making it an argument to the getCost method means that you can reason about and test getCost without reference to the global. For instance if you discover that you are having rounding issues (because you aren’t using BigDecimal), you wouldn’t want discount changing as you try to figure out how best to fix the problem.

All of that said, it’s not much of a problem to have a global, not even a mutable global, problems arise when you have too many to keep track of. How many is too many to keep track of? That will depend on whether they are mutable, how many places they are mutated or initiated, how many of them there are and how many devs you have making changes to them.

jmoreno
  • 11,238