2

According to some answers of How are globals any different from a database? that explains how database is different from global state:

as I understand, the answers roughly explain accessing a database using "a consistent, understandable interface" in code is different from accessing a global variable. However, what I don't understand is, how a "a consistent, understandable interface" differentiates from accessing global variables? For example, when accessing global variables:

public void myMethod(){
   AppData.someValue = x;
}

the global state is eval because you can paste "AppData.someValue = x" in any other methods, resulting everyone can modify the state.

When changing the access way from global state to database:

public void myMethod(){
   MyConnector mc=...;
   Session s=mc.beginTransaction();
   AppData appData=mc.getObjectFromTable(...);
   appData.someValue=x;
   s.save(appData);
   s.endTransaction();
   mc.close();
}

When you paste the wall of code:

MyConnector mc=...;
Session s=mc.beginTransaction();
AppData appData=mc.getObjectFromTable(...);
appData.someValue=x;
s.save(appData);
s.endTransaction();
mc.close();

to any other methods, every methods can also change the value of someValue.

Just because the code to trigger "everyone can modify the state" is longer or even has better name than "AppData.someValue = x" doesn't change the nature of the code that lets everyone can modify the state.

To differentiate database from global states, I think we should wrap and apply dependency injection to the database driver:

public void myMethod(MyConnector mc){
   Session s=mc.beginTransaction();
   AppData appData=mc.getObjectFromTable(...);
   appData.someValue=x;
   s.save(appData);
   s.endTransaction();
   mc.close();
}

so that not every methods can access to database, same as the case of accessing global variables that inject the object only when a method needs to change it.

So my question is, why does "a consistent, understandable interface" differentiates database from global states, even I can paste the "a consistent, understandable interface" code everywhere to let anyone modify the global state, just like global variables?

Doc Brown
  • 218,378

7 Answers7

4

Don’t use either a database or global variables in your application code.

Create a class whose instances represent settings or global state and use that. There is more than just the state: There is reading the state, there is changing the state, there is permission to change the state, there is displaying the state in your UI. There is a default state. There is persisting the state.

Now an instance can be created in different ways. It can have a state that is hardcoded in the app. Via #define, via configuration files, via an MDM (controlled by the users company IT).

You can have an instance depending on global state. Reset every time you launch the app. Or stored in a database. Or controlled by the user outside the app. So you have multiple subclasses but your application handles them all in an identical way.

gnasher729
  • 49,096
4

I think you are correct - a "consistent, understandable interface" is not what makes the difference. This description is misleading, and I think it does not even describe the handling of global state in general correctly, regardless whether it is global state in global variables or a database.

To deal with global state in a fail-safe manner, there should be an explicit interface to this state, with a replaceable implementation. The purpose of the interface is to separate your - for example - database code from the core logic of your application, and which allows exactly what you wrote here:

To differentiate database from global states, I think we should wrap and apply dependency injection to the database driver.

For databases, such interfaces are often called "repositories" or "data access layers". They don't come automatically from introducing a DB, but most educated devs are used to separate database code from the core logic using such a layer. Hence, when a database gets introduced into a code base, people typically use techniques which they sometimes forget to use when dealing with global variables. Part of the issue is surely that most popular programming languages (as long as they are not prely functional) make it very easy to access global variables in code directly, much easier than to avoid this kind of access.

Of course, the separation techniques for databases can also be applied to in-memory globals. It should be obvious that in case your code already uses global variables in the core logic, it makes not much sense to introduce a full-fledged database and a repository logic to get rid of the globals, just because devs are better trained to introduce a layer of separation for it. Getting out the globals out of the core logic can be achieved by much simpler means, leading to a similar kind of data access layer (with explicit parameters and injected interfaces or callbacks), but with lean implementations and no extra dependencies.

Doc Brown
  • 218,378
3

Well in one way a DB has the same problem that a global does:

Shared mutable state.

"a consistent, understandable interface" code differentiates from accessing global variables?

This can discourage the sharing part of this. The code base can carve up responsibility for talking to the DB so a table is only updated by one repository. Not really enforced but can eliminate many problems.

But a DB can be updated by many applications. Which brings us to source of truth. There should be a single source of truth. In some designs the DB is the source of truth. What it says goes. In others, the DB is simply a report of the truth of the application. What it was at some point in time.

This isn't strictly a DB thing. Measure the temperature with some instrument and report that to anything that asks and you have a shared mutable state. It's simply more obvious that it's the source of truth and not a model of it that can be made inconsistent. You also don't have many writers. Unless someone is playing with matches.

Oh sure, the DB might let you enforce transactions and its ACID goodness. But that only defends against some technical oopsies. There are plenty of other gotchas that shared mutable state exposes you to.

But even that isn't what bugs me the most about a global variable. Ya see, it's a hidden dependency. I don't like code that knows too much about other code. Code that has an opinion about where this global var is will get upset if I move it. I prefer to ask to be handed what I need and let something else tell me what it is.

Now sure, a DB query has this same issue. But that's what repositories take care of. With that I can isolate what knows how to find this thing so moving it later isn't a search through the whole code base.

So yeah, there's a lot that's the same. Still some differences.

candied_orange
  • 119,268
3

"Globals considered harmful" is an idea we hear from time to time, and many of the answers you've linked repeat the usual canards.

We had a previous Q&A not so many months ago: https://softwareengineering.stackexchange.com/a/446415/292095. There were also some additional comments beneath that question, and the context was about DI which you also mention in your question here.

Globals are, quite obviously, different in many ways from database engines. A global is simply a field in memory when a program runs. A database engine is a considerably more complicated facility for interacting with durable storage.

I think the similarity you are perceiving is that the storage they provide is somehow open and shareable (in the most abstract way).

It's like suggesting a banking corporation is very similar to a wallet you put in your pocket, in that both may hold your money. But you've gone somewhere where the oxygen is too thin, once you start suggesting that the similarities between banking corporations and pocket wallets are more remarkable than their differences.

We can certainly say a banking corporation and a pocket wallet differ in their "interface", but again few sensible people would consider the "interface" to be the main difference.

And often, this kind of thinking leads to an even more fantastical journey of trying to describe what the "interface" actually is. For example, asking a bank teller for money they are holding, is not appreciably different from asking your spouse for money they are holding.

But you can travel around with your spouse, and they can carry the money around with them, in a way a bank clerk will not/cannot. The essence of the difference is not merely in how you "interface" with the money, but everything about how the money is stored, where it is stored, how much security it has, at what geographic place a demand can be made for the money, the nature of the relationship with the person dispensing the money, and so on.

In my view, the difference between a global and a database is not merely their interface, but the whole form of their workings (including both innards and outer surfaces) which are really nothing alike at all in any conventional understanding. They would not be used for the same things or under the same circumstances, and the manner and conditions under which they store data is sufficiently different that they are not typically interchangeable in an application design.

Steve
  • 12,325
  • 2
  • 19
  • 35
2

This question relies on some significant oversimplification of the concepts it's referring to.

The issue with global state is one of unavoidably having a singular reference to the same state. The advice being given to not use global state is not saying "you should never refer to the same thing twice", it's saying "it's not great if you can never refer to something else". These are two very different statements.

When you access global state, you are making a hardcoded reference to a specific global field. If you want to make use of another state, you have to update your code to refer to the other global field.
A database, however, is abstracted by a connection string at a bare minimum. If you change the connection string, you change the underlying content that you receive, even if you launch the exact same query.

The point of the advice is not that it's never allowed for multiple actors to share the same source of truth if they so choose, the point is that it would be bad if these actors were unable to choose different sources of truth, even when they wanted to.

The latter can be achieved via databases, as you can trivially change the connection string without needing to change the logic itself (if you hardcode your connection string in your code, that's a different problem).

I also want to point out that it's very easy to respond to this with "but you could also refer to different global states by doing [..]" which is not necessarily factually incorrect, but it's usually a unreasonably contrived approach and the recipients of the "don't use global state like that" advice aren't even doing that contrived solution in the first place anyway.

At the end of the day, global fields and a database are two different things with very different use cases, and you can't just take advice about one and blindly imply it to the other, even if it sometimes sounds semantically ambiguous in English.

Flater
  • 58,824
1

@DocBrown is correct, but I'd like to offer a condensed revision with practical focus.

If a database is accessed via connection URI from multiple components, it is a shared mutable state. But the database handle is usually explicitly passed around, making it less shared.

Shared variables can be easily made non-shared by reducing their visibility and explicitly passing references to them, but this is never done. Instead, they are sometimes wrapped in a Singleton, which is also a shared (and often mutable) state and achieves nothing.

Basilevs
  • 3,896
1

Before going into why there is a difference, I think it’s useful to understand what globals represent, and the way to do that is to compare them to their polar opposite. Pure Functions, and not just pure functions, but the purest of the pure - functions that have no parameters or where the parameters are just values (so the equivalent of byte arrays) without any pointers or references or functions in them.

If you look at a function FortyTwo which just returns the value, 42, you know what the result will be when you see it called, there is no ambiguity about the results, the result WILL be 42. It’s consistent and understandable in its entirety.

If you look at the function Square(n) which returns n*n, you again understand it in its entirety. Other than possible overflow values, you know the results.

Such function can be either memoized or possibly replaced with precomputed values in the compiled output.

So goes, the purest of the pure...but what happens when you make them less pure? Introduce parameters with external references (pointers, references or even worse, functions). It’s now no longer so easy to understand the function, because it’s harder to understand the input. It is no longer so consistent, the result depends not just on the raw input, but what that input does. Take Map(values, fn), what will it return? From looking at Map, you don’t know, because you don’t know what fn does. You don’t even know what, if any, side effects it has. It may erase your hard drive, send an email, or do anything really.

Ok, so it’s not technically pure, but if you know what values it is called with, you can still sorta understand what result it will give…which leads us to globals.

Globals are hidden parameters that the PROGRAMMER provides, not the calling function. Why is the fact that it’s the programmer providing the value important? Because it may not be the correct parameter. Globals are infamous for being duplicated, IsThis versus NotThat where the two variables represent the same conceptual value, but are named differently and set at different points in the program flow. Not such a big deal if the value is immutable and both get set to the same value. If it’s mutable, well, now you have the potential for a big problem.

With globals, the caller can’t have a reliable understanding of what the result should be, because they can’t know the input.

Globals are spooky action at a distance.

So, how does this differ from a database? Mainly in two ways. First it is explicit, you have some kind of database querying parameter, which returns a value that isn’t know to the caller, but the caller has explicitly set it up or inherited from one of its own parameters. It’s really no worse than passing in an Object and the function using Object.Property. Secondly, you are much less likely to have two different values that represent the same thing, but don’t. Adding to the database typically involves a process that is more rigorous than adding a new global.

In theory, the differences are not really that different at all. In practice, they make a huge difference.

jmoreno
  • 11,238