Dependency promotion strategies: siloed or orchestrated?

Question

We have a lot of apps and web services (some public facing products, some internal and part of a private "backend") that are interdependent on one another. Each one of these components has 4 environments (clusters of servers/nodes serving specific purposes):

Non-Production
- DEV - Integrated development environment where CI builds push changes; useful for engineers to troubleshoot hard-to-find bugs that are not locally reproducible
- QA - Isolated QA/Testing environment
- DEMO - Stable UAT environment for business stakeholders
Production
- LIVE - Our live/production environment

Code promotion goes: LOCAL (developer's machine) => DEV => QA => DEMO => LIVE.

Say we have an application called myapp that is backed by a RESTful web service called myws, which itself is backed by a DB called mydb.

Currently, we have what I would call "orchestrated" promotion amongst these dependencies: the myapp-dev points to myws-dev which uses mydb-dev. Similarly, myapp-qa points to myws-qa which uses mydb-qa. Same for DEMO and LIVE.

The problem with this is that anytime I make a change to, say, myapp, this requires me to make changes to myws and mydb as well. But because each DEV environment points to its dependencies' DEV environments, it means I have to schedule and rollout these changes all at the same time. Furthermore, if one build becomes unstable/broken, it often brings down other upstream components; for instance if a developer breaks something when changing mydb-dev, the myws-dev and myapp-dev clusters usually also become unstable.

To solve this, I am putting together a proposal for what I would call a "siloed" promotion strategy: all inter-component dependencies follow this guideline:

Upstream dependencies depend on the DEMO environment for their downstream dependencies, for all of their non-production environments (DEV, QA and DEMO); and
Upstream dependencies depends on the LIVE environment for their downstream dependencies for their production environment

Using this convention, myapp-dev would actuall point to myws-demo, which would use mydb-demo. Similarly, myapp-qa would also point to myws-demo and mydb-demo.

The advantage here that I can find is build stabilization: it is much less likely that the DEMO environment for a particular component will become unstable, because code can't make it to DEMO without rigorous testing both on DEV and QA.

The only disadvantage I can find to this method is that, if DEMO does break for a particular component, all the non-production environments for all upstream dependencies will suddenly be broken. But I would counter that this should happen extremely rarely because of the testing performed on DEV and QA.

This has got to be a problem that many developers (much smarter and experienced than myself) have solved, and I wouldn't be surprised if this problem and its solutions already have names to them (besides what I am calling orchestrated/siloed). So I ask: Do the merits of a siloed promotion strategy outweigh any cons, and what are the cons that I may be overlooking here?

score 7 · Accepted Answer · answered Jan 17 '15 at 21:21

If I'm reading your post right, it doesn't seem like this proposal actually solves either of the alleged problems.

anytime I make a change to, say, myapp, this requires me to make changes to myws and mydb as well. But because each DEV environment points to its dependencies' DEV environments, it means I have to schedule and rollout these changes all at the same time

The "siloed promotion strategy" seems like it would only make this worse. If myapp v2, myws v2 and mydb v2 are only on DEV, and myapp v2 relies on mydb v2 to not crash, then when I try to run myapp v2 on DEV I'll hit mydb v1 DEMO and it crashes. You'd essentially be forced to either constantly override the silos, or deploy mydb v2 all the way to prod before you can even start working on myapp v2. More importantly, you would never be testing mydb v2 on DEV, so if it is broken you don't even find out until it breaks on DEMO, and then you're back at square one.

To some extent, the problem you describe is inevitable no matter how your workflow is set up, but it can be minimized. The trick is to ensure that the interface between mydb and myws (and the interface between myws and myapp) is strictly defined, and require all updates to that interface to be fully backwards compatible. At my job we have an XML schema defining the interface between our apps and services, and many of our internal tools simply won't let us make incompatible updates to those schemas.

Furthermore, if one build becomes unstable/broken, it often brings down other upstream components; for instance if a developer breaks something when changing mydb-dev, the myws-dev and myapp-dev clusters usually also become unstable.

This sounds to me like a serious problem, but not a deployment problem. A broken database will certainly prevent the service and app from working properly, but they shouldn't become "unstable". They should be returning error messages of some kind, so that anyone running myapp on dev sees "We're sorry, our database is having problems today" instead of simply crashing.

If the issue is that a broken database causes these problems at all, then what you can do is set up some "temporary siloing" system that lets you say "mydb DEV is broken now, please route all myws DEV queries to mydb DEMO for the time being". But this should only be a way to perform temporary fixes until mydb DEV is working normally again. If everything is "siloed" that way by default, then you're back at the problems I described above because nobody ever runs against mydb DEV at all.

I feel like I'm probably misunderstanding your proposal somehow, but hopefully this answer at least makes it obvious what was being misunderstood and how best to rephrase it.

Dependency promotion strategies: siloed or orchestrated?

1 Answers1