3

A significant part of the activity of software Engineers working on automated testing, continuous delivery, service integration and monitoring, and building SDLC infrastructure strives to reduce operational risks lying on the system they are working on. To estimate the economic value of these efforts require a risk model. As a first approach, it seems reasonable to throw an ad hoc model where each risk occurrence is governed by a Bernoulli variable (roughly speaking a coin that can be weighted to fall more or less on a given side). But how to find sensible parameters for this model.

To take a concrete example, if I develop a disaster recovery procedure and run drills each month to prove it working, I would need to estimate the odds of my company going bankrupt because it is unable to recover quickly enough from a disaster, and I was unable to find any sensible data on this.

Is there any sensible risk model, together with its parameters, that can be used to estimate the value of efforts invested in risk-reduction?

Dan Cornilescu
  • 6,780
  • 2
  • 21
  • 45
Michaël Le Barbier
  • 2,670
  • 1
  • 16
  • 27

2 Answers2

2

You are probably unable to find data on this for two main reasons. Firstly, because it varies from system to system. Does an outage cost you money due to contractual obligations? Or only in lost reputation and in user perception? How much does that cost? This varies from organization to organization. But I think that might be what you are getting at with the Bernoulli variable...

...Which brings me to the second thing. In order for you to be able to model risk, you need research in which someone has already statistically proved or measured the efficacy of DevOps methodologies (which means you have to define exactly what DevOps is first, a notoriously tricky issue.)

You also need to be able to compare that to old methods - meaning that someone must also have measured data for the Dev/Ops segmented model. I'm not sure that research exists or is settled. Probably because the efficacy and cost of Dev/Ops will also vary widely from organization to organization. In that respect it may provide an opportunity to measure the old methodology and then show the average improvement of changing methodologies if you can find enough research subjects.

In short, it is doubtful that such a formula for risk modeling exists. If it does, it likely exists as paid marketing research to a consulting firm and is unlikely to be available to the public and I would personally be very dubious of the accuracy of it's results due to the subjectivity of the input variables measured.

James Shewey
  • 3,752
  • 1
  • 17
  • 38
2

Is there any sensible risk model, together with its parameters, that can be used to estimate the value of efforts invested in risk-reduction?

Yes, of course. For non-life-threatening risks (say, your common bug which just induces some cost to fix it): take the chance of the problem happening, multiply by cost, and there you go. Easily done by going through your bug ticket system, where you hopefully have recorded the effort gone into fixing.

Then there are the life-threatening risks where you do not actually need to factor in the cost, but there is no question that you do your utmost to avoid it. For example, if you are a small company with one product, and you lose your source code, then it is not a question about how much that costs. You can just pack in. You will want to have some sort of backup strategy that is, by all means and purposes, unbeatable (for example: the boss takes a current set of physical backup tapes back home each weekend => and yes, I have worked in companies where that was done (in addition to all the usual other measures, like snapshotting file systems, distributed git repositories, on-site backups, off-site online backups etc.)).

Nothing about this is particularly related to DevOps. DevOps just happens to bring a few tools to the table (like CI/CD pipelines, relatively easy HA through read-only redundant containers/pods etc.).

AnoE
  • 4,936
  • 14
  • 26