3

When using Git to store documents distributed and decentralized it can be considered as a database.

How would the ACID properties and the CAP theorem correspond to git in this case?

I think one has to distinguish between a single repository and the whole network of repositories.

ACID - for a single repository:

  • A would be ok, as there are commits
  • C depends on the use case and thus is not relevant
  • I is the big question,
    • if one is using only one branch that should be fine
    • if one is considering multiple branches I would be fine if a rebase (resp. merge) without conflicts is possible to a master branch (or even all other branches) but is not fine if a rebase would result in a conflict.
  • D depends on your hard disk but should generally be fine

ACID - for a distributed setup:

  • would be the same as above plus, that I has to be seen with respect to all other clones on the network of repositories

CAP:

  • C
    • for a single repository this would be ok
    • looking at a distributed setup this would only be true, if every read operation is preceded by a pull.
  • A this would always be true since the local copy is always available
  • P if a pull is not possible than one has to decide whether to sacrifice A or C (as in the PACELC theorem). Also one can see if the majority of remote repositories is available and use some quorum approach.

(I understand the CAP-C more like the ACID-A, see https://dba.stackexchange.com/a/202125/147543)

Is this a valid interpretation of ACID and CAP as it is used in the database domain? Or is there already an ongoing discussion about Git with regard to ACID and CAP?

white_gecko
  • 139
  • 3

0 Answers0