66

Quite often in my work the idea of 2-way data synchronisation between database systems crops up. The classic example is two slightly different CRM systems (say, Raiser's Edge and Salesforce) and the need to have a two-way sync of Contact data between them.

API considerations aside, assuming you have a shared key to sync on, and purely thinking of the algorithm/pattern to be employed, this is a task that is often underestimated by non-techies.

For example, you have to watch out for:

  • Can you easily detect which records have changed in both systems (or will you have to compare all records between the systems to detect changes)
  • If you're going for a once-every-N-hours sync, how to deal with conflicts where the same record changes at more-or-less the same time in both systems
  • If you're going for a real-time sync (ie an update in one system immediately triggers an update to the other system) how to handle divergence over time due to bugs or system crashes

Personally I can think of ways to tackle all this but I'm wondering if there are any well known patterns, literature or best practices that I could refer to.

codeulike
  • 849

2 Answers2

10

Yes, a hard problem, easily underestimated. And could be a lot of work. If you are on Microsoft technologies, you may want to have a look at Microsoft Sync Framework here and here.

0

There are many theories about remote site DB synchronization. First start with INSERT. handling this one is easy - as you can create a unique ID for every site (for example an initial of the site name + ID (number): site_a_177 vs. site_b_53)

So insert should not create any conflicts. the problem is the update. I don't believe that there's a 100% failure proof method, but you can start an update by "locking" the record in the remote DB, and only after you got the handle - continue with the update, and finish by syncing the update and only then release the lock.