How to Sync MySQL Databases when offline?

Question

My application that is running on a client uses a MySQL database running on a server. So multiple clients are connected to the same server. That works well when the server is online. But now I would like to enhance my application to be able to run in an offline mode.

                +--------------+
                |              |
    +-----------+   SERVER     +----------+
    |           |              |          |
    |           +-------+------+          |
    |                   |                 | 
+------+-------+   +-------+------+  +-------+------+
|              |   |              |  |              |
|  Client 1    |   |  Client 2    |  |  Client X    |
|              |   |              |  |              |
+--------------+   +--------------+  +--------------+

Now comes the problem: what happens when the client is offline? I need a copy of my MySQL database on each client too. By default the application interacts with the MySQL on the server. If this server is not accessible (for what reason ever: server is offline or client has no internet connection) it should use the MySQL running on the client. If the client/server connection is available again the databases need to be synched automatically.

My question is now: how to achieve this? First of all I checked the MySQL-replication, but in my scenario I have multiple "masters" and an unknown number of clients. So I afraid that replication is not my solution. Is it possible to solve my problem with MaxScale? I never worked with that so I really appreciate any help.

J.D. · Answer 1 · 2021-09-19T17:03:59.800

I think you're looking at your scenario the wrong way. Firstly, is your client application a mobile application? Because then that makes more sense why you want to implement an offline mode. It's unusual to implement offline database support for a desktop application where network connection is usually relatively reliable.

If it is a mobile application, then you need to use the right tool for the right job. MySQL isn't supported on mobile devices, but SQLite is and usually the general choice of a local database system that can handle offline mode. There are also a series of other mobile supported database systems that can be used instead of SQLite as well.

You would still want your main remote database system as the place to consolidate all the data once a device is back online, which can indeed be MySQL. But as mentioned in the comments by dbdemon, you need a way to handle resolving conflicts, especially if you allow multiple client applications to modify the same records of the database. Likely this would need to be handled in the application layer or a mix of the application layer and the database layer, but you would need to codify it yourself to a degree. If MySQL has some sort of change tracking feature then that could help greatly in determining what data is new since the last time a client device has synced. Perhaps this StackOverflow question regarding change tracking has some answers helpful to you.

If you are talking about a desktop application you're trying to support offline mode for then we'd probably need more information on why and what problem you're trying to solve in doing so, since it is an unusual use case.

Bill Karwin · Answer 2 · 2021-09-19T17:03:17.857

You commented:

So there is no really "simple" solution to implement the scenario described? It's so strange because I don't want to believe that my scenario is so unusual...

It's not an unusual requirement, but being common doesn't make it simple or easy to solve.

The problem every project that has this requirement may have a different idea about how it should work. So there's no way for an out-of-the-box solution to be right for everyone.

Suppose a couple of your clients are offline, and they both write changes to their respective offline databases. Then the internet is restored and you want them to get back into sync. But they each updated some of the same rows in a given table. Which one should take priority? Should it be first come, first served? Should it be the last update wins? Should it be a merge conflict, requiring manual intervention to resolve (like a git merge conflict)? You might be able to answer this for your own project, but not for every other project.

Products like Galera handle this by allowing nodes to accept writes only if they are connected to the cluster. Offline nodes automatically become read-only.

I don't know much about MaxScale, but it apparently is a proxy to a traditional MariaDB replica set, with one writer and N read-only replicas.

score 0 · Answer 3 · answered Oct 04 '21 at 07:54

You might want to think outside the box here. Your main issue isn't related to a database solution that will somehow replicate the clients that are unable to connect to the central database, but more of a conceptual issue of how to replicate database clients that can be OFFLINE (not connected to the main server) or ONLINE (connected to the main server) and/or the data that is modified OFFLINE or ONLINE.

The (Basic) Requirements

For the pseudo concept I'm thinking of implementing a CRM (customer relationship management) tool, where salespeople can insert and modify data on their laptops. The data has to be synced with a master database (looks like what you are trying to achieve).

Locations

So I'll have a centralized database and databases on the clients. The clients are located all over the world. Some report directly with the main headquarters in California (USA), other salespeople will be located in different locations like London (UK), Paris (FR), etc.

Questions to Ask

Will the data be stored in the decentralized locaitons?
Will the data be stored only in the central locaiton?
Will people be modifying data when on the road?
Will people at the decentralized locations be modifying data directly?
On which database/system?
What if the link to California goes down?
How will the assistants work with the CRM data?

Data Sovereignty

If changes are made to the data, who takes precedence? Is the salesperson the master of modifications? The assistant? The person at Headquarters in California?

Could the data be split so as to allow for multi-side modification? HQ is allowed to change the address of the customer, but the salesperson is responsible for the actual (personal) contact information of the company representatives.

Do local decentralized locations (e.g. London, Paris) know more about the local companies than the HQ in California? Would it be better for them to have a decentralized country database? If so should certain countries be linked to the decentralized databases, rather than allowing HQ to be the master of all data?

Online / Offline

Is OFFLINE actually OFFLINE or just OFFLINE from a (de-)centralized main database? Is ONLINE the connection to the (de-)centralized main database?

Possible Concepts

Could you envision a main / satellite / client setup where each location has its own database? (Could be anything from MySQL to Microsoft SQL Server to Centura/Gupta).
You could have markers on certain tables (two columns) that determine when a record was last modified. This could be a date column lastmodified and a changedby column 00000000000000000000000..... which contains the position of the last person to modify the data.

Contents of changedby Column in Contact Person Table

111000000000000000000100000000000000010000000000....
|||                  |               |
|||                  |               +---------: Changed by Salesperson Levrat in Paris
|||                  |
|||                  |
|||                  +-------------------------: Changed by Salesperson Murry at HQ
|||
|||
|||
|||
||+--------------------------------------------: Changed in Paris
|+---------------------------------------------: Changed in London
+----------------------------------------------: Changed at HQ

When the decentralized server in Paris syncs with the HQ it overwrites the changes made at the HQ, because Paris knows better what Contact Person information is relevant for the customer in Bordeaux.

When the salesperson Levrat comes back into the office, then his information overwrites the data in Paris. Which subsequently is resynced with the database at HQ.

Each synchronization resets the flags.

Building a Solution

Now this is just a basic example and if you add the lastmodified column, then you could possible build a synchronisation the permits the last changed record to overwrite anything up or down the line, depending on your requirements.

Add another column and you have could have certain data only synchronized to the satellite locations.

Add a pulldown button to the applicaiton login and the client can login to the central (HQ), decentralized (PARIS) or local database.

Add a push-button in the client applicaiton and the client is asked to synchronize with Paris (default) or HQ (optional).

Add the split between main customer data and customer representatives and you can sync some data in one direction and other data in the other direction depending on who has data sovereignty.

Requirements Engineering

First you need to know what your current requirements are and what your future requirement could be. Then you can start to think about implementing a tool (out of the box), a complex solution or your own custom tailored build.

It's a long journey. If you take the time to analyze your requirements first and to invest a lot of time before you even start programming or looking for a solution, then you'll pass the finishing line. Otherwise you may fail to reach the finishing line....