4

I’m currently building a dashboard to view some analytics about the data generated by my company's product. We use MySQL as our database. The SQL queries to generate the analytics from the raw live data can be a bit complicated and take long time to process. So I put in place some batches that run every day or every hour, query this live data and generate the analytics and store this in some special tables that are queried only by the dashboard. It works well but the drawback is that the analytics are not real-time.

So I would like to know what is the best practice for my requirements. I don’t need strict real-time but near real-time of one or a few minutes.

I would like to know if replicating the live data from MySQL to something like hadoop or Elasticsearch would be a good solution.

Julien
  • 141

1 Answers1

1

There are multiple things to consider in your case. What program are you using to create your dashboard? There are tools, such as Tableau that could help via creation of extract - though this may cost money.

You could try using PostgreSQL - it is known to be faster than MySQL and it is free. You could set up your batch process to real time where it cleans up data and saves in PostgreSQL, where you can point your dashboard.

If you have the budget, and the flexibility to store data on the cloud, look into Google Big Query. You could set up a batch process within Google Big Query that uses nodes which process data at a really fast pace (we used it at my company to improve our processing times from 22 hours to 2 hours).

Try optimizing the scripts using indices as well. This may improve processing times - may be not too much, but should help a little if you want to improve performance.

Hope this helps.

madham
  • 11