1

I'm building an app that lets the user export their data. When they click the Export button, we need to do a SELECT * WHERE user_id =, save it to a CSV file, and download it. There may be hundreds of thousands of entries.

If the user edits their data while the export is in progress, the export can include the edits, or not. Either is fine, as long as it's a consistent snapshot taken at any point in time during the export.

The question is how do I do it without killing performance for all other queries? I don't mind if the export takes longer. Can I tell the database to somehow run this query at a low priority?

Assume a MySQL or Postgres database.

The only solution I know is to set up a read replica just for exports. Exports can slow down the replica, but won't slow down the primary, thus ensuring fast performance for other queries. Is there a better solution to this problem?

1 Answers1

1

The comments above indicate that several of those who read your question are skeptical that a single query could cause such a problem on a healthy database server, even if the query has a large result set. The query of course would use quite a bit of I/O and network bandwidth to move that many rows, but MySQL should be able to handle it.

This makes me think something else is competing for resources, and when your query runs, it pushes the server over some threshold. I've seen this happen if the server is very old and not appropriate to the scale of data you are using.

Or it could be that the MySQL Server or some other process on the same server are overallocating resources (for example, RAM), and when your query runs, it exceeds a threshold so the server is overtaxed. I've seen this happen for example when the RAM usage of the MySQL Server increases to the point where it makes the operating system use swap space to supplement RAM.

You may need to investigate many operating system statistics while the query is running to spot it. I'd use top, iostat, or vmstat for example.

Bill Karwin
  • 16,963
  • 3
  • 31
  • 45