0

I have set up a Change Data Capture (CDC) pipeline using PeerDB to mirror tables from a PostgreSQL standby read replica to ClickHouse.
• The PostgreSQL database contains terabytes of data.
• The initial snapshot of the existing data needs to be loaded into ClickHouse.
• PeerDB is configured to pull from the standby read replica.
Questions:

  1. How long will the initial snapshot take? Are there any benchmarks or estimations based on database size?
  2. Will the initial snapshot affect the standby PostgreSQL server’s performance?
    • Since it is a read replica, will PeerDB’s snapshot queries (e.g., COPY, SELECT * FROM) put significant load on it?
    • Would it impact replication lag from the primary database?
  3. Are there any best practices to optimize the initial snapshot process to minimize impact on the standby server?

0 Answers0