2

i've re-read several times this article but still can't understand some moments...

  1. i have prod cluster with 3 dc and the table with rf=3 in each dc
  2. i've done taken snapshot with ansible on each node simultaneously

2.1) i've created the same schema on test cluster for keyspace and table

  1. next step i have to copy each snapshot to each test server ? example: prod1 --> test1:path_to_table, prod2 --> test2:path_to_table?
  2. final step: where exactly should i run sstableloader?, must i run it on each server in the test cluster? or only on some one? the -d option means source node or target node? please help.
Erick Ramirez
  • 4,590
  • 1
  • 8
  • 30
jetjo
  • 39
  • 6

1 Answers1

2

You don't have to copy the snapshots to the destination nodes in your "step 3". You can place the snapshots on any server or node which has (1) Cassandra installed but (2) does NOT have to be running.

The idea is that you need the Cassandra binaries to run the sstableloader tool. We (3) don't recommend running sstableloader on the source nodes particularly if they're part of a production cluster because it can affect the performance of your production system.

You can run sstableloader on the destination nodes (your test servers) but be aware that (4) if the snapshots are on the same disk as the installed Cassandra data directories, (5) they will compete for the same available IO and you'll likely find that the bulk-load will be slow. Cheers!

Erick Ramirez
  • 4,590
  • 1
  • 8
  • 30