1

I need one help! Our current Cassandra version is 3.11.6, we want to migrate to new Cassandra cluster nodes having version 4.0.13.

  1. We took the snapshot of the one of our database from current 6 nodes.
  2. We created tables and schemas for all the tables in new cassandra 6 nodes.
  3. Restored the snapshots in their respective new cassandra nodes.
  4. Ran sshupgrade.
  5. Ran nodetool repair on each nodes one at a time (Our cassandra version is 4.x, so by default it will perform incremental repair)
  6. We ran nodetool repair -vd and verified that all nodes are in sync.
  7. But when we did the count, its not matching. For example for one of the tables, the count in current cassandra node was showing 140k while in new cassandra nodes, it was showing 85k. We checked for couple of tables and data mismatch was there for all of them.

We tried running full repair instead of incremental repair in step 5. We checked both debug logs and system logs but we didn't find anything specific which can indicate a problem. We tried copying all the .db files from the existing node and copied it to new nodes and followed the same steps again (Steps 4 - 6). We tried to run nodetool rebuild (even though its seems irrelevant) We compared the configuration of existing node with new nodes and its same.

Any suggestion on what can we do further?

Rohit Gupta
  • 2,116
  • 8
  • 19
  • 25

2 Answers2

0

So something you could try, would be to upgrade the current 3.11.6 cluster to 4.0.13. Then, (wipe) and join the new 4.0.13 nodes to that cluster as a new, logical data center. Run a nodetool rebuild on each of the new nodes, and they should be all-set.

Aaron
  • 4,420
  • 3
  • 23
  • 36
0

Based on your description, it sounds like you've attempted to clone the application data from the existing Cassandra 3.1 cluster to a new 4.0 cluster by copying the data files from one node to another. This will only work if the configuration of the new cluster is identical to the source cluster, otherwise the files that you copied to a node may not be owned by that node so the data is effectively "lost".

For the cluster configuration to be considered identical between existing and the new, (1) they have to have identical topologies (same number of DCs, same number of nodes in each DC), AND (2) the token assignments are identical.

For example, if you copied one of the SSTables to a node that is assigned token range [100..199] but the SSTable contains data that ranges from token value 110 to 230, the node will only serve requests for data it owns so any partitions in the SSTable that map to token values 200-230 are not readable/retrievable because the node doesn't own it.

I've previously written procedures for cloning data from one cluster to another. For cloning data to an identical cluster, see How to restore snapshots to a cluster with identical configuration. Otherwise, see How to clone application tables to a new cluster.

Repair isn't necessarily relevant when cloning data to another cluster so it's not part of the migration procedure. But you will definitely need to run nodetool upgradesstables to force a rewrite of the C* 3.11 SSTables to the new 4.0 format. Note that the sstableupgrade utility will do the same thing except it's designed for offline usage. Cheers!

Erick Ramirez
  • 4,590
  • 1
  • 8
  • 30