How do I restore a schema in Cassandra?

Question

This is an example scenario and we wanted to understand if it would be possible to recover it. And also understand better about the schema.

In a hypothetical scenario of just 1 node, Cassandra 3.11. I have 1 keyspace and 1 table.

root@dd85fa9a3c41:/# cqlsh -k cycling -e "describe tables;"
rank_by_year_and_name

Now I reset my schema and restart Cassandra: (I have no nodes to replicate it again)

root@dd85fa9a3c41:/# nodetool resetlocalschema

With the new schema, I no longer "see" my keyspace+table:

root@dd85fa9a3c41:/# cqlsh -e "describe keyspaces;"
system_traces  system_schema  system_auth  system  system_distributed

I lost my original schema, where was my keyspace+table. But, they are still on disk:

root@dd85fa9a3c41:/# ls -l /var/lib/cassandra/data/cycling/
total 0
drwxr-xr-x 1 root root 14 Nov 22 11:32 rank_by_year_and_name-4eedbbf0

How could I restore that keyspace in this scenario?

With sstableloader I could recreate keyspace+table and import.

Is it possible to restore this scenario without sstableloader?

score 1 · Answer 1 · answered Jan 23 '23 at 05:37

I realise it's a hypothetical scenario but running resetlocalschema on a single-node cluster is a bad idea. The node is supposed to drop its copy of the schema and request the latest copy from other nodes but in the case of a single-node cluster, there are no nodes to get the schema from.

You really shouldn't run resetlocalschema on a single-node cluster unless you're doing some specific test or edge case activity as discussed in CASSANDRA-5094.

Now to your question on how you would restore the schema, most enterprises have a copy of their schema usually in a Change Management system (or CI/Config Management System). Before updates can be made to the schema in production, it usually goes through testing, peer-review, staging/pre-production validation, and finally deployed to production through an approved Change Request (terms might differ between organisations but the net intent is the same).

Similarly when you perform regular backups, the nodetool snapshot command stores a copy of the schema together with the SSTable backups. In this example I posted in How do I migrate data in tables to a new Cassandra cluster?, you can see that the snapshots/ folder contains both a manifest.json (inventory of SSTables included in the snapshot) and a schema.cql (the schema at the time of the snapshot):

data/
  community/
    users-6140f420a4a411ea9212efde68e7dd4b/
      snapshots/
        1591083719993/
          manifest.json
          mc-1-big-CompressionInfo.db
          mc-1-big-Data.db
          mc-1-big-Digest.crc32
          mc-1-big-Filter.db
          mc-1-big-Index.db
          mc-1-big-Statistics.db
          mc-1-big-Summary.db
          mc-1-big-TOC.txt
          schema.cql

From the above you should be able to see that you have two options available:

recreate the schema from a copy that's been submitted/peer-reviewed in your Change Management System, or
recreate the schema from the snapshot.

The choice depends on what you're trying to achieve. Cheers!

How do I restore a schema in Cassandra?

1 Answers1