I'm assuming that you mean you intend to "freeze" a collection because you can't freeze a column. In any case, updating a frozen collection does not generate a tombstone.
Setting the value of a non-frozen collection generates a tombstone because to completely clear or erase the previous values of the collection. Cassandra does not do a read-before-write (except in the case of lightweight transactions) so it does not know if there are cells which hold elements of the collection therefore it needs to write a tombstone so that any older cells are not returned by a read request.
In contrast, frozen collections are serialised into a single value instead of individual elements so the entire value is updated. Since the value of a frozen collection is stored in one cell, a tombstone is not required to invalidate older pre-existing cells.
You would have been able to work these out yourself if you ran a quick test of your own. For completeness, I will illustrate with this example table that contains both a frozen set and a regular set collection:
CREATE TABLE community.freeze_test (
pkey int PRIMARY KEY,
frozenset frozen<set<text>>,
setcol set<text>
)
First, we create a new partition with:
INSERT INTO freeze_test (pkey, frozenset, setcol)
VALUES (1, {'apple', 'banana'}, {'avocado', 'blueberries'})
Then flush the memtable to disk with nodetool flush, dump the contents of the SSTable:
$ sstabledump nb-1-big-Data.db
[
{
"partition" : {
"key" : [ "1" ],
"position" : 0
},
"rows" : [
{
"type" : "row",
"position" : 18,
"liveness_info" : { "tstamp" : "2024-07-05T06:58:41.192323Z" },
"cells" : [
{ "name" : "frozenset", "value" : ["apple", "banana"] },
{ "name" : "setcol", "deletion_info" : { "marked_deleted" : "2024-07-05T06:58:41.192322Z", "local_delete_time" : "2024-07-05T06:58:41Z" } },
{ "name" : "setcol", "path" : [ "avocado" ], "value" : "" },
{ "name" : "setcol", "path" : [ "blueberries" ], "value" : "" }
]
}
]
}
]%
Notice that frozenset is just a single cell value:
{ "name" : "frozenset", "value" : ["apple", "banana"] }
but the non-frozen collection has a tombstone marker and each set element spans multiple cells:
{ "name" : "setcol", "deletion_info" : { "marked_deleted" : "2024-07-05T06:58:41.192322Z", "local_delete_time" : "2024-07-05T06:58:41Z" } },
{ "name" : "setcol", "path" : [ "avocado" ], "value" : "" },
{ "name" : "setcol", "path" : [ "blueberries" ], "value" : "" }
If we set frozenset using UPDATE, it overwrites the cell with a new value and no tombstone:
UPDATE freeze_test SET frozenset = {'oranges'} WHERE pkey = 1;
"cells" : [
{ "name" : "frozenset", "value" : ["oranges"], "tstamp" : "2024-07-05T07:02:11.064353Z" }
]
If we update the non-frozen collection with:
UPDATE freeze_test SET setcol = {'grapes','strawberries'} WHERE pkey = 1;
notice that a tombstone is also generated along with the new cells:
"cells" : [
{ "name" : "setcol", "deletion_info" : { "marked_deleted" : "2024-07-05T07:03:42.962302Z", "local_delete_time" : "2024-07-05T07:03:42Z" } },
{ "name" : "setcol", "path" : [ "grapes" ], "value" : "", "tstamp" : "2024-07-05T07:03:42.962303Z" },
{ "name" : "setcol", "path" : [ "strawberries" ], "value" : "", "tstamp" : "2024-07-05T07:03:42.962303Z" }
]
Finally, if I add an element to the non-frozen collection with the += operator:
UPDATE freeze_test SET setcol += {'mango'} WHERE pkey = 1;
The SSTable just has one cell:
"cells" : [
{ "name" : "setcol", "path" : [ "mango" ], "value" : "", "tstamp" : "2024-07-05T07:07:41.407898Z" }
]
Adding or removing an element from a non-frozen set collection does not generate a tombstone because Cassandra does not need to clear out the existing contents of the collection. Cheers!