0

I've been using Cassandra and have run into various problems with Tombstones¹. These would cause detrimental issues when I would later run a query. For example, if I overwrite the same few rows over and over again, even though I still have 5 valid rows in my database, it can take minutes to read them because the Cassandra system would have to read over all the Tombstones to finally reach the useful data.

What I'm wondering is whether Google Bigtable has a similar anti-pattern?

In my situation, I will be using the Bigtable to do many writes, up to 10,000 a minute, and once an hour, another app reads the data to update its caches to the latest. Many of the writes are actually updates to existing rows (same key).

¹ Tombstones: what's left behind when deleting or updating a row in a Cassandra database. These get removed from the database when a compression of the table occurs, assuming it is given time to run said compression.

Erick Ramirez
  • 4,590
  • 1
  • 8
  • 30
Alexis Wilke
  • 135
  • 2
  • 12

1 Answers1

1

In my experience, use cases which suffer from tombstone issues are almost always due to storing queues or queue-like data.

Consider the scenario where packages still to be delivered by a driver are stored in this table:

CREATE TABLE pending_packages_by_driver_name (
    driver_name text,
    package_id int
    ...
    PRIMARY KEY (driver_name, package_id)
) WITH CLUSTERING ORDER BY (package_id ASC)

For each delivery driver in the table, there are one or more rows of package_id sorted in ascending order (to keep it simple). As each package is delivered, they are deleted from the table so that only undelivered packages remain.

Let's say for the purposes of this example that Alice has already delivered 5 packages whose package_id are 121, 122, 123, 124 and 125 with 3 packages 126, 127 and 128 remaining.

Here is an illustration of the partition where driver_name = 'Alice', the deleted packages with a tombstone marker [d] (for "deleted") and the remaining packages (live rows):

+---------+---------+---------+---------+---------+---------+-----+-----+-----+
| 'Alice' | 121 [d] | 122 [d] | 123 [d] | 124 [d] | 125 [d] | 126 | 127 | 128 |
+---------+---------+---------+---------+---------+---------+-----+-----+-----+

When reading this partition, Cassandra has to iterate over all the tombstoned rows (packages 121-125) to get to the live rows (packages 126-128). As more packages are delivered, the number tombstones will continue to grow until such point that an exception is thrown because there are too many.

As a side note, one can argue that modifying the data model such that the packages are ordered in the reverse order (WITH CLUSTERING ORDER BY (package_id DESC)) so that the live rows are at the "start" like this:

+---------+-----+-----+-----+---------+---------+---------+---------+---------+
| 'Alice' | 128 | 127 | 126 | 125 [d] | 124 [d] | 123 [d] | 122 [d] | 121 [d] |
+---------+-----+-----+-----+---------+---------+---------+---------+---------+

In this very specific case, the data model works because the deleted rows are at the end of the partition but only because my example is over-simplified. In practice, the packages wouldn't be delivered in that order.

In most cases, queue and queue-like use cases operate on a first-in, first-out basis so iterating over the tombstones (processed items in the queue) is unavoidable. This is why queues are anti-pattern for both Cassandra and Bigtable.

If you haven't already seen it, this blog post talks about it in a bit more detail -- Cassandra Anti-Patterns: Queues and Queue-like Datasets.

Ryan Svihla's post Understanding Deletes is also an excellent resource as he talks about an alternative data modelling technique that may work for some use cases. Cheers!

Erick Ramirez
  • 4,590
  • 1
  • 8
  • 30