4

Imagine I have graph data that is beyond the size of a single machine.

How would you shard a graph database?

I asked on Hacker News and people suggested sharding based on a hash of the predicate-subject-object, ala RDF triples.

I could partition the graph across replicas and query all replicas to aggregate answers.

But what about joins across machines?

Do I need to colocate join keys on all replicas?

Or do I just send the query to all replicas and in parallel and aggregate the responses?

0 Answers0