Hash, Replication, and Round Robin Distributions - Need more clarification

Question

I will give a few examples of the tables we have:

Table1 - Most queries are State Code and Year, has about 1000 rows, will grow by 1000 rows a year, used Round Robin
Table2 - Most queries are by State Code, Year, Column 1 (String), has about 1000 rows, will grow by 1000 rows a year, used Round Robin
Table3 - Most queries are by State Code and Year, has about 100,000 rows, will grow by about 25000 rows a year, used Hash
Table4 - Most queries are by Year, has about 100,000 rows, will grow by about 25000 rows a year, plan to use Hash
Lookup Table 5 - Most queries are by State Code and Look up Id, has about 10 rows, used Replication
Lookup Table 6 - Most queries are by State Code and Look pup Id, has about 500 rows, used Replication

Did we use the correct types of distribution? Can someone give a more concrete/better example/clarification of when/why you should use each type of distribution? Microsoft's documentation/guidance wasn't very helpful.

score 1 · Answer 1 · answered Dec 20 '22 at 22:52

For tables of this size you shouldn't use Synapse Dedicated SQL pool. Per the Synapse guidance all those tiny tables should be Replicated, which is a strong indication that your data is too small for a Massively Parallel database system like Synapse SQL Pool.

Use Azure SQL Database instead, scales down to less than 1 core, and up to around 80 cores, plus additional readable replicas, and which supports columnstore tables, and partitioning.

Hash, Replication, and Round Robin Distributions - Need more clarification

1 Answers1