8

I want to use TABLESAMPLE to randomly sample from rows that meet a certain condition with PostgreSQL.

This runs fine:

select * from customers tablesample bernoulli (1);

But I cannot figure out how to embed the condition in the script. This for example

select * from customers where last_name = 'powell' tablesample bernoulli (1);

throws this error:

SQL Error [42601]: ERROR: syntax error at or near "tablesample"
Position: 71

Reveille
  • 189
  • 1
  • 5

2 Answers2

10

tablesample is an "attribute" of a table, not a query. So you need to write it write after the table name:

select * 
from customers tablesample system (1)
where last_name = 'powell';

Note that the where clause will be applied after the table has been sampled. It will not return 1% of all customers that have the last name 'powell'. But instead it will apply the filter on the sampled table.

4

I was also wondering if this was possible and stumbled on this post. I didn't do much research on the tablesample attribute, so technically the other answers are more correct to your direct question.

However you can work around this if you really want to use the tablesample attribute by creating a temporary table (or similar) based on your conditional query.

create temp table powell_customers as select * from customers where last_name = 'powell';
SELECT * from powell_customers tablesample system(1); 
ZachSand
  • 41
  • 2