DuckDB’s SAMPLE
clause is a handy feature that allows us to work with a random subset of our data. This is particularly useful when dealing with large datasets where processing the entire dataset might be time-consuming or unnecessary for exploratory data analysis, testing queries, or creating representative samples.
When we use this clause, we can specify the absolute number of rows to return, or a percentage of rows. We also have an option of sampling method to use.
Continue reading