When working with large datasets in DuckDB, the SAMPLE
clause offers an efficient way to query a subset of your data. However, unless you specifically construct your query to get repeatable results, this sampling will return a different set of results each time the query is run.
But we can change that. We can write our query to return the same random result set every time we run it.
This article explores how to achieve consistent, reproducible result sets when using the SAMPLE
clause in DuckDB.