3 Ways to Remove Duplicate Rows from Query Results in SQL

Sometimes when we run a SQL query, we only want to see unique rows. But depending on the columns we’re selecting, we might end up with duplicate rows. And this could happen without even knowing it, especially with large data sets.

But it doesn’t have to be this way.

Fortunately most SQL databases provide us with an easy way to remove duplicates.

Continue reading

How NULL Values Can Affect your Results When Using the COUNT() Function in SQL

The SQL COUNT() function is a handy tool for telling us how many rows would be returned in a query. We can pass a column name to the function or we can pass the asterisk (*) wildcard to indicate all columns.

If a column contains NULL values, we could get different results, depending on whether we use the column name or the asterisk (*).

Continue reading

How the SQL UNION Operator Deals with NULL Values

The SQL UNION operator concatenates the results of two queries into a single result set. By default it returns distinct rows (i.e. it removes any redundant duplicate rows from the result set). But we can also use UNION ALL to return non-distinct rows (i.e. retain duplicates).

When it comes to NULL values, it’s pretty straight forward. SQL treats two NULL values as non distinct values. In other words, they’re duplicates.

Continue reading

How to Limit the Rows Returned by a SQL Query

When writing SQL queries, we’ll often use a WHERE clause or HAVING clause to narrow the results down to just those rows that we’re interested in.

But sometimes we might want to reduce the number of rows returned without adding extra filtering criteria. Sometimes we might just want to see a handful of rows, without hundreds, thousands or even millions of rows being returned.

Continue reading

How to Force a Guaranteed Minimum Value When Selecting a Column in SQL Server

Suppose you’re trying to query column, but you need to set a minimum value to be returned, even if the column contains values that are less than that minimum. For example, you want a minimum value of 50 to be returned, even if the column contains values that are less than 50.

We can use the GREATEST() function to build such a query.

Continue reading

Understanding the DISTINCT ON Option in PostgreSQL

Most major RDBMSs support the DISTINCT clause, which allows us to get unique – or “distinct” – rows from our SQL queries. But PostgreSQL’s implementation of this clause has an extra option that most other RDBMSs don’t include.

PostgreSQL allows us to include the ON() option when using DISTINCT. This enables us to specify exactly which columns should be evaluated by the DISTINCT modifier, while simultaneously allowing us to return columns that aren’t evaluated.

Continue reading