How the SQL Server DIFFERENCE() Function Works

This article aims to help you understand the DIFFERENCE() function, which is a T-SQL function available in SQL Server, Azure, etc.

The key to understanding the DIFFERENCE() function is to understand how Soundex works (or in the context of SQL Server, how the SOUNDEX() function works). This is because the DIFFERENCE() function returns the difference between the Soundex values between two strings.

A Soundex value is four characters long. If two words sound the same, they will share the same four character Soundex value. If they sound similar (but not the same), their Soundex values might share some characters but not all. For example, their Soundex values might have two characters the same and two that are different. If two words sound completely different, none of the characters in their respective Soundex values will be the same.

The DIFFERENCE() function returns a value that ranges from 0 through 4.  This value represents the number of characters in the Soundex values that are the same. 0 indicates weak or no similarity, and 4 indicates strong similarity or the same values.

Syntax

First, here’s the syntax:

DIFFERENCE ( character_expression , character_expression )

As indicated, this function accepts two arguments. The Soundex codes of each character expression is compared, and the result is returned.

Example – Exact Match

Here’s an example of what happens when both strings sound the same (i.e. they have the same Soundex value):

SELECT DIFFERENCE('Two','Too');

Result:

4

The result is 4, which is the highest possible value. This means that there’s a strong similarity between the two strings (as far as their Soundex values go).

If you’re wondering where the number 4 came from, take a look at the following code:

SELECT 
    SOUNDEX('Two') AS Two,
    SOUNDEX('Too') AS Too;

Result:

Two   Too 
----  ----
T000  T000

In this example the SOUNDEX() function is used to return the Soundex codes for each word. As it turns out, both words have the same Soundex codes (T000).  And because all four characters of the Soundex codes are the same, DIFFERENCE() returns 4.

Example – Low Similarity

Here’s an example of how it works when the strings have a lower similarity:

SELECT DIFFERENCE('Tea','Coffee');

Result:

2

The result is 2, which is in the middle. The strings don’t sound exactly the same, but they do have some similarities.

And as with the previous example, we can see where this result came from by running the following code:

SELECT 
    SOUNDEX('Tea') AS Tea,
    SOUNDEX('Coffee') AS Coffee;

Result:

Tea   Coffee
----  ------
T000  C100

So we can see that only two of the Soundex characters are the same between these words. Therefore the DIFFERENCE() result is 2.

Example – No Similarity

Here’s an example of how it works when the strings have a weak or no similarity:

SELECT DIFFERENCE('Tree','Captivated');

Result:

0

The result is 0, which is means that the strings have a very weak or no similarity.

And here are the Soundex codes for these two words:

SELECT 
    SOUNDEX('Tree') AS Tree,
    SOUNDEX('Captivated') AS Captivated;

Result:

Tree  Captivated
----  ----------
T600  C131

So in this case, none of the characters are the same, hence a DIFFERENCE() result of 0.