If you need to find rows that contain lowercase letters in SQL Server, you could try one of the following options.
Sample Data
Suppose we have a table with the following data:
SELECT c1 FROM t1;
Result:
+----------------+ | c1 | |----------------| | CAFÉ | | Café | | café | | 1café | | eCafé | | James Bond 007 | | JB 007 | | 007 | | NULL | | | | É | | É 123 | | é | | é 123 | | ø | | Ø | +----------------+
We can use the following methods to return the rows that contain lowercase letters.
Option 1: Compare to the UPPER()
String
We can use the UPPER()
function to compare the original value to its uppercase equivalent:
SELECT * FROM t1
WHERE UPPER(c1) COLLATE Latin1_General_CS_AS <> c1;
Result:
+----------------+ | c1 | |----------------| | Café | | café | | 1café | | eCafé | | James Bond 007 | | é | | é 123 | | ø | +----------------+
By using the not equal to (<>
) operator (you can alternatively use !=
instead of <>
if you prefer), we only return those rows that are different to their uppercase equivalents. The reason we do this is because, if a value is the same as its uppercase equivalent, then it was already uppercase to begin with (and we don’t want to return it).
We also use COLLATE Latin1_General_CS_AS
to explicitly specify a case-sensitive (and accent-sensitive) collation. Without this, you could get unexpected results, depending on the collation being used on your system.
Option 2: Compare to the Actual Characters
We can alternatively use the LIKE
operator, and specify the actual lowercase characters that we want to match:
SELECT * FROM t1
WHERE c1 LIKE '%[abcdefghijklmnopqrstuvwxyz]%'
COLLATE Latin1_General_CS_AS;
Result:
+----------------+ | c1 | |----------------| | Café | | café | | 1café | | eCafé | | James Bond 007 | +----------------+
In this case, less rows are returned than in the previous example. This is because I didn’t specify characters like é
and ø
, which were returned in the previous example. Although one row does contain é
, that row was only returned because it also contains other lowercase characters that do match.
Therefore, this example is more limited than the previous one, but it does provide you with more control over the characters you want to match.
Option 3: Compare to a Range of Characters
We can alternatively specify the range of characters we want to match:
SELECT * FROM t1
WHERE c1 LIKE '%[a-z]%'
COLLATE Latin1_General_100_BIN2;
Result:
+----------------+ | c1 | |----------------| | Café | | café | | 1café | | eCafé | | James Bond 007 | +----------------+
In this case, I used a binary collation (Latin1_General_100_BIN2
). I did this because binary collations sort each case separately (like this: AB....YZ...ab...yz
).
Other collations tend to intermingle the uppercase and lowercase letters (like this: AaBb...YyZz
), which would therefore match both uppercase and lowercase characters.
Option 4: Find the First Instance of a Lowercase Character
Another way to do it is to use the PATINDEX()
function:
SELECT * FROM t1
WHERE PATINDEX('%[abcdefghijklmnopqrstuvwxyz]%', c1
COLLATE Latin1_General_CS_AS) > 0;
Result:
+----------------+ | c1 | |----------------| | Café | | café | | 1café | | eCafé | | James Bond 007 | +----------------+
In this example, we specify the exact characters that we want to match, and so in this instance, we didn’t get the rows with characters like é
and ø
(other than the one that also contains other characters that were matched).
One benefit of this technique is that we can use it to ignore the first character (or specified number of characters) if we so wish:
SELECT * FROM t1
WHERE PATINDEX('%[abcdefghijklmnopqrstuvwxyz]%', c1
COLLATE Latin1_General_CS_AS) > 1;
Result:
+----------------+ | c1 | |----------------| | Café | | 1café | | James Bond 007 | +----------------+
Therefore, we can return all rows that contain lowercase characters, but where the first character is not lowercase.
This is because PATINDEX()
returns the starting position of the first occurrence of the pattern (in our case, the pattern is a list of lowercase characters). If the starting position of the first occurrence is greater than 1, then the first character isn’t in our list of lowercase characters.
While this technique could be used to ignore the first character being uppercase, it doesn’t rule out that the first character could be another character, such as a number. We can see this in the second row, which contains 1café
.
Option 5: Find the First Instance Based on a Range
We can also use PATINDEX()
with a range:
SELECT * FROM t1
WHERE PATINDEX('%[a-z]%', c1
COLLATE Latin1_General_100_BIN2) > 1;
Result:
+----------------+ | c1 | |----------------| | Café | | 1café | | James Bond 007 | +----------------+
I again used a binary collation (like with the other range example).