Below are five options for returning rows that contain uppercase letters in SQL Server.
Sample Data
Suppose we have a table with the following data:
SELECT c1 FROM t1;
Result:
+----------------+ | c1 | |----------------| | CAFÉ | | Café | | café | | 1café | | eCafé | | James Bond 007 | | JB 007 | | 007 | | NULL | | | | É | | É 123 | | é | | é 123 | | ø | | Ø | +----------------+
We can use the following methods to return the rows that contain uppercase letters.
Option 1: Compare to the LOWER()
String
We can use the LOWER()
function to compare the original value to its lowercase equivalent:
SELECT c1 FROM t1
WHERE LOWER(c1) COLLATE Latin1_General_CS_AS <> c1;
Result:
+----------------+ | c1 | |----------------| | CAFÉ | | Café | | eCafé | | James Bond 007 | | JB 007 | | É | | É 123 | | Ø | +----------------+
By using the not equal to (<>
) operator (you can alternatively use !=
instead of <>
if you prefer), we only return those rows that are different to their lowercase equivalents. The reason we do this is because, if a value is the same as its lowercase equivalent, then it was already lowercase to begin with (and we don’t want to return it).
We also use COLLATE Latin1_General_CS_AS
to explicitly specify a case-sensitive (and accent-sensitive) collation. Without this, you could get unexpected results, depending on the collation being used on your system.
Option 2: Compare to the Actual Characters
Another option is to use the LIKE
operator and specify the actual uppercase characters that we want to match:
SELECT c1 FROM t1
WHERE c1 LIKE '%[ABCDEFGHIJKLMNOPQRSTUVWXYZ]%'
COLLATE Latin1_General_CS_AS;
Result:
+----------------+ | c1 | |----------------| | CAFÉ | | Café | | eCafé | | James Bond 007 | | JB 007 | +----------------+
In this case, less rows are returned than in the previous example. This is because I didn’t specify characters like É
and Ø
, which were returned in the previous example. Our result does contain É
but that row was only returned because it also contains other uppercase characters that do match.
Therefore, this option is more limited than the previous one, but it does provide you with more control over the characters you want to match.
Option 3: Compare to a Range of Characters
We can alternatively specify the range of characters we want to match:
SELECT * FROM t1
WHERE c1 LIKE '%[A-Z]%'
COLLATE Latin1_General_100_BIN2;
Result:
+----------------+ | c1 | |----------------| | CAFÉ | | Café | | eCafé | | James Bond 007 | | JB 007 | +----------------+
In this case, I used a binary collation (Latin1_General_100_BIN2
). I did this because binary collations sort each case separately (like this: AB....YZ...ab...yz
).
Other collations tend to intermingle the uppercase and lowercase letters (like this: AaBb...YyZz
), which would therefore match both uppercase and lowercase characters.
Option 4: Find the First Instance of a Uppercase Character
Another way to do it is to use the PATINDEX()
function:
SELECT * FROM t1
WHERE PATINDEX('%[ABCDEFGHIJKLMNOPQRSTUVWXYZ]%', c1
COLLATE Latin1_General_CS_AS) > 0;
Result:
+----------------+ | c1 | |----------------| | CAFÉ | | Café | | eCafé | | James Bond 007 | | JB 007 | +----------------+
In this example, we specify the exact characters that we want to match, and so in this instance, we didn’t get the rows with characters like É
and Ø
(other than the one that also contains other characters that were matched).
One benefit of this technique is that we can use it to ignore the first character (or specified number of characters) if we so wish:
SELECT * FROM t1
WHERE PATINDEX('%[ABCDEFGHIJKLMNOPQRSTUVWXYZ]%', c1
COLLATE Latin1_General_CS_AS) > 1;
Result:
Time: 0.472s +-------+ | c1 | |-------| | eCafé | +-------+
Therefore, we can return all rows that contain uppercase characters, but where the first character is not uppercase.
This is because PATINDEX()
returns the starting position of the first occurrence of the pattern (in our case, the pattern is a list of uppercase characters). If the starting position of the first occurrence is greater than 1, then the first character isn’t in our list of uppercase characters.
Option 5: Find the First Instance Based on a Range
We can also use PATINDEX()
with a range:
SELECT * FROM t1
WHERE PATINDEX('%[A-Z]%', c1
COLLATE Latin1_General_100_BIN2) > 1;
Result:
+-------+ | c1 | |-------| | eCafé | +-------+
I again used a binary collation (like with the other range example).