Redis LCS Command Explained

In Redis, the LCS command implements the longest common subsequence algorithm.

The longest common subsequence algorithm finds the longest subsequence common to all sequences in a set of sequences. Note that this is different to the longest common string algorithm (also known as the longest common substring algorithm), which requires that matching characters in the string are contiguous. The longest common subsequence algorithm, on the other hand, doesn’t require matching characters to be contiguous.

The LCS command was introduced in Redis 7.0.0.

Syntax

LCS key1 key2 [LEN] [IDX] [MINMATCHLEN len] [WITHMATCHLEN]

When no arguments are specified, the string representing the longest common subsequence is returned.

The arguments work as follows:

LENReturns the length of the longest common subsequence.
IDXReturns an array with the LCS length and all the ranges in both the strings, start and end offset for each string, where there are matches.
MINMATCHLENWhen used with IDX, limits the results to only those matches of a given minimal length.
WITHMATCHLENWhen used with IDX, returns the length of each match.

These arguments are demonstrated in the following examples.

Example

Let’s set two keys:

MSET firstname "Rick" lastname "Richardson"

Result:

OK

Now let’s use LCS to find the longest common subsequence between the two keys:

LCS firstname lastname

Result:

"Ric"

Only the first three characters matched.

Also, in this case, all three characters were contiguous (i.e. consecutive). However, LCS doesn’t require them to be contiguous.

Let’s use values that contain some non-contiguous common values:

MSET firstname "Randi" lastname "Rancid"

Result:

OK

Now let’s run LCS against those two keys again:

LCS firstname lastname

Result:

"Rani"

We can see that the Rani sequence is common to both values, even though not all characters are contiguous.

We can also see that, although both keys contain the letter d, that character is out of sequence, and so it’s not included in the result.

The LEN Argument

We can use the LEN argument to get the length of the resulting string:

LCS firstname lastname LEN

Result:

(integer) 4

Here we get an integer reply of 4 because that’s the length of the resulting LCS operation.

The IDX Argument

We can use the IDX argument to get the position of each match:

LCS firstname lastname IDX

Result:

1) "matches"
2) 1) 1) 1) (integer) 4
         2) (integer) 4
      2) 1) (integer) 4
         2) (integer) 4
   2) 1) 1) (integer) 0
         2) (integer) 2
      2) 1) (integer) 0
         2) (integer) 2
3) "len"
4) (integer) 4

Matches are returned from the last one to the first one.

The above array means that the first match (second element of the array) is between positions 0 and 2 of the first string and 0 and 2 of the second. This is reporting on the Ran match.

Then there is another match at position 4 in the first string and 4 in the second. This is reporting on the i match.

The MINMATCHLEN Argument

We can use the MINMATCHLEN argument to specify a minimum length:

LCS firstname lastname IDX MINMATCHLEN 3

Result:

1) "matches"
2) 1) 1) 1) (integer) 0
         2) (integer) 2
      2) 1) (integer) 0
         2) (integer) 2
3) "len"
4) (integer) 4

In this case I specified a minimum length of 3, and so only those matches with a length of three or more are returned.

The WITHMATCHLEN Argument

We can also use the WITHMATCHLEN argument to return the actual length of each match:

LCS firstname lastname IDX MINMATCHLEN 1 WITHMATCHLEN

Result:

1) "matches"
2) 1) 1) 1) (integer) 4
         2) (integer) 4
      2) 1) (integer) 4
         2) (integer) 4
      3) (integer) 1
   2) 1) 1) (integer) 0
         2) (integer) 2
      2) 1) (integer) 0
         2) (integer) 2
      3) (integer) 3
3) "len"
4) (integer) 4

Here, I lowered the MINMATCHLEN to 1 in order to better demonstrate WITHMATCHLEN.

Case Sensitivity

Be mindful of case sensitivity when using LCS:

MSET firstname "Rick" lastname "ricK"

Result:

OK

Now let’s use LCS to find the longest common subsequence between the two keys:

LCS firstname lastname

Result:

"ic"

Only sequences with matching case were returned.