I have some data like this:
| metaphone | lag |
|---|---|
| FLKSW | [null] |
| PPS | FLKSW |
| PPS | PPS |
| PSP | PPS |
And I want to compare the string values in both columns on the folowing condition: they're similar (assign some value, like 1) if they share at least 2 chars. Otherwise, they're not similar.
So in the example, PPS and PSP would be similar.
How can this substring comparison be achieved?
I know one approach would be to extract substrings and manually compare them, but it feels hacky and I don't know the maximum number of chars that can occur.