30

When matching patients based on demographic data are there any recommendations on what fields should match for the patient to be the "Same Patient"?

I know the algorithms will be different for different implementations, I'm just curious if there are any best practices or recommendations around this process.

First Name
Last Name
Date of Birth
SSN
Address
City
State
Zip

etc?

ConcernedOfTunbridgeWells
  • 17,081
  • 2
  • 59
  • 71

7 Answers7

20

There's this great essay (in spanish, sorry) written by Pablo Pazos, a CS Engineer from Uruguay who has been working on Healthcare IT since 2006 and has made some great contributions to the field, in which he describes an algorithm for doing this.

You can run the article through a translator, but the gist of it is that the basic info to determine a person's identity are their given and family names (both from father and mother), sex, and date of birth. Interestingly enough, he specifically excludes id numbers like SSN from his identity matching algorithms, since "any kind of identifier is NOT part of his identity" (I guess this point could be debatable, though). Also, he excludes attributes like street address, phone numbers, etc. since they aren't really related with the identity of someone, they aren't associated with "who someone actually is".

Also, he assigns different "weights" to each of the former attributes, like this:

  • First name: 17.5%
  • Middle name: 17.5%
  • Family name (father): 17.5%
  • Family name (mother): 17.5%
  • Sex: 10%
  • DOB: 20%

With the matches found on every one of these attributes, he describes a methodology to obtain a composite "concordancy match index" with which comparisons between records can be possible. Also, "partial" matches on the name attributes are possible by using algorithms like Levenshtein's distance.

Good read, IMO. Sorry it's in spanish, but I hope I was able to convey its main ideas.

13

There is no single magic algorithm for patient matching, and I doubt there ever will be.

For starters, there are regional variances. As MMattoli pointed out, what works well in an urban United States hospital probably won't fit well in a rural Australian clinic treating Aborigines.

Also, individual sites have differing views on fault tolerance. If you only matched when you were absolutely sure, you'd get a lot of missed matches. This causes duplicate patient records, which creates a whole other set of problems. Most sites will be willing to settle for pretty sure, but how sure is sure enough? Ask 10 people and you'll get 12 answers.

Therefore the "best" algorithm will be configurable, so your customers can tune it to fit their needs.

When considering a match, different fields offer varying degrees of confidence.

Healthcare-specific identifiers offer the most confidence, since their whole purpose is to uniquely identify the person within the health system. Hospitals usually take pains to make sure these do not get duplicated.

Examples:

  • National Health ID (e.g. UK NHS Number)
  • Hospital-assigned Medical Record Number.

Other patient identifiers may offer high confidence as well, depending on the system. For instance, a Military ID is probably very relevant in a military hospital.

Examples:

  • Military ID
  • Insurance ID
  • Social Security Number (In the US, Social Security Number is generally not considered a high-confidence match, due to rampant insurance fraud.)

In absence of unique identifiers, one must resort to demographic information. It is ill-advised to match on any one field, but the more demographic field match, the more confident the match.

Things about a person that don't often change are good for matching:

  • Name
  • Gender
  • Date of Birth

But even more malleable information can be considered in the match to boost confidence:

  • Address
  • Phone Number
  • Email Address
Lynn
  • 231
  • 1
  • 3
7

It is also worth checking previous lastnames as these often change.

Andy Judson
  • 171
  • 2
4

Apart from obvious combinations of the following three given in your question

First Name
Last Name
Date of Birth
City
State
ZIP/Pin Code

I would think of adding phone number (Home and/or Cell) to the list. These days it is quite common and every will have a unique number and even if some times people change their phone numbers, older phone numbers are remembered by most people, so can come handy.

We found address often suffers from multiple spellings and multiple ways of rendering especially in countries like India where people use a local language and patient management softwares 'still' uses English.

Jamess
  • 237
  • 1
  • 4
  • 9
3

The gender in the records seems often be derived from First Name. I have seen increased variance in gender for foreigners, when we can't derive the gender from the name.

In Germany we have some further variances with names containing the 'Umlaute' like 'äöü', which are sometimes replaces by 'ae oe ue'.

bernd_k
  • 12,369
  • 24
  • 79
  • 111
1

My thought is in the order as below 1). SSN, Last name, and first 5 chars of first name 2). SSN, Birthdate and first 5 chars of first name 3). SSN, Birthdate and last name 4). SSN, Gender, Birthdate 5). Last name, first 5 chars of first name ,city and zip

1

This is a really tough problem in the US. Names are not unique and often change during a person's lifetime or are presented differntly (Rob versus Robert for instance), so they can never be used to identify the patient except in conjunction with some more realiable information. Health insurance number and provider changes much more frequently and may be the same for multiple members of the family. SSN is supposedly unique, but there is fraud around it. Same with Driver's liscense number which of course not everyone will have.

Personally, I would start with insurance policy number and date of birth and name combination, then ssn and date of birth and name combination. I would check address and phone to give me additonal assurance when they match but not much weight if they don't. Additonally I would use blood type as a rule out factor if it is known (and we all know the hospital vampires will be taking blood samples) as that doesn't change. Name matching would have to be fuzzy match due to the name varaition problem. Other things should generally look for an exact match first themna fuzzy match if the name confidence is really high (could have been a typo entering the SSN).

HLGEM
  • 3,153
  • 18
  • 18