During loading of data to Senzing, entity resolution and subsequent matching occurs immediately as your data is observed. There are three main matching outcomes that can result.
- Match or Duplicates
Records we found that represent the same person or organization, also referenced as duplicates. You can expect to see the majority of matches containing matching attributes that include:
- Identical or similar names
- Address, phone or email address
- Birth date and identifiers if supplied
There definitely won’t be any important differences at this level, e.g. different unique identifiers such as SSN, Account Numbers, Passport etc
- Possible Match or Possible Duplicates
Records have matching values that would ordinarily represent the same person or organization, yet there are also important non-matching attributes holding them apart. They likely have:
- Possibly weaker name match
- Matching address, phone, email or identifier
- A different birth date, gender or identifier
- Possibly Related
Records only share one or two pieces of information with each entity. Usually these are just matches on addresses, phones, or emails which is often their spouse or child’s record. We include these as sometimes the names can be so badly entered that it could potentially be the same person and/or interesting correlations can be gleaned.
When entity records have not met any of these possible outcomes they are considered singleton entities. That is entity resolution has not determined they are the same or similar enough to other entity records currently loaded.
Duplicate is used when considering records within the same data source. Match is used when considering across data sources.
Please sign in to leave a comment.