Upon reviewing entities, you might notice at times, the MATCH_KEY listed in the RELATED_ENTITES section for a related entity lists not only the attributes contributing to the relationship, but additionally the term (Ambiguous).
"RELATED_ENTITIES": [ { "ENTITY_ID": 3, "LENS_ID": 1, "MATCH_LEVEL": 2, "MATCH_KEY": "+NAME+PHONE (Ambiguous)", "MATCH_SCORE": "12", "ERRULE_CODE": "CNAME_CFF", "REF_SCORE": 6, "IS_DISCLOSED": 0, "IS_AMBIGUOUS": 1, "ENTITY_NAME": "PAT SMITH", "RECORD_SUMMARY": [ { "DATA_SOURCE": "AMBIG1", "RECORD_COUNT": 1, "FIRST_SEEN_DT": "2019-01-29 00:02:42.602", "LAST_SEEN_DT": "2019-01-29 00:02:42.602" } ], "LAST_SEEN_DT": "2019-01-29 00:02:42.602" } ]
What does this mean? This relationship was created due to matching name and phone number, but this related entity could resolve to the entity returned in this response message making up one side of the relationship and also another entity.
ENTITY_ID: 3 above has a relationship to both the entity this relationship is detailed in (ENTITY_ID: 1) and 1 other entity (ENTITY_ID: 2), yet entity 1 and 2 cannot resolve together due to conflicting attributes.
In this situation, because ENTITY_ID: 3 could equally resolve to either of the other entities it is marked ambiguous and instead of resolving to either, relationships to both entities are maintained instead.
Consider these 3 records, to outline a simple example:
RECORD_ID, PRIMARY_NAME_FULL, AKA_NAME_FULL, PHONE_NUMBER, DATE_OF_BIRTH 1 , "Patrick Smith" , "Paddy Smith", 787-767-2688, 1/12/1990 2 , "Patricia Smith" , , 787-767-2688, 5/4/1994 3 , "Pat Smith" , , 787-767-2688,
Patrick and Patricia are not the same entity, even though they have similar names and the same phone number their date of births differ.
Now which entity should Pat belong to? Pat is synonymous with Patrick and Patricia and the phone numbers are all the same - Pat could resolve to either Patrick or Patricia. Pat becomes his/her own entity at this time with relationships between both Patrick and Patricia. Pat is marked as an ambiguous entity.
If you graphed these 3 entities they would look like:
This could be an example of a husband and wife, siblings, or similar with the ambiguous Pat entity. Without additional attributes for these records it is challenging for a human or Senzing to accurately determine at this time.
Remember Senzing works in real-time as new information is learned about records and entities and prior assertions are adjusted to reflect all new information. Assume our data source these 3 records came from were updated with new attributes and values and immediately streamed to Senzing for update with the records now looking like:
RECORD_ID, PRIMARY_NAME_FULL, AKA_NAME_FULL , PHONE_NUMBER, DATE_OF_BIRTH 1 , "Patrick Smith" , "Paddy Smith" , 787-767-2688, 1/12/1990 2 , "Patricia Smith" , , 787-767-2688, 5/4/1994 3 , "Pat Smith" , "Patrick Smith", 787-767-2688, 1/12/1990
With the additional also-known-as (AKA) name and date of birth for Pat there is greater confidence which of the 2 entities Pat belongs to. The new AKA 'Patrick Smith' matches to the PRIMARY name on RECORD_ID: 1 along with the new date of birth and existing phone number. A new decision is asserted and the subsequent graphing of the new results would be:
RECORD_ID: 1 + 3 now resolve together into a single entity (1) with a relationship to ENTITY_ID: 2 with the ambiguity removed. The subsequent RELATED_ENTITIES section for Patrick Smith to Patricia Smith being:
"RELATED_ENTITIES": [ { "ENTITY_ID": 2, "LENS_ID": 1, "MATCH_LEVEL": 2, "MATCH_KEY": "+NAME+PHONE-DOB", "MATCH_SCORE": "12", "ERRULE_CODE": "CNAME_CFF_DEXCL", "REF_SCORE": 5, "IS_DISCLOSED": 0, "IS_AMBIGUOUS": 0, "ENTITY_NAME": "PATRICIA SMITH", "RECORD_SUMMARY": [ { "DATA_SOURCE": "AMBIG1", "RECORD_COUNT": 1, "FIRST_SEEN_DT": "2019-01-28 23:47:08.445", "LAST_SEEN_DT": "2019-01-28 23:47:08.445" } ], "LAST_SEEN_DT": "2019-01-28 23:47:08.445" } ]
When observing a relationship described as ambiguous, there is an entity that could resolve to multiple entities. Additional data is needed in order to provide further clarity to the entity resolution process to resolve instead of ambiguously relating entities. Until then, the relationship(s) will be ambiguous.
Comments
1 comment
Nice work on this one Ant!
Please sign in to leave a comment.