In order to perform entity resolution, Senzing needs your data mapped. As noted in the tutorial video, entities are composed of features, and features are composed of attributes. Mapping is the process of annotating the fields in a data source with a set of common terms Senzing understands and uses when performing Entity Resolution.
For instance, the column that contains a person's first name may be named fname in one data
source and firstName in another. The corresponding term for a first name in Senzing is NAME_FIRST. Reducing the different column names to common terms allows entity resolution to work.
In the "Mapped to" attribute pull-down list for each column, you will find a list of terms to select from. It’s a good idea to familiarize yourself with this list as it will inform you as to what to look for in the columns of your data sources.
You will see "Feature" below the "Mapped to" attribute selector.
The "Mapped to" pull down list is grouped by category. There are common terms for names, addresses, phones, identifiers like drivers license, passport and email addresses, and other attributes like date of birth and gender. An ideal mapping would include:
- A name
- An address, phone, and/or email
- A date of birth and/or identifier
Unfortunately, not every data source has all of these. But they should have something besides just a name; name alone is not enough to perform Entity Resolution. However, you can use the search function on a full name once your data is loaded.
You will also find that the app automatically maps many data source columns automatically. Your task is to accept or correct the ones it mapped automatically and manually assign the ones not auto-mapped.
One final thought, not every column in every data source is useful to the entity resolution process! That is you don't need to map every column, but it will still be ingested for your subsequent reference. This is useful for displaying in reports, for example.
To include or exclude a data source column use the Included checkbox at the top of the column. In this example, setting the customer_since column to None/Included indicates you'd like to ingest this column and make it available for reference, but it will not be used for Entity Resolution as it is not mapped to a valid Senzing attribute term.
In addition to using the Included checkbox to exclude a column from being loaded, you can set the "Mapped to" to None/Suppressed. This informs Senzing not to include the column during loading, as a result it will not show up on any reporting or exporting and is not used for Entity Resolution.
Ideally, you wouldn't include every un-mapped column from a source, especially columns with a large amount of text in them as they will make your reports harder to read. Remember, you can always go back to the source system and view the full details of all fields that are not relevant and used by Entity Resolution.
That’s all there is to mapping! We suggest you add your data sources one by one and review the matches Senzing determined each time. See the troubleshooting article if you didn’t get the results you were expecting.