Eventually large advanced users will want to do some level of tuning to the configuration. This could be as simple as changing a feature from an ID comparison to an EXACT MATCH or a from exclusive to not exclusive. Once the need is identified and the configuration change made, the next question is "What do I do with the data I've loaded?".
The answer is fairly straight forward but is "It depends". How much data do you have? What is the velocity of data coming in? Is there a compliance requirement that demands the configuration take full effect immediately?
Note: If all you did was add new data sources or features for new data sources, there is no need to worry. Just update your configuration and move on as the existing data is not impacted.
Option 1: Let nature take it's course
For large deployments this is almost always the right choice. The configuration changes tend to impact a small percentage of records, though they could be hard/expensive to identify which, and the velocity of data is high. As new data comes in, the system will use the new configuration to effect changes so the system will adjust over time with no need for the customer to do anything. If a user happens upon an entity that is impacted by the change but has yet to be adjusted, a re-evaluation message can be sent just for that record.
If the changes impact data mapping of existing records (e.g. we ignored an attribute that we now use, or we now ignore one we used to use) or impact a significant number of records in an operational way, this is not suitable.
Option 2: Reprocess the records (mapping changes)
If there are mapping changes (e.g. we ignored an attribute that we now use, or we now ignore one we used to use), then the easiest solution is to just reload the original records into the live system. The system will optimize processing on any records that did not change and trigger processing only for the changed records.
Option 3: Reprocess the records (Entity Resolution changes)
If there are Entity Resolution (ER) changes (e.g. making same NAME+same DOB a possible match instead of a match), just reloading the same records will do nothing as the system will see the records haven't changed and not trigger processing.
Senzing has the ability to force a re-evaluation of a record, please contact email@example.com as it is a low level function at this time. We plan to make it a higher level API function in the future.
Option 4: Purge and reload
In development/test systems or in smaller deployments, this is the easiest and expedient way to resolve this. It doesn't matter what the change is, it will fully address it and it will probably be faster to turn around a reload on a small system then decide which of the above options you wish to apply.