Entity Resolution 2.0
Following a few decades of experience building some of the largest commercially available and real time Entity Resolution systems in existence, a ground-up skunkworks effort and project began within IBM during 2009.
Drawing upon a combination of best practices we have picked up over the years and exciting new breakthroughs what we have today is unlike any other Entity Resolution technology on the market - Senzing is nothing short of Entity Resolution 2.0.
The following outlines a few unique features and capabilities.
Computationally Efficient Self-learning & Self-tuning
As Senzing accumulates additional context it becomes smarter. Thus, Entity Resolution results and outcomes improve over time. Furthermore, when Senzing makes a specific discovery (e.g., (800) 555-1212 is a common phone number ), not only does Senzing take this into account in the future, Senzing re-evaluates all previous assertions involving this phone number. The ability to use new observations to reverse earlier assertions (also referred to as “Sequence Neutrality”) in real time and at scale, is non-trivial.
Business Value: This is exceptionally important for two reasons:
- Experts are no longer required to train and tune the Entity Resolution system, lowering the barrier to entry is significantly reduced in terms of time, cost and experience.
- Automatically fixing the past as new discoveries are made updates the system in real time like no other Entity Resolution system is capable of doing. While traditional Entity Resolution methods (without Sequence Neutrality) have to re-train and re-load on a periodic basis - correcting for “error drift” in the data - Senzing self-corrects historically false positives and false negatives as additional data records are continuously processed.
Related Information: Smart Systems Flip-Flop
Entity Centric Learning
Entity Resolution benefits from all of the data features contained within a resolved entity, this is in contrast to traditional 1:1 record matching methods and techniques.
Business Value: Required to effectively perform Entity Resolution in 'weak signal' environments. Examples of weak signal environments include: professional adversaries attempt to obfuscate their identities, low fidelity data sources, seemingly incompatible data sources.
Senzing supports both disclosed and derived relationships. When taken together, these relationships allow Senzing to fully capture and manage the entire resolved entity graph. Disclosed relationships are those known to exist between entities that are reported as observations toSenzing (e.g., family members, employment affiliations, and business hierarchies). Derived relationships are discovered at run-time as new data records are ingested and analyzed (e.g., two distinct entities sharing a physical address could be roommates, twins, etc).
Business Value: Comprehending relationships between resolved entities improves context – with this added context comes higher quality business decisions. For example, if a loan application contains a personal reference who is roommates with Billy the Kid, this might warrant further investigation before finalizing a credit decision.
Real Time Transactional Entity Resolution
Senzing delivers low-latency, high transaction rate, Entity and Relationship Resolution.
Business Value: Real-time, entity resolved and graphed data allows Senzing to be deployed into operational and mission critical systems. Transactional Entity Resolution incrementally integrates new observations in real time eliminating the need to periodically refresh the entire data store. This has huge performance implications: there is a big difference between integrating the latest 10k additions or changes transactionally (each using insignificant compute) versus a batch based system that must reprocess the entire data set (re-boil the ocean) to integrate the latest 10k transactions.
Senzing integrates IBM’s Global Name Management (GNM) technology for culturally-aware and linguistically sensitive name comparisons. GNM is the only government certified name comparison algorithm on the market.
Business Value: Leveraging the world-class GNM library, Senzing is able to compare names for similarity across cultures and scripts – an essential feature that delivers higher quality Entity Resolution.
Related Information: IBM InfoSphere Global Name Management
Selective Anonymization - A Privacy by Design (PbD) feature
Senzing provides the ability to anonymize entity attributes (including geospatial data) at the system of record, before data is transferred to Senzing for Entity Resolution. The ability to perform high quality (fuzzy) Entity and Relationship Resolution using only these anonymized features allows for data sharing and insight in situations and use cases where the release and sharing of sensitive data was previously impossible.
Business Value: This technique increases the willingness of data owners to participate in information sharing ecosystems because this differentiating technique greatly reduces the risk of unintended disclosure.
Related Information: To Anonymize or Not Anonymize, That is the Question
Principle-Based Entity Resolution
A breakthrough technique that blends fuzzy matching, deterministic and probabilistic methods into a high-performance, self-tuning, self-correcting, entity agnostic (e.g., people, organizations, cars, boats, planes) Entity Resolution and Relationship discovery engine. No longer does Entity Resolution require experts, training data and manual tuning.
Business Value: No experts and no training means users can start loading observations (data) almost immediately, significantly decreasing time to value. Add new data sources, new kinds of entity classes, and new entity features (attributes) with ease without experts or preliminary training steps.
Related Information: Principle-based Entity Resolution
The Senzing technology does much, much more.
With a suite of Privacy by Design (PbD) features, feedback loop support, an underlying schema designed for linear scale on elastic compute infrastructures, Senzing is unique and we're confident a game changer.