G2Export - How to Consume Resolved Entity Data provided an overview of using G2Export to extract resolved entity information of loaded source data from G2, and how to interpret the output. This article focused on the default CSV output format, here we will take a look at how this is represented when using JSON.
If you'd like to load the sample demo data in to G2 as used in this article:-
- python G2Loader.py -P -p demo/sample/project.csv
- Note:- This will purge (-P) the current G2 repository if you have previously loaded any data
A very useful tool for managing and parsing JSON data on Linux is jq. The output shown herein was pretty printed using jq. To install it on CentOS:-
- sudo yum install jq -y
To export the resolved entity information in JSON format to the default file g2export.json :-
- python G2Export.py -F JSON
JSON Example Output
The subsequent example output is the single JSON record for RESOLVED_ID 1 - Robert Jones taken from the g2export.json file and pretty printed using the jq command:-
- jq <<< '<JSON_RECORD>'
The JSON document represents the same information as seen in the CSV example, the following outlines the main components.
- At the root is the single RESOLVED_ENTITY, in this case RESOLVED_ID is 1 - Robert Jones
- RESOLVED_ID 1 consists of 2 source record objects contained within the RECORD array
- Both came from the PEOPLE DATA_SOURCE and their unique identifiers from the source (RECORD_ID) are 1001 and 1002
- The RECORD object for 1002 contains the MATCH_KEY and MATCH_LEVEL information detailing why these 2 records resolved together
- RESOLVED_ID 1 contains a RELATED_ENTITY array indicating the entity has possible matches and/or relationships to 1 or more other distinct entities
- There are 3 RELATED_ID objects describing the relationships, RELATED_ID is the RESOLVED_ID of the related entity
- RESOLVED_ID 1 is related to RESOLVED_ID 3, 4 and 1001
- Each RELATED_ID object describes the relationship details with MATCH_LEVEL and MATCH_KEY
- Each RECORD array contains 1 or more objects describing the record(s) constituting the related entity indicated in RELATED_ID object