Overview
This article is for advanced use only. Most users will just add DataSources with the G2Config API or G2Loader.py. The same principles apply but it is a much simpler problem to solve.
The Senzing API externalizes the configuration into a JSON file. This allows for management in a standard configuration management tool (e.g. Git, CVS) as well as promotion of new configurations through Dev, Test, Staging, UAT, and Prod.
At the same time the separation of the config from the database itself requires some coordination so they don't get out of sync. The JSON allows for a lot of flexibility and standard practices, but if those practices aren't in place you will have errors.
Basic Practice
Do not remove or change IDs once data is loaded
Senzing normalizes types (features, data sources, etc) into IDs in the database. The g2config.json configuration file maps those IDs to what they actually are: FTYPE_ID 1 is a NAME, DSRC_ID 2 is SEARCH, etc. If you add new types either manually, via the API, via G2Loader.py, or with the G2ConfigTool.py and then use the configuration to load the data, you can't remove those types in the future.
For instance, if you add a new feature called MYCOMPANY (FTYPE_ID 1001) and load data, you will now have features with ID 1001 in the system. If you subsequently use a configuration that does not have FTYPE_ID 1001 defined or it is changed from an Identifier to an Address, you will get errors such as:
0000E|Invalid const lookup identifier in Key-Index Container [PK19KeyIndexedContainerI6DsrcID10DsrcConfigE]. Identifier not found.
If that happens you need to either revert to the previous config that had the IDs defined or purge your repository and start with a new config.
Use Configuration Management
Please use Configuration Management!
Please use Configuration Management!
Please use Configuration Management!
The format of the g2config.json makes it easy to review changes in the configuration. This is the best way to audit for someone removing previous configuration items or changing them. It also allows you to revert to a previous version and coordinate modifications. CM tool tagging/promotion makes certain you know what you have been running last in each system.
Upgrading the Configuration
Starting with Senzing API release 1.2, we included migration scripts and upgrade SQL to help move your project to the latest version. There are two ways to upgrade the configuration 1) [advanced] make the current configuration compatible with the new version and 2) [basic] use the current configuration and simply re-add your custom Data Sources.
Advanced: This is preferred by most larger customers as they tend to have the most customization. The positive is it decreases the work to upgrade to a new Senzing version by minimizing the changes to configuration and results. The negative is that it won't take advantage of newer configuration based tuning/additions.
Basic: This is preferred by most smaller customers with fairly stock configurations where only new Data Sources have been added. It allows the user to quickly upgrade and take full advantage of newer configuration based tuning/additions.
As we move forward, we expect to be adding the ability to incorporate more configuration customizations in upgrades so even the Advanced users can easily take advantage of configuration tuning/additions.
If you have specific enhancement requests for configuration upgrades and management please send an email to support@senzing.com to help use prioritize this work.
Comments
0 comments
Please sign in to leave a comment.