Out of the box, Senzing is configured to use an embedded SQLite database for the entity repository to accelerate getting started. This article describes the steps to configure Senzing to use Db2 as the entity repository.
Db2 V11.5 Server (evaluation) was used in testing the steps outlined herein. This article assumes you are already a Db2 user or familiar with Db2, it only briefly covers the installation of Db2.
Install and Basic Db2 Setup
The following is a brief overview of the steps required to install an evaluation version of Db2 as a root user installation. For full details on installing Db2 on Linux please see the provided links above. This example installation assumes Senzing, Db2 Server and the Db2 Client are installed on the same machine.
- Create users and groups for Db2, modify the group and user IDs if required for your system
sudo groupadd -g 999 db2iadm1
sudo groupadd -g 998 db2fsdm1
sudo groupadd -g 997 dasadm1
sudo useradd -u 1004 -g db2iadm1 -m -d /home/db2inst1 db2inst1
sudo useradd -u 1003 -g db2fsdm1 -m -d /home/db2fenc1 db2fenc1
sudo useradd -u 1002 -g dasadm1 -m -d /home/dasusr1 dasusr1
sudo passwd db2inst1
sudo passwd db2fenc1
sudo passwd dasusr1
- Untar the Db2 installation package
tar xvf v11.5_linuxx64_dec.tar.gz
- Run the Db2 installer
During the installation you may be prompted certain prerequisites for Db2 are not met. Please refer to the Db2 installation documentation for further details.
- Follow the prompts, usually defaults are acceptable for testing
- For the product to install select SERVER
- Enter NO when asked to install DB2 pureScale Feature
- Create a Db2 instance (for a root install)
sudo <db2_install_path>/instance/db2icrt -u db2fenc1 db2inst1
- <db2_install_path> = /opt/ibm/db2/Vxx.x if accepted default install location
- Create the Db2 Sample database to test with
su - db2inst1
- Test Db2 and access to the SAMPLE database
db2 connect to sample
db2 "select * from employee"
db2 connect reset
Make Your Senzing User a Db2 Instance Admin
If you already have a Db2 instance configured or your DBA is going to set one up for Senzing they will know the privileges and authorizations your Senzing user will require. For the purpose of this article adding the Senzing user to the db2iadm1 group maintains simplicity.
- Make your Senzing user a Db2 instance admin
sudo usermod -aG db2iadm1 senzing
- db2iadm1 = The default Db2 instance group
- senzing = Userid used for running Senzing
- Logout and login again
On a test system and if you are familiar with Db2, you may think it useful to source the db2profile to setup Db2 paths and variables upon login. This causes clashes with the paths and environment used by the Db2 client in the following steps. See the IBM Db2 documentation.
Create New Database & Add Senzing Schema
- Create the database to use as the Senzing entity repository
sudo su - db2inst1
db2 create db g2 using codeset utf-8 territory us
db2 connect to g2
db2 -tvf <project_path>/resources/schema/g2core-schema-db2-create.sql
- -t = Sets statement terminator (; by default)
- -f = File to process
- -v = Verbose output
- <project_path> = Your Senzing project path
db2 connect reset
The above commands should be run as the Db2 instance owner (db2inst1). Make the g2core-schema-db2-create.sql script readable to the Db2 instance owner or create a copy the Db2 instance owner has privileges to create the Senzing schema.
Install & Configure IBM CLI/ODBC Driver
The IBM IBM CLI/ODBC Driver package provides lightweight connectivity for applications accessing Db2 through the CLI or ODBC interfaces.
- Download the IBM Data Server Driver for ODBC and CLI (64-bit) and make it available to the db2inst1 user
- Perform the following as the Db2 instance user, in this article that is db2inst1
su - db2inst1
- Extract the downloaded package
tar xvf v11.1.4fp5_linuxx64_odbc_cli.tar.gz -C /home/db2inst1/db2_cli_odbc_driver/
- Create the db2dsdriver.cfg file, it will be created in /home/db2inst1/sqllib/cfg/
db2dsdcfgfill -i db2inst1
- You can validate a db2dsdriver.cfg file using the dbcli command
db2cli validate -dsn g2
- g2 = Database name previously created and the Senzing schema populated too
- Add the required variables to the Senzing users environment
These environment variables will need to be set for the Senzing user at each login when accessing Db2. You could add them to the end of the setupEnv script located in your Senzing project path root. These must come at the end of setupEnv and after the default commands in this file!
Configure G2Project.ini & G2Module.ini
Modify the ini files to reference the new Db2 databases and schema. These files are located in <project_path>/etc/. You can comment out the current lines by prefixing them with # and adding the modified ones below.
- G2Project.ini - Change or add a new G2Connection entry
- G2Module.ini - Change or add a new CONNECTION entry
- db2inst1 = Connection user
- password4g2 = Password for user
- G2 = Senzing entity repository
Configure Database Parameters
Set the database parameters for the Senzing workload. Be sure to stop and start Db2 for the changes to be effective.
Run a Test Load and Export
Perform a test load of the supplied sample data and perform an export to test the new database setup.
- Source setupEnv
- Load the sample data
python3 G2Loader.py -P -p demo/sample/project.csv
- Create the export file