You may have arrived here due to perceived poor performance in general or a warning from Senzing your entity repository database is performing slowly; for example during the startup of G2Loader. The Senzing checkDBPerf() function tests how many auto-commit inserts can be performed on the Senzing entity repository within a few seconds. The faster this completes, the higher scalability you can expect from you system. This isn't the only performance area impacting Senzing but is the #1 outside of data mapping.
Database performance with Senzing is highly related to latency. Issues with performance, in order, tend to focus around:
- Disk IO performance of the database server - see the Disk Performance article
- Lack of database tuning for an auto-commit OLTP workload
- Network bottlenecks preventing high speed communication between the Senzing API and the database
- Latency between the database server and non-direct attached storage subsystems
The database parameters outlined here are the minimim required and not an exhaustive list for optimal database performance. We are happy to assist in making further recommendations but you should always include your Database Administrator in helping to monitor and tune your underlying Senzing database.
For each database system the following parameters should be set for the Senzing auto-commit workload.
The Senzing engine automatically sets a couple of SQLite pragmas suited to the Senzing workload. If your database is small and you have RAM to spare you could consider using tmpfs to improve performance. See the Disk Performance article.
The pragmas set for SQLite:
- synchronous = 0
- secure_delete = 0
- journal_mode = WAL
- journal_size_limit = 1000000
db2 update db cfg for <database_name> using PAGE_AGE_TRGT_MCR 10
Assure that the ec2 instances running the Senzing API are in the same AZ, VPC, and subnet as the Aurora PostgreSQL server. This will greatly reduce latency and increase performance.
If you have an Aurora PostgreSQL read replica in another AZ, all commits will be synchronous. Either disable replicas for the large historical/initial loads or assure the replicas are in the same AZ.
For MS SQL 2019, make sure the database is set to a UTF-8 character set. If your database is not in UTF-8 (e.g. running MS SQL 2017) it is critical that your odbc.ini set "AutoTranslate = No" for the configuration so data is not corrupted.
ALTER DATABASE <G2 DB> SET DELAYED_DURABILITY = Forced;
ALTER DATABASE <G2 DB> SET AUTO_CREATE_STATISTICS ON;
ALTER DATABASE <G2 DB> SET AUTO_UPDATE_STATISTICS_ASYNC ON;
MS SQL also has some other special case items.
- It doesn't update the query plans automatically so the DBA will need to be involved to update statistics and flush the procedure cache.
- On RedHat, unixODBC is not built with --enable-fastvalidate which severely limits scaling per process. Microsoft explains the issue here. Ubuntu and Debian ship with this flag enabled.
On Ubuntu, Microsoft supplies a version of unixODBC that does NOT contain --enable-fastvalidate for the build. This creates a 10x performance problem. You must override its versions by downgrading to the Ubuntu-built versions. Currently, that is:
sudo apt install libodbc1=2.3.6-0.1build1 unixodbc=2.3.6-0.1build1
innodb_flush_log_at_trx_commit = 0
innodb_flush_method = O_DIRECT
innodb_file_per_table = 1
On larger Senzing systems running a high number of threads you may see errors that additional prepared statements can't be created:
42000 Can't create more than max_prepared_stmt_count statements (current value: 16382)
If you know you will be running Senzing on a larger system with a higher than default number of threads, or you receive an error similar to the above, you will need to increase the number of prepared statements for the database. For example, on MySQL by increasing max_prepared_stmt_count:
max_prepared_stmt_count = 100000
100,000 is a suggested value but this will vary depending on threads and speed of system. A suggested formula is 2000 * number of Senzing threads + 500
Networking used to be very straightforward, easy to monitor and manage. There was the exceptional case of devices hitting max packets per second or driver/OS/PCI bottlenecks but those were rare.
The popularity of cloud environments extracts much of the network topology from the user. Your application and database could be on different sides of a data center (or continent) or even communicating through very limited virtual systems when they are co-located. Cloud networking is often a black box and the hardest component to troubleshoot.
We have seen systems scale from 30 records per second to 1000 records per second simply by switching the type of underlying cloud network fabric:
- Make sure your systems are co-located as close as possible when provisioned
- If there are options between virtual and physical switches, test with both options
- Ask about both bandwidth and packets per second limits