Overview
You may have arrived here due to perceived poor performance in general or a warning from Senzing your entity repository database is performing slowly; for example during the startup of G2Loader. The Senzing checkDBPerf() function tests how many auto-commit inserts can be performed on the Senzing entity repository within a few seconds. The faster this completes, the higher scalability you can expect from you system. This isn't the only performance area impacting Senzing but is the #1 outside of data mapping.
Database performance with Senzing is highly related to latency. Issues with performance, in order, tend to focus around:
- Disk IO performance of the database server - see the Disk Performance article
- Lack of database tuning for an auto-commit OLTP workload
- Network bottlenecks preventing high-speed communication between the Senzing API and the database
- Latency between the database server and non-direct attached storage subsystems
The database parameters outlined here are the minimim required and not an exhaustive list for optimal database performance. We are happy to assist in making further recommendations but you should always include your Database Administrator in helping to monitor and tune your underlying Senzing database.
Our customers have systems running as fast as low 10s of milliseconds for searches and mid 10s of milliseconds for loads all while running 100s or 1000s of operations in parallel -- performance depends on data, config, architecture, and hardware. If you aren't seeing the performance you are looking for, reach out so we can help you get the most from your setup.
Auto-Commit Tuning
For each database system the following parameters should be set for the Senzing auto-commit workload.
SQLite
The Senzing engine automatically sets a couple of SQLite pragmas suited to the Senzing workload. If your database is small and you have RAM to spare you could consider using tmpfs to improve performance. See the Disk Performance article.
The pragmas set for SQLite:
- synchronous = 0
- secure_delete = 0
- journal_mode = WAL
- journal_size_limit = 1000000
DB2
db2set DB2_LOGGER_NON_BUFFERED_IO=ON
db2set DB2_SKIP_LOG_WAIT=YES
db2 update db cfg for <database_name> using PAGE_AGE_TRGT_MCR 10
db2set DB2_USE_ALTERNATE_PAGE_CLEANING=ON
PostgreSQL
synchronous_commit=off
wal_writer_delay=1000
enable_seqscan=off
Aurora PostgreSQL
Ensure that the ec2 instances running the Senzing API are in the same AZ (availability zone), VPC, and subnet as the Aurora PostgreSQL server. This will greatly reduce latency and increase performance.
If you have an Aurora PostgreSQL read replica in another AZ, all commits will be synchronous. Either disable replicas for the large historical/initial loads or ensure the replicas are in the same AZ.
Note: Aurora PostgreSQL Server-less v2 is showing significantly degraded performance compared to v1.
The following settings are per cluster database group:
autovacuum_max_workers: 5 enable_seqscan: 0 pglogical.synchronous_commit: 0 synchronous_commit: "off"
MS SQL and Azure Hyperscale SQL
If you are running on Azure, ensure that your Azure Hyperscale database is in the same proximity zone as your applications/services that have integrated the Senzing API.
For MS SQL 2019, make sure the database is set to a UTF-8 character set. If your database is not in UTF-8 (e.g. running MS SQL 2017) it is critical that your odbc.ini set "AutoTranslate = No" for the configuration so data is not corrupted.
ALTER DATABASE <G2 DB> SET DELAYED_DURABILITY = Forced;
ALTER DATABASE <G2 DB> SET AUTO_CREATE_STATISTICS ON;
ALTER DATABASE <G2 DB> SET AUTO_UPDATE_STATISTICS_ASYNC ON;
MS SQL also has some other special case items.
- It doesn't update the query plans automatically so the DBA will need to be involved to update statistics and flush the procedure cache.
- On RedHat, unixODBC is not built with --enable-fastvalidate which severely limits scaling per process. Microsoft explains the issue here. Ubuntu and Debian ship with this flag enabled.
-
On Ubuntu, Microsoft supplies a version of unixODBC that does NOT contain --enable-fastvalidate for the build. This creates a 10x performance problem. You must override its versions by downgrading to the Ubuntu-built versions. Currently, that is:
sudo apt install libodbc1=2.3.6-0.1build1 unixodbc=2.3.6-0.1build1
MySQL
Note: Aurora MySQL v3 does not support turning off flush on commit so it will not perform well with Senzing.
innodb_flush_log_at_trx_commit = 0
innodb_flush_method = O_DIRECT
innodb_file_per_table = 1
innodb_doublewrite=0
innodb_flush_neighbors=0
skip-log-bin
On larger Senzing systems running a high number of threads you may see errors that additional prepared statements can't be created:
42000 Can't create more than max_prepared_stmt_count statements (current value: 16382)
If you know you will be running Senzing on a larger system with a higher than default number of threads, or you receive an error similar to the above, you will need to increase the number of prepared statements for the database. For example, on MySQL by increasing max_prepared_stmt_count:
max_prepared_stmt_count = 100000
100,000 is a suggested value but this will vary depending on threads and speed of system. A suggested formula is 2000 * number of Senzing threads + 500
Network Bottlenecks
Networking used to be very straightforward, easy to monitor and manage. There was the exceptional case of devices hitting max packets per second or driver/OS/PCI bottlenecks but those were rare.
The popularity of cloud environments extracts much of the network topology from the user. Your application and database could be on different sides of a data center (or continent) or even communicating through very limited virtual systems when they are co-located. Cloud networking is often a black box and the hardest component to troubleshoot.
We have seen systems scale from 30 records per second to 1000 records per second simply by switching the type of underlying cloud network fabric:
- Make sure your systems are co-located as close as possible when provisioned
- If there are options between virtual and physical switches, test with both options
- Ask about both bandwidth and packets per second limits
Comments
0 comments
Please sign in to leave a comment.