What are resource pools?
Semarchy xDM makes use of "connection pools" when accessing the database as these connection pools allow connections to be re-used. This is important as the process of connecting to a database is the most "costly" aspect of getting data from a database. When a connection can be re-used rather than re-created, time can be saved. As in Semarchy xDM, all aspects of the end-user's UI is driven by data, the performance of database interactions is paramount and the use of resource pools is part of what gives xDM its performance.
You can find out more about connection pools and why they are used here: https://dev.to/digi0ps/connection-pooling-what-and-why-bom
Where are they configured?
When Semarchy xDM deployed in Tomcat, connection pools are defined and configured in the
semarchy.xml configuration file in stanzas which start with:
To accomplish any work in xDM you will need at least 2 JNDI Data Sources to be defined in
semarchy.xml. These datasource definitions are used to create the connections pools needed by Semarchy xDM. The first one is the pool called
"jdbc/SEMARCHY_REPOSITORY" which is where the xDM application can find the repository. The 2nd one can be named as you wish, though convention means the name starts with
jdbc/. One example of this is
<Resource name="jdbc/SEMARCHY_CUSTOMER_B2C_MDM" auth="Container" factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" type="javax.sql.DataSource" driverClassName="org.postgresql.Driver" url="jdbc:postgresql://localhost:5432/postgres" username="semarchy_customer_b2c_mdm" password="semarchy_customer_b2c_mdm" maxActive="8" maxIdle="8" minIdle="0" maxWait="15000" initialSize="0" defaultAutoCommit="false" rollbackOnReturn="true" validationQuery="select 1" testOnBorrow="true" timeBetweenEvictionRunsMillis="45000" validationInterval="60000" />
The full explanation of the stanza is beyond the scope of this article. However the items that can affect the performance of an xDM installation are these:
maxActive="8" maxIdle="8" minIdle="0"
Using this configuration, Tomcat will create a connection pool named
jdbc/SEMARCHY_CUSTOMER_B2C_MDM and will set its size (i.e. the maximum number of active connections) at 8, the minimum number of idle connections to 0 and maximum number of idle connections to 8.
With the above configuration you can have up to 8 connections in use where (in simplistic terms and assuming the code is well-behaved and relinquishes connections once the SQL has completed) "in use" means that the database is executing some SQL. If a 9th request came along it would have to wait as there are no free (or "idle") connections for it to use. The pool will keep the request waiting up to 15 seconds (the
maxWait parameter above - 15000ms = 15 secs) and if it still can't satisfy the request, an exception will be thrown and the request will fail.
What are good or reasonable values to use?
There are a number of schools of thought. Personally, I take the view that the pool should have 1 more connection than is needed. So if your user activity causes 120 connections to be used, you should set the connection pool to have
maxActive = 121.
Some detractors would argue that this could cause load on the database and on the whole, I would agree and urge them to tune (or size) the database as part of this exercise! In fact, in setting
maxActive, you should consider how many simultaneous connections your database can support.
The other value that I would tune is
minIdle which I suggest should be set to a low number like 2. To understand the logic behind this is that having some idle connections one must consider that when there are no idle connections an incoming request would cause a connection to be created at which point the overheads of creating the connection would come into play. With
minIdle set to, say, 2 then a connection could be supplied immediately and Tomcat can go and create a "spare" connection in line with the minIdle parameter without making the xDM code wait for it.
How do I monitor the connection pools?
Many Java monitoring tools have the ability to query JMX (See Wikipedia) and if you have access to a monitoring tool, we would suggest that you consider using it to track not just connection pools but other metrics too.
If you don't have a monitoring tool, Tomcat has made available, as part of the Manager application that's distributed with Tomcat, a feature called jmxproxy (see Tomcat's documentation) which lets you query JMX beans through a browser. You will need to give a user the
manager-jmx role to them to have access to jmxproxy. Also you will need to know what bean to query, so below is an example:
In the above example:
192.168.178.200:8080is the host/port on which Tomcat can be found,
Catalina:type=DataSource,host=localhost,context=/semarchy,class=javax.sql.DataSource,name=%22is the name of the bean (the '%22' represent double-quotes)
idleis the name of the attribute you want to get
Not only can this be used in a browser, but you can also query it from command-lines (as long as you have curl or something similar available:
[root@cs77 ~]# curl "http://semadmin:firstname.lastname@example.org:8080/manager/jmxproxy/?get=Catalina:type=DataSource,host=localhost,context=/semarchy,class=javax.sql.DataSource,name=%22jdbc/SEMARCHY_CUSTOMER_B2C_MDM%22&amp;att=maxIdle" OK - Attribute get 'Catalina:type=DataSource,host=localhost,context=/semarchy,class=javax.sql.DataSource,name="jdbc/SEMARCHY_CUSTOMER_B2C_MDM"' - maxIdle = 8 [root@cs77 ~]#
I have attached a little script which queries the attributes for the repository's pool as well as another one) whose name is passed on the command-line as well as the JVM heap memory:
[root@cs77 ~]# ./getPoolMetrics.sh SEMARCHY_STG Resource jdbc/SEMARCHY_REPOSITORY: maxActive = 16 active = 0 minIdle = 0 maxIdle = 10 idle = 2 Resource jdbc/SEMARCHY_STG: maxActive = 8 active = 0 minIdle = 0 maxIdle = 8 idle = 0 JVM Heap Memory: init = 4294967296 max = 4150263808 committed = 4150263808 used = 677195984 [root@cs77 ~]#
I have zipped and attached the script.