JDBC Federator

JDBC Federator is a data connector that can combine multiple single-tenant databases or schemas into a single multi-tenant ElastiCube. This is useful, for example, for performing cross-analyses between different tenants. If you have multiple clients with their own database schemas, creating an ElastiCube for each will be difficult to manage, and writing a Union statement is not scalable. Using JDBC Federator solves these problems.

JDBC Federator is built on the JVM Connector framework and is conceptually similar to the Deploying a Custom Connector. For JDBC Federator to work, it requires a JDBC driver for the databases you want to connect to. You can download JDBC drivers from here.

This document contains the following main sections:

Setting Up JDBC Federator

Setting up the JDBC Federator requires several steps:

Editing the description.json File

To edit the description.json file:

  1. In Sisense, click Admin, search for and select System Management, and click File Management.

    Note:

    The File Management feature must be enabled in the Feature Management section in order for it to appear here.

    Note:

    On your local Linux machine, this maps to /opt/sisense/storage.

    Important:

    Do not add or remove any files or folders from the root directory.

  2. Navigate to connectors > federator > description.json.

  3. Append the following to the payload section:

    "additionalClassPaths": ["/opt/sisense/storage/connectors/federator/jdbcdrivers/*"], "jvmParameters" : ["-Xmx8G"]

    Note:

    Set the RAM allocation (-Xmx8G ) in GB appropriately. For example, 4G, 8G, or 12G.

  4. The description.json file is now as follows:

            {
            "fileVersion": "config://sisense/connector/federator#1",
            "configurationType": "json",
            "payload": {
            "provider": "federator",
            "displayName": "Federator",
            "version": "1.1.0",
            "category": "Database",
            "jdbcClass": "com.sisense.connectors.jdbc.JayDBCQueryFederator",
            "isAccumulativeSupported": true,
            "isWebECMSupported": true,
            "startAsUser": false,
            "additionalClassPaths": ["/opt/sisense/storage/connectors/federator/jdbcdrivers/*"],
            "jvmParameters" : ["-Xmx8G"]
            }
            }

 

Editing the usedConnectors.json File

To edit the usedConnectors.json file:

  1. In Sisense , click Admin, search for and select System Management, and click File Management. Note that the File Management feature must be enabled in the Feature Management section in order for it to appear here.

  2. Navigate to connectors > usedConnectors.json.

  3. Append "federator" to both of the arrays:

    "availableConnectors": [..., "federator"]
    "displayConnectors": [..., "federator"]

The usedConnectors.json file should now look like this:

      {
      "customConnectors":true,
      "availableConnectors":[
      "athena",
      "EC2EC",
      "CSV",
      "GoogleAnalytics",
      "GenericJDBC",
      "Google Spreadsheets",
      "sql",
      "Excel",
      "GoogleAdwords",
      "SalesforceJDBC"
      "federator"
      ],
      "displayConnectors":[
      "athena",
      "EC2EC",
      "CSV",
      "GoogleAnalytics",
      "GenericJDBC",
      "Google Spreadsheets",
      "sql",
      "Excel",
      "GoogleAdwords",
      "SalesforceJDBC"
      "federator"
      ],
      "externalConnectorsPath":"/opt/sisense/storage/connectors"
      }

 

Setting Up the jdbcdrivers and tenants Folders

To set up the jdbcdrivers and tenants folders:

  1. Download the following files:

  2. Save the downloaded files to System Management > File Management > connectors > federator.

  3. In that location, create two new folders:

    • tenants
    • jdbcdrivers
  4. Add the relevant JDBC driver JAR file to the jdbcdrivers folder.

 

Restarting the Connector Pod

Restart the Kubernetes Connector Pod. Note that a restart is required when:

  • You updated the descriptor.json file
  • You add or update the jdbc driver JAR file

You do not have to restart the Connector Pod when you update the tenantdetails.csv and jdbc_query.properties files (next steps).

Configuring the tenantdetails.csv File

The tenantdetails.csv file lists the details for each tenant and is saved here:

/opt/sisense/storage/connectors/federator/tenantdetails.csv.

The CSV file must contain at least one tenant column, which typically maps to your database schemas. The tenant column must contain unique values.

If your tenants are located across multiple database servers with different usernames and passwords, you can add additional columns for each.

You can add as many columns as you like, based on the variables across tenants.

The following is an example of a tenantdetails.csv file:

Configuring the jdbc_queries.properties File

  1. Click Admin, search for and select System Management, and click File Management > connectors > federator.

    Note:

    The File Management feature must be enabled in the Feature Management section in order for it to appear here.

  2. Open the jdbc_query.properties file for editing.

  3. In the table.names property, add a comma-delimited list of tables.

  4. For each value in the table.names list, add a corresponding entry for its query. In the example, below, there's a table called table.names=Customer,Invoice and so there are corresponding query entries for Customer.query and Invoice.query.

    Note:

    • The table names are logical names and do not need to match the database names
    • do not leave any spaces in the tables.names list
    • Any value from the tenantdetails.csv file is reverenced using the <column name> format. For example: &lt;tenant&gt;
  5. Make sure that the details for each of the properties are correct.

    # tenant csv file location
    tenant.config.file=/opt/sisense/storage/connectors/federator/tenantdetails.csv
    # folder where the incremental state of the tables is stored
    #tenant.incr.folder=/opt/sisense/storage/connectors/federator/tenants
    # table names as a comma delimited list, do NOT leave spaces
    table.names=Customer,Invoice
    # sql queries for each table
    Customer.query=SELECT '<tenant>' as "Tenant", * from <tenant>.Customer
    Invoice.query=SELECT '<tenant>' as "Tenant", * from <tenant>.Invoice
    # column names for incremental loads (optional)
    #xxxx.increment.col=CreatedDate
    #yyyy.increment.col=ID
    # misc settings
    incr.date.format=yyyy-MM-dd HH:mm:ss.SSS
    fault.tolerance=Y
    connection.test.query=SELECT 1
    #For DB2
    #connection.test.query=SELECT * from SYSIBM.SYSDUMMY1
    #For PostGres
    #record.fetch.size=100000

Setting Up Incremental Load

When you run a build, the ElastiCube reloads the entire data set. If you have a lot of data, it could take a long time to load. An incremental load, however, only loads new data. This configuration instructs JDBC Federator to look at specific columns (such as CustomerID or InvoiceDate) to determine what the latest data is, and to only load that data.

To set up incremental load:

  1. Click Admin, search for and select System Management, and click File Management > connectors > federator.

    Note:

    The File Management feature must be enabled in the Feature Management section in order for it to appear here.

  2. Open the jdbc_query.properties file for editing.

  3. Uncomment the tenant.incr.folder property.

  4. Specify the incremental columns for each table you want to load incrementally. For example:

    customer.increment.col=CustomerID
    invoice.increment.col=InvoiceDate

  5. In Sisense, in the ElastiCube , set the table build behavior to Append.

Setting the Federator Connection URL

Federator appears in the list as an available connector.

Set the connection URL for Federator as follows:

JDBC URL|ConfigFile=&lt;path&gt;/jdbc_query.properties

Example:

jdbc:redshift://redshift.aws.com/sales|ConfigFile=/opt/sisense/storage/connectors/federator/jdbc_query.properties

Note:

The connection URL can contain dynamic elements from the tenantdetails.csv file.

For example:

jdbc:redshift://redshift.aws.com/&lt;tenant&gt;|ConfigFile=/opt/sisense/storage/connectors/federator/jdbc_query.properties

Troubleshooting

Connection Error - Could Not Connect to DB. java.lang.ClassNotFoundException:

Check the following

  • The JDBC driver JAR file for the database is uploaded to the jdbcdrivers folder
  • additionalClassPaths entry is added to the descriptor.json file
  • The connector pod was restarted after the settings were made

Can connect but do not see any tables.

Check the following:

  • The ConfigFile property is appended to the JDBC URL
  • The tenantdetails.csv file and the jdbc_query.properties file exist
  • The SQL queries in jdbc_query.properties are valid.
  • Detailed error message can be found on the logs for the connector pod

Viewing Logs

Logs are located in:

/var/log/sisense/sisense/&lt;connector pod logs&gt;

You can also check the logs within the connector pod:

kubectl -n sisense logs &lt;connector pod name&gt; --since=5m

FAQs

  1. Is there a limitation on the number of data sources or schemas I can connect to?

    There is no limit, the only limit is the total number of rows across all the tenants.

  2. Where does the unioned data reside before creating the table inside the ElastiCube? Is it cached on the server?

    There is no UNION in the Federator. The Federator runs individual queries in sequence and passes the data to the ElastiCube , row by row.

    There is no caching or buffering in memory or on disc.

  3. All my tenants are on the same database. Should I UNION the tables in the import query or use the Federator? What would be the advantage and disadvantage of using the Federator?

    The Federator supports incremental loads at the tenant level. A single UNION query will not support this.

    Management and setup is also much easier in the Federator than doing a UNION for each table. For example, if you add 20 schemas and then want to add the 21st schema, for the UNION method you will have to go to the ElastiCube , change every table, and rebuild the entire cube. In the Federator, you only need to add a single line in the tenantdetails.csv file.

  4. Can I Federate data across different database vendors (e.g. SQL Server and Postgres)?

    No, you cannot federate data across different vendors in a single table. You can have multiple federations in an ElastiCube . For example, Table 1 (multiple SQL Server databases) Table 2 (multiple Postgres databases).
    For this you need to create multiple sets of property and CSV files.

  5. If my ElastiCubes get too large, can I split the Federations into multiple ElastiCubes?

    Yes, you can. Create multiple sets of property and CSV files, and then use the Dynamic ElastiCube plugin from the Sisense Market Place to have your dashboards switch from one ElastiCube to another, based on the tenant.

  6. Is the Federator data size-sensitive?

    The Federator is only sensitive to data size as far as the data size limits of the ElastiCube are concerned.

  7. What data sources can I use in the Federator?

    The following databases (via their JDBC Drivers) are supported:

    • "com.microsoft.sqlserver.jdbc.SQLServerDriver"
    • "oracle.jdbc.driver.OracleDriver"
    • "com.ibm.as400.access.AS400JDBCDriver"; "com.ibm.db2.jcc.DB2Driver"
    • "com.simba.athena.jdbc.Driver"
    • "org.postgresql.Driver"
    • "com.mysql.jdbc.Driver"
    • "com.amazon.redshift.jdbc42.Driver"
    • "net.snowflake.client.jdbc.SnowflakeDriver"