Connecting to Hadoop via Hive
The Sisense Hive connector is a certified connector that allows you to import data from the Apache Hadoop Hive API into Sisense via the Sisense generic JDBC connector. The connector offers the most natural way to connect to Apache Hadoop Hive data and provides additional powerful features.
The support for the connector is provided by Sisense and will be assisted by the certification partner's support, if needed. For any support issues or additional functionality requests, contact your Sisense representative or open a request through the Sisense Help Center . For advanced inquiries specific to driver functionality, you can also contact the certification partner’s support directly via support@cdata.com .
After you have downloaded the driver, you can connect through a connection string in Sisense. The connection string is used to authenticate users who connect to the Hive APIs. Once you have connected to Hive, you can import a variety of tables from the Hive API.
This page describes how to download the Hive driver and deploy it, how to connect to Hive with a connection string, provides information about the Hive data model, and more.
Downloading the Hive JDBC Driver
You can download the Hive JDBC driver here .
For a short video of downloading the driver, see below (the video uses the Box driver as an example).
Note:
- The driver is certified for Sisense v7.2 and above.
- Sisense v7.4 and above: Click the above link to download a ready-to-use driver.
- Sisense prior to v7.4: Click the above link to download a 30-days free-trial of the driver. Contact Sisense for the full license version.
Deploying the Driver
Prerequisite: The install file (setup.jar) is a Java Application that requires Java 6 (J2SE) or above to run.
To install the driver, double-click the setup.jar file and proceed with the instructions in the installation wizard.
Depending on the machine on which you are accessing the Sisense application, install the driver in one of the following locations:
-
When Sisense is installed on your local machine , deploy the driver locally.
-
For a non-local installation (when accessing Sisense on a remote Windows server, or accessing the Sisense hosted cloud environment), select one of the below methods:
- Deploy the driver on the Sisense server machine, and then perform all the authentication on the server machine.
OR - Deploy the driver on your local machine (or any other machine, as convenient), perform all the authentication on that machine, and then copy the JAR file to the remote server.
- Deploy the driver on the Sisense server machine, and then perform all the authentication on the server machine.
Note:
The default location of the JAR file: C:\Program Files\CData\CData JDBC Driver for <Driver Name> 2019\lib
For a short video of the process, see below (the video uses the Box driver as an example).
JAVA Troubleshooting
If you do not have Java 6 installed, you may download it from here.
If your system is not set up to run Java applications, execute the following command: java -jar setup.jar
.
Connecting to Hive
Sisense uses connection strings to connect to Hive and import data into Sisense . Each connection string contains authentication parameters that the data source uses to verify your identity and what information you can export to Sisense .
To create the connection string:
-
Open the lib directory for the connector. This is the default path:
C:\Program Files\CData\CData JDBC Driver for <Driver Name> 2019\lib
. -
Double-click the jar file in the lib directory.
Alternatively, to open the jar file from the command line, enter the following command in the command prompt (change the driver name to your driver):
cd C:\Program Files\CData\CData JDBC Driver for <Driver Name> 2019\lib
. Press Enter and then enter the following command (change the driver name to your driver):“C:\Program Files\Sisense\infra\jre\bin\java.exe" -jar cdata.jdbc.<Driver Name>.jar
.Press Enter again.
For example:
The Connection String Builder opens.
-
Enter the values for the following connection properties (click in the Value column to enter a value or to modify an existing value):
-
Server : The host name or IP address of the server hosting HiveServer2.
-
Port : The port for the connection to the HiveServer2 instance:
- When using BINARY TransportMode, set this to the value in the 'hive.server2.thrift.port' property of the Hive configuration file (hive-site.xml).
- When using HTTP TransportMode, set this property to the value in the 'hive.server2.thrift.http.port' property of the Hive configuration file (hive-site.xml).
-
TransportMode : Set this to the 'hive.server2.transport.mode' value specified in your Hive configuration file (hive-site.xml).
-
AuthSchema : Set this to The AuthScheme used to authenticate with Hive:
- Set to NOSASL, if the hive.server2.authentication property is set to NOSASL.
- Set to LDAP, if the hive.server2.authentication property is set to LDAP.
- Set to KERBEROS, if the hive.server2.authentication property is set to KERBEROS.
- Set to PLAIN, if the hive.server2.authentication property is set to NONE (uses PLAIN SASL), PAM, or CUSTOM.
-
-
If the Connection String Builder has a InitiateOAuth property, set it to OFF to avoid entering the OAuth Authorization process.
Note:
This property may not appear for some connectors.
-
Press Enter to add all the connection properties to the connection string.
An example of the connection string:
jdbc:hive:Server=127.0.0.1;Port=10000;TransportMode=BINARY
-
Click Test Connection . A new browser tab opens where you need to log in to your application in order to grant access. (Each application will display a different window and messages.)
Close the Authorization Successful! message that opens.
-
Go back to the Connection String Builder dialog, and click OK in the Test Connection Successful message to close it.
-
Click Copy to Clipboard to obtain the connection string.
For a short video of the process, see below (the video uses the XML driver as an example):
You need to follow the above instructions only on first connect, and then when your credentials to the application change.
Adding Hive Tables to your ElastiCube
-
Open Sisense.
For a non-local installation, open Sisense on the hosted cloud environment.
-
In the Data page, open an ElastiCube or create a new ElastiCube .
-
In the Model Editor, click . The Add Data dialog box is displayed.
-
Click Generic JDBC to open the JDBC settings.
-
In Connection String , enter the Hive URL. See Connecting to Hive for more information.
-
In JDBC JARs Folder , enter the name of the directory where the Hive JAR file is located.
-
In Driver's Class Name , enter the following class name:
cdata.jdbc.apachehive.ApacheHiveDriver
. -
In User Name and Password , enter your Hive credentials. These fields are not required if the user name and password were provided in the connection string.
-
Click Next . A list of tables in the database are displayed. All tables and views associated with the database will appear in a new window.
-
From the Tables list, select the relevant table or view you want to work with. You can click next to the relevant table or click Preview to see a preview of the data inside it.
-
(Optional) Click + to customize the data you want to import with SQL. See Importing Data with Custom Queries for more information.
-
After you have selected all the relevant tables, click Done . The tables are added to your data model.
For a short video of the process, see below (the video uses the XML driver as an example):
Additional Resources
For the full documentation set for the Hive connector, click here.
For connection string options, click here.
For information on the Hive data model, click here.
.r.