Choosing the Right Data Model

Customers often run into the question of which type of data model they should use, ElastiCube (see Introducing ElastiCubes), Live (see Introducing Live Models), or B2D (see Build to Destination). This topic covers some of the key aspects to consider when choosing between these options.

Note:

The common use cases listed below are generic guidelines. Sisense recommends that you discuss your needs and requirements with Sisense's technical team to help ensure your business expectations are met.

Common Use Cases

This section covers common use cases and some of the most relevant factors to consider when choosing between the Live, ElastiCube, B2D, and "hybrid" types of data models.

Live

Good use cases for Live include: trading and stock market analysis, banking, ecommerce and high-volume sales analytics, and call center and customer service analytics.

Use Live when:

Data models have more than 1 billion records. (ElastiCube data models are limited to 1 billion records.)
Frequent data updates are required and can be supported by the pace of the customer ETL schedule.
Data is sourced from a cloud data warehouse (CDW), updated frequently and/or in real time, and the query cost is low, (examples: Redshift, BigQuery).
Data is already structured neatly and extensive transformation is not required, (there are no custom tables in Live).
Data source must be adjusted according to the user or group that is accessing the analytics. (Use the Dynamic ElastiCubes plugin to support this.)
There is a preference to work with the dialect / syntax of the underlying data source.
There are lots of tenant specific models. With Live you can leverage dynamic live connections, (see Managing Live Dynamic Connections) and do not need to worry about build scheduling.
System resources are constrained, (CPU, RAM).

ElastiCube

Good candidates for ElastiCube include analytics for the entertainment, healthcare, education, and marketing industries.

Use ElastiCube when:

There isn't a CDW and there are multiple data sources, (that is, a “data mashup” is required).
Data transformations and cleansing are required, but there isn’t already a data preparation solution in your data stack.
There is a CDW, but there are significant fees to the CDW vendor based on the volume of queries.
The dataset is growing, but it is currently less than 1 billion records per data model. When the amount of data grows to exceed 1 billion records for any single data model, then move to B2D or the Live model.
Data refresh cadence is infrequent, or tied to an ETL pipeline.
The table build option Append, or the table build option Accumulate By (incremental), is required. (See Defining ElastiCube Build Settings.)

Build to Destination

Benchmarking is a good use case for B2D.

Move to B2D when:

Upsert is required. For more information, see Build to Destination Benefits and Considerations.
Existing ElastiCube holds a very large amount of data.
Data preparation is required (by a Sisense end user) on top of an existing CDW.
Customer has access to Amazon S3 storage and a CDW (like Snowflake). Note that B2D works like a Live model, with all queries going through the CDW - which increases the costs associated with the CDW and Amazon S3 storage.

Hybrid

Use "Hybrid" (combination of ElastiCube and Live) when:

Dashboards must include both historical and current data. In many cases the ElastiCube data model is used to provide historical data while the Live model is used to provide the current data.
The ElastiCube is for a data set with varied slices (at a high level of granularity), while the Live data model supports visuals for deep-dive transactional level or historical (archived) insights.
A single dashboard includes both widgets that require real-time data, and other widgets that do not.

Important Considerations

In addition to the use cases found above, there are other relevant factors to consider. The following side-by-side summary table lists and describes many of the most important characteristics of the ElastiCube, Live, and B2D data model types.

	ElastiCube	Live	B2D	Considerations
Data Refresh Rate Determines the age of the data in your dashboards	Determined by the frequency of the ElastiCube builds	Determined by the customer update frequency	Determined by the frequency of the B2D builds on the cloud data warehouse (CDW)	How frequently do I need to pull new data from the source database?
Data Transformation Options Includes custom columns, custom tables, importing from multiple data sources, and custom code	Extensive transformations are supported on the ElastiCube, (custom columns, custom tables, importing from multiple data sources, custom code)	Limited - transformations must be run on the cloud database	Extensive transformations are supported on the ElastiCube, (custom columns, custom tables, importing from multiple data sources, custom code) Custom transformations can also be performed using Sisense SQL or the destination SQL.	Does my data require enriching or pre-conditioning? Can my data be pre-aggregated? Note: ElastiCubes should NOT be treated as an alternative to an ETL tool or a CDW, they are meant to be used in parallel. Operations should be completed in source, and imported in an ElastiCube when the data is ready for reporting.
Load on the Cloud Data Warehouse Any application that increases the load on the CDW should be examined closely	Not applicable during queries. During build only if the CDW is the data source.	Constant queries to the CDW on an ongoing basis	Constant queries to the CDW on an ongoing basis. Build should run on a small resource.	Does the analytical system stress my CDW? Can the query load be avoided by using a database replica?
Database Availability The level of availability is a critical factor for your analytical system	Data sources must be connected to the ElastiCube during build	Crucial – Live data model must be accessible at all times	Crucial – Data model must be accessible at all times	How frequently are the data sources offline? How critical is my analytical system? Is being offline (or showing out-of-date information) acceptable?
Additional Vendor Costs Various database vendors use different charging models	Data is queried once from the data sources per ElastiCube build. No additional vendor costs for users to access the ElastiCube after the build.	Database vendors may charge for data queries to the CDW	Database vendors may charge for data queries to the CDW. Build can run on small resource to reduce costs.	How many users are using my dashboards? What is my build frequency? Which data model will result in lower costs? What is the tipping point? Is it worthwhile to pay more for better performance?
Database Size Limitations Sisense ElastiCubes and CDWs have different limitations	Support for up to 1 billion records in a single ElastiCube	Limitations are according to the CDW vendor	Limitations are according to the CDW vendor	What is the amount of data I need in my data model? How much historical data do I need to store? Can I reduce the amount of data? (For example, by trimming historical data, reducing the number of columns, and so on.)
Query Performance Depends on the underlying work required to fetch and process data	Query performance depends on the hardware and the complexity of the data. Good performance is provided by multiple layers of in-memory caching.	Query performance depends on the CDW size	Query performance depends on the CDW size	How sensitive is the client to delays in receiving their query results? How much latency is acceptable?
Available Connector Types	Wide range of connectors, both native and generic JDBC (see CData)	N/A	Wide range of connectors, both native and generic JDBC (see CData)	Sisense supports hundreds of data connectors, (see Data Connectors).
Caching Optimization The caching options differ for ElastiCubes and CDWs	Sisense supports results based caching and reusing subqueries caching (Recycler)	Results based caching in Sisense	Results based caching in Sisense. Optional use of Sisense Acceleration.	Do I want to leverage Sisense's query caching? How long do I want the caching to store data?
Dashboard Design Limitations	Supports all widget types	Supports all widget types	Supports all widget types	All widget types are supported, but not all formulas and functions, (such as mode, standard deviation, R formulas)

Choosing the Right Data Model

Common Use Cases

Live

ElastiCube

Build to Destination

Hybrid

Important Considerations

Data Refresh Rate

Data Transformation Options

Load on the Cloud Data Warehouse

Database Availability

Additional Vendor Costs

Database Size Limitations

Query Performance

Available Connector Types

Caching Optimization

Dashboard Design Limitations