Datasets in QuickSight run either in SPICE or Direct Query mode. The query mode can be seen in the bottom left corner, with SPICE and Direct Query as options. You can switch between these two modes as needed. Additionally, there is a Refresh Now option, which fully refreshes the data source and updates the dashboard with any new records.
Spice
SPICE (Super-fast Parallel In-memory Calculation Engine) is a high-performance, in-memory engine that uses columnar storage technologies to extract data from tables of data sources defined in the dataset, on a scheduled basis, and store it in memory. Additionally, SPICE datasets can handle any number of requests made through the QuickSight dashboard, as the engine automatically scales internally to ensure a better user experience.
QuickSight manages the entire workload without placing a strain on the underlying data source. Therefore, it makes no difference if 10 users or 10,000 users access a dataset through one or more dashboards. SPICE layer in Quicksight was developed with performance in mind and offers lightning-fast dashboard loads. It sits between the user interface and the data source and is specifically designed for analytical queries in Quicksight.
Importing data directly from the S3 data source or uploading files such as CSV or XLS, are not allowed for querying the required field to create a new dataset for analysis, instead, all of the data will be moved to spice.
Spice Schedule
Spice datasets can be scheduled to refresh from once every 15 minutes to a year (15 minutes, 30 minutes, 1 hour, week, month, year). During the refresh schedule, the data from the source is moved to spice to hold the updated records which then produce the data on the dashboard.
Spice Capacity
Each AWS Region has a specific allocation of SPICE capacity. All QuickSight users within a particular AWS Region share the SPICE capacity for that AWS account.
QuickSight administrators can view the available SPICE capacity in each AWS Region and monitor how much of it is being utilized. Additionally, administrators can add more SPICE capacity or release unused capacity as needed.
For the Enterprise Edition, a SPICE dataset can store up to 500 million rows or 500 GB, while for the Standard Edition, it can hold up to 25 million rows or 25 GB.
SPICE supports data ingestion from both queries and files. When ingesting data via files:
- Up to 2,000 columns can be included.
- Each column name can have up to 127 Unicode characters.
- A manifest for importing from Amazon S3 can reference up to 1,000 files.
Features of Spice
- Highly scalable as it can enhance resources internally based on the number of requests.
- Data in spice is highly available even if the data source failed to ingest data.
- Super fast dashboard loads because the data is stored in the memory itself.
- Incremental or full data source refresh can be set for spice datasets to have updated records on schedule.
Direct query
When using the Direct Query option, queries are executed against the backend data source each time a dashboard or analysis utilizing the dataset is loaded.
The primary advantage of using Direct Query is that it eliminates the need to schedule a refresh for new records. The dashboard automatically retrieves all the latest records from the data source whenever users view it.
QuickSight also optimizes performance by generating smart queries in Direct Query mode. It joins only the necessary tables rather than querying all tables defined in the dataset. Additionally, you can specify unique keys for tables when defining joins, enabling QuickSight to determine whether a table can be excluded from the query if no fields from that table are being fetched.
Custom SQL Query:
You can insert your query as Custom SQL in the dataset, for example, if the dataset already pulls data from 25 tables and you want to avoid remapping every relationship in the UI. (Joining datasets through the user interface in QuickSight can be done by selecting Add Data in the QuickSight Data Prep.)
Custom SQL can also be combined with other tables in the dataset and functions as a regular table within it. This approach is highly flexible, allowing you to write complex queries, such as sub-queries and datasets with multiple joins.