What is Full and Incremental Refresh in QuickSight SPICE ?

Quicksight SPICE allows us to schedule data refresh in two ways: Incremental Refresh and Full Refresh. Both options are used based on business requirements and specific needs.

QuickSight offers two modes for creating datasets: Direct Query and SPICE. In Direct Query mode, data is fetched directly from the data source in real time whenever users open the dashboards, eliminating the need for scheduling. SPICE, on the other hand, is a high-performance engine that requires periodic scheduling to refresh and update its data from sources like RDS, Athena, SQL Server, and others. The “Add New Schedule” option enables users to configure these periodic refreshes to keep the SPICE engine data up-to-date.

Add new schedule

 

Incremental Refresh

Incremental refresh allows us to refresh a portion of the dataset based on a specified date field configured in the dataset. Incremental refresh is chosen when changes to historical data are expected to occur only for a few days or months, rather than affecting the entire dataset in the future.

Incremental refresh

Time zone

The time zone option in incremental refresh allows you to select your country’s time zone, and it treats timestamp data as UTC. The refresh starts based on the start time set for the dataset, which is derived from the selected time zone.

Start time

The start time refers to the initial point or cutoff time used to identify updated records during an incremental refresh. It is a fixed timestamp that retrieves records added or updated after this specified time. The start time is based on the date field defined in the Date column, which is configured during the incremental refresh setup to ensure data retrieval begins from the designated start time.

For instance, if your dataset has a column accounting_date, and you configure this date column for incremental refresh and set the start time to 8:00 AM, QuickSight will retrieve all records with an accounting_date value after 8:00 AM during the incremental refresh.

Frequency

The frequency option allows us to set how often we want to perform the incremental refresh on the scheduled dataset. The frequency can be set anywhere from 15 minutes to a month, with options such as 15 minutes, 30 minutes, hourly, daily, weekly, and monthly.

Configure incremental refresh

Incremental refresh utilizes the Date column, window size (number), and window size (unit) to determine what portion of the dataset will be refreshed whenever the refresh occurs.

Configure incremental refresh

Date column

The Date column option fetches all the fields in date format from the dataset associated with the refresh configuration. Among these date fields, we must select the one that the incremental refresh will use to identify and refresh new records in the dataset.

Window size (number)

The Window size (number) is specified as a numerical value that indicates the number of hours, days, or weeks.

For example, if you set the number to 2 and select “days” in the Window size unit, it refreshes data for the last 2 days, i.e., yesterday and the day before yesterday.

The window size considers the two most recent full days before the current day. Similarly, for the “weeks” and “hours” window size units, it considers the most recent full weeks or full hours, excluding the current week and current hour.

Window size (Unit)

The Window size (unit) option allows us to choose from hours, days, or months as units. Based on these units, numbers are assigned to the Window size (number) to specify ranges, such as the last 2 days, last 4 hours, last 3 months, and so on.

Full Refresh

Full refresh allows us to refresh the entire dataset without performing a partial refresh. Full refresh is chosen when we believe that data changes may occur even after many days, months, or years. By choosing full refresh, users can ensure that all future updates to the data are captured.

Full refresh

Time zone

The time zone in full refresh is similar to that in incremental refresh. Users can set their preferred time zone to define the refresh time accordingly.

Start time

The start time is the scheduled initial refresh time that triggers the refresh once the specified time is reached. After that, the refresh occurs continuously based on the frequency set for the dataset.

Frequency

The frequency for a full refresh can be set to hourly, daily, weekly, or monthly.

Refresh now

The Refresh Now option allows you to perform a full refresh of the dataset immediately. However, it does not mean that a schedule will be created to refresh the dataset periodically.

Leave a Comment