Guest Post
- Author – Saurabh Gupta
In this post we will go through ROLLUP function which is one of the Extended grouping function.
ROLLUP is used to aggregate data along all the levels of hierarchy within a single dimension.
For example –
In time dimension data can be aggregated along different levels like day, week, month and year. So if you need to display sale of a store for a particular day, month and year ( with grand totals ) , then you can first aggregate data on day and then roll up that data to month and then finally to year. In this case year is the highest level of roll up.
Let’s consider an example of one level of hierarchy:
Below is the STORE_QTY table which stores quantity of products sold on a particular day, where products are identified by PRODUCT_ID and Stores are identified by STORE_ID.
CREATE VOLATILE TABLE STORE_QTY
( STORE_ID INTEGER, SOLD_DATE DATE, PRODUCT_ID INTEGER, QUANTITY INTEGER ) ON COMMIT PRESERVE ROWS;If you want to calculate the number of products of same kind sold from the table, along with the total number of products sold, then you can use a ROLLUP function as :
SELECT PRODUCT_ID ,SUM(QUANTITY) AS TOTAL_QUANTITY FROM STORE_QTY GROUP BY ROLLUP (PRODUCT_ID) ORDER BY 1;Here along with all Product_Id you get a row which has ? in PRODUCT_ID column.
This ? Isn’t null but instead indicates the grand total of all the product_id.
To separate actual null data and grand totals you can use a GROUPING function:
SELECT CASE GROUPING (PRODUCT_ID) WHEN 1 THEN ‘TOTAL’ ELSE PRODUCT_ID END PRODUCT_ID , SUM(QUANTITY) AS TOTAL_QUANTITY FROM STORE_QTY GROUP BY ROLLUP (PRODUCT_ID) ORDER BY 1;Now if you see by using GROUPING function we have indicated grand total with Total
in above output instead of ?
Let’s consider another example where you have to ROLLUP data on two levels:
You have a store which sells different products and you want data to be displayed as:
- Number of products of the same kind sold in a particular store
- Number of all the products sold within a store
- Finally Number of products sold by all the stores
So basically here you need a ROLLUP first on PRODUCT_ID and then on STORE_ID.
SELECT STORE_ID, PRODUCT_ID , SUM(QUANTITY) AS TOTAL_QUANTITY FROM STORE_QTY GROUP BY ROLLUP (STORE_ID,PRODUCT_ID) ORDER BY 1 , 2;If you see the above output shows :
- Number of individual products sold in a particular store from rows 3 to 5 for store 100 and rows 7 to 8 for store 101.
- Number of all the products sold (indicated by ? In PRODUCT_ID column, you can use GROUPING to indicate Total Products ) in store 100 is shown in row 2 ( sum of row 3 to 5 ) whereas for store 101 it is shown in 6 ( sum of rows 7 to 8 ).
- Grand total of all the products sold across all the stores is indicated in row 1.
If you reverse the order of STORE_ID and PRODUCT_ID in your query then it will give you data firstly rolled up on STORE_ID and then on PRODUCT_ID
SELECT PRODUCT_ID , STORE_ID ,SUM (QUANTITY) AS TOTAL_QUANTITY FROM STORE_QTY GROUP BY ROLLUP (PRODUCT_ID , STORE_ID) ORDER BY 1 , 2;To summarize you can use ROLLUP function where you need data to be aggregated along all the levels of hierarchy.
Feel free to write your thoughts/queries/doubts about the post in the comment section.
If you like the post do share it on your social networking sites with the below share button 🙂
2 pings