Show the top 3 regions by total sales in a stacked bar graph

Problem: I want to show total sales by multiple dimensions (i.e. region and shipping mode). However, I only want to display the top 3 region by total sales.

44%20PM

This isn’t possible if I use the Top K option since that will only show the top 3 total sales by region-shipping mode combinations.

13%20PM

Solution: This requires multi-pass SQL to achieve the desired result. First we must use an Analytic (Window) function to construct a rank based on total sales, save our rank function as a Derived Data field in our Visual, and then use our rank function to filter our visual to only show the Top 3 regions based on rank of total sales.

Step 1: Construct your subquery (Derived Data column) that will calculate and rank your regions by total sales.

The best way to get started is to use a table visual to create the data structure that be will used to reference your regions ranked by total sales.

To apply a RANK() function to your total sales, add sales again to the Measure shelf, click on the element, select the “Analytic Functions” dropdown and then select “Rank”:

56%20PM

In this case we want to rank sales across the entire result set without any partitioning (entities), and we also want to rank in ascending order by total sales, which means we can leave the basic configurations alone.

45%20PM

If you explore the code behind the rank function, you should see something like this:

31%20PM

After applying the rank function, you should now see total sales by region ranked in ascending order.

Step 2: Save your subquery expression as a Derived Data column

If you click on the “rank” element in your Measure shelf, and then click “Save Expression” you’ll be given an option to save this subquery to the Visual builder. Give your rank function a unique name to ensure there’s no reference name overlaps in your Dataset (i.e. total_sales_rank)

39%20PM

Now you should be able to see your Derived Data column in green on just below the Dimension shelf title:

50%20PM

Step 3: Clear your shelves, add your original dimensions (region, ship_code), your measure (sales), and filter by your Derived Data “rank” column.

Since your previous shelf elements were used to construct your subquery, its best to remove everything and give yourself a clean slate before we move onto your main query:

Once you clear your shelves, add back your dimensions and measures:

And finally add your “total_sales_rank” Derived Data column to filter shelf and create a custom expression to filter down to the top 3 ranked regions:

Now you should only see 3 regions in your query:

Step 4: Update your visual to use the original stacked bar chart

Now that you’ve got the query constructed properly, we can display the dimensions and measures like we originally intended through a Stacked Bar chart.

Hi!!

Is there any way in that you can show the top 5 regions on a trellis bar chart with their top 5 branches details?

Thanks!!

@Carlos_Bocanegra I would use “Dataset from SQL” which calculates the rank across both the region & branches.

Here’s an SQL:

WITH T1 as (SELECT TA_0.`product_sub_category` as `Product Sub Category`, TA_0.`region` as `Region`, sum((TA_0.`sales`)) as `sum(Sales)`, rank() over (partition by (TA_0.`region`) order by sum((TA_0.`sales`)) desc) as `rank_product_in_region`, sum(sum((TA_0.`sales`))) over (partition by (TA_0.`region`)) as `region_sales`
  FROM `robb`.`Superstore_LV` TA_0 
  GROUP BY 1, 2
  ORDER BY 5 DESC, 4 ASC
 )
 SELECT 
 T1.`Product Sub Category`,
 T1.`Region`,
 T1.`rank_product_in_region`,
 dense_rank() over (order by sum(T1.`region_sales`) desc) as `region_rank`
 FROM T1
 GROUP BY 1,2,3
 ORDER BY 4 ASC, 3 ASC

You can use the SQL visual type to calculate the results:

Then create a “Dataset from SQL” as shown here: Creating a Dataset based on UNION of tables using custom SQL query

Then build your visual & filter accordingly.

1 Like