Creating Accurate Percentile Measures in DAX – Part II

June 20, 2012 at 9:11 pm

I am working with products that have many tied ranks and are broken down into about 100 different sets (we call them departments). Since so many products have tied ranks, instead of using the COUNTROWS() function in calculating the 25thPctRank_INC I am thinking it would be wiser for me to simply find the max value from the rank column, within each item’s respective set.

I receive the following error message when I replace COUNTROWS() with MAX: “Cannot find column ‘Rank’ in table ‘FactSales’.”

Here is what I currently have as my formula:

25thPctRank_INC
=CALCULATE(
(MAX( FactSales[Rank] ) -1*25/100+1,
ALL(DimItem[SKU] )
)

How can I replace the portion of your formula for 25thPctRank_INC that counts the rows in a table listing all products with a formula that simply returns the max Rank value calculated for each product, while still respecting each product’s set/department?

Any help you could offer, or simply a link to additional resources to get me going in the right direction would be much appreciated.

-Kyle

June 21, 2012 at 8:46 am

Kyle,

Ties doesn’t make a difference to the results. However, there’s an error in one of my calculations, which is discussed here: https://powerpivotpro.com/2012/05/percentile-measures-in-dax-errata/

[Rank] is not a column in the fact table – it is a measure. That’s why you’re getting the error message. In my sample dataset, I do show a rank column, but that’s there for illustrative purposes only – it’s not part of the “fact” table used in PowerPivot.

Let me know if I can help further.

June 21, 2012 at 5:40 pm

Colin,

Thanks for your help, and for this excellent series of posts on percentile ranking as well.

What I am trying to create is a measure that calculates each product’s rank (by units sold right now… eventually I’ll want to do the same to rank by gross margin earned, total sales $, and # of sale transactions) within its set, using a moving annual total of units sold. I intend to use this measure to ultimately illustrate the distribution of our investments in inventory over time (% of $ invested in “A” items, “B” items, etc. by week, for example).

I am, however, getting different results from you in the Rank values because I have included the optional “Dense” enumeration to my RANKX() function. This way, after each tie, the next rank value is [previous rank] + 1. For example, if 5 values are tied with a rank of 8, the next value will receive a rank of 9 (instead of 13). I could be wrong, but I think that this is why the COUNTROWS() function in your 25thPctRank_INC formula is giving me some trouble.

Broken down, here are my formulas:

Moving Annual Total Sales Units (“MATSalesUnits”) =
CALCULATE( SUM (FactSales[UnitsSold] ),
DATESBETWEEN( DimDate[DateKey],
NEXTDAY(
SAMEPERIODLASTYEAR(
LASTDATE( DimDate[DateKey] ) ) ),
LASTDATE( DimDate[DateKey] ) ) )

My MATSalesUnits formula seems to be behaving properly. Now onto my formula for Rank:

RANK=
RANKX(
CALCULATETABLE( DimItem, ALLEXCEPT( DimItem, DimItem[Set] ) ),
[MATSalesUnits],,1,Dense)

I think my Rank measure behaves correctly as well. When I show all SKUs and this measure in a PowerPivot table and slice by any value in my date table, I am getting each product’s rank within its respective department using moving annual sales units totals.

What I want to show next for each item is whether its rank falls in the 25th, 50th, 75th, etc. quartile of the product’s department (set). My logic here is to try to find the rank that corresponds to each breakpoint by finding the highest ranked product for each set, and then if the item in question’s rank is equal to or less than .25 * the max rank within the set, then the product falls in the low percentile bucket (a “D” item).

This (I believe) is where my process breaks down. How can I find the max rank for each item’s set and use that value in a formula to calculate the set’s 25PctRank_INC? COUNTROWS() is yielding inaccurate breakpoints for me because I am using the optional “Dense” enumeration in my RANKX() function, which I feel is better suited for my project.

Thanks again for taking the time to post excellent material. This blog has been a tremendous resource to me over the last year.

Kyle

June 21, 2012 at 10:51 pm

If you need to show a dense rank measure, it should be calculated separately from the rank measure used for the percentile calculation. The dense option cannot be used in the rank measure for calculating percentiles – you’re guaranteed to get incorrect results.

June 22, 2012 at 1:30 pm

Thanks, Colin. I see that now.

Is there any way to bake the context of each product’s “set” into the rank measure itself? The Rank values change once I remove the Set dimension from the powerpivot table. I’d like for the measure to return the same values regardless of whether the “set” dimension is present in the query.

Sorry for all the questions, but this is a REALLY big help.

June 22, 2012 at 7:28 pm

If I understand you correctly, you want to maintain the set rankings in the PivotTable even though the set dimension is not present to provide context for these rankings?

June 22, 2012 at 7:36 pm

Hi Colin,

Yes, that’s what I am trying to do.

Thanks.

Kyle

June 23, 2012 at 10:21 am

Without the “set” dimension to add context to the rankings, how would the user know what the rankings relate to, when the rankings are displayed in the PovotTable?

June 23, 2012 at 1:07 pm

The user won’t actually see the rank for each item individually. I really just want to show, for example, the value of our store’s inventory at replacement cost over time, with items grouped by their percentile rank. I’m trying to graphically answer questions such as “what percentage of our investments were in “A” items in March? How does that compare to the following month?”.

I guess you could say that I am trying to generate a fact table containing each product’s percentile rank for each day. If I were to literally generate a fact table and store each product’s rank for 365 dates, my .xlsx file would be enormous (I have about 65,000 products). I am hoping to be able to avoid having to create such a table by using a measure, but perhaps it is not actually possible to do what I am after.

Thanks.

June 24, 2012 at 2:46 pm

Measures are entirely dependent on data from the source, so they cannot store historical values. It’s a typical function of a data warehouse to store historical inventory balances at a frequency dependent on the grain required for analysis. In the absence of a formal data warehouse, you will need some tool that can copy data from your transactional database into a historical data store. Even so, trying to analyze the ranks for 65K products every day of the year would be quite overwhelming. What action would you take if, for example, a product ranked 60000th today and 60001th tomorrow, and then 59999th the day after? It would be easier if you can analyze your product ranks monthly, and easier still if in addition, you can focus on your top x and botton x products. I understand though that in your scenario, neither of these options might be possible.

September 5, 2012 at 6:10 am

I don’t know, imagine in a online shop, all the laser printers are past the 15kth position in the ranking. Maybe they would wish to know that to make a decision whether to keep offering them or not

May 3, 2013 at 5:00 pm

Hi, realize this is a fairly old article, hopefully someone can help – I am looking to do similar to the above in terms of providing a rank when an additional dimension level is added (i.e. “Set” above), but I also want to be able to provide the rank of the individual row against the entire population (at a certain grouping level).

July 26, 2013 at 1:32 pm

You saved my life with that rankX(Calculatetable(….))) trick. I have been researching how to do this for 2 days!

Thanks 🙂

May 27, 2014 at 10:58 pm

Thank you so much for the rankx(calculatetable(…))) formula. Just what I needed to finish a project. Thank you, thank you!

January 7, 2015 at 5:05 am

Thank you for the great tutorials. Just one thing I want to ask:

“5. After you create the chart from the template, the whiskers will not appear. This occurs because the whiskers are created using error bars, and external ranges are used to specify the data for the error bars. Unlike the rest of the chart, the template doesn’t know which ranges to use for error bars. You must add these manually, using the technique detailed in Jon’s tutorial.”

I cannot find a way to add the whiskers to the chart. I read Jon’s tutorial and I can make a box plot in normal Excel. But in PowerPivot Chart I cannot make Error Bars recognize the measures.

Specifically, I click on [1-25th percentile] series, add Error Bars —> More Options, check on Minus, check on Cap, choose Custom —> Specify Value, here I get stuck and cannot make the chart understand the [4 – Minimum] measure.

I downloaded your file, but it’s a .crtx file and I cannot open it.

Could you please help on this? Thanks a lot.

January 7, 2015 at 1:14 pm

How are opening the .ctrx file? You must place the file in the Templates folder, and access the template from the Templates tab in the Insert Chart dialog box. To see the path to the Templates folder (where you put the file), open the Insert Chart dialog box (click the dialog launcher button on the bottom right of the Charts group in the Ribbon’s Insert tab). In the Insert Chart dialog box, select the Templates tab. Next click the Manage Templates button. The file dialog box that appears points to the Templates folder.

It’s hard to say why the custom value doesn’t work. Make sure to enter 0 for the Positive Error Value, and to clear the Negative Error Value entry before pointing to a worksheet cell (otherwise the default value {1} will be added to the value you point to and generate an error when you click OK).

December 21, 2016 at 4:47 am

Hi all,

Can anyone tell me how to download the file? I am stuck to create the box plot through the pivot chart.
When I click download is asking me for logins, but there is not a create account section, so I don’t know how to log into the website.

Cheers,
Manuel

Creating Accurate Percentile Measures in DAX – Part II

Cancel reply