Guest Post: Nth Occurrence DAX Formula

David Hager is At It Again

A short two weeks after his truly creative Measure Catalog Macro post, David Hager is back with a heavyweight DAX post.

This time he is exploring an area of the product that I have very little experience with, namely the EARLIER() function.  On the DAX spicy scale, EARLIER() is a 5 in my opinion, but the other reason I haven’t done much with it is because I’m pretty sure it’s only useful in calc columns.  At Pivotstream we do all of our calc columns in SQL (for several reasons I won’t go into here).

But not everyone has colleagues ready to write calc columns in SQL for them now do they?  In fact, the Great PowerPivot Survey is reinforcing that for me (I’m going to wait until next week before I start summarizing results, but it sure has been interesting so far).

OK David, take it away.

CREATING AN Nth OCCURRENCE DAX FORMULA

By David Hager

CLICK HERE TO DOWNLOAD THE WORKBOOK IN THIS POST

When you have a growing database of customer records (such as purchases at a store), it is useful to have a method for tracking the previous visit for each customer.

A formula of the following type can be used for this.

=CALCULATE(MAX(Table1[DATE]),FILTER(Table1, EARLIER(Table1[Customer])=Table1[Customer] && EARLIER(Table1[DATE])>Table1[DATE]))

(See http://gavinrussell.wordpress.com/2010/09/23/powerpivot-dax-session-notes-2-previous-row/ for the original idea behind this post).

image

However, note that the previous visit date is returned for every customer visit (except the initial one) for each customer. In some cases, this may be the desired result, but usually only the last customer visit record would add the previous visit as useful information.

While trying to make a formula that would do this, I thought that it would also be nice if I could expand that concept to find an Nth previous visit. Since I did not want to hard-code an Nth value that would have to be updated manually in the PowerPivot cube, I had to invent a method for creating a dynamic constant.

In order to create a dynamic constant for PowerPivot (which would simply be an updatable cell value in Excel), the first step is to create an Excel table. The table consists of a Date field which can be used to create a relationship with the fact table Date field and a constant field. In each row of the constant field, the same formula needs to point to the update cell (=$D$2).

image

Now, the table can be linked to PowerPivot, using the Create Linked Table feature.

Note from Rob:  Here’s another great example of a feature that I never use (Linked Table), because that linkage doesn’t refresh on the server.  But it has tremendous utility in desktop PowerPivot, so I’m glad David is showing it off here.

Two more steps remain before the dynamic constant can be used in a calculated column formula. By clicking Create Relationship in the PowerPivot window Design ribbon, a relationship can be established between the Date field in Table1 with the same in Table2. Finally, a calculated column named NTH is added to Table1 with the formula =RELATED(Table2[NTH]).

Now it is time to create the formula. The entire version of this formula is too complex to be viewed in a completely displayed form, so two parts of it are entered as calculated columns and are used as inputs into the formula shown below.

=IF(Table1[IsLastRecord],CALCULATE(MAX(Table1[DATE]),FILTER(Table1,EARLIER(Table1[Customer])=

Table1[Customer] && EARLIER(Table1[DATE])>Table1[DATE] &&Table1[IsNthPreviousRecord])),BLANK())

where

Table1[IsLastRecord]

=COUNTROWS(FILTER(Table1,EARLIER(Table1[Customer])=Table1[Customer]))=

(COUNTROWS(FILTER(Table1, EARLIER(Table1[Customer])=Table1[Customer] && EARLIER(Table1[DATE])>Table1[DATE]))+1)

and

Table1[IsNthPreviousRecord]

=(COUNTROWS(FILTER(Table1,EARLIER(Table1[Customer])=Table1[Customer]))-

(COUNTROWS(FILTER(Table1, EARLIER(Table1[Customer])=Table1[Customer] &&

EARLIER(Table1[DATE])>Table1[DATE]))+1)=Table1[NTH])

IsLastRecord counts the numbers of records for each customer and compares that number to the number of records previous to the last record plus 1. If the numbers are the same, the expression is true and the last record in row context for each customer will have a date depending on whether it had an Nth previous visit (otherwise a blank).

IsNthPreviousRecord is the 3rd filter in the main DAX formula. This Boolean expression is true if the numbers of records for each customer minus number of records previous to the last record plus 1 is equal to the desired Nth previous visit.

The following picture shows the returned records for the table example when Nth is equal to 2.

image

Note that only customers a & d have 3 or more visits, so those are the only dates returned.

I have not gone into great detail into explaining exactly how this DAX formula works mainly because if you work through this yourself you will gain a greater understanding of how to construct your own formulas. Note from Rob:  Heh heh, now I STILL don’t understand EARLIER().  Nor have I built a pivot table that exploits the power of this formula. I will also leave that as an exercise to the reader.

As you might imagine, there are many ways that this formula methodology for Nth occurrence can be used. Also, the technique for using a dynamic constant is pretty neat. I hope this helps in your future PowerPivot model-building projects!

4 Responses to Guest Post: Nth Occurrence DAX Formula

  1. Rob, David, first of all, thanks for sharing.
    Your idea is a nice one but I cannot resist to try to make it somehow a better one with some more insights taken from my personal experience.
    The first thing to note is that you don’t need a relationship to get the value of a parameter table. In DAX, if you have a column which has only one value, you can link directly the table containing the parameter (which I named “Param”) in PowerPivot and use VALUES (Param[Nth Number]), which will evaluate to your parameter anywhere in DAX. It clearly works if the column has a single value, otherwise it will raise an error. This somehow simplify your formulas.
    Then, to solve the scenario, I would have chosen a different path (which does not mean mine is better, but a different idea might add some info to your solution).
    I have defined a calc column in Table1 which is:
    CustVisitNumber=CALCULATE (COUNTROWS (Table1), Table1[Customer] = EARLIER (Table1[Customer]), Table1[DATE] < EARLIER (Table1[DATE]), ALL (Table1)) + 1
    This is an increasing number that starts from 1 and increases at each customer visit of the same customer.
    Then, to get the Nth previous visit it is enough to write another calc column like:
    PrevNthVisit=CALCULATE (MAX (Table1[DATE]), Table1[Customer]=EARLIER (Table1[Customer]), Table1[CustVisitNumber] = EARLIER(Table1[CustVisitNumber]) – VALUES (Param[Nth Number]), ALL (Table1))
    Finally, if you want the info for the last row only, then you just need to BLANK it when necessary, as you have already done with IsLastRecord.
    The final model is somehow easier even if the DAX code is a bit harder to understand. Moreover, I tried to avoid FILTER and preferred to leverage CALCULATE with filters, which usually result in faster calculations.

    • …from one EARLIER() calc column practitioner to another. I’m sitting this one out :)

      Thanks for weighing in Alberto. I suspect you and David could write formulas together that span pages :)

  2. David Hager says:

    After adding an IsLastRecord blank to Alberto’s formula, I used it to make another formula that sums the purchases for the last n+1 visits.

    =IF(ISBLANK(Table1[PrevNthVisit]),BLANK(),CALCULATE(SUM(Table1[AMOUNT]), DATESBETWEEN(Table1[DATE],EARLIER(Table1[PrevNthVisit]), EARLIER(Table1[DATE])),Table1[Customer]=EARLIER(Table1[Customer])
    ,ALL(Table1)))

  3. niri says:

    I am using isfiltered & hasonefilter in my Calculation – it’s not giving subtotals at highest level, is there any way to avoid this limitation?

Leave a Comment or Question