A Generalized “New Customers in Time Period” Solution, Inspired by Tuesday’s Post
David Hager’s post on Tuesday really planted a seed in my brain. And then a comment on that post from Charlie got me thinking further.
How can we extend the “New Customers per Day” concept to become “New Customers in <Any Period of Time>?” New Customers per Month for instance.
Step One: Make That Calendar Table!
I think it’s a general habit we should all get into: if you’re using the Time Intelligence functions in DAX (like DATESBETWEEN), it is going to be MUCH safer all the way around if you create a separate Calendar/Date table first.
Quick technique for doing this is to use a flat pivot:
Create a Flat Pivot, Use All Date-Related Columns from Your Original Table on Rows,
Turn Off Subtotals, then Copy the Pivot
Paste as New Table in the PowerPivot Window
Yielding…
Calendar Table in PowerPivot
Which I then relate to my original Table1…
Step Two: Measures!
[DistinctCustomers] =
DISTINCTCOUNT(Table1[CustomerID])
[PreExisting Customers] =
CALCULATE([DistinctCustomers],
DATESBETWEEN(Calendar[Date],
0,
FIRSTDATE(Calendar[Date])-1
)
)
The first measure is just a distinct count of customer ID, and the second one calculates how many such disctinct customers “existed” from date 0 (the beginning of time, heh heh) up through the last day before the selected time period.
(Note how we do not need to use the ALL function in the second measure, unlike in Tuesday’s post. This is because of the separate Calendar table we’re using – but that’s not the only benefit of the separate Calendar table).
So if I am looking at March 2009 on my pivot, FIRSTDATE(Calendar[Date]) returns March 1, 2009 and then I subtract 1 to get Feb 28, 2009 – the last day before the selected month started.
Here it is:
The number of pre-existing customers can never go DOWN over time, so the fact that this shows a nice smooth increase is a good sign.
Now I add another measure:
[Customers Life to Date] =
CALCULATE([DistinctCustomers],
DATESBETWEEN(Calendar[Date],
0,
LASTDATE(Calendar[Date])
)
)
It’s just like the [PreExisting Customers] measure except for that highlighted portion – we extend up through the last date in the selected time period.
And then our last measure:
[New Customers in Period] =
[Customers Life to Date] – [PreExisting Customers]
Portable!
If I want to use Days or Years on rows instead, I just rearrange the pivot, and the formulas still work:
Or, say, Year:
Always want to emphasize this as a HUGE benefit of PowerPivot: PowerPivot formulas are portable. They adapt to new report shapes with no additional effort.



Very Neat!
This is cool.The table really look very nice with conditional formatting.
Every time I try this I get an error of “MdxScript(Model)(7, 11) Calculation error in measure ‘Table1′[Preexisting Customers]: an invalid numeric representation of a date value was encountered”.
My date values are coming directly from SQL Server, so I believe they should be valid, and I have copied and pasted the formula directly from your example. Any idea on what could be causing this?
In the powerpivot window have you verified that the data type is indeed date/time?
Thanks for the response!
Yes, the date column in both my fact table and my date dimension table are both set to ‘Data Type: Date’. My date dimension table is also set as the ‘Date Table’, and other time based functions like TOTALYTD/TOTALMTD are working correctly with it.
Well at this point I’d suggest sending me the workbook, if that is an option. rob. at a place called pivotstream. dot com.
I think I may have got this working. It appears there was an issue with my Calendar table and missing dates (it was built off of distinct dates in my fact table). I pulled in a date dimension from an un-related DW that has contiguous dates between 1900-2032, set it up to use that, and I no longer get the error. That said, when I try to display distinct/pre-existing for actual days (there are about 1000 distinct days) Excel then eats up 6-7GB of memory and doesn’t seem to get anywhere after 15 minutes of processing. The same thing works for display those measures for months though.
Ah nevermind my previous comment then about sending me the workbook. Yes I have noticed the same problem – with a lot of different values on rows (like dates), it can be very slow. DISTINCTCOUNT is much better than COUNTROWS(DISTINCT()) in this regard but it’s still doing a *lot* of work.
The dates need to be distinct too. Chances are you might have customers that were created at the same time.
A date column containing duplicate dates was specified in the call to function ‘DATESBETWEEN’. This is not supported.
With regard to the distinct dates, I think that’s why he used a pivot of the existing dates – which would be distinct. I did run into an additional step that was required though, as our column is a datetime – so I had to drop the time portion.
By the way, Rob – I tried ordering your book through MrExcel the other day, both the ebook and physical copies. Were you aware that they charge a minimum of $135 for international shipping?!?
That was definitely it, I added another date field without the time and it worked wonderfully.
Thanks!