A Generalized “New Customers in Time Period” Solution, Inspired by Tuesday’s Post

David Hager’s post on Tuesday really planted a seed in my brain.  And then a comment on that post from Charlie got me thinking further.

How can we extend the “New Customers per Day” concept to become “New Customers in <Any Period of Time>?”  New Customers per Month for instance.

### Step One:  Make That Calendar Table!

I think it’s a general habit we should all get into:  if you’re using the Time Intelligence functions in DAX (like DATESBETWEEN), it is going to be MUCH safer all the way around if you create a separate Calendar/Date table first.

Quick technique for doing this is to use a flat pivot:

Create a Flat Pivot, Use All Date-Related Columns from Your Original Table on Rows,
Turn Off Subtotals, then Copy the Pivot

Paste as New Table in the PowerPivot Window

Yielding…

Calendar Table in PowerPivot

Which I then relate to my original Table1…

### Step Two:  Measures!

[DistinctCustomers] =
DISTINCTCOUNT(Table1[CustomerID])

[PreExisting Customers] =
CALCULATE([DistinctCustomers],
DATESBETWEEN(Calendar[Date],
0,
FIRSTDATE(Calendar[Date])-1
)
)

The first measure is just a distinct count of customer ID, and the second one calculates how many such disctinct customers “existed” from date 0 (the beginning of time, heh heh) up through the last day before the selected time period.

(Note how we do not need to use the ALL function in the second measure, unlike in Tuesday’s post.  This is because of the separate Calendar table we’re using – but that’s not the only benefit of the separate Calendar table).

So if I am looking at March 2009 on my pivot, FIRSTDATE(Calendar[Date]) returns March 1, 2009 and then I subtract 1 to get Feb 28, 2009 – the last day before the selected month started.

Here it is:

The number of pre-existing customers can never go DOWN over time, so the fact that this shows a nice smooth increase is a good sign.

[Customers Life to Date] =
CALCULATE([DistinctCustomers],
DATESBETWEEN(Calendar[Date],
0,
LASTDATE(Calendar[Date])
)
)

It’s just like the [PreExisting Customers] measure except for that highlighted portion – we extend up through the last date in the selected time period.

And then our last measure:

[New Customers in Period] =
[Customers Life to Date] – [PreExisting Customers]

### Portable!

If I want to use Days or Years on rows instead, I just rearrange the pivot, and the formulas still work:

Or, say, Year:

Always want to emphasize this as a HUGE benefit of PowerPivot:  PowerPivot formulas are portable.  They adapt to new report shapes with no additional effort.

Workbook for this post is available here.

#### Rob Collie

One of the founding engineers behind Power Pivot during his 14-year career at Microsoft, and creator of the world’s first cloud Power Pivot service, Rob is one of the foremost authorities on self-service business intelligence and next-generation spreadsheet technology.

## This Post Has 22 Comments

1. Siraj Samsudeen says:

This is cool.The table really look very nice with conditional formatting.

2. Kevin says:

Every time I try this I get an error of “MdxScript(Model)(7, 11) Calculation error in measure ‘Table1′[Preexisting Customers]: an invalid numeric representation of a date value was encountered”.

My date values are coming directly from SQL Server, so I believe they should be valid, and I have copied and pasted the formula directly from your example. Any idea on what could be causing this?

1. powerpivotpro says:

In the powerpivot window have you verified that the data type is indeed date/time?

1. Kevin says:

Thanks for the response!

Yes, the date column in both my fact table and my date dimension table are both set to ‘Data Type: Date’. My date dimension table is also set as the ‘Date Table’, and other time based functions like TOTALYTD/TOTALMTD are working correctly with it.

1. powerpivotpro says:

Well at this point I’d suggest sending me the workbook, if that is an option. rob. at a place called pivotstream. dot com.

2. Kevin says:

I think I may have got this working. It appears there was an issue with my Calendar table and missing dates (it was built off of distinct dates in my fact table). I pulled in a date dimension from an un-related DW that has contiguous dates between 1900-2032, set it up to use that, and I no longer get the error. That said, when I try to display distinct/pre-existing for actual days (there are about 1000 distinct days) Excel then eats up 6-7GB of memory and doesn’t seem to get anywhere after 15 minutes of processing. The same thing works for display those measures for months though.

1. powerpivotpro says:

Ah nevermind my previous comment then about sending me the workbook. Yes I have noticed the same problem – with a lot of different values on rows (like dates), it can be very slow. DISTINCTCOUNT is much better than COUNTROWS(DISTINCT()) in this regard but it’s still doing a *lot* of work.

3. Shawn says:

The dates need to be distinct too. Chances are you might have customers that were created at the same time.

A date column containing duplicate dates was specified in the call to function ‘DATESBETWEEN’. This is not supported.

1. Kevin says:

With regard to the distinct dates, I think that’s why he used a pivot of the existing dates – which would be distinct. I did run into an additional step that was required though, as our column is a datetime – so I had to drop the time portion.

By the way, Rob – I tried ordering your book through MrExcel the other day, both the ebook and physical copies. Were you aware that they charge a minimum of \$135 for international shipping?!?

1. Shawn says:

That was definitely it, I added another date field without the time and it worked wonderfully.

Thanks!

4. Jude Hoffner says:

I’m getting the same error message as Kevin in January, Every time I try this I get an error of “MdxScript(Model)(7, 11) Calculation error in measure ‘Table1′[Preexisting Customers]: an invalid numeric representation of a date value was encountered”.

Everything on my date table checks out. Cannot for the life of me figure out what is driving this. Does the date table have to be of the exact same date range as my Orders table? Any help is appreciated.

1. powerpivotpro says:

Just from reading the resolutions in the comment thread above, between Kevin and Shawn, it sounds like you should:

1) Make sure you calendar table is indeed using a Date data type
2) Remove any time component of your dates
3) Make sure there are no gaps and no duplicates in your Calendar table
4) Make sure you are using fields from your Calendar table on the pivot, and NOT date related fields from your data table.

2. powerpivotpro says:

Failing that you can also send me the workbook and I will take a look. rob. at a place called powerpivotpro. dot com.

1. Curtis says:

I know the comments were posted quite a while ago but I came up against this same problem and tried all of the suggestions listed above, None of these resolved my issue “an invalid numeric representation of a date value was encountered”. I managed to find that using BLANK() instead of 0 as the StartDate argument resolved this issue. This maybe of use to someone else.

=CALCULATE([DistinctCustomers],
DATESBETWEEN(Calendar[Date],
BLANK(),
LASTDATE(Calendar[Date])))

1. Kathy says:

I had the same problem. By using your Blank() method, it works. Thank you!

5. Is it possible to calculate the sales contributed by new/ returning customers then?

6. The Date Table is key for this solution but most of the time you use a predefine Date Table not one extracted from your data.

To avoid calculation of [PreExisting Customers] and [Customers Life to Date] for periodes where there is no data it may be useful to check if there are data first.

I’m suggesting these two new fomulas:

PreExisting Customers:=if(COUNTROWS(Table1)>0,CALCULATE([DistinctCustomers],
DATESBETWEEN(DimDate[Date],
0,
FIRSTDATE(DimDate[Date])-1
)
),blank())

Customer Life to Date:=if(COUNTROWS(Table1)>0,CALCULATE([DistinctCustomers],
DATESBETWEEN(DimDate[Date],
0,
LASTDATE(DimDate[Date])
)
),[PreExisting Customers])

With these formulas you can use a “standard” Date Table.

7. Evan says:

Is there a way to find out who the new customers are? This works out how many there are, but not who they are.

8. Arjun says:

Hi Everyone,

Can we do this using a Standard Pivot Table by any way? I tried this but came to know that we need a either a helper column with a powerful formula or a calculated field to capture this. But I feel the same inefficiency leads to the use of PowerPivot.

Arjun

9. Hi All

We have a measure in our Tabular model that uses the DATESBETWEEN function but unfortunately we are seeing the following error when a user selects a date for which there is no data held against that date:

“An invalid numeric representation of a date value was encountered”

Our measure is coded as follows:

Measure :=

CALCULATE (

SUM ( ‘FactTable'[Amount] ),

DATESBETWEEN (

‘Date'[Calendar Date],

IF (

MONTH ( FIRSTDATE ( ‘Date'[Calendar Date] ) ) >= 7,

DATE ( YEAR ( FIRSTDATE ( ‘Date'[Calendar date] ) ) – 1, 07, 01 ),

DATE ( YEAR ( FIRSTDATE ( ‘Date'[Calendar date] ) ) – 2, 07, 01 )

),

IF (

MONTH ( FIRSTDATE ( ‘Date'[Calendar Date] ) ) >= 7,

DATE ( YEAR ( FIRSTDATE ( ‘Date'[Calendar date] ) ), 06, 30 ),

DATE ( YEAR ( FIRSTDATE ( ‘Date'[Calendar date] ) ) – 1, 06, 30 )

)

),

‘TableX’,

‘TableY’,

‘TableZ’

)

I can get this to run in SSMS by introducing an IF else BLANK() statement around the calculation, using the following code:

EVALUATE

(

SUMMARIZE (

‘Date’,

‘Date'[Financial Year],

“Test”, IF (

COUNTROWS ( ‘FactTable’ ) > 0,

CALCULATE (

SUM ( ‘FactTable'[Amount] ),

DATESBETWEEN (

‘Date'[Calendar Date],

IF (

MONTH ( FIRSTDATE ( ‘Date'[Calendar Date] ) ) >= 7,

DATE ( YEAR ( FIRSTDATE ( ‘Date'[Calendar date] ) ) – 1, 07, 01 ),

DATE ( YEAR ( FIRSTDATE ( ‘Date'[Calendar date] ) ) – 2, 07, 01 )

),

IF (

MONTH ( FIRSTDATE ( ‘Date'[Calendar Date] ) ) >= 7,

DATE ( YEAR ( FIRSTDATE ( ‘Date'[Calendar date] ) ), 06, 30 ),

DATE ( YEAR ( FIRSTDATE ( ‘Date'[Calendar date] ) ) – 1, 06, 30 )

)

),

‘TableX’,

‘TableY’,

‘TableZ’

),

BLANK ()

)

)

)

However, I am unsure as to how to implement this fix into the measure code so that it can be evaluated in various different contexts. Please can anyone advise?

Many Thanks,

Matt

10. Daniel maycock says:

I’m only seeing the number of customers in the month itself, and not a running total or cumulative count to date. What would I be doing wrong?