A Generalized New Customers (or unique visitors) in Time Period - per Month, Year, Etc. in PowerPivot

A Generalized “New Customers in Time Period” Solution, Inspired by Tuesday’s Post

David Hager’s post on Tuesday really planted a seed in my brain.  And then a comment on that post from Charlie got me thinking further.

How can we extend the “New Customers per Day” concept to become “New Customers in <Any Period of Time>?”  New Customers per Month for instance.


Step One:  Make That Calendar Table!

I think it’s a general habit we should all get into:  if you’re using the Time Intelligence functions in DAX (like DATESBETWEEN), it is going to be MUCH safer all the way around if you create a separate Calendar/Date table first.

Quick technique for doing this is to use a flat pivot:

image  image

Create a Flat Pivot, Use All Date-Related Columns from Your Original Table on Rows,
Turn Off Subtotals, then Copy the Pivot

image

Paste as New Table in the PowerPivot Window

Yielding…

image

Calendar Table in PowerPivot

Which I then relate to my original Table1…

image

Step Two:  Measures!

[DistinctCustomers] =
DISTINCTCOUNT(Table1[CustomerID])

[PreExisting Customers] =
CALCULATE([DistinctCustomers],
          DATESBETWEEN(Calendar[Date],
                       0,
                       FIRSTDATE(Calendar[Date])-1
                      )
         )

The first measure is just a distinct count of customer ID, and the second one calculates how many such disctinct customers “existed” from date 0 (the beginning of time, heh heh) up through the last day before the selected time period.

(Note how we do not need to use the ALL function in the second measure, unlike in Tuesday’s post.  This is because of the separate Calendar table we’re using – but that’s not the only benefit of the separate Calendar table).

So if I am looking at March 2009 on my pivot, FIRSTDATE(Calendar[Date]) returns March 1, 2009 and then I subtract 1 to get Feb 28, 2009 – the last day before the selected month started.

Here it is:

A Generalized New Customers (or unique visitors) in Time Period - per Month, Year, Etc. in PowerPivot

The number of pre-existing customers can never go DOWN over time, so the fact that this shows a nice smooth increase is a good sign.

Now I add another measure:

[Customers Life to Date] =
CALCULATE([DistinctCustomers],
          DATESBETWEEN(Calendar[Date],
                       0,
                       LASTDATE(Calendar[Date])
                      )
         )

It’s just like the [PreExisting Customers] measure except for that highlighted portion – we extend up through the last date in the selected time period.

A Generalized New Customers (or unique visitors) in Time Period - per Month, Year, Etc. in PowerPivot

And then our last measure:

[New Customers in Period] =
[Customers Life to Date] – [PreExisting Customers]

A Generalized New Customers (or unique visitors) in Time Period - per Month, Year, Etc. in PowerPivot

Portable!

If I want to use Days or Years on rows instead, I just rearrange the pivot, and the formulas still work:

A Generalized New Customers (or unique visitors) in Time Period - per Month, Year, Etc. in PowerPivot

Or, say, Year:

A Generalized New Customers (or unique visitors) in Time Period - per Month, Year, Etc. in PowerPivot

Always want to emphasize this as a HUGE benefit of PowerPivot:  PowerPivot formulas are portable.  They adapt to new report shapes with no additional effort.

Download the New Workbook

Workbook for this post is available here.

Rob Collie

One of the founding engineers behind Power Pivot during his 14-year career at Microsoft, and creator of the world’s first cloud Power Pivot service, Rob is one of the foremost authorities on self-service business intelligence and next-generation spreadsheet technology.

This Post Has 23 Comments

  1. Every time I try this I get an error of “MdxScript(Model)(7, 11) Calculation error in measure ‘Table1′[Preexisting Customers]: an invalid numeric representation of a date value was encountered”.

    My date values are coming directly from SQL Server, so I believe they should be valid, and I have copied and pasted the formula directly from your example. Any idea on what could be causing this?

  2. The dates need to be distinct too. Chances are you might have customers that were created at the same time.

    A date column containing duplicate dates was specified in the call to function ‘DATESBETWEEN’. This is not supported.

  3. I’m getting the same error message as Kevin in January, Every time I try this I get an error of “MdxScript(Model)(7, 11) Calculation error in measure ‘Table1′[Preexisting Customers]: an invalid numeric representation of a date value was encountered”.

    Everything on my date table checks out. Cannot for the life of me figure out what is driving this. Does the date table have to be of the exact same date range as my Orders table? Any help is appreciated.

  4. The Date Table is key for this solution but most of the time you use a predefine Date Table not one extracted from your data.

    To avoid calculation of [PreExisting Customers] and [Customers Life to Date] for periodes where there is no data it may be useful to check if there are data first.

    I’m suggesting these two new fomulas:

    PreExisting Customers:=if(COUNTROWS(Table1)>0,CALCULATE([DistinctCustomers],
    DATESBETWEEN(DimDate[Date],
    0,
    FIRSTDATE(DimDate[Date])-1
    )
    ),blank())

    Customer Life to Date:=if(COUNTROWS(Table1)>0,CALCULATE([DistinctCustomers],
    DATESBETWEEN(DimDate[Date],
    0,
    LASTDATE(DimDate[Date])
    )
    ),[PreExisting Customers])

    With these formulas you can use a “standard” Date Table.

  5. Hi Everyone,

    Can we do this using a Standard Pivot Table by any way? I tried this but came to know that we need a either a helper column with a powerful formula or a calculated field to capture this. But I feel the same inefficiency leads to the use of PowerPivot.

    Arjun

  6. Hi All

    We have a measure in our Tabular model that uses the DATESBETWEEN function but unfortunately we are seeing the following error when a user selects a date for which there is no data held against that date:

    “An invalid numeric representation of a date value was encountered”

    Our measure is coded as follows:

    Measure :=

    CALCULATE (

    SUM ( ‘FactTable'[Amount] ),

    DATESBETWEEN (

    ‘Date'[Calendar Date],

    IF (

    MONTH ( FIRSTDATE ( ‘Date'[Calendar Date] ) ) >= 7,

    DATE ( YEAR ( FIRSTDATE ( ‘Date'[Calendar date] ) ) – 1, 07, 01 ),

    DATE ( YEAR ( FIRSTDATE ( ‘Date'[Calendar date] ) ) – 2, 07, 01 )

    ),

    IF (

    MONTH ( FIRSTDATE ( ‘Date'[Calendar Date] ) ) >= 7,

    DATE ( YEAR ( FIRSTDATE ( ‘Date'[Calendar date] ) ), 06, 30 ),

    DATE ( YEAR ( FIRSTDATE ( ‘Date'[Calendar date] ) ) – 1, 06, 30 )

    )

    ),

    ‘TableX’,

    ‘TableY’,

    ‘TableZ’

    )

    I can get this to run in SSMS by introducing an IF else BLANK() statement around the calculation, using the following code:

    EVALUATE

    (

    SUMMARIZE (

    ‘Date’,

    ‘Date'[Financial Year],

    “Test”, IF (

    COUNTROWS ( ‘FactTable’ ) > 0,

    CALCULATE (

    SUM ( ‘FactTable'[Amount] ),

    DATESBETWEEN (

    ‘Date'[Calendar Date],

    IF (

    MONTH ( FIRSTDATE ( ‘Date'[Calendar Date] ) ) >= 7,

    DATE ( YEAR ( FIRSTDATE ( ‘Date'[Calendar date] ) ) – 1, 07, 01 ),

    DATE ( YEAR ( FIRSTDATE ( ‘Date'[Calendar date] ) ) – 2, 07, 01 )

    ),

    IF (

    MONTH ( FIRSTDATE ( ‘Date'[Calendar Date] ) ) >= 7,

    DATE ( YEAR ( FIRSTDATE ( ‘Date'[Calendar date] ) ), 06, 30 ),

    DATE ( YEAR ( FIRSTDATE ( ‘Date'[Calendar date] ) ) – 1, 06, 30 )

    )

    ),

    ‘TableX’,

    ‘TableY’,

    ‘TableZ’

    ),

    BLANK ()

    )

    )

    )

    However, I am unsure as to how to implement this fix into the measure code so that it can be evaluated in various different contexts. Please can anyone advise?

    Many Thanks,

    Matt

  7. I’m only seeing the number of customers in the month itself, and not a running total or cumulative count to date. What would I be doing wrong?

  8. Thank you Rob, this was perfect! Using this pattern I was able to calculate New Clients, Returning Clients, and Lost Clients. Thanks again!

Leave a Comment or Question