The Greatest Formula in the World, Part 3

December 13, 2011

 
Our time machine that we developed in part one and part two still needs some explanation, so let’s dive in.

First Things First:  The MAXX Wasn’t Needed

Looking back, I noticed that I was overcomplicating things needlessly.  The final formula I had involved the following FILTER() term, in which I used a MAXX:

FILTER(ALL(Periods),
       Periods[NextYear Period]<=
          MAXX(VALUES(Sales[Period Num]), Sales[Period Num])
)

Turns out that MAX() works just fine, so let’s replace that clause and simplify things a bit.  Here’s the new formula:

CALCULATE([Total Sales],
   ALL(Periods),
   FILTER(ALL(Periods),
      Periods[Year]=MAX(Periods[Year])-1
   ),
   FILTER(ALL(Periods),
          Periods[NextYear Period]<=MAX(Sales[Period Num])
   ),       
   VALUES(Periods[MerchPeriod])
)

OK, with that, we can move on to explanation:  How does this formula work??

I used to call this technique “expand then filter”

Well actually I still do, in my own head.  It’s just that the GFITW is a catchier title.

OK, so the “expand” part is just that first ALL():

CALCULATE([Total Sales],
          ALL(Periods),

The first step in the formula, then, is basically just telling the calc engine to throw away all filters on the Periods table.  In other words, “forget all concept of time, pretend the pivot is not filtered at all with respect to time.”

We do this so that we have a clean slate.  Then, in the subsequent FILTER clauses, we build up a new filter context for time.

One more time for clarity:  the way this formula works is to first throw out all time filters, and then in subsequent steps, we build up new filters to match the time period that we want, which in this case is last year.

Once you understand that, this overall formula starts to get pretty simple.  Each piece of the formula is quite straightforward in its own right.  ALL() is pretty straightforward for sure, and so are the subsequent FILTERS().

OK, we’ve expanded.  Now on to the filters!

The first filter says, “hey, now that we’ve thrown out all time filters, let’s filter time back down to just be last year.”

FILTER(ALL(Periods),
       Periods[Year]=MAX(Periods[Year])-1
)

Let’s talk about the FILTER() function itself for a moment.

How does FILTER() Work?

Honestly this function has deserved its own post for a long time.  I’ll give a brief explanation here.

The syntax for the FILTER function is FILTER(TableToFilter, FilterExpression).   Pretty simple.  Here’s some more detail:

  1. FILTER() takes a TableToFilter and a FilterExpression, and returns all rows from that TableToFilter that match the FilterExpression.
    1. In the example above, TableToFilter is ALL(Periods)
    2. and FilterExpression is Periods[Year]=MAX(Periods[Year])-1
  2. FILTER() steps through the TableToFilter one row at a time. 
    1. And for each row, it evaluates the FilterExpression.  If the expression evaluates to true, the row is “kept.”  If not, it is filtered out.
    2. Because FILTER() goes one row at a time, it can be quite slow if you use it against a large table.  When I say “large” that is of course subjective.  A few thousand rows is fine in my experience.  A million is not.  Do not use FILTER() against your fact table.
  3. The FilterExpression typically takes the form of Table[Column] = <expression>
    1. The comparison operator doesn’t have to be “=”.  It can also be <, >, <=, >=, <>
    2. The expression on the right hand side of FilterExpression can be “rich.”  This is VERY useful.  In a simple CALCULATE, the right side of each filter expression has to be simple, like a literal number (9) or a string (“Standard”).  The fact that FILTER() allows for rich expressions here is one of the most common reasons I use FILTER().
    3. The Table[Column] in the filter expression is a column in the TableToFilter.  If you are filtering the Periods table, it makes sense that you are testing some property of each row in Periods.  I can’t think of a sensible reason to use a column here that is NOT from TableToFilter.  (Insert “boot signal” here, maybe the Italians can address this).
  4. FILTER() ignores everything else going on in your formula and acts completely on its own.
    1. For example, our overall formula sets ALL(Periods) as the first argument to CALCULATE.
    2. The FILTER()’s that come after that do NOT pay any attention to other arguments however, including that ALL(Periods).
    3. In other words, the FILTER() functions are still operating against the original filter context from the pivot!  If the pivot is sliced to Year=2009, then the FILTER() function starts with the Periods table already pre-filtered to just 2009.
    4. This is why each of my FILTER()’s uses ALL(Periods) for TableToFilter.  I have to repeat the “expand” step so that my FILTER() is also working from a clean slate.
  5. Even though each FILTER() operates on its own, their results then “stack up” in the overall formula.
    1. Even though FILTER() RETURNS a set of rows that matched the FilterExpression, it actually REMOVES rows from the overall filter context.
    2. This sounds tricky but really it isn’t.
    3. Let’s say our TableToFilter contains 6 rows:  A, B, C, D, E, and F.
    4. And our overall formula contains two FILTER() clauses that both operate on the same TableToFilter, just like our overall formula near the beginning of this post.
    5. Let’s also say that the first FILTER() returns rows A, B, C, and D.
    6. And the second FILTER() returns rows C, D, E, and F.
    7. The net result is that only rows C and D are left “alive” in the overall filter context of the formula.
    8. So one way to think of this is that FILTER()s “stack up” on top of each other.
    9. Another way to think of it is that even though the first filter RETURNED rows A, B, C, and D, its real effect was to REMOVE all other rows (E and F) from consideration.

OK, back to that first filter!

Here it is again:

FILTER(ALL(Periods),
       Periods[Year]=MAX(Periods[Year])-1
)

Let’s revisit points 1-5 above for this FILTER expression to see how it all works.  And let’s examine just a single cell of the pivot to see how this FILTER operates for that one cell:

image

Focusing on The Orange-Circled LASTYRSALES Cell As An Example
With Its Period Filter Context Highlighted in Green

In that picture above, the orange cell we are looking at has a filter context “coming in” from the pivot.  It has Period[Year] set to 2011 and Period[MerchPeriod] set to 1, as highlighted in green.

Given the detailed description of FILTER() from points 1-5 above, we can see that:

  1. We set ALL(Periods) as our TableToFilter so that we are starting from a clean slate with respect to time.  So our Periods table now has “all rows alive.”
  2. Then our FilterExpression tests against the Periods[Year] column.
  3. MAX(Periods[Year]) – 1 still operates independently!, so it still picks up Periods[Year]=2011 from the pivot.  Therefore it returns 2011 – 1 = 2010!
  4. Since we started with ALL(Periods) as the TableToFilter, and the FilterExpression only “keeps” rows where Year=2010, we are left with all 2010 rows “alive” after evaluating this FILTER().
  5. If we didn’t do ALL(Periods) for TableToFilter, and instead just used Periods without the ALL(), our FILTER would start out with only rows from 2011 (since that is what the pivot is telling us).
    1. And then in the next step when we go back a year to 2010, FILTER() would find no rows.  There are no rows that match Periods[Year] = 2010 and Periods[Year]=2011. 
    2. So our FILTER would return no rows, which means it would have the effect of REMOVING all rows from Periods in the overall formula, and our measure would return blank for all cells in the pivot.

That’s a lot of explanation, I know.  Walk through it a few times.  It’s actually pretty intuitive once you’ve done it a few times.  The tricky part, for me, was discovering all of these details for myself.  And since I’ve done all of that, you don’t have to.

That’s enough for this time.  I think you can probably figure out how the second FILTER() evaluates based on the above, but I will step through it next time. 

I will also explain why we use that last VALUES() in the formula, and probably also share some of the answers I got from the Italians, and from David Churchward, in response to my question “did I need to add that calc column in the Periods table?”


The Greatest Formula in the World, Part Two

December 7, 2011

image

Our Time Machine From Part 1 Needs a Couple of Fixes

OK, if you recall from part one, we had a “year over year” time machine calculation going on, and it was built against a data set that lacked a “real” calendar table.  Very cool.  Very resourceful.  And very necessary.

But the time machine had two problems.  Let’s fix them.

Problem #1:  Meaningless Grand Total

Let’s start with the easy one:

Year over year in PowerPivot - Meaningless Grand Total

Grand Total is Meaningless for Last Year Sales

It is meaningless to have a “grand total” value for a measure that returns “last year’s sales.”  What year would that BE, actually?  It’s nonsense.  So we use an old trick, one of the many flavors of IF(VALUES()):

=IF(COUNTROWS(VALUES(Periods[Year]))>1,BLANK(),
   CALCULATE([Total Sales],
      ALL(Periods),
      FILTER(ALL(Periods), Periods[Year]=MAX(Periods[Year])-1),
      VALUES(Periods[MerchPeriod])
   )
)

Where the new IF “wrapper” is highlighted and the original formula is in normal font.

For more on that “IF VALUES” technique, please see this post.

The results are as desired, the grand total is now blank:

image

Problem solved.  Moving on…

Problem #2:  2011 isn’t complete but we’re getting all of 2010 sales

Remember, our sales data for 2011 only goes through the first 6 months.  So we do NOT want “last year sales” for months 7-12:

image

That can be solved a number of ways.  The simplest is just to add another IF(), and have the Last Year Sales measure return BLANK() whenever there are no Sales:

=IF(COUNTROWS(VALUES(Periods[Year]))>1,BLANK(),
   IF([Total Sales] = BLANK(), BLANK(),
      CALCULATE([Total Sales],
         ALL(Periods),
         FILTER(ALL(Periods),
            Periods[Year]=MAX(Periods[Year])-1
         ),
         VALUES(Periods[MerchPeriod])
      )
  )
)

That trims the results to just the months desired:

image

But note that the subtotal for 2011 is still too high:

image

“Last Year Sales” for 2011 Should Only be Returning
2010 Sales Through the First Six Months

This one is trickier.  I struggled to find a good answer before coming up with the following:

Step 1:  Add a “Next Year Period Num” Column to the Periods Table

Recall that my Sales table has a “Period Num” column that is the basis for the relationship with the Periods table:

image

And that matches up with a similar column in the Periods table:

image

Note that Period Num does NOT reset to 1 with each new year (unlike MerchPeriod).  So that is the absolute unique ID for a given Month/Year combo.

Well, I created a new column in the Periods table that tells me, for a given Period Num, what the equivalent Period Num will be NEXT year:

image

Now, in my FILTER statements, I can choose to match on THAT column instead.

Bringing it Home

OK, now that I have that column, I can add a new FILTER clause to my measure:

=IF(COUNTROWS(VALUES(Periods[Year]))>1,BLANK(),
   CALCULATE([Total Sales],
      ALL(Periods),
      FILTER(ALL(Periods),
         Periods[Year]=MAX(Periods[Year])-1
      ),
      FILTER(ALL(Periods),
         Periods[NextYear Period]<=
            MAXX(
                VALUES(PeriodSales[Period Num]),
                PeriodSales[Period Num]
                )
      ),
       
      VALUES(Periods[MerchPeriod])
   )
)

OK, what does that do?  Simply put, it further filters the Periods table to NOT go beyond any periods for which we currently have records in the Sales table.

And the results:

Last Year Sales Measure in PowerPivot With a Custom Calendar

Last Year Sales Measure Doing the Right Thing

And now if you want YOY Growth, it’s straightforward:

=([Total Sales] – [LY Sales Finished]) / [Total Sales]

Which yields:

Year over Year / Year on Year Growth Percentage Measure in PowerPivot With a Custom Calendar

Year over Year / Year on Year Growth Percentage
Measure in PowerPivot With a Custom Calendar

OK, I cheated and wrapped another IF(COUNTROWS(VALUES)) around the outside of the formula so that the grand total cell is blank again, because again, that’s a meaningless value to report.  Details.

Isn’t that a lot of work?

Depends on how you look at it.  First of all, it’s a lot harder to grasp the first time than it is as you get used to it, trust me.

But even better, this investment pays off forever.  Normal Excel formulas may be easier to write the first time, but then you have to re-write and adjust them forever, every time your data changes.  And every time your desired report shape changes.

This one is a portable formula.  It goes wherever you want it to go, and eats whatever data you feed it.  Forever.  It’s worth a little extra effort.

OK but do I really have to understand all of that?

No, not really.  Not right away.  Think of it this way:  this is what I do all day, every day.  And I can tell you that figuring this out is not something you want to do in your spare time.

But once someone gives you the pattern?  Wash, rinse, repeat.  Make small adjustments as needed.

Honestly, that whole formula above could be wrapped up in a function provided by Microsoft, and all of the nasty stuff hidden.  I told them as much last time I visited Redmond.  And if it were a function rather than a formula, we wouldn’t care at all how it worked.

So… treat this as a pattern.  Copy/paste and modify to fit your needs.  Seriously.

Unfinished Business

I actually WILL retrace my steps here though and explain a number of things though, so that you CAN understand.  I have a couple of posts in the queue aimed at just that.

Were there alternate ways to write this measure?

Yes, there were, especially in that last step where I needed to get the “Last year sales” measure correct for the 2011 subtotal.  I could have used SUMX to make the year subtotal equal to the sum of its underlying months.  That is worthy of a post for sure.

But I was wondering if there was still another way.  A way that doesn’t use SUMX and doesn’t require a new calc column.

And for that, we need the Italians.  So I’m gonna try out my new toy, the Boot Signal:

image

Calling Marco and Alberto

Get it?  *Boot* Signal?


The Greatest Formula in the World, Part One

December 6, 2011

 
image

Play Write the greatest song formula in the world, or I’ll eat your souls.”
-From “Tribute,” by Tenacious DAX

A Do-It-Yourself Time Machine

You remember the Great Precedence Project?  Well, I have shelved it for now, for two reasons:

  1. Creating an exhaustive list of all the precedence rules in PowerPivot formulas was going to be…  exhausting.
  2. I didn’t really need to boil the philosophical ocean in search of deep underlying meaning, because on further inspection, I already had the Greatest Formula in the World, and needed no other.

So what is the GFITW about?  It’s about time navigation.  It’s useful when you have a custom calendar and the time intelligence functions are therefore not so helpful.

The GFITW is a do-it-yourself time machine.

Year on Year Sales With a Custom Calendar?  Yes We Can!

In that post linked above, which happened to be the last installment of the precedence project, I was calculating a running total, and doing so without benefit of the fancier functions like DATESYTD etc. (which rely on the “real” calendar, not your business calendar).

OK, so now let’s do a “Year on Year” sales measure, sometimes called a “Year over Year” measure.  And again, our “calendar” table isn’t a true calendar at all:

Semantically this IS a calendar but functions like PREVIOUSYEAR can't use it

A Table of Periods (Not Dates) Means You Cannot Use the Built-In
Time Intelligence Functions to Calculate Year-on-Year Sales

And the Sales table is very simple:

Sales Table linked to a period number, not linked to a date

Sales Table is Linked to the Periods Table by the Period Num Column

And I have this basic pivot already set up:

image

Year and MerchPeriod on Rows

So, how do I write a measure that “fetches” the sales from last year?

Let’s Skip Some Steps

I am going to be honest with you:  it took me a very long time to figure this out.  The formula looks reasonably simple, and it is, but I had to try a million variations before I got it right.  Literally, it took me about a week to refine it.

So rather than walk you through that painful process, let’s skip ahead and show you the GFITW in all its glory.  Then in the next post I can show you what NOT to do, and we can also fine tune it.  OK?  OK.

So here it is, the Greatest Formula.  In the World:

=CALCULATE([Total Sales],
   ALL(Periods),
   FILTER(ALL(Periods),
      Periods[Year]=MAX(Periods[Year])-1),
   VALUES(Periods[MerchPeriod])
)

And its results:

image

No, it’s not perfect yet.  We will need to calculate percentage growth of course, rather than merely fetching last year’s sales.  And that given that 2011 is not yet complete, I’m a little squeamish about Last Year Sales returning ALL of 2010’s sales (the $5.9M number).

But for now, let’s count our blessings.  This formula DOES fetch the sales from last year, and it works (mostly) both at the year level and the MerchPeriod (aka Month) level.

Dissecting the Formula

Here is the formula again:

=CALCULATE([Total Sales],
   ALL(Periods),
   FILTER(ALL(Periods),
      Periods[Year]=MAX(Periods[Year])-1),
   VALUES(Periods[MerchPeriod])
)

But really, the GFITW is not a formula.  It is a pattern that you can re-use and modify.  So let’s look at it that way:

=CALCULATE(Original Measure,

   ALL(Your Custom Calendar Table),
  
   FILTER(ALL(Your Custom Calendar Table),
      Year Column Or Similar =
        
Expression that “moves” the Year Column
  
),
  
   VALUES(
Time Column That is More Granular
          Than Year, Like Month or MerchPeriod

  
)

)

Where everything in blue is “fixed” as part of the pattern, and everything in italics is something you can change.

Like I said, this is running a bit long for one post.  Come back Thursday for the rest Smile


DAX Autocomplete Stops Working–A Workable “Fix”

September 27, 2011

 
image

“There, good as new.”

By Now, You’ve Probably Seen This

So you’re typing along, writing a DAX measure.  And suddenly, poof!  Autocomplete stops working.

This happens to me all day, every day.  Especially when I use the [Measure]( filter expression ) syntax.  Here’s a quick “before and after” example:

PowerPivot autocomplete is working

Autocomplete is Working for Table/Column Names

PowerPivot autocomplete is NOT working

Same Table Name, But Autocomplete is NOT Working

Grab Your Duct Tape…

A guy named Scott Kaylie showed me this about six months ago and I am just now getting around to posting it.  Yes, my backlog of posts has in fact grown to be at least six months long Sad smile

Anyway, here’s the trick.  Just put a * in between the measure and the open parenthesis, as if you were going to multiply the measure by another expression:

Tricking autocomplete into working with *

OK, so then just go ahead and finish writing the measure, leaving the * in there:

Tricking autocomplete into working with *

And then, at the very end, go back and remove the *…

Tricking autocomplete into working with *

Voila.  Ugly yes, but much better than having no workaround.  Thanks Scott.

Any other questions please consult the following home repair manual:

From GraphJam


Detecting ALL() – Detecting When Not Filtered

September 20, 2011

 
image

 

“I like to carry it, you never know when you’re gonna need it.”

-The  much-missed John Candy as Uncle Buck

 

 

A technique that you may need someday

File this under “you may never need this, but when you do, you’ll know it immediately.”  I’d call this one a solid 4 on the DAX Spicy Scale.

Why I needed this:  I had two different calendar tables, one at the Date level and one that was a custom calendar, Periods table.  Most of the measures in this model are written to be used with one table or the other, and I never have to “cross the streams.”

But then I ran into a case where a measure I had written to be used with the Periods table, suddenly needed to also be used on a pivot that was only filtered by Dates.  And I didn’t want to write a new version of this measure (for reasons that are mostly irrelevant here).

Of course, when I put the Period-focused measure on a Date-focused pivot, and there were no fields from Periods on the pivot, well…  the Period-focused measure returned junk.

What I decided to do, then, was detect if Periods was the aptly-named “Sir Not Appearing in This Film” and then assign a Period in that case.  In other words, detect if the Periods table was not on the pivot, and if not, FORCE a Period value into the evaluation of my measure.

Detecting ALL(), or the Absence of Filter on a Field or Table

Here is the final measure formula I used, color-coded for identify its parts:

IF(COUNTROWS(Periods) = COUNTROWS(ALL(Periods))-1,
   CALCULATE(My Original Measure,
      FILTER(Periods, Periods[Period] = [LatestPeriod])
   ),
  
My Original Measure
)

Let’s go part-by-part:

The Detector

IF(COUNTROWS(Periods) = COUNTROWS(ALL(Periods))-1

The “detector” counts the rows of Periods in the current pivot context, and compares that to the number of rows in the Periods table with all filters removed by ALL().  Pretty straightforward right?

So…  why the –1 at the end?  The reasons for that are slightly academic…  academic enough that I don’t really want to understand in great depth.  Let’s go with  the short version:  in this case, when I counted the rows of ALL(Periods), ALL() was kind enough to include the “blank” row of the Periods table.

What’s that?  You say you don’t HAVE a blank row in your Periods table???  Well, neither do I.  But you MIGHT have some rows in your Sales table that have a blank value for the [PeriodID] column.  Or maybe you have rows in the Sales table that DO have a [PeriodID] value, but that value does not appear in your Periods table.  Either way, you implicitly DO have blanks in your Periods table, and COUNTROWS(ALL()) decides to tell you about it.  So you’ll subtract one and like it, soldier!

(I have not tested, honestly, whether you always need – 1 in this detector.  If your Sales table is perfectly clean, maybe the – 1 is not needed, and maybe it is.  Someone let me know OK?  And I bet three-to-one that the answer comes from Italy.)

The Original Measure

My Original Measure

OK, this part IS straightforward.  Whatever my original measure was named, or perhaps its full original formula, appears here.  Moving on…

The Filter for the “No Periods Selected” Case

FILTER(Periods, Periods[Period] = [LatestPeriod])

OK, in the case that Periods is absent from the pivot, I take the original measure and then use the FILTER function to pretend that the Periods[Period] column IS on the pivot, and filtered to a value matching the most recent period, as calculated by my [LatestPeriod] measure.

But really, this part is going to be VERY different based on the circumstance.  Maybe you want to use a completely different measure, for instance.  Or set Periods to the first period this month.  Or the period corresponding to the current filter context from the Date table.

I merely included the “meat” of what I did here to drive home the intent.

Two Notes

One – note that there is NO difference between “the Periods table is not used on the pivot at all” and “the Periods table IS on the pivot but unfiltered in the current context.”  So if you have Periods on a slicer but nothing is selected, the detector will “go off.”  And if you have Periods on rows, the detector will still “go off” in the grand total cell of the pivot (and maybe in certain subtotal cells as well).

Two – I was messing around in this area when I discovered the need for the long-simmering Precedence Project.  In other words, when you start messing around with overriding filter context like I did with the FILTER() function above, and you’ve got a number of tables and relationships in play, every now and then you see something you don’t expect.  For that reason, I plan to return to the Precedence Project shortly.


Running Totals Without a Traditional Calendar Table

June 30, 2011

PowerPivot provides a host of great functions like DATESYTD, DATESMTD, DATESBETWEEN, etc. that are useful for calculating many things, including a running total.

But with the exception of DATESBETWEEN, I seldom get to use any of those “time intelligence” functions, for the simple reason that our clients almost never operate on a traditional calendar.

I very often find myself working with a “calendar” that looks like this:

Non Traditional Calendar

Can’t Use Time Intelligence Functions With a Calendar Like This

And sometimes I even find myself with “calendars” that don’t have any date columns in them at all:

Another Non Traditional Calendar

This Table DOES Truly Represent the Business Calendar But Has No Dates in It!

So what do you do when you still need a “Year to Date” Total?

Silly humans.  They don’t care that the data is structured one way or another under the hood.  They just keep insisting on seeing useful things, like Year to Date totals.  They don’t want to hear how the blender is constructed, they just want their daiquiri.

So, what’s a report designer to do?  Give up?  No way.  We make daiquiris anyway.

I’m going to use that second calendar above, the one that has no dates in it.  In fact this is the same data set I have been using for the Precedence Project.  (Really, this is Part Three in disguise).  So let’s return to a familiar pivot:

image

Familiar Starting Point

It has Year and MonthNum on rows, and the simple measure Total Sales.  I want to end up with this:

image

Desired Result

Getting to that desired result, in my experience, is something you either stumble upon quickly or flail around forever and never find.  In fact, finding myself in a situation like this is what triggered me to start the Precedence Project in the first place.

Cutting to the chase:  this formula works

To keep this post short and sweet, I’m just going to share a working formula.  I hesitate to call it “the” formula, because there are multiple variations that work, and some fit certain biz requirements better than others.

So here is one that works.  It has some quirks that I will iron out in the next post.

[YTD Sales]=
[Total Sales](
                FILTER(
                  ALL(Periods),
                  Periods[PeriodNum]<=MAX(Periods[PeriodNum])
                ),
                VALUES(Periods[Year])
             )

In short, this  measure clears all filters on the entire Periods table, then adds back two filters – restores the current context for Year (using VALUES), and limits the PeriodNum to be less than or equal to the current context (I will explain later why I used MAX and not just VALUES).   Note that I am using PeriodNum (which keeps increasing across years and is unique to a given Month/Year combo), not MonthNum (which is on the pivot and “resets” to 1 for January of each year).

Next Post:  Explaining the Formula, and Showing How NOT to do it

Explaining that relatively simple formula, if I do a thorough job, will consume its own post.  And I also want to cover some other approaches – ones that seem like they should work but do not.  Because the things that DON’T work are even more educational than the things that do.


Precedence Part 3: ALLEXCEPT vs. ALL w/ VALUES

June 22, 2011

 
Snatch - A Damn Fine Movie

 

“However…  you do have ALL() the characteristics of a dog, Gary.  ALLEXCEPT() loyalty…”

 

 

 

This will be a quick one.  I think.

In part two, we left off with the observation that VALUES() can “trump” ALL() even when VALUES() is applied to a column that is not on the pivot:

[ALL then VALUES of field not on pivot] =
     [Total Sales](ALL(Periods),VALUES(Periods[Period Num]))

image_thumb5

One thing I have often puzzled over is this:  how does a combination of ALL(Table), VALUES(Table1[ColumnA]) compare to using ALLEXCEPT(Table1, Table1[ColumnA])?

In essence, those should do the same things right?  In one case, you set the whole table to ALL(), but then “restore” ColumnA to its pivot context using VALUES().  In the other case, you set every field in the table except ColumnA to ALL(), leaving ColumnA in the context set by the pivot.

They should result in the same thing, right?  Let’s check.

ALLEXCEPT() using columns that are on the pivot

[ALLEXCEPT fields on Pivot]=
[Total Sales](ALLEXCEPT(Periods,Periods[Year],
              Periods[FiscalQuarter]))

There are two fields on the pivot from the Periods table – Year and FiscalQuarter, and both of those are excluded from being “all’d” in the measure.  Results:

image

OK, good news.  Those are indeed the same results as the unfiltered base measure, and that’s what we got from ALL() plus VALUES() as well.  So from this, we can say that ALL() plus VALUES() is the same as ALLEXCEPT().

Not so fast though…

ALLEXCEPT() using columns that are NOT on the pivot

Let’s try ALLEXCEPT() using a field that is NOT on the pivot.  Remember, VALUES() trumped ALL() in this case, too, and the results were the same as the unfiltered base measure.  Will ALLEXCEPT follow suit?

[ALLEXCEPT field NOT on Pivot]=
[Total Sales](ALLEXCEPT(Periods,Periods[Period Num]))

image

OK, that was a bit unexpected.  ALLEXCEPT() returned the same results as pure ALL().  So ALLEXCEPT() does *not* preserve the pivot context of columns NOT on the pivot, whereas VALUES() *does* pick those up.

I’m not sure if that’s intentional on the part of the PowerPivot team, or if it’s just a quirk.

Summary

1) ALLEXCEPT() does behave precisely like ALL(), VALUES() as long as the “exempted” columns are columns on the pivot.

2) ALLEXCEPT() does NOT preserve pivot context, however, on columns that are not on the pivot.

Oh, and one more…

No need to use ALLEXCEPT() instead of ALL() when using VALUES()

In the past, whenever I wanted to do an ALL() with a VALUES(), I typically used an ALLEXCEPT() instead of the ALL().

For instance, I would do something like this:

[Total Sales](ALLEXCEPT(Periods,Periods[ColumnA]),
              VALUES(Periods[ColumnA]))

That turns out to have been an overly-careful habit.  As we’ve seen here, and in part two, VALUES() trumps the ALL(), and does not need the “help” of ALLEXCEPT() instead.

Yes, I’m sure many of you will say that’s obvious from everything covered so far, but just in case you find yourself tempted to use ALLEXCEPT() to exempt the column you are preserving with VALUES(), there’s no need.


Precedence Project Part Two: VALUES()

June 16, 2011

 
OK, picking up from part one

Let’s start with a simplified version of last post’s pivot – remove one of the row fields, and all of the measures but the base Total Sales measure:

image

Now let’s add a measure that sets Year to ALL:

[ALL Year Sales] = [Total Sales](ALL(Periods[Year]))

PowerPivot ALL Year Measure

Note that all of the orange cells have the same value – they are all Q1 cells and differ only on Year, so the all Year measure means each orange cell will display the total sales for Q1 across all years.  So the green cells all add up to the value in each orange cell.

Rule #3:  VALUES() Selectively “Shuts Off” ALL()

OK, we’ve played with VALUES() a few times on the blog, most notably here and here.  But previously, we’ve never used it like we are going to use it this time.  We are going to use it to preserve the original filter context and “fight back” against ALL().  Exciting huh?

Let’s add another of our previous measures back to the pivot – the one that applied ALL() to the entire Periods table:

[ALL Period Sales] = [Total Sales](ALL(Periods))

image

OK, now let’s add VALUES() to the mix in a new measure that has ALL applied to the whole Periods table, and VALUES applied to just one column in Periods:

[ALL with VALUES] =
     [Total Sales](ALL(Periods), VALUES(Periods[FiscalQuarter]))

ALL Against Whole Table, VALUES Against One Field

This time, again the Q1 cells are all the sum of every Q1 in the base Sales measure.  So the VALUES function, applied to the FiscalQuarter column in this measure, overrode the ALL which was applied to every column in the Periods table (including FiscalQuarter!)

So this new measure differs from the pure [ALL Period Sales] measure at the Quarter level.  But note the blue cells.  It’s still identical to [ALL Period Sales] at the Year level.

So is ALL() still “winning” at the Year level?  No.  It’s just that, at the Year level, VALUES(Periods[FiscalQuarter]) returns  the list of all four quarter values, because all four Quarter values truly due correspond to that cell’s coordinates in the pivot (the subtotal for the Year does indeed imply all quarters).

Neat huh?

Clarification/Correction

Derek pointed out in a comment that the two values in blue actually are NOT equal.  They are off by $281.  This is because there are blank values for year – some noise in the data – and I have filtered those out using the little Row Labels dropdown:

image

If I clear that filter and allow blanks to show up in the pivot, I get:

image

Note that the two values in blue are still off by $281.  But the mysterious missing $281 is indeed accounted for by the blank year.

So…  the pure ALL(Periods) measure catches that $281.  That’s expected.  What’s unexpected, at least initially, is that the ALL(Periods), VALUES(Periods[Fiscal Quarter]) measure MISSES that $281. 

Here’s why:  note that there’s only a blank Quarter under the blank year.  There are no blank quarters associated with the blue cells.  So that $281 is omitted.

Great catch Derek, this was awesome.  I probably should spend a post just explaining where those blanks Years and Quarters came from.

Back to the original post…

We can even confirm that VALUES() beats ALL() in a heads-up fight.  Rather than use ALL() against a whole table and VALUES() against one column, let’s use them both against the same single column:

[All vs VALUES Heads Up] =
      [Total Sales](ALL(Periods[Year]),VALUES(Periods[Year]))

VALUES Beats ALL Heads Up

Hey look!  Our new measure returns the EXACT same results as the base sales measure.  VALUES completely shut off ALL() in a head to head fight.

Rule #4:  VALUES() Wins Even if the “Targeted” Field Is Not on the Pivot!

Maybe this one isn’t so much a precedence rule as it is surprisingly cool.  Let’s try using VALUES() against a column from the Periods table that we don’t have on the pivot…

Periods Table

Let’s try that [Period Num] column.  Notice how it uniquely identifies each period, regardless of Year?  It’s the unique key for this table, so if VALUES() works, that *should* beat ALL() even if ALL() is applied to the entire table.

Let’s try it:

[ALL then VALUES of field not on pivot] =
     [Total Sales](ALL(Periods),VALUES(Periods[Period Num]))

image

…and the result is IDENTICAL to the base sales measure.  The ALL() is completely overridden.  Cool huh?  VALUES operates against the underlying tables, and merely takes into account the filters from the current pivot cell.

Good thing to know.  I’ve been taking advantage of that for a long time, it is very helpful.

OK, this post has run long so I will save ALLEXCEPT for the next one.


The Great Precedence Project (Filter Rules for DAX)

June 14, 2011

 
Richard Pryor wrote most of Blazing Saddles, explaining why it's the best movie Mel Brroks ever made

“Wait a minute. There might be legal precedent. Of course. Land snatching. Let’s see, land, la-land…see snatch. snatch…snatch…ah! Haley vs. United states. Haley 7, United States nothing. You see, it can be done!”

 

 

Keeping filters straight as they pile up

Some example questions to ponder:

  1. Within a FILTER or a CALCULATE, does the order of the filter arguments ever make a difference?
  2. Does using an ALL inside of a FILTER yield different results than using an ALL inside of a CALCULATE?
  3. Measures that are used inside of a FILTER or DATESBETWEEN are never impacted by external filter-setting forces are they?  Or are they?

I went through several months without learning anything new about DAX, and that seemed to me like a Good Sign.  A sign that perhaps I now knew everything I needed to know – not everything I could know, but everything I would ever really need to know.

But then a series of new scenarios presented by clients opened up my eyes to some very “squishy” places in my understanding.  I got everything working for the clients, but I took a note to dig in later and see if I could capture everything into a series of new “rules.”

In the meantime I fired off a few emails – specifically to the product team and “The Italians” – and got some initial advice (the Italians were particularly helpful).  But both of those parties understand this stuff almost TOO well – and what Excel people like me need, I think, is to learn from someone who struggled initially.

So I propose to be that person – who struggles with questions like the above, then triumphs, then explains.  I am going to post my findings as I go, but rest assured that it’s not an entirely selfless act – forcing myself to explain things is the way I best learn things myself Smile

It’s gonna take a series of posts to cover this.  And frankly, I’m not done discovering all the rules yet.  Let’s dig in.

Setting the stage:  two simple measures

We need a base to start from.  So, here’s a simple pivot:

Starting Point for PowerPivot Filter Precedence

The measure there is nothing fancy – just a simple SUM.  All of the fields on rows are from the Periods table:

Periods Table

Now let’s add an ALL(Periods) version of the measure:

[ALL Period Sales] = [Total Sales](ALL(Periods))

Starting Point for PowerPivot Filter Precedence with ALL

And if the syntax I am using, [Measure](<filters>), makes you scratch your head, remember that it is just shorthand syntax for CALCULATE([Measure], <filters>) – they are 100% the same, I just prefer the shorthand syntax for readability.  (This is a controversial topic within the DAX community however, and others disagree with me, but I’m right, damnit! Smile)

Rule #1:  “ALL()” and “Table[Col]=Val” Work Together

Let’s add a third measure, one that has the ALL() from above, but also sets Year=2009:

[ALL Period Sales Set to Year 2009] =

     [Total Sales](ALL(Periods),Periods[Year]=2009)

And the results:

ALL and Simple Filters Work Together

Neither clause “overruled” the other.  They were both applied.  We can think of this as “all Periods filters were completely removed, but then Year was set to 2009.”

Of course, thinking of it that way makes you wonder…  what if I reverse the clauses and have the ALL second?  Will that then override the simple filter for 2009?  I mean, I hope that doesn’t matter.  Marco and Alberto say it doesn’t matter.  But I need to see it myself…

Rule #2:  Filter order does NOT make a difference within a CALCULATE

OK, let’s reverse the order:

[ALL Period Sales Set to Year 2009] =

     [Total Sales](Periods[Year]=2009, ALL(Periods))

And yes, same result:

Order does NOT matter in filter clauses

OK, that’s reassuring.  But it gets more interesting from here.  In the next post, I’m going to show you that when ALL() meets VALUES(), there arguably IS a winner, and that ALLEXCEPT() and VALUES() yield outcomes that I did not expect.


Guest Post: Nth Occurrence DAX Formula

March 31, 2011

David Hager is At It Again

A short two weeks after his truly creative Measure Catalog Macro post, David Hager is back with a heavyweight DAX post.

This time he is exploring an area of the product that I have very little experience with, namely the EARLIER() function.  On the DAX spicy scale, EARLIER() is a 5 in my opinion, but the other reason I haven’t done much with it is because I’m pretty sure it’s only useful in calc columns.  At Pivotstream we do all of our calc columns in SQL (for several reasons I won’t go into here).

But not everyone has colleagues ready to write calc columns in SQL for them now do they?  In fact, the Great PowerPivot Survey is reinforcing that for me (I’m going to wait until next week before I start summarizing results, but it sure has been interesting so far).

OK David, take it away.

CREATING AN Nth OCCURRENCE DAX FORMULA

By David Hager

CLICK HERE TO DOWNLOAD THE WORKBOOK IN THIS POST

When you have a growing database of customer records (such as purchases at a store), it is useful to have a method for tracking the previous visit for each customer.

A formula of the following type can be used for this.

=CALCULATE(MAX(Table1[DATE]),FILTER(Table1, EARLIER(Table1[Customer])=Table1[Customer] && EARLIER(Table1[DATE])>Table1[DATE]))

(See http://gavinrussell.wordpress.com/2010/09/23/powerpivot-dax-session-notes-2-previous-row/ for the original idea behind this post).

image

However, note that the previous visit date is returned for every customer visit (except the initial one) for each customer. In some cases, this may be the desired result, but usually only the last customer visit record would add the previous visit as useful information.

While trying to make a formula that would do this, I thought that it would also be nice if I could expand that concept to find an Nth previous visit. Since I did not want to hard-code an Nth value that would have to be updated manually in the PowerPivot cube, I had to invent a method for creating a dynamic constant.

In order to create a dynamic constant for PowerPivot (which would simply be an updatable cell value in Excel), the first step is to create an Excel table. The table consists of a Date field which can be used to create a relationship with the fact table Date field and a constant field. In each row of the constant field, the same formula needs to point to the update cell (=$D$2).

image

Now, the table can be linked to PowerPivot, using the Create Linked Table feature.

Note from Rob:  Here’s another great example of a feature that I never use (Linked Table), because that linkage doesn’t refresh on the server.  But it has tremendous utility in desktop PowerPivot, so I’m glad David is showing it off here.

Two more steps remain before the dynamic constant can be used in a calculated column formula. By clicking Create Relationship in the PowerPivot window Design ribbon, a relationship can be established between the Date field in Table1 with the same in Table2. Finally, a calculated column named NTH is added to Table1 with the formula =RELATED(Table2[NTH]).

Now it is time to create the formula. The entire version of this formula is too complex to be viewed in a completely displayed form, so two parts of it are entered as calculated columns and are used as inputs into the formula shown below.

=IF(Table1[IsLastRecord],CALCULATE(MAX(Table1[DATE]),FILTER(Table1,EARLIER(Table1[Customer])=

Table1[Customer] && EARLIER(Table1[DATE])>Table1[DATE] &&Table1[IsNthPreviousRecord])),BLANK())

where

Table1[IsLastRecord]

=COUNTROWS(FILTER(Table1,EARLIER(Table1[Customer])=Table1[Customer]))=

(COUNTROWS(FILTER(Table1, EARLIER(Table1[Customer])=Table1[Customer] && EARLIER(Table1[DATE])>Table1[DATE]))+1)

and

Table1[IsNthPreviousRecord]

=(COUNTROWS(FILTER(Table1,EARLIER(Table1[Customer])=Table1[Customer]))-

(COUNTROWS(FILTER(Table1, EARLIER(Table1[Customer])=Table1[Customer] &&

EARLIER(Table1[DATE])>Table1[DATE]))+1)=Table1[NTH])

IsLastRecord counts the numbers of records for each customer and compares that number to the number of records previous to the last record plus 1. If the numbers are the same, the expression is true and the last record in row context for each customer will have a date depending on whether it had an Nth previous visit (otherwise a blank).

IsNthPreviousRecord is the 3rd filter in the main DAX formula. This Boolean expression is true if the numbers of records for each customer minus number of records previous to the last record plus 1 is equal to the desired Nth previous visit.

The following picture shows the returned records for the table example when Nth is equal to 2.

image

Note that only customers a & d have 3 or more visits, so those are the only dates returned.

I have not gone into great detail into explaining exactly how this DAX formula works mainly because if you work through this yourself you will gain a greater understanding of how to construct your own formulas. Note from Rob:  Heh heh, now I STILL don’t understand EARLIER().  Nor have I built a pivot table that exploits the power of this formula. I will also leave that as an exercise to the reader.

As you might imagine, there are many ways that this formula methodology for Nth occurrence can be used. Also, the technique for using a dynamic constant is pretty neat. I hope this helps in your future PowerPivot model-building projects!