Becoming one with CALCULATE()

March 20, 2014

Guest Post by Scott Senkeresty

Intro from Rob

Hey, it starts out simple and powerful:  CALCULATE is the SUMIF you always wished you’d had.  It works in pivots.  It’s the “anything IF.”  It’s amazing, really, how many doors it opens.

Of course, CALCULATE is designed to be powerful in ways we can’t even IMAGINE in our first day/week/month of using it.  You can spend years discovering all the things it can do – and that’s a good thing!  But sooner or later you’re going to hit something with CALCULATE that makes you scratch your head – why is it returning THOSE results?

I myself entered this twilight zone with the Precedence Project – a series of posts that I quickly abandoned.  It turns out that, practically speaking, you don’t need to achieve deep theoretical understanding of this stuff in order to achieve great results.

Below, however, Scott does a great job of resolving those mysteries.  And he does so by “channeling” two old friends who live at the base of the Alps.  Take it away, Scott…

Going to “Graduate School”

image

All right, so I’ve read Rob’s book a few times.  (Heck, I am credited as tech editor on it.)  I’ve devoured PowerPivotPro University.  So now what, I ask Rob?

“Go forth and conquer – data is your ocean,” is his answer.  He’s a practical sort of guy.  Me, though?  I’m never satisfied until I’ve completely torn the machine apart.

So, as I hinted in my last post, I went to graduate school and spent a few intense days engrossed in Marco and Alberto’s book.

 

Read the rest of this entry »


The GFITW “Loses” an ALL()

November 7, 2013

 

Yes, We’ve Seen This Image Before and I Am Sure We Will See it Again

Jump in the Wayback Machine…

In the Spring of 2011, I dove into a Power Pivot project that I thought was going to be simple, but even today remains the most complex thing I’ve ever done in DAX.  I think it’s fair to say that the experience, at the time, was traumatizing.  (The client’s business logic itself was/is incredibly complicated.  It’s 100% legitimate, but I think barometric pressure might be factored into their budget/actuals ratios.  Kidding.)

But like many difficult experiences, a lot of good came of it as well:

  1. I learned a ton – it forced me to advance my Power Pivot knowledge significantly
  2. It demonstrated to me that Power Pivot essentially had no “ceiling” – it could handle almost anything
  3. It became a Microsoft case study
  4. It “spawned” the GFITW.

Ah yes, the Greatest Formula in the World.  The solution to all our custom calendar needs, and a pattern I’ve repeated hundreds of times since.  On the blog, in the book, in client workbooks, everywhere.

Well it turns out, the GFITW could afford to go on a diet.

Um, Yeah.  The First ALL Isn’t Necessary (But Doesn’t Cause Problems)

Here’s the “classic” GFITW pattern:

Read the rest of this entry »


The Greatest Formula in the World, Part One

December 6, 2011

 
image

Play Write the greatest song formula in the world, or I’ll eat your souls.”
-From “Tribute,” by Tenacious DAX

A Do-It-Yourself Time Machine

You remember the Great Precedence Project?  Well, I have shelved it for now, for two reasons:

  1. Creating an exhaustive list of all the precedence rules in PowerPivot formulas was going to be…  exhausting.
  2. I didn’t really need to boil the philosophical ocean in search of deep underlying meaning, because on further inspection, I already had the Greatest Formula in the World, and needed no other.

So what is the GFITW about?  It’s about time navigation.  It’s useful when you have a custom calendar and the time intelligence functions are therefore not so helpful.

The GFITW is a do-it-yourself time machine.

Year on Year Sales With a Custom Calendar?  Yes We Can!

In that post linked above, which happened to be the last installment of the precedence project, I was calculating a running total, and doing so without benefit of the fancier functions like DATESYTD etc. (which rely on the “real” calendar, not your business calendar).

OK, so now let’s do a “Year on Year” sales measure, sometimes called a “Year over Year” measure.  And again, our “calendar” table isn’t a true calendar at all:

Semantically this IS a calendar but functions like PREVIOUSYEAR can't use it

A Table of Periods (Not Dates) Means You Cannot Use the Built-In
Time Intelligence Functions to Calculate Year-on-Year Sales

And the Sales table is very simple:

Sales Table linked to a period number, not linked to a date

Sales Table is Linked to the Periods Table by the Period Num Column

And I have this basic pivot already set up:

image

Year and MerchPeriod on Rows

So, how do I write a measure that “fetches” the sales from last year?

Let’s Skip Some Steps

I am going to be honest with you:  it took me a very long time to figure this out.  The formula looks reasonably simple, and it is, but I had to try a million variations before I got it right.  Literally, it took me about a week to refine it.

So rather than walk you through that painful process, let’s skip ahead and show you the GFITW in all its glory.  Then in the next post I can show you what NOT to do, and we can also fine tune it.  OK?  OK.

So here it is, the Greatest Formula.  In the World:

=CALCULATE([Total Sales],
   ALL(Periods),
   FILTER(ALL(Periods),
      Periods[Year]=MAX(Periods[Year])-1),
   VALUES(Periods[MerchPeriod])
)

And its results:

image

No, it’s not perfect yet.  We will need to calculate percentage growth of course, rather than merely fetching last year’s sales.  And that given that 2011 is not yet complete, I’m a little squeamish about Last Year Sales returning ALL of 2010’s sales (the $5.9M number).

But for now, let’s count our blessings.  This formula DOES fetch the sales from last year, and it works (mostly) both at the year level and the MerchPeriod (aka Month) level.

Dissecting the Formula

Here is the formula again:

=CALCULATE([Total Sales],
   ALL(Periods),
   FILTER(ALL(Periods),
      Periods[Year]=MAX(Periods[Year])-1),
   VALUES(Periods[MerchPeriod])
)

But really, the GFITW is not a formula.  It is a pattern that you can re-use and modify.  So let’s look at it that way:

=CALCULATE(Original Measure,

   ALL(Your Custom Calendar Table),
  
   FILTER(ALL(Your Custom Calendar Table),
      Year Column Or Similar =
        
Expression that “moves” the Year Column
  
),
  
   VALUES(
Time Column That is More Granular
          Than Year, Like Month or MerchPeriod

  
)

)

Where everything in blue is “fixed” as part of the pattern, and everything in italics is something you can change.

Like I said, this is running a bit long for one post.  Come back Thursday for the rest Smile


Running Totals Without a Traditional Calendar Table

June 30, 2011

PowerPivot provides a host of great functions like DATESYTD, DATESMTD, DATESBETWEEN, etc. that are useful for calculating many things, including a running total.

But with the exception of DATESBETWEEN, I seldom get to use any of those “time intelligence” functions, for the simple reason that our clients almost never operate on a traditional calendar.

I very often find myself working with a “calendar” that looks like this:

Non Traditional Calendar

Can’t Use Time Intelligence Functions With a Calendar Like This

And sometimes I even find myself with “calendars” that don’t have any date columns in them at all:

Another Non Traditional Calendar

This Table DOES Truly Represent the Business Calendar But Has No Dates in It!

So what do you do when you still need a “Year to Date” Total?

Silly humans.  They don’t care that the data is structured one way or another under the hood.  They just keep insisting on seeing useful things, like Year to Date totals.  They don’t want to hear how the blender is constructed, they just want their daiquiri.

So, what’s a report designer to do?  Give up?  No way.  We make daiquiris anyway.

I’m going to use that second calendar above, the one that has no dates in it.  In fact this is the same data set I have been using for the Precedence Project.  (Really, this is Part Three in disguise).  So let’s return to a familiar pivot:

image

Familiar Starting Point

It has Year and MonthNum on rows, and the simple measure Total Sales.  I want to end up with this:

image

Desired Result

Getting to that desired result, in my experience, is something you either stumble upon quickly or flail around forever and never find.  In fact, finding myself in a situation like this is what triggered me to start the Precedence Project in the first place.

Cutting to the chase:  this formula works

To keep this post short and sweet, I’m just going to share a working formula.  I hesitate to call it “the” formula, because there are multiple variations that work, and some fit certain biz requirements better than others.

So here is one that works.  It has some quirks that I will iron out in the next post.

[YTD Sales]=
[Total Sales](
                FILTER(
                  ALL(Periods),
                  Periods[PeriodNum]<=MAX(Periods[PeriodNum])
                ),
                VALUES(Periods[Year])
             )

In short, this  measure clears all filters on the entire Periods table, then adds back two filters – restores the current context for Year (using VALUES), and limits the PeriodNum to be less than or equal to the current context (I will explain later why I used MAX and not just VALUES).   Note that I am using PeriodNum (which keeps increasing across years and is unique to a given Month/Year combo), not MonthNum (which is on the pivot and “resets” to 1 for January of each year).

Next Post:  Explaining the Formula, and Showing How NOT to do it

Explaining that relatively simple formula, if I do a thorough job, will consume its own post.  And I also want to cover some other approaches – ones that seem like they should work but do not.  Because the things that DON’T work are even more educational than the things that do.


Precedence Part 3: ALLEXCEPT vs. ALL w/ VALUES

June 22, 2011

Snatch - A Damn Fine Movie

 

“However…  you do have ALL() the characteristics of a dog, Gary.  ALLEXCEPT() loyalty…”

 

 

 

This will be a quick one.  I think.

In part two, we left off with the observation that VALUES() can “trump” ALL() even when VALUES() is applied to a column that is not on the pivot:

[ALL then VALUES of field not on pivot] =
[Total Sales](ALL(Periods),VALUES(Periods[Period Num]))

image_thumb5

One thing I have often puzzled over is this:  how does a combination of ALL(Table), VALUES(Table1[ColumnA]) compare to using ALLEXCEPT(Table1, Table1[ColumnA])?

In essence, those should do the same things right?  In one case, you set the whole table to ALL(), but then “restore” ColumnA to its pivot context using VALUES().  In the other case, you set every field in the table except ColumnA to ALL(), leaving ColumnA in the context set by the pivot.

They should result in the same thing, right?  Let’s check.

ALLEXCEPT() using columns that are on the pivot

[ALLEXCEPT fields on Pivot]=
[Total Sales](ALLEXCEPT(Periods,Periods[Year],
Periods[FiscalQuarter]))

There are two fields on the pivot from the Periods table – Year and FiscalQuarter, and both of those are excluded from being “all’d” in the measure.  Results:

image

OK, good news.  Those are indeed the same results as the unfiltered base measure, and that’s what we got from ALL() plus VALUES() as well.  So from this, we can say that ALL() plus VALUES() is the same as ALLEXCEPT().

Not so fast though…

ALLEXCEPT() using columns that are NOT on the pivot

Let’s try ALLEXCEPT() using a field that is NOT on the pivot.  Remember, VALUES() trumped ALL() in this case, too, and the results were the same as the unfiltered base measure.  Will ALLEXCEPT follow suit?

[ALLEXCEPT field NOT on Pivot]=
[Total Sales](ALLEXCEPT(Periods,Periods[Period Num]))

image

OK, that was a bit unexpected.  ALLEXCEPT() returned the same results as pure ALL().  So ALLEXCEPT() does *not* preserve the pivot context of columns NOT on the pivot, whereas VALUES() *does* pick those up.

I’m not sure if that’s intentional on the part of the PowerPivot team, or if it’s just a quirk.

Summary

1) ALLEXCEPT() does behave precisely like ALL(), VALUES() as long as the “exempted” columns are columns on the pivot.

2) ALLEXCEPT() does NOT preserve pivot context, however, on columns that are not on the pivot.

Oh, and one more…

No need to use ALLEXCEPT() instead of ALL() when using VALUES()

In the past, whenever I wanted to do an ALL() with a VALUES(), I typically used an ALLEXCEPT() instead of the ALL().

For instance, I would do something like this:

[Total Sales](ALLEXCEPT(Periods,Periods[ColumnA]),
VALUES(Periods[ColumnA]))

That turns out to have been an overly-careful habit.  As we’ve seen here, and in part two, VALUES() trumps the ALL(), and does not need the “help” of ALLEXCEPT() instead.

Yes, I’m sure many of you will say that’s obvious from everything covered so far, but just in case you find yourself tempted to use ALLEXCEPT() to exempt the column you are preserving with VALUES(), there’s no need.


Precedence Project Part Two: VALUES()

June 16, 2011

OK, picking up from part one

Let’s start with a simplified version of last post’s pivot – remove one of the row fields, and all of the measures but the base Total Sales measure:

image

Now let’s add a measure that sets Year to ALL:

[ALL Year Sales] = [Total Sales](ALL(Periods[Year]))

PowerPivot ALL Year Measure

Note that all of the orange cells have the same value – they are all Q1 cells and differ only on Year, so the all Year measure means each orange cell will display the total sales for Q1 across all years.  So the green cells all add up to the value in each orange cell.

Rule #3:  VALUES() Selectively “Shuts Off” ALL()

OK, we’ve played with VALUES() a few times on the blog, most notably here and here.  But previously, we’ve never used it like we are going to use it this time.  We are going to use it to preserve the original filter context and “fight back” against ALL().  Exciting huh?

Let’s add another of our previous measures back to the pivot – the one that applied ALL() to the entire Periods table:

[ALL Period Sales] = [Total Sales](ALL(Periods))

image

OK, now let’s add VALUES() to the mix in a new measure that has ALL applied to the whole Periods table, and VALUES applied to just one column in Periods:

[ALL with VALUES] =
[Total Sales](ALL(Periods), VALUES(Periods[FiscalQuarter]))

ALL Against Whole Table, VALUES Against One Field

This time, again the Q1 cells are all the sum of every Q1 in the base Sales measure.  So the VALUES function, applied to the FiscalQuarter column in this measure, overrode the ALL which was applied to every column in the Periods table (including FiscalQuarter!)

So this new measure differs from the pure [ALL Period Sales] measure at the Quarter level.  But note the blue cells.  It’s still identical to [ALL Period Sales] at the Year level.

So is ALL() still “winning” at the Year level?  No.  It’s just that, at the Year level, VALUES(Periods[FiscalQuarter]) returns  the list of all four quarter values, because all four Quarter values truly due correspond to that cell’s coordinates in the pivot (the subtotal for the Year does indeed imply all quarters).

Neat huh?

Clarification/Correction

Derek pointed out in a comment that the two values in blue actually are NOT equal.  They are off by $281.  This is because there are blank values for year – some noise in the data – and I have filtered those out using the little Row Labels dropdown:

image

If I clear that filter and allow blanks to show up in the pivot, I get:

image

Note that the two values in blue are still off by $281.  But the mysterious missing $281 is indeed accounted for by the blank year.

So…  the pure ALL(Periods) measure catches that $281.  That’s expected.  What’s unexpected, at least initially, is that the ALL(Periods), VALUES(Periods[Fiscal Quarter]) measure MISSES that $281.

Here’s why:  note that there’s only a blank Quarter under the blank year.  There are no blank quarters associated with the blue cells.  So that $281 is omitted.

Great catch Derek, this was awesome.  I probably should spend a post just explaining where those blanks Years and Quarters came from.

Back to the original post…

We can even confirm that VALUES() beats ALL() in a heads-up fight.  Rather than use ALL() against a whole table and VALUES() against one column, let’s use them both against the same single column:

[All vs VALUES Heads Up] =
[Total Sales](ALL(Periods[Year]),VALUES(Periods[Year]))

VALUES Beats ALL Heads Up

Hey look!  Our new measure returns the EXACT same results as the base sales measure.  VALUES completely shut off ALL() in a head to head fight.

Rule #4:  VALUES() Wins Even if the “Targeted” Field Is Not on the Pivot!

Maybe this one isn’t so much a precedence rule as it is surprisingly cool.  Let’s try using VALUES() against a column from the Periods table that we don’t have on the pivot…

Periods Table

Let’s try that [Period Num] column.  Notice how it uniquely identifies each period, regardless of Year?  It’s the unique key for this table, so if VALUES() works, that *should* beat ALL() even if ALL() is applied to the entire table.

Let’s try it:

[ALL then VALUES of field not on pivot] =
[Total Sales](ALL(Periods),VALUES(Periods[Period Num]))

image

…and the result is IDENTICAL to the base sales measure.  The ALL() is completely overridden.  Cool huh?  VALUES operates against the underlying tables, and merely takes into account the filters from the current pivot cell.

Good thing to know.  I’ve been taking advantage of that for a long time, it is very helpful.

OK, this post has run long so I will save ALLEXCEPT for the next one.


The Great Precedence Project (Filter Rules for DAX)

June 14, 2011

Richard Pryor wrote most of Blazing Saddles, explaining why it's the best movie Mel Brroks ever made

“Wait a minute. There might be legal precedent. Of course. Land snatching. Let’s see, land, la-land…see snatch. snatch…snatch…ah! Haley vs. United states. Haley 7, United States nothing. You see, it can be done!”

 

 

Keeping filters straight as they pile up

Some example questions to ponder:

  1. Within a FILTER or a CALCULATE, does the order of the filter arguments ever make a difference?
  2. Does using an ALL inside of a FILTER yield different results than using an ALL inside of a CALCULATE?
  3. Measures that are used inside of a FILTER or DATESBETWEEN are never impacted by external filter-setting forces are they?  Or are they?

I went through several months without learning anything new about DAX, and that seemed to me like a Good Sign.  A sign that perhaps I now knew everything I needed to know – not everything I could know, but everything I would ever really need to know.

But then a series of new scenarios presented by clients opened up my eyes to some very “squishy” places in my understanding.  I got everything working for the clients, but I took a note to dig in later and see if I could capture everything into a series of new “rules.”

In the meantime I fired off a few emails – specifically to the product team and “The Italians” – and got some initial advice (the Italians were particularly helpful).  But both of those parties understand this stuff almost TOO well – and what Excel people like me need, I think, is to learn from someone who struggled initially.

So I propose to be that person – who struggles with questions like the above, then triumphs, then explains.  I am going to post my findings as I go, but rest assured that it’s not an entirely selfless act – forcing myself to explain things is the way I best learn things myself Smile

It’s gonna take a series of posts to cover this.  And frankly, I’m not done discovering all the rules yet.  Let’s dig in.

Setting the stage:  two simple measures

We need a base to start from.  So, here’s a simple pivot:

Starting Point for PowerPivot Filter Precedence

The measure there is nothing fancy – just a simple SUM.  All of the fields on rows are from the Periods table:

Periods Table

Now let’s add an ALL(Periods) version of the measure:

[ALL Period Sales] = [Total Sales](ALL(Periods))

Starting Point for PowerPivot Filter Precedence with ALL

And if the syntax I am using, [Measure](<filters>), makes you scratch your head, remember that it is just shorthand syntax for CALCULATE([Measure], <filters>) – they are 100% the same, I just prefer the shorthand syntax for readability.  (This is a controversial topic within the DAX community however, and others disagree with me, but I’m right, damnit! Smile)

Rule #1:  “ALL()” and “Table[Col]=Val” Work Together

Let’s add a third measure, one that has the ALL() from above, but also sets Year=2009:

[ALL Period Sales Set to Year 2009] =

[Total Sales](ALL(Periods),Periods[Year]=2009)

And the results:

ALL and Simple Filters Work Together

Neither clause “overruled” the other.  They were both applied.  We can think of this as “all Periods filters were completely removed, but then Year was set to 2009.”

Of course, thinking of it that way makes you wonder…  what if I reverse the clauses and have the ALL second?  Will that then override the simple filter for 2009?  I mean, I hope that doesn’t matter.  Marco and Alberto say it doesn’t matter.  But I need to see it myself…

Rule #2:  Filter order does NOT make a difference within a CALCULATE

OK, let’s reverse the order:

[ALL Period Sales Set to Year 2009] =

[Total Sales](Periods[Year]=2009, ALL(Periods))

And yes, same result:

Order does NOT matter in filter clauses

OK, that’s reassuring.  But it gets more interesting from here.  In the next post, I’m going to show you that when ALL() meets VALUES(), there arguably IS a winner, and that ALLEXCEPT() and VALUES() yield outcomes that I did not expect.