A recent comment/question alerted me to the fact that I’ve never devoted a post just to this very useful (and often misunderstood) function. Time to correct that.

### The #1 Reason to Use FILTER – When CALCULATE Breaks Down

**Does this Mysterious Error Look Familiar?**

**Different Formula, Same Error**

### What’s Wrong With Those Formulas?

The thing both of those formulas have in common is that they are using a measure in the filter argument of the CALCULATE function. In both examples here, I’ve highlighted the offending measure in yellow.

CALCULATE([Sightings per Year], [Avg Sighting Length in Mins]>6)

CALCULATE([Sightings per Year],

Observations[TTL Min]>[Avg Sighting Length in Mins])

In the first formula, I was trying to use a measure on the left side of the comparison, and in the second, I was trying to use a measure on the right side of the comparison. Both are illegal.

**CALCULATE expects its filter arguments to take the form of Column=Fixed Value, or >Fixed Value, <= Fixed Value, etc., where “Fixed Value” is a specific number (like 6), a specific text string (like “Regular”), or a specific date.** So my first formula violates the rule that a column name is required on the left. And my second formula violates the rule where a fixed value (not an expression or a measure) is required on the right.

**CALCULATE refuses to let you use variable expressions like measures in these filter arguments largely because “vanilla” CALCULATE is intended to always be fast**, and once you start including expressions in these comparisons, your formulas might run a LOT slower. So this is a good rule really – it forces you to stop and think before accidentally doing something bad. The error message, of course, could and should be a lot better.

For a bit more explanation on this, **see this brief post**.

### What’s the Solution?

If you look at those two illegal formulas above, they both reflect a perfectly valid intent. The first formula is attempting to ask for “how many sightings per year would I report if we just counted sightings that lasted more than 6 minutes” and the second is asking for “how many sightings per year are above average in length.”

I’m almost regretting my selection of those examples, because they are a bit more complex than necessary to make the fundamental points. But hey, too late now to change them, so I’ll move quickly.

In the first example, the Avg Sighting Length measure is actually based on a column in my Observations table – each UFO sighting has a [TTL Min] column. So I could rewrite that filter in the calculate as Observations[TTL Min] > 6 and everything is fixed.

But let’s say I wanted to filter out entire States where the average sighting length was > 6. Since I don’t have a column in my States table that does that, it’s sensible to use the measure, and that forces me to use FILTER, because **FILTER does allow me to use measures in my comparisons:**

CALCULATE([Sightings per Year],

FILTER(States, [Avg Sighting Length in Mins]>6)

)

See that? The highlighted section took one of the filter arguments to CALCULATE and replaced it with a call to the FILTER function. The syntax of FILTER is pretty simple, but is explained below.

In my second example, where a measure was used on the right side of the comparison, the formula gets rewritten as:

CALCULATE([Sightings per Year],

FILTER(Observations,

Observations[TTL Min]>[Avg Sighting Length in Mins] )

)

So there you go. When you want to use a measure, or an expression like AVERAGE(Observations[TTL Mins]), you have to call in the FILTER function. More details follow, starting with the simplest information and moving to the most subtle of characteristics.

### How does FILTER() Work?

The syntax for the FILTER function is FILTER(*TableToFilter*, *FilterExpression*). Pretty simple.

**For simple purposes, if you understand the gist of the above, and then points 1 and 2 below, you are good to go.** If you want to understand more of the details over time, I recommend revisiting points 3-5.

**FILTER() takes a***TableToFilter*and a*FilterExpression*, and returns all rows from that*TableToFilter*that match the*FilterExpression*.- In the example above,
*TableToFilter*is ALL(Periods) - and
*FilterExpression*is Periods[Year]=MAX(Periods[Year])-1

- In the example above,
**FILTER() steps through the***TableToFilter*one row at a time.- And for each row, it evaluates the
*FilterExpression*. If the expression evaluates to true, the row is “kept.” If not, it is filtered out. - Because FILTER() goes one row at a time, it can be quite slow if you use it against a large table. When I say “large” that is of course subjective. A few thousand rows is fine in my experience. A million is not. Do not use FILTER() against your fact table.

- And for each row, it evaluates the
**The***FilterExpression*typically takes the form of Table[Column] = <expression>- The comparison operator doesn’t have to be “=”. It can also be <, >, <=, >=, <>
- The expression on the right hand side of
*FilterExpression*can be “rich.” This is VERY useful. In a simple CALCULATE, the right side of each filter expression has to be simple, like a literal number (9) or a string (“Standard”). The fact that FILTER() allows for rich expressions here is one of the most common reasons I use FILTER(). - The Table[Column] in the filter expression is a column in the
*TableToFilter*. If you are filtering the Periods table, it makes sense that you are testing some property of each row in Periods. I can’t think of a sensible reason to use a column here that is NOT from*TableToFilter.*(Insert “boot signal” here, maybe the Italians can address this).

**FILTER() ignores everything else going on in your formula and acts completely on its own.**- For example, our overall formula sets ALL(Periods) as the first argument to CALCULATE.
- The FILTER()’s that come after that do NOT pay any attention to other arguments however, including that ALL(Periods).
- In other words, the FILTER() functions are still operating against the original filter context from the pivot! If the pivot is sliced to Year=2009, then the FILTER() function starts with the Periods table already pre-filtered to just 2009.
- This is why each of my FILTER()’s uses ALL(Periods) for
*TableToFilter.*I have to repeat the “expand” step so that my FILTER() is also working from a clean slate.

**Even though each FILTER() operates on its own, their results then “stack up” in the overall formula.**- Even though FILTER() RETURNS a set of rows that matched the
*FilterExpression,*it actually REMOVES rows from the overall filter context. - This sounds tricky but really it isn’t.
- Let’s say our
*TableToFilter*contains 6 rows: A, B, C, D, E, and F. - And our overall formula contains two FILTER() clauses that both operate on the same
*TableToFilter*, just like our overall formula near the beginning of this post. - Let’s also say that the first FILTER() returns rows A, B, C, and D.
- And the second FILTER() returns rows C, D, E, and F.
- The net result is that only rows C and D are left “alive” in the overall filter context of the formula.
- So one way to think of this is that FILTER()s “stack up” on top of each other.
- Another way to think of it is that even though the first filter RETURNED rows A, B, C, and D, its real effect was to
*REMOVE all other rows*(E and F) from consideration.

- Even though FILTER() RETURNS a set of rows that matched the

Nice post!

“Do not use FILTER() against your fact table.”

“…the FILTER() functions are still operating against the original filter context from the pivot! If the pivot is sliced to Year=2009, then the FILTER() function starts with the Periods table already pre-filtered to just 2009.”

The above statements are worth highlighting.

in fact table, if “filter” works slow but still accurate and reliable, then should be fine……….

Hey, if you don’t mind (or don’t notice) the speed difference, I agree.

thanks for the confirmation :p

Wow, thanks so much for this post. It all looks obvious now! Dax takes time to get used to coming from an Excel background, but definitely worth the effort. This is a bit like being able to write dynamic sql queries directly in Excel cells. Thanks for your very clear explanations and (infectious) enthusiasm!

I am using 5 separate slicers to specify unrelated parameters (i.e. x < Age < y, Gender = [Male and/or Female], etc.) this results in a pretty complex filter function, but it works.

My code seems very inefficient though, as I have a collection of measures which all require the same filter function that I have cut and pasted into each definition. Is there a way to resolve the filter just once and use the result across multiple measures?

I am unaware of a bulk way to define the same FILTER and re-use it, at least not in PowerPivot.

Just making sure though: do you need FILTER() in all of those cases? I mean, I only use FILTER() for slicer purposes when I am explicitly doing “disconnected slicers” which is an advanced technique.

Your Gender slicer seems like it might just be doable as a connected table. (But the x < Age < y part does seem to be a textbook disconnected example).

Can Powerpivot filters be used by regular Excel users who do not have PowerPivot?

In excel 2013, yes. In 2010, you need PowerPivot for SharePoint. Actually the SharePoint thing is pretty good for 2013 too.

Hello, I’m trying to filter cording to values in rows, to create a Rank.

This would be an example of my table.

id sale amount place date

1 $ 75,00 Santa Fe 13/10/2014

2 $ 81,00 Cordoba 02/07/2014

3 $ 90,00 Pilar 21/11/2014

4 $ 73,00 Cordoba 02/05/2014

5 $ 22,00 Pilar 19/12/2014

6 $ 97,00 Santa Fe 06/02/2015

7 $ 49,00 Pilar 29/09/2014

8 $ 42,00 Cordoba 29/04/2015

9 $ 44,00 Pilar 02/07/2014

10 $ 55,00 Brasilia 14/01/2015

11 $ 38,00 Santa Fe 29/11/2014

12 $ 92,00 Santa Fe 24/06/2014

13 $ 91,00 Santa Fe 02/08/2014

14 $ 54,00 Cordoba 21/05/2014

15 $ 12,00 Cordoba 02/09/2014

16 $ 56,00 Santa Fe 16/08/2014

17 $ 83,00 Santa Fe 15/05/2014

18 $ 72,00 Santa Fe 27/11/2014

19 $ 79,00 Pilar 30/03/2015

20 $ 91,00 Pilar 19/05/2014

21 $ 43,00 Cordoba 30/03/2015

22 $ 26,00 Pilar 10/12/2014

23 $ 33,00 Cordoba 31/12/2014

24 $ 25,00 Brasilia 05/01/2015

25 $ 33,00 Pilar 18/05/2014

26 $ 29,00 Cordoba 24/09/2014

27 $ 87,00 Pilar 22/03/2015

28 $ 35,00 Pilar 14/09/2014

29 $ 18,00 Cordoba 04/09/2014

30 $ 78,00 Cordoba 05/12/2014

31 $ 85,00 Cordoba 13/07/2014

32 $ 22,00 Brasilia 26/10/2014

33 $ 97,00 Cordoba 05/01/2015

34 $ 67,00 Cordoba 14/12/2014

35 $ 99,00 Santa Fe 24/10/2014

36 $ 61,00 Pilar 17/05/2014

37 $ 19,00 Pilar 31/10/2014

38 $ 91,00 Brasilia 02/12/2014

39 $ 52,00 Brasilia 05/09/2014

40 $ 12,00 Brasilia 02/12/2014

41 $ 14,00 Pilar 03/12/2014

42 $ 97,00 Pilar 22/05/2014

43 $ 34,00 Santa Fe 28/09/2014

44 $ 80,00 Santa Fe 27/06/2014

45 $ 74,00 Pilar 06/04/2015

46 $ 48,00 Brasilia 22/09/2014

47 $ 42,00 Cordoba 07/01/2015

48 $ 37,00 Cordoba 24/03/2015

49 $ 45,00 Cordoba 26/04/2015

50 $ 54,00 Santa Fe 10/07/2014

I want to create a Rank according to sales amount within each place. So I’m using a Rankx function, but it ranks all the rows, I want to use a filter, so I can have a table with only the rows that have the same place.

ex:

10 $ 55,00 Brasilia 14/01/2015

24 $ 25,00 Brasilia 05/01/2015

32 $ 22,00 Brasilia 26/10/2014

38 $ 91,00 Brasilia 02/12/2014

39 $ 52,00 Brasilia 05/09/2014

40 $ 12,00 Brasilia 02/12/2014

46 $ 48,00 Brasilia 22/09/2014

So when I apply th rankx function it Rank within each place and not within all places.

Thank you verymuch for your time!

Hi Rob,

In the section entitled “How does FILTER() Work?” it looks like the functions in the examples there do not match the upper portion of the rest of your post. For example ALL(Periods) is not in the above example as mentioned.

I am not trying to be ticky tacky, but this seems to be one of the intelligent posts on how filter() actually works. Could you look at the post and see if I am talking crazy or am missing something. If not would you mind updating so we can reference the same material. This post is a bit older but I think may be worth fixing if it indeed does have a problem.

Love the work you are doing to promote the best thing that’s happened to Excel!

Kind Regards,

John Bradley

I get the error “The “02/24/2015″ string cannot be converted to the date type”. Why can’t I get a DAX formula to work in this situation

CALCULATE(

SUM(IncidentDetails[Number of Incidents]),

IncidentDetails[Reported Date]>=3/5/2015

)

I’ve tried various filters, if statements but I keep getting the error. I’ve gone back to the data source and Data table and tried to convert but I get the error. Tried using the FORMAT() option. Sounds like I cannot do a formula that incorporates a specific date.

I even get the error when trying to use this formula but get error still:

=CALCULATE(

SUM(IncidentDetails[Number of Incidents]),

FILTER(‘IncidentDetails’,

IncidentDetails[Reported Date]=MAX(IncidentDetails[Reported Date]

)

)

)

Thanks so much Rob, you’ve just illuminated the darkness for me

I have a similar issue as Nathan above. I tried to apply Rob’s solution to a problem I had but something’s not working. This may not be the place where to post the question, in which case, could you be so nice as to let me know where is better?

A simple table with Customer names in first column, transaction dates in second column and qty sold in third column.

I need to extract, for each customer, the last transaction date and qty, as well as the penultimate (the one before the last) transaction date and qty. There are other columns for later calculations, but right now I focus on this problem.

I created a first measure

LastSaleDate:=max(sales[transdate])

Then I did the following

Penultimate1:=CALCULATE(MAX([TRANSDATE]),filter(sales,Sales[TRANSDATE]<max(sales[transdate])))

This works fine. It was the solution I found before discovering this post.

So I try using the principle of Rob's post here (using CALCULATE and FILTER) where FILTER is required in order to pass a "variable" such as the measure LastSaleDate

Penultimate2:=CALCULATE(MAX([TRANSDATE]),FILTER(ALL(Sales),Sales[TRANSDATE]<[LastSaleDate]))

this doesn't work. I get as a result. I tried changing formats, using a fixed value instead of the LastSaleDate measure, tried to work out if it is the expression part of the CALCULATE function that could be wrong ….. And I have no idea why this doesn’t work.

Can someone help to identify why the concept explained by Rob in this post doesn’t apply in this case???

Thank You

Christine