A recent comment/question alerted me to the fact that I’ve never devoted a post just to this very useful (and often misunderstood) function.  Time to correct that.

### The #1 Reason to Use FILTER – When CALCULATE Breaks Down

Does this Mysterious Error Look Familiar?

Different Formula, Same Error

### What’s Wrong With Those Formulas?

The thing both of those formulas have in common is that they are using a measure in the filter argument of the CALCULATE function.  In both examples here, I’ve highlighted the offending measure in yellow.

CALCULATE([Sightings per Year], [Avg Sighting Length in Mins]>6)

CALCULATE([Sightings per Year],
Observations[TTL Min]>[Avg Sighting Length in Mins])

In the first formula, I was trying to use a measure on the left side of the comparison, and in the second, I was trying to use a measure on the right side of the comparison.  Both are illegal.

CALCULATE expects its filter arguments to take the form of Column=Fixed Value, or >Fixed Value, <= Fixed Value, etc., where “Fixed Value” is a specific number (like 6), a specific text string (like “Regular”), or a specific date.  So my first formula violates the rule that a column name is required on the left.  And my second formula violates the rule where a fixed value (not an expression or a measure) is required on the right.

CALCULATE refuses to let you use variable expressions like measures in these filter arguments largely because “vanilla” CALCULATE is intended to always be fast, and once you start including expressions in these comparisons, your formulas might run a LOT slower.  So this is a good rule really – it forces you to stop and think before accidentally doing something bad.  The error message, of course, could and should be a lot better.

For a bit more explanation on this, see this brief post.

### What’s the Solution?

If you look at those two illegal formulas above, they both reflect a perfectly valid intent.  The first formula is attempting to ask for “how many sightings per year would I report if we just counted sightings that lasted more than 6 minutes” and the second is asking for “how many sightings per year are above average in length.”

I’m almost regretting my selection of those examples, because they are a bit more complex than necessary to make the fundamental points.  But hey, too late now to change them, so I’ll move quickly.

In the first example, the Avg Sighting Length measure is actually based on a column in my Observations table – each UFO sighting has a [TTL Min] column.  So I could rewrite that filter in the calculate as Observations[TTL Min] > 6 and everything is fixed.

But let’s say I wanted to filter out entire States where the average sighting length was > 6.  Since I don’t have a column in my States table that does that, it’s sensible to use the measure, and that forces me to use FILTER, because FILTER does allow me to use measures in my comparisons:

CALCULATE([Sightings per Year],
FILTER(States, [Avg Sighting Length in Mins]>6)
)

See that?  The highlighted section took one of the filter arguments to CALCULATE and replaced it with a call to the FILTER function.  The syntax of FILTER is pretty simple, but is explained below.

In my second example, where a measure was used on the right side of the comparison, the formula gets rewritten as:

CALCULATE([Sightings per Year],
FILTER(Observations,
Observations[TTL Min]>[Avg Sighting Length in Mins]   )

)

So there you go.  When you want to use a measure, or an expression like AVERAGE(Observations[TTL Mins]), you have to call in the FILTER function.  More details follow, starting with the simplest information and moving to the most subtle of characteristics.

### How does FILTER() Work?

The syntax for the FILTER function is FILTER(TableToFilter, FilterExpression). Pretty simple.

For simple purposes, if you understand the gist of the above, and then points 1 and 2 below, you are good to go. If you want to understand more of the details over time, I recommend revisiting points 3-5.

1. FILTER() takes a TableToFilter and a FilterExpression, and returns all rows from that TableToFilter that match the FilterExpression.
1. In the example above, TableToFilter is ALL(Periods)
2. and FilterExpression is Periods[Year]=MAX(Periods[Year])-1
2. FILTER() steps through the TableToFilter one row at a time.
1. And for each row, it evaluates the FilterExpression. If the expression evaluates to true, the row is “kept.” If not, it is filtered out.
2. Because FILTER() goes one row at a time, it can be quite slow if you use it against a large table. When I say “large” that is of course subjective. A few thousand rows is fine in my experience. A million is not. Do not use FILTER() against your fact table.
3. The FilterExpression typically takes the form of Table[Column] = <expression>
1. The comparison operator doesn’t have to be “=”. It can also be <, >, <=, >=, <>
2. The expression on the right hand side of FilterExpression can be “rich.” This is VERY useful. In a simple CALCULATE, the right side of each filter expression has to be simple, like a literal number (9) or a string (“Standard”). The fact that FILTER() allows for rich expressions here is one of the most common reasons I use FILTER().
3. The Table[Column] in the filter expression is a column in the TableToFilter. If you are filtering the Periods table, it makes sense that you are testing some property of each row in Periods. I can’t think of a sensible reason to use a column here that is NOT from TableToFilter. (Insert “boot signal” here, maybe the Italians can address this).
4. FILTER() ignores everything else going on in your formula and acts completely on its own.
1. For example, our overall formula sets ALL(Periods) as the first argument to CALCULATE.
2. The FILTER()’s that come after that do NOT pay any attention to other arguments however, including that ALL(Periods).
3. In other words, the FILTER() functions are still operating against the original filter context from the pivot! If the pivot is sliced to Year=2009, then the FILTER() function starts with the Periods table already pre-filtered to just 2009.
4. This is why each of my FILTER()’s uses ALL(Periods) for TableToFilter. I have to repeat the “expand” step so that my FILTER() is also working from a clean slate.
5. Even though each FILTER() operates on its own, their results then “stack up” in the overall formula.
1. Even though FILTER() RETURNS a set of rows that matched the FilterExpression, it actually REMOVES rows from the overall filter context.
2. This sounds tricky but really it isn’t.
3. Let’s say our TableToFilter contains 6 rows: A, B, C, D, E, and F.
4. And our overall formula contains two FILTER() clauses that both operate on the same TableToFilter, just like our overall formula near the beginning of this post.
5. Let’s also say that the first FILTER() returns rows A, B, C, and D.
6. And the second FILTER() returns rows C, D, E, and F.
7. The net result is that only rows C and D are left “alive” in the overall filter context of the formula.
8. So one way to think of this is that FILTER()s “stack up” on top of each other.
9. Another way to think of it is that even though the first filter RETURNED rows A, B, C, and D, its real effect was to REMOVE all other rows (E and F) from consideration.

#### Rob Collie

One of the founding engineers behind Power Pivot during his 14-year career at Microsoft, and creator of the world’s first cloud Power Pivot service, Rob is one of the foremost authorities on self-service business intelligence and next-generation spreadsheet technology.

## This Post Has 14 Comments

1. ColinBanfield says:

Nice post!

“Do not use FILTER() against your fact table.”

“…the FILTER() functions are still operating against the original filter context from the pivot! If the pivot is sliced to Year=2009, then the FILTER() function starts with the Periods table already pre-filtered to just 2009.”

The above statements are worth highlighting.

1. Shawn says:

in fact table, if “filter” works slow but still accurate and reliable, then should be fine……….

1. powerpivotpro says:

Hey, if you don’t mind (or don’t notice) the speed difference, I agree.

1. Shawn says:

thanks for the confirmation :p

2. Mathilde says:

Wow, thanks so much for this post. It all looks obvious now! Dax takes time to get used to coming from an Excel background, but definitely worth the effort. This is a bit like being able to write dynamic sql queries directly in Excel cells. Thanks for your very clear explanations and (infectious) enthusiasm!

3. Mike H. says:

I am using 5 separate slicers to specify unrelated parameters (i.e. x < Age < y, Gender = [Male and/or Female], etc.) this results in a pretty complex filter function, but it works.

My code seems very inefficient though, as I have a collection of measures which all require the same filter function that I have cut and pasted into each definition. Is there a way to resolve the filter just once and use the result across multiple measures?

1. powerpivotpro says:

I am unaware of a bulk way to define the same FILTER and re-use it, at least not in PowerPivot.

Just making sure though: do you need FILTER() in all of those cases? I mean, I only use FILTER() for slicer purposes when I am explicitly doing “disconnected slicers” which is an advanced technique.

Your Gender slicer seems like it might just be doable as a connected table. (But the x < Age < y part does seem to be a textbook disconnected example).

4. Trisha says:

Can Powerpivot filters be used by regular Excel users who do not have PowerPivot?

1. powerpivotpro says:

In excel 2013, yes. In 2010, you need PowerPivot for SharePoint. Actually the SharePoint thing is pretty good for 2013 too.

5. Ina says:

Hello, I’m trying to filter cording to values in rows, to create a Rank.

This would be an example of my table.

id sale amount place date
1 \$ 75,00 Santa Fe 13/10/2014
2 \$ 81,00 Cordoba 02/07/2014
3 \$ 90,00 Pilar 21/11/2014
4 \$ 73,00 Cordoba 02/05/2014
5 \$ 22,00 Pilar 19/12/2014
6 \$ 97,00 Santa Fe 06/02/2015
7 \$ 49,00 Pilar 29/09/2014
8 \$ 42,00 Cordoba 29/04/2015
9 \$ 44,00 Pilar 02/07/2014
10 \$ 55,00 Brasilia 14/01/2015
11 \$ 38,00 Santa Fe 29/11/2014
12 \$ 92,00 Santa Fe 24/06/2014
13 \$ 91,00 Santa Fe 02/08/2014
14 \$ 54,00 Cordoba 21/05/2014
15 \$ 12,00 Cordoba 02/09/2014
16 \$ 56,00 Santa Fe 16/08/2014
17 \$ 83,00 Santa Fe 15/05/2014
18 \$ 72,00 Santa Fe 27/11/2014
19 \$ 79,00 Pilar 30/03/2015
20 \$ 91,00 Pilar 19/05/2014
21 \$ 43,00 Cordoba 30/03/2015
22 \$ 26,00 Pilar 10/12/2014
23 \$ 33,00 Cordoba 31/12/2014
24 \$ 25,00 Brasilia 05/01/2015
25 \$ 33,00 Pilar 18/05/2014
26 \$ 29,00 Cordoba 24/09/2014
27 \$ 87,00 Pilar 22/03/2015
28 \$ 35,00 Pilar 14/09/2014
29 \$ 18,00 Cordoba 04/09/2014
30 \$ 78,00 Cordoba 05/12/2014
31 \$ 85,00 Cordoba 13/07/2014
32 \$ 22,00 Brasilia 26/10/2014
33 \$ 97,00 Cordoba 05/01/2015
34 \$ 67,00 Cordoba 14/12/2014
35 \$ 99,00 Santa Fe 24/10/2014
36 \$ 61,00 Pilar 17/05/2014
37 \$ 19,00 Pilar 31/10/2014
38 \$ 91,00 Brasilia 02/12/2014
39 \$ 52,00 Brasilia 05/09/2014
40 \$ 12,00 Brasilia 02/12/2014
41 \$ 14,00 Pilar 03/12/2014
42 \$ 97,00 Pilar 22/05/2014
43 \$ 34,00 Santa Fe 28/09/2014
44 \$ 80,00 Santa Fe 27/06/2014
45 \$ 74,00 Pilar 06/04/2015
46 \$ 48,00 Brasilia 22/09/2014
47 \$ 42,00 Cordoba 07/01/2015
48 \$ 37,00 Cordoba 24/03/2015
49 \$ 45,00 Cordoba 26/04/2015
50 \$ 54,00 Santa Fe 10/07/2014

I want to create a Rank according to sales amount within each place. So I’m using a Rankx function, but it ranks all the rows, I want to use a filter, so I can have a table with only the rows that have the same place.
ex:

10 \$ 55,00 Brasilia 14/01/2015
24 \$ 25,00 Brasilia 05/01/2015
32 \$ 22,00 Brasilia 26/10/2014
38 \$ 91,00 Brasilia 02/12/2014
39 \$ 52,00 Brasilia 05/09/2014
40 \$ 12,00 Brasilia 02/12/2014
46 \$ 48,00 Brasilia 22/09/2014

So when I apply th rankx function it Rank within each place and not within all places.

Thank you verymuch for your time!

6. Hi Rob,

In the section entitled “How does FILTER() Work?” it looks like the functions in the examples there do not match the upper portion of the rest of your post. For example ALL(Periods) is not in the above example as mentioned.

I am not trying to be ticky tacky, but this seems to be one of the intelligent posts on how filter() actually works. Could you look at the post and see if I am talking crazy or am missing something. If not would you mind updating so we can reference the same material. This post is a bit older but I think may be worth fixing if it indeed does have a problem.

Love the work you are doing to promote the best thing that’s happened to Excel!

Kind Regards,

7. nathan says:

I get the error “The “02/24/2015″ string cannot be converted to the date type”. Why can’t I get a DAX formula to work in this situation

CALCULATE(
SUM(IncidentDetails[Number of Incidents]),
IncidentDetails[Reported Date]>=3/5/2015
)

I’ve tried various filters, if statements but I keep getting the error. I’ve gone back to the data source and Data table and tried to convert but I get the error. Tried using the FORMAT() option. Sounds like I cannot do a formula that incorporates a specific date.

I even get the error when trying to use this formula but get error still:

=CALCULATE(
SUM(IncidentDetails[Number of Incidents]),
FILTER(‘IncidentDetails’,
IncidentDetails[Reported Date]=MAX(IncidentDetails[Reported Date]
)
)
)

8. Andre says:

Thanks so much Rob, you’ve just illuminated the darkness for me

9. Christine says:

I have a similar issue as Nathan above. I tried to apply Rob’s solution to a problem I had but something’s not working. This may not be the place where to post the question, in which case, could you be so nice as to let me know where is better?

A simple table with Customer names in first column, transaction dates in second column and qty sold in third column.

I need to extract, for each customer, the last transaction date and qty, as well as the penultimate (the one before the last) transaction date and qty. There are other columns for later calculations, but right now I focus on this problem.

I created a first measure

LastSaleDate:=max(sales[transdate])

Then I did the following

Penultimate1:=CALCULATE(MAX([TRANSDATE]),filter(sales,Sales[TRANSDATE]<max(sales[transdate])))

This works fine. It was the solution I found before discovering this post.

So I try using the principle of Rob's post here (using CALCULATE and FILTER) where FILTER is required in order to pass a "variable" such as the measure LastSaleDate

Penultimate2:=CALCULATE(MAX([TRANSDATE]),FILTER(ALL(Sales),Sales[TRANSDATE]<[LastSaleDate]))

this doesn't work. I get as a result. I tried changing formats, using a fixed value instead of the LastSaleDate measure, tried to work out if it is the expression part of the CALCULATE function that could be wrong ….. And I have no idea why this doesn’t work.

Can someone help to identify why the concept explained by Rob in this post doesn’t apply in this case???

Thank You
Christine