Finding matching or duplicate or "contains matches" across tables in Power Pivot

We Want to Flag Rows in our Companies Table (on left) When They
Contain a Keyword from our MatchList Table (on right)

These are a few of my favorite things…

Perhaps the only thing that makes me happier than a new “X” function (I still badly want a CONCATENATEX) is “inventing” a new one (like we’ve seen with PRODUCTX).

The other day I was looking at a Power Pivot model and thinking “gee, it sure would be nice to have a CONTAINSX.”

Turns out we can “make” our own CONTAINSX using SUMX.

Does this row’s value in list one also appear in list two?

imageReduced to this generic form, this a pretty common overall need.

Sometimes you can do this VERY quickly in Power Pivot by relating the two tables, and then writing a =RELATED calc column in table 1 to see if it has a matching value in table 2.

But there are times when that doesn’t work.  For instance, when you’re not looking for an exact match, but a “contains” match.

Cutting to the chase

I won’t drag this one out.  Let’s give you a formula (I won’t call it THE formula, because there are definitely other ways and probably better ways).

  [Is this company a metals company] =

          ) > 0,
      “Probably Not”

In English

  1. The SUMX part – Step through every row in MatchList.  For each row in MatchList, evaluate the FIND function.  FIND will return a number.  Sum up all the values you get from FIND.  (There will be 5 values to sum up, because there are 5 rows in MatchList.
  2. FIND – for this row of MatchList, see if you find that substring in the current row of Companies.  If you do, return the number of the position where that substring is found, like FIND always does.  If you don’t find a substring match, return 0.
  3. UPPER – I did this to make the FIND case-INsensitive.  If you want it to be case-sensitive, remove the UPPERs.
  4. IF – if you get 0 back from the SUMX, that means no matches were found, so return “Probably Not.”  If anything other than 0 comes back, there was at least one match (maybe more!), so return “YES!”

No relationship!


No Relationships Are Required for this Technique

A million variations

I bet there are lots of practical ways to twist this:  Return the number of matches, rather than Yes/No.  Case-sensitive versus insensitive.  Begins with, ends with, exact match.  Go nuts.

Grab the workbook

Seriously, just grab it already. Smile

Rob Collie

One of the founding engineers behind Power Pivot during his 14-year career at Microsoft, and creator of the world’s first cloud Power Pivot service, Rob is one of the foremost authorities on self-service business intelligence and next-generation spreadsheet technology.

This Post Has 42 Comments

  1. Here is another version of script with the same functionality just using different functions, and the fact that 0=FALSE:

    Companies[Companies], 1, 0)
    “Probably Not”

  2. Hey Rob. Brilliant as always. If you’re looking for a CONCATENATEX, try this:

    New table called Matchlist2 with 1 field called Keyword2 and a single entry of “Works” (just for this example, but you can use as many second words as you like). Want to concatenate Keyword1 with Keyword2 to find “IronWorks” and “SteelWorks”.

    CONCATENATE(MatchList[Keyword], MatchList2[Keyword2])
    SEARCH([NewKeyWord],Companies[Company], 1, 0)
    ) > 0,
    “Probably Not”

    Other good thing with SEARCH is that it will accept wildcards (? for single digit wildcard, * for string wildcards)

    Keep up the great work.

    Sasha – you might be missing a “>0”


      1. Well, isn’t that special, Sasha. I hadn’t thought about it like that. Sorry about that and thanks for pointing it out. Cheers

  3. Instead of simply returning a “YES” or “NO” match, possible to make it return the matching text found? It is not uncommon to be in situation where we would like to analyse data by certain key texts found within a field.

    1. FANTASTIC question! Yes, I have done that sort of analysis before, years ago, when breaking down user behavior trends when I worked on Bing back at Microsoft.

      My SUMX version will NOT handle this, sadly. But if we take Sasha’s approach from above and modify it for use in conjunction with FIRSTNONBLANK, we have a winner:


      Now, if there were multiple matches, this will only return one. The first one alphabetically, I believe. We would need CONCATENATEX to return all of the matches in one string.

      I think this is worth a followup blog post 🙂

      1. this is shorter than a formula i copied off someone for use in Excel, and that also returns the first match found. Even better if there is a version to concatenate them. Off to digest above, real nice!

        1. argh i got that “naked column” error on the “Companies[Company]”.. did i do something wrong?

          Pasted from my workbook:
          Values(‘Key'[Key words]),
          Search(‘Key'[Key words],’Messy'[Messy field],1,0)

          1. Finally figured this was supposed to be used in the calculated column – silly stuffing it in “Measures” lol. My apologies for the unnecessary confusion. *blame job depriving me of time to properly grasp this amazing toy*

      2. Yours and Sasha’s examples all work wonderfully. I modified your “FIRSTNONBLANK” formula with the filter argument in Sasha’s formula producing this:
        =FIRSTNONBLANK(FILTER( MatchList, SEARCH( MatchList[Keyword], Company[Company], 1, 0) ), 1)
        It produced an error. Somehow the VALUES function is needed but I’m scratching by brain to understand why.

        1. FIRSTNONBLANK requires a *column* – it doesn’t know what to do with a table. VALUES() returns a column. That’s the reason.

          1. Thanks! As I was noodling on this over the weekend, I wondered if that was the case. Thanks for clarifying. I was thinking of creating some sort of reference chart/list of what can get plugged into where, sort of like legos. If you know where something like this already exists, that would be wonderful.

      3. So, I have been wondering & working on how to create a concatenatex that will not only show the first value matchced (vs Yes and Probably Not), but it will show all values that matched. Has this been solved yet?

      4. Rob how badly will using a calculated column effect performance of the powerpivot using the above filtere seach combination (ie is it a hit to the CPU versus a memory/ storage hit if you were to load the substring from the original dataset into the datamodel)? Can you provide your real world experiences?

  4. Wow. I’m always encouraged to see examples of things that actually seem easier in Power Pivot than in Excel. This looks like of those. I have run into this keyword search thing before (and would always wonder why it was not a built in function in Excel, e.g. =Contains(string_to_search, lookup_array) or something. In Excel the solutions seem to boil down to: a/ writing a UDF, or b/ using lots of helper columns with the FIND() or SEARCH() functions (conceptually easiest but occupies lots of real estate – one column per keyword-to-find), or c/ an array formula (nifty, concise, efficient, and usually scary looking). Because the ultimate end users of my works are typically not me, and frequently non-power users of Excel who are slightly terrified of UDFs or arrays, I always have to evaluate the level of ‘explainery’ required versus benefits of efficiency and niftiness. I look forward to discovering more examples of Power Pivot offering BOTH efficiency and ‘splainability.

    Also, Tenacious D totally rules, and the pic here could have also been effective on the previous post. “Tableau has rocked for a long, long time….”

  5. Hi
    God post, but what if I wanted to lookup a value from another table?
    I would like to analyse some posts from a bank statement. I would to group them into categories (Electronics, food, travel). I have a model with a CSV and a linked table. The linked Table is a list of search strings and a matching category.
    I can use the search string and return that, when it is found in the CSV text field. But I would like to get the category field. Is this possible?

  6. Hi,
    can you tell me please how to do exact match search? Please. I don’t know how to do this without related tables. I need it, please help me

      1. Thank’s for the answer, lookupvalue works from table B to table A where related works, or i need it from A to B, where related doesn’t works.

        However, i just find a way to tour around

        =calculate(COUNTROWS(B),filter(B,B[Champ 1]=A[Champ 1]))

        after i filter where it’s empty and it works.

  7. For those of you trying to get multiple text values into the value area of PowerPivot using FIRSTNONBLANK who are wishing for ConcatenateX function. try doing the concatenation of the one to many relationship in SQL using for xml path, before it gets to PowerPivot. In this example, one date has many PlantComments.Comment:

    ((SELECT ‘. ‘ + dbo.PlantComments.comment AS [text()]
    FROM dbo.PlantComments
    Where = FOR XML PATH(”)), 1, 1, ”) AS ReportComments
    FROM dbo.PlantComments INNER JOIN
    dbo.PlantComments a ON = a.Date

    1. The PlantComments table:

      ID Date Comment
      1 6/08/2014 is this working?
      2 27/08/2014 test
      3 27/08/2014 test2

      Results of the SQL query:

      Date ReportComments
      2014-08-06 00:00:00.000 is this working?
      2014-08-27 00:00:00.000 test. test2

  8. Hi,
    i have two unrelated powerpivot tables and i’m trying to compare composite string keys from column X in table A and column Y in table B. When matching occurs i want to write all values in the same row from another column in table B (ex column Z) to table A.

    Is it possible with dax expression?

    I’d greatly appreciate it if you kindly give me some help to solve this problem.

  9. Very Great Stuff!!! I was reading this subject in the Alchemy book but am having trouble customizing for my situation. Can anyone help?

    I have a table of Appointments with a field called Resource. Resource is a concatenation of Provider Names used for that appointment. It might contain one or more Providers Names. Examples include:
    Dr. Smith
    Dr. Smith, Nurse Jones
    Nurse Jones, Nurse Ratchet
    Nurse Ratchet, Dr. Kildare, Nurse Jones

    I have a table of Provider Names:
    Dr. Kildare
    Dr. Smith
    Nurse Jones
    Nurse Ratchet

    In a PivotTable with Provider Names as Row items, I want to calculate how many Appointments there were for each Provider. Using the few records listed above, the PivotTable would look like this:

    Dr. Kildare 1
    Dr. Smith 2
    Nurse Jones 3
    Nurse Ratchet 2
    Grand Total 4

  10. CALCULATE(DISTINCTCOUNT([Deal ID]),isnumber(SEARCH(“Brand Awareness – Anniversary”,SalesforceDownload[Purpose],1))=TRUE())

  11. Do you know how to perform a distinct count when a separate column contains a string?
    My data example is:

    1213 sat
    1014 gef
    1456 sat,gef
    1213 sat,gef
    1213 gef

    I want to count how many distinct values are in the first column where the 2nd column contains “gef”. i.e. the answer would be 3.

    I tried this but it didn’t work:

  12. Thank you for this great post ! I have a similar usecase, except I would like to add a slicer on the keywords to be able to select one or more. The measure explained in the video works only if you select one. Any ideas on how to be able to select multiple keywords in the slicer ? Thanks for your help ! (should we filter the table keyword in the calculated column somehow ?)

  13. How do I return a different column? In the formula below, I want to return Companies[Value] instead of Companies[Company]…so like a Vlookup. Thanks


  14. Great information. Greatly appreciated.

    Lets say we have a column called Drive_NME containing thousands of values of database Server Drives such as:
    D:\DBTranlog03 etc…
    D:\TempDB02 etc…
    D:\DBData04 etc

    and we want to report everything that contains DBTranlog as “DBTranLog”, “TempDB as “TempDB”, DBData as “DBData” etc., so that we can group each category of lump data together. In SQL I can do this by:
    when Drive_NME like ‘%DBData%’ then ‘DBData’
    when Drive_NME like ‘%TranLog%’ then ‘TranLog’
    when Drive_NME like ‘%TempDB%’ then ‘TempDB’
    when Drive_NME like ‘%Cube%’ then ‘Cube’
    when Drive_NME like ‘%Backup%’ then ‘Backups’
    else Drive_NME
    end [Drive_TYP],

    But I am not sure how to do this in powerpivot.

  15. Is there any way to make this work if you have more than one column in your matchlist (company name keyword for contains and company country for exact match), so if I have company and country fields in my original table and want to identify matches where a keyword contains + exact match country are found across both tables?

  16. Hi Rob/David,

    I do have a question. I Table 1 These are the values
    States Task IDs Update
    1 A 12
    3 B 15
    2 A 13
    1 C 12

    Table 2 Below are the values.
    States Task No Of Ids to be updated
    1 A 40
    1 B 25
    2 A 45
    2 B 30
    3 A 45
    3 B 45
    1 C 42
    2 C 32
    3 C 32

    I need a percentage Since Task B is not completed in table 1 for state 1. The calcualtion should be (12+13)/(40+45).

    Any suggestions how this could be acheived.


  17. So here’s a challenge for someone out there.

    I have a static TABLEB which has 5 Columns, example below, comma separated which I would like to use to match the 4 specific columns from another table, which would then return the Value in Column 4, e,g, Try again, Good Job, Not Bad.

    A,B,C,D,Try again
    B,C,A,D,Good Job
    E,F,G,Z,Not Bad

    In TABLEA below, I need to match columns6,7,8,9 with columns,1,2,3,4 in TABLEB and return the value in column5 TABLEB into column10 in TABLEA, so I assume Column10 in TABLEA will hold the formula.

    1,2,3,4,5,A,B,C,D,MATCH – Try Again
    2,3,1,5,1,Z,F,A,G,NO MATCH
    4,1,5,4,E,F,G,Z,MATCH Not Bad

    I hope this is not too difficult, using PowerQuery or even DAX if need.

  18. In this example, how do you assign the keyword itself to the column instead of the “YES!” or “Probably Not”?
    Specifically, if a row contains the word “mine”, the calculated column shows “mine” instead of YES!.

    1. Joshua – the answer to your questions are in the comments up above. You have to modify the formula Rob used. See Rob’s post from 1/24/2014. That solution worked for me perfectly.

      1. Hi Jason, Thanks I found it indeed.
        But now I have another problem regarding this formula. I’ve been using the formula to find the keyword but like Rob said himself, it assigns the first value it finds in alphabetical order.
        I want to sort this by ID order instead of the alphabetical sorting of the keyword.
        Is there anyway to add a sort function to this formula ?

        FILTER( VALUES(Table1[Key Word]),
        SEARCH(Table1[Key Word], [Episode],1,0)

        I’d appreciate the feedback.

Leave a Comment or Question