“CONTAINSX” – Finding if a value in table 1 has a matching value in table 2

 
Finding matching or duplicate or "contains matches" across tables in Power Pivot

We Want to Flag Rows in our Companies Table (on left) When They
Contain a Keyword from our MatchList Table (on right)

These are a few of my favorite things…

Perhaps the only thing that makes me happier than a new “X” function (I still badly want a CONCATENATEX) is “inventing” a new one (like we’ve seen with PRODUCTX).

The other day I was looking at a Power Pivot model and thinking “gee, it sure would be nice to have a CONTAINSX.”

Turns out we can “make” our own CONTAINSX using SUMX.

Does this row’s value in list one also appear in list two?


imageReduced to this generic form, this a pretty common overall need.

Sometimes you can do this VERY quickly in Power Pivot by relating the two tables, and then writing a =RELATED calc column in table 1 to see if it has a matching value in table 2.

But there are times when that doesn’t work.  For instance, when you’re not looking for an exact match, but a “contains” match.

Cutting to the chase

I won’t drag this one out.  Let’s give you a formula (I won’t call it THE formula, because there are definitely other ways and probably better ways).

  [Is this company a metals company] =

  =IF(
      SUMX(MatchList,
           FIND(
                UPPER(MatchList[Keyword]),
                UPPER(Companies[Company])
                ,,0
               )
          ) > 0,
      “YES!”,
      “Probably Not”
     )

In English
 

  1. The SUMX part – Step through every row in MatchList.  For each row in MatchList, evaluate the FIND function.  FIND will return a number.  Sum up all the values you get from FIND.  (There will be 5 values to sum up, because there are 5 rows in MatchList.
  2. FIND – for this row of MatchList, see if you find that substring in the current row of Companies.  If you do, return the number of the position where that substring is found, like FIND always does.  If you don’t find a substring match, return 0.
  3. UPPER – I did this to make the FIND case-INsensitive.  If you want it to be case-sensitive, remove the UPPERs.
  4. IF – if you get 0 back from the SUMX, that means no matches were found, so return “Probably Not.”  If anything other than 0 comes back, there was at least one match (maybe more!), so return “YES!”

No relationship!

image

No Relationships Are Required for this Technique

A million variations

I bet there are lots of practical ways to twist this:  Return the number of matches, rather than Yes/No.  Case-sensitive versus insensitive.  Begins with, ends with, exact match.  Go nuts.

Grab the workbook

Seriously, just grab it already. Smile

25 Responses to “CONTAINSX” – Finding if a value in table 1 has a matching value in table 2

  1. Really cool! Next stop –> differences between SEARCH & FIND
    and for anyone trying to find out how to do this as a measure (calculated field), then here’s a video tutorial that you might find useful:

  2. Amanda Harley says:

    What the, what? Ridiculously awesome!

  3. Sasha Juric says:

    Here is another version of script with the same functionality just using different functions, and the fact that 0=FALSE:

    =IF(
    COUNTROWS(
    FILTER(Matchlist,
    SEARCH(Matchlist[Keyword],
    Companies[Companies], 1, 0)
    )
    ),
    “YES!”,
    “Probably Not”
    )

  4. David Churchward says:

    Hey Rob. Brilliant as always. If you’re looking for a CONCATENATEX, try this:

    New table called Matchlist2 with 1 field called Keyword2 and a single entry of “Works” (just for this example, but you can use as many second words as you like). Want to concatenate Keyword1 with Keyword2 to find “IronWorks” and “SteelWorks”.

    IF
    (
    COUNTROWS
    (
    FILTER
    (
    ADDCOLUMNS
    (
    CROSSJOIN
    (
    MatchList,
    MatchList2
    ),
    “NewKeyWord”,
    CONCATENATE(MatchList[Keyword], MatchList2[Keyword2])
    ),
    SEARCH([NewKeyWord],Companies[Company], 1, 0)
    )
    ) > 0,
    “Yes”,
    “Probably Not”
    )

    Other good thing with SEARCH is that it will accept wildcards (? for single digit wildcard, * for string wildcards)

    Keep up the great work.

    Sasha – you might be missing a “>0″

    Cheers
    DC

  5. en says:

    Instead of simply returning a “YES” or “NO” match, possible to make it return the matching text found? It is not uncommon to be in situation where we would like to analyse data by certain key texts found within a field.

    • powerpivotpro says:

      FANTASTIC question! Yes, I have done that sort of analysis before, years ago, when breaking down user behavior trends when I worked on Bing back at Microsoft.

      My SUMX version will NOT handle this, sadly. But if we take Sasha’s approach from above and modify it for use in conjunction with FIRSTNONBLANK, we have a winner:

      =FIRSTNONBLANK(FILTER(VALUES(MatchList[Keyword]),
      SEARCH(Matchlist[Keyword],
      Companies[Company],
      1,
      0
      )
      ),1)

      Now, if there were multiple matches, this will only return one. The first one alphabetically, I believe. We would need CONCATENATEX to return all of the matches in one string.

      I think this is worth a followup blog post :)

      • en says:

        this is shorter than a formula i copied off someone for use in Excel, and that also returns the first match found. Even better if there is a version to concatenate them. Off to digest above, real nice!

      • Keith Barney says:

        Yours and Sasha’s examples all work wonderfully. I modified your “FIRSTNONBLANK” formula with the filter argument in Sasha’s formula producing this:
        =FIRSTNONBLANK(FILTER( MatchList, SEARCH( MatchList[Keyword], Company[Company], 1, 0) ), 1)
        It produced an error. Somehow the VALUES function is needed but I’m scratching by brain to understand why.

        • powerpivotpro says:

          FIRSTNONBLANK requires a *column* – it doesn’t know what to do with a table. VALUES() returns a column. That’s the reason.

          • Keith Barney says:

            Thanks! As I was noodling on this over the weekend, I wondered if that was the case. Thanks for clarifying. I was thinking of creating some sort of reference chart/list of what can get plugged into where, sort of like legos. If you know where something like this already exists, that would be wonderful.

      • d0k5r2e6isb says:

        So, I have been wondering & working on how to create a concatenatex that will not only show the first value matchced (vs Yes and Probably Not), but it will show all values that matched. Has this been solved yet?

  6. Wow. I’m always encouraged to see examples of things that actually seem easier in Power Pivot than in Excel. This looks like of those. I have run into this keyword search thing before (and would always wonder why it was not a built in function in Excel, e.g. =Contains(string_to_search, lookup_array) or something. In Excel the solutions seem to boil down to: a/ writing a UDF, or b/ using lots of helper columns with the FIND() or SEARCH() functions (conceptually easiest but occupies lots of real estate – one column per keyword-to-find), or c/ an array formula (nifty, concise, efficient, and usually scary looking). Because the ultimate end users of my works are typically not me, and frequently non-power users of Excel who are slightly terrified of UDFs or arrays, I always have to evaluate the level of ‘explainery’ required versus benefits of efficiency and niftiness. I look forward to discovering more examples of Power Pivot offering BOTH efficiency and ‘splainability.

    Also, Tenacious D totally rules, and the pic here could have also been effective on the previous post. “Tableau has rocked for a long, long time….”

  7. Morten Andersson says:

    Hi
    God post, but what if I wanted to lookup a value from another table?
    I would like to analyse some posts from a bank statement. I would to group them into categories (Electronics, food, travel). I have a model with a CSV and a linked table. The linked Table is a list of search strings and a matching category.
    I can use the search string and return that, when it is found in the CSV text field. But I would like to get the category field. Is this possible?

  8. john says:

    Hi,
    can you tell me please how to do exact match search? Please. I don’t know how to do this without related tables. I need it, please help me

    • powerpivotpro says:

      Try LOOKUPVALUE?

      • john says:

        Thank’s for the answer, lookupvalue works from table B to table A where related works, or i need it from A to B, where related doesn’t works.

        However, i just find a way to tour around

        =calculate(COUNTROWS(B),filter(B,B[Champ 1]=A[Champ 1]))

        after i filter where it’s empty and it works.

  9. Tim Bush says:

    For those of you trying to get multiple text values into the value area of PowerPivot using FIRSTNONBLANK who are wishing for ConcatenateX function. try doing the concatenation of the one to many relationship in SQL using for xml path, before it gets to PowerPivot. In this example, one date has many PlantComments.Comment:

    SELECT DISTINCT a.date, STUFF
    ((SELECT ‘. ‘ + dbo.PlantComments.comment AS [text()]
    FROM dbo.PlantComments
    Where dbo.Plantcomments.date = a.date FOR XML PATH(”)), 1, 1, ”) AS ReportComments
    FROM dbo.PlantComments INNER JOIN
    dbo.PlantComments a ON dbo.PlantComments.date = a.Date

    • Tim Bush says:

      The PlantComments table:

      ID Date Comment
      1 6/08/2014 is this working?
      2 27/08/2014 test
      3 27/08/2014 test2

      Results of the SQL query:

      Date ReportComments
      2014-08-06 00:00:00.000 is this working?
      2014-08-27 00:00:00.000 test. test2

  10. alen says:

    Hi,
    i have two unrelated powerpivot tables and i’m trying to compare composite string keys from column X in table A and column Y in table B. When matching occurs i want to write all values in the same row from another column in table B (ex column Z) to table A.

    Is it possible with dax expression?

    I’d greatly appreciate it if you kindly give me some help to solve this problem.

Leave a Comment or Question