Introducing CUBESET() and CUBERANKEDMEMBER()

batman04xf9

“Where does he get such WONDERFUL toys???”

-Kasper de Jonge

(To be honest, Kasper probably looks nothing like this.  I’ve never been to the Netherlands, but based on the fashions in Sweden, I am fairly certain that the Dutch don’t dress like that, at least not when they visit Sweden).

 

Poor Kasper, though.  The mystery of how the parameterized, sliced, auto-sorting report was built has simply been too much for him to bear, so the other day I sent him an advanced preview of the workbook.  I’ll share the same workbook (without the underlying data, which is licensed) with all of you, but first, some explanation.

Two posts back, I showed you the functions CUBEMEMBER and CUBEVALUE.  Think of CUBEMEMBER cells as composing the axes of a report, and CUBEVALUE cells as composing the numerical data in the report.

But CUBEMEMBER does not easily lend itself to dynamically-sorted reports.  Each CUBEMEMBER function fetches a fixed value (an NFL player, in my case) from the database.  To get a nifty auto-sorting report, we need to introduce…

CUBESET()

As far as Excel functions go, CUBESET is a strange one.  It’s 100% useless by itself, but forms the foundation for some very interesting things.  Let’s look at its syntax:

CUBESET(connection, set_expression, , [sort_order],
       [sort_by])

  1. Connection – the name of the connection, from Excel’s perspective.  In CTP3 builds, this is always “Sandbox” for PowerPivot connections.  This, BTW, is the first argument to every cube function.
  2. Set_expression – rather than return a single “member” like CUBEMEMBER, CUBESET returns a whole “Set” of values.  There are many different syntaxes for representing a set, but the one you will use most often is <ColumnName>.children, which returns all unique values of that column.
  3. Caption (optional) – there is no way for Excel to display all values of a set in a single cell, so you generally make up your own caption.  “” is a common caption, since it doesn’t clutter your report, but you can return any string you want, like “All NFL Players”.
  4. Sort_order (optional) – an integer from 0 to 7, representing “No Sorting,” “Sort Ascending,” “Sort Descending,” “Alpha Ascending,” “Alpha Descending,” “DB order ascending,” “DB order descending.”
  5. Sort_by (optional) – the MEASURE you want to sort by.  (Must be a measure, not a column!)  This argument is ignored unless you specify 1 or 2 for sort_order (ascending or descending).

Using CUBESET() in the football report

In my football report, I use CUBESET behind the scenes, with the following arguments:

=CUBESET("Sandbox",
         "[CleanPlayers].[FullName].children",
         "All Players”,
         2,
         "[Measures].[Rushing Yards]")

Ignoring the connection and caption arguments, what this formula means is:  “Return all unique values from the [FullName] column in the [CleanPlayers] table, sorted descending by the Rushing Yards measure.”

(Note that in Cube formulas, table names are wrapped in square brackets, whereas in DAX they are wrapped in single quotes.  It would be nice if cube functions could use single quotes for tables to distinguish them from columns as DAX does, but there’s an excellent historical reason for that.)

Fetching values from that set using CUBERANKEDMEMBER()

This is where it gets fun :)

The cell that contains the CUBESET formula only displays a caption, but behind that cell, the entire set is stored, invisibly.  CUBERANKEDMEMBER is how you fetch values from that set.

(Whether you chose to specify a sort order or not, that set IS ordered, even if the database decided to return the values in somewhat random order.  Hence the “RANKED” in its name.)

The syntax for CUBERANKEDMEMBER is quite simple:

=CUBERANKEDMEMBER(
         Connection,
         Set_Expression,
         Rank,
         [Caption]
)

For set expression, just give it the cell address of the CUBESET formula.  For rank, give it any integer: 1 to return the first member of the set, 2 for the second, etc.

In my football report, this is:

The syntax for CUBERANKEDMEMBER is quite simple:

=CUBERANKEDMEMBER(
         “Sandbox”,
         $B$3,
         ROW(A1)
)

B3 is the address of my CUBESET formula, but I used the absolute reference syntax $B$3 because I plan to copy this CUBERANKEDMEMBER formula down a column, and I don’t want Excel adjusting B3 to B4, B5, etc., since that won’t reference the right cell anymore.

OK, what the heck is ROW(A1)???

It’s an Excel trick.  If I just enter 1 for this argument, that will NOT increment to 2, 3, etc. when I copy this formula down a column.

And the whole reason to copy down a column is so that I get the 2nd, 3rd, 4th, etc. members of the set as I go down the column.

If I use ROW(A1), that will return 1 for the first instance of the formula.  And when I copy down the column, Excel increments that to ROW(A2), ROW(A3), etc., which return 2, 3, etc. – just what I want!

Comparison to the Named Set Method

Denny Lee’s series of posts on Top N reports goes a different direction than what I have here.  He defines a Named Set in the Excel UI (using the MDX language), then uses a PivotTable to return that top N set.

In comments on the first post, Colin Banfield asked how to order that set.  Neither Denny nor Colin have an answer at the time of this writing.  And if Denny and Colin are stumped on MDX, well, there’s no WAY that I am going to figure it out :)

But with CUBESET, you can see that the Sort_order and Sort_by params give me a simple method for ordering the set by a variety of criteria.  Then CUBERANKEDMEMBER lets me fetch any number of those sorted items, in whatever precise order I’d like.

So… no MDX required, (except for the .children trick I used in CUBESET, which technically is MDX), AND a simple solution to the ordering problem.

The Named Set feature is a good one, and it definitely has its uses, but for custom Top N reports, I’ll take CUBESET and stay away from that MDX madness.

And what about Top N filters in pivots?

Actually, if Top N is all you want, a Pivot is gonna be your best bet at least 90% of the time.  Remember, the only reason I went down the Cube Formulas path is that I wanted custom formatting in my report, and parameterization.

I’ll post again shortly about how pivots might even be better than you expect at Top N reports.

Don’t worry if I am losing you, this is not required…  nor easy

This whole topic of cube formulas is pure gravy, take-it-or-leave-it kind of stuff.  It took me several attempts to get comfy with it myself, and I was part of the team building the feature!

Think of this as a 200-level topic.  I’m including it for the handful of you who consider yourselves hardcore, to test your mettle and stretch your Excel skills :)  Everyone else, just file this away for later and come back some day when you find yourself needing precisely these tricks.

When it comes to building PowerPivot reports, I’d break things out like this:

100 Level Topics

  1. PivotTables, PivotCharts, and Slicers
  2. Converting a PivotTable to formulas and then custom arranging/formatting the report

200 Level

  1. Customizing CUBEVALUE cells to do custom arithmetic and parameterization
  2. Writing CUBESET and CUBERANKEDMEMBER formulas

300 Level

  1. Dynamic sorting using CUBESET and CUBERANKEDMEMBER
  2. MDX-Driven Named Sets, as Denny demonstrated

…which means that my next post on cube formulas is even more optional/ignorable than this one :)

Wake up Denny!

Denny, of course, well…  Denny HAS to pay attention, because I challenged his geek-hood in my last post.

Some of you might say that wasn’t fair, since I have more Excel background than Denny and he has more SQL background than I.  Sun Tzu once suggested that choosing the field and terms of battle was more important than the fighting itself, and I do love me some Sun Tzu :)

Are you like Kasper?  Want to skip ahead?  Download the workbook!

I have not covered every technique yet, but if you are getting anxious like Kasper and can’t wait any longer, you can download the workbook here and inspect every last formula.

Remember, the source data itself is licensed, so I’ve removed that from the workbook.  You won’t be able to fetch fresh values from the PowerPivot db if you try to refresh, or modify formulas.  I’m sharing this so that you can see all the formulas and understand how the report is built.

And no, Kasper didn’t get the source data, either :)

Next Football Post:  Writing a “Rank” Measure >>

32 Responses to Introducing CUBESET() and CUBERANKEDMEMBER()

  1. Thank god i don’t look like that :) But you’re right i had to know and i can’t stand teasers. Great stuff these Excel tricks, all new for me as a BI developer. So keep on showing these tips and tricks, maybe i’ll create a screencast soon how to create a report like this.

    What i would like to do is create a filter on the PowerPivot data that i can control with webpart connections! Thus far couldn’t get it to work, keep on playing :)

    BTW what will the Cube be named in RTM is not Sandbox?

  2. Hmm Rob I’ve been playing with the these formula’s and inspected your workbook.
    Inventive stuff tying it all together like this.

    If i understand correct you get the 50 players with the most rushing yards (regardless of the slicers). you get the fantasy points from these players (with slicers). And then select the top 20 fantasy points by largest. Lookup the playersname by using the points. And put these two fields at the report.

    A few remarks: the cubeset and cuberankedmember is regardless of the slicers, isn’t it possible to make it work with the slicers? so when you use the slicers the sorting will be applied on the filtered set. The cubevalue is applied with filtering.

    The way you fix this is by getting a set, use filtered values to get the top n values and then lookup the members belonging to these values. I know i shouldn’t think as a developer but this is very unnatural to me. What is the chance a player from a specific weeknumber is not in your top 50 list?

    Using these functions has great potential but you have to have a specific purpose and know what you are doing. I’ll have to play some more with it, currently i see the not being able to use a filtered cubeset and cuberankedmember as a big disadvantage. To be continued :)

    Kasper

  3. geb says:

    What about using the cubeset function with a .children and a limiter? The syntax to brutal for me after trying out a couple times to figure out.

    eg cubeset top 10 rushers, in week X
    or top 10 rushers where reciving yards greater than Y

    something like multiple select boxes, Select the team, and it gives you the list of players in the team by restricting the cubeset2 (player list), using team 1 as an input?

    or is that what the last sentance of kasper’s coment meant,that not being able to use a filtered cubeset is a big disadvantage

  4. […] out for diving into the deep end into his world of Excel and getting the proverbial egg on my face (Joker picture not-withstanding).  He’s called me out, challenged my geek hood, and sliced / diced my postings (pun intended).  […]

  5. […] Introducing CUBESET() and CUBERANKEDMEMBER() […]

  6. Kasper and Geb – excellent points and questions. I see this as an awesome opportunity to bring in a man who I consider to be the world’s foremost expert on cube formulas, the great Dany Hoter.

    Let me see if I can lure him into this conversation with us. He is very busy, but an in-depth cube formula chat like this is damn near irresistible to him, so let’s keep our fingers crossed.

    If I can’t get him to weigh in, I’ll dive in myself.

    -Rob

  7. Dany Hoter says:

    After such an introduction from Rob I really don’t have a choice but become active on this site.
    Both Kasper and Geb questions can be answered by the same technique.
    When creating a set with CUBESET we don’t need to limit the number of members in the set. The important thing is to get the order of members in the set be by descending order of the measure including the slicers. Once we achieve that we can decide how many CUBERANKEDMEMBER functions to use from the top of the set and in this way achieve the effect of a topcount.
    In order to get the right order we need to define something called a tuple in MDX terminology. An example of a tuple is sales of milk in April/2010. A tuple can contain a single member from any number of dimensions. We can create a tuple using CUBEMEMBER with a range of cells as the second argument. Each cell in the range should contain a member.
    Assume that A3 contain the measure, B1 the sliced time period and B3 the sliced product category. The formula CUBEMEMBER(“sandbox”,(A3,B1,B3),”This is a tuple”) can be used as the last argument for the CUBESET function (Of course A1:A4 is a valid range in case the cells are consecutive). The result will be a set ordered by the measure value in the filtered period and the filter category.
    There are still some issues in the solution the most acute one is how to sort if one or more filters use mutiple members.
    I’ve send Rob an example and he’ll adapt his spreadsheet and publish a post with more details.

    • kurt says:

      this is a great reply! ‘cube’ functions are superpowerful for creating boilerplate analysis and swapping parameters, thanks for the post

    • qinyg says:

      Great, thanks! One quick question: do you have a solution to the issue raised in your post?
      how to sort if one or more filters use mutiple members.

  8. geb says:

    I think I found what I was looking for, maybe I phrased my question incorrectly.

    Changing the cubemember tuple in the last argument in the cubeset only changes the top sort, (very useful in its own right), what I was looking for was how to limit the returning members of the cubeset based on a parameter in a different cell.

    If I choose a team in the tuple, the teams player will come out first, but player on other teams will still be in the cubeset.

    This below is what I came up with after playing around. The cubeset now switches dynamically to list the players on team specified in cell e7, in ranked order of their rushing yards.

    CUBEMEMBER(“sandbox”,”[team].[” &E7&”]”,+e7) (e7 is a cell where you pick a team from a validation list/drop down box/slicer)

    =CUBESET(“sandbox”, “([team].[” & $e$7 &”]*[CleanPlayers].[FullName].children)”,$e$7 & ” top rushers”,2,
    “[Measures].[Rushing Yards]”)

    I couldn’t test with Robs actual cube (but he didn’t send Kasper the data either), but was able to get it working quite nicely on one of my own.

    With a few other tricks, we can use this to create dynamically sizing drop down lists using the validation function, and no code. If I can get something cleaned up I can send rob a copy along to illustrate.

  9. Geb/Colin/Dany – interested in posting your mods/workarounds?

    If you don’t want to write a full post, you can just upload the workbook and then we just post a few highlights and a link to the file.

    I’ve started a Samples library here:

    http://powerpivotfaq.com/PowerPivot%20Samples/Forms/AllItems.aspx

    Whatcha think?

  10. JA says:

    Brilliant post and a brillian comment geb – I was having the same problem.

  11. justin ho says:

    Hi dear,
    i need to get a group in a column named region(south,north,west,east),but i don’t want get out the data about south,so i try use cubeset to creat a group.
    =CUBESET(“PowerPivot Data”,”[revenue].[region].[All].[east],[revenue].[region].[All].[west],revenue].[region].[All].[north]”,”group”)
    i got #n/a it’s wrong formula,but i can try out the right formula,so maybe it have to ues another function?
    looking forward to your answer.
    Thans.

  12. Dave says:

    =CUBESET(“TheYesConnection”,”{([Segment].[Content Group].&[PAC],[Reporting Date].[Fiscal Hierarchy].&[FY 2012].&[1].&[1],[Segment].[Site Section FN].[All].children)}”, “Site Section All”,2, “[Measures].[keypresses]”)

  13. OtterMBA says:

    I am new to PowerPivot and I am self taught with everything excel so far. I am trying to use Cubeset and Cuberankedmember to create a TopN report. You’re right above how a pivot is much easier and quicker, however I don’t like how pivots look in my reports. I am attempting to make a dynamic report that is formated to company specifics. Can some one help me with my code? I am attempting to pull in the topN sales for a specific month by salesperson.
    =CUBESET(“PowerPivot Data”,”nonempty([idgdaily8].[whname].[All].children)”,”Wholesaler”,5,”[Measures].[idgdly8 Prem]”)
    =CUBERANKEDMEMBER(“PowerPivot Data”,E$6,ROW($A7)-6) where E$6 is the cell with the cubeset formula. After I drag the CRM formula down I use a cubevalue formula to return the correct sales for that person and for that month, but the results are seldom in order by highest to lowest sales.

    • DickM says:

      Hey OtterMBA …

      Could the issue be in the fact that you are using “5” for the sort parameter in your CUBESET formula ? “5” is for Sort_Natural_Ascending while “2” is for SortDescending which I would think should return them in descending order from highest to lowest (?).

      Just a suggestion.
      Dick

  14. OtterMBA says:

    It does not seem to be an issue of the “Sort_Order” as I have used all the 6 parameter settings (0-5).It may pull my topN results but it does not sort them in any fassion that makes sense to me. For example I am playing with just two columns at the moment and I get results similar to below.
    Salesperson September Sales
    Joe $5m
    Steve $9m
    Joe 2 $4M

    Is this something that ONLY works with pivot tables? Also I am using Denali if that means anything.

    • DickM says:

      Hmmm..

      No this doesnt only work with Pivot Tables and I am using Denali myself.

      Is there any way you could create a sample “sanitized” version of the spreadsheet that I can look at here?

      I am at dick@plogic.ca

      Dick

  15. OtterMBA says:

    Sorry for the delay but I finally figured it out. I had to use sets in both the set_expression and sort_by areas of the function. I also realized that the function is a stickler for the order in which columns are added for set_expression set. IE… If you want a report to show the top n RBs for a team you need to use (team,players.children) to have the proper results.
    Thanks for the help though.

    • OtterMBA says:

      also here was the resulting formula.

      =CUBESET(“PowerPivot Data”,”([salesteam].[reportsto].[All].[“&’Zip Codes’!$D$1&”],[salesteam].[fullname].[All].children)”,’Zip Codes’!$D$1,2,”([Measures].[Production],[sales].[Year].[All].[“&$D$3&”])”)

  16. Marklw says:

    So have a Question, new to DAX, only few days working with it. I like to pull from a slicer and I can do it without issue from what I got from this post:
    =CUBERANKEDMEMBER(“PowerPivot Data”,Slicer_Fiscal_Year,ROW(A1))

    Though I like to change the order to be most current date if multiple selected. How would one go about doing this. (No matter what I have tested I get oldest value as rank 1)

    My end result like to have a single date max(Fiscal Year) if mult or have a ‘ALL’ given in my field.

    This post was very helpful to me BTW!!!

    • powerpivotpro says:

      Well you can define a new measure that is MAX(SlicerTable[SlicerColumn]) (or MIN)

      Then you use CUBEVALUE to fetch that measure (but it still needs to reference the slicer too).

      Or you can use a CUBESET formula, have it reference the slicer, but use the SortBy and SortOrder arguments to CUBESET. Then have your CUBERANKEDMEMBER formula reference the CUBESET cell.

      • Marklw says:

        Thanks got me in the direction I needed:

        Slicers directly did:
        =CUBERANKEDMEMBER(“PowerPivot Data”,CUBESET(“PowerPivot Data”,Slicer_Fiscal_Year,,4),1)

        Though I have found that I have IF test for “All” value first; tried a count check =CUBESETCOUNT(Slicer_Fiscal_Year) you get a 1 for All also would have rather seen a 0 for my count for selected… but prob just don’t know how to get to that :>.

        Also found this comment after reply that was also helpful:

        thanks

  17. Vincent S says:

    Does anyone know if we can do like a SUMIFS or COUNTIFS once you have converted your pivot table into a forumla? I wanted to do a what-if that counts the number of rows that is > a certain number but can’t seem to figure it out.

    Thanks,

  18. Karen says:

    Hi, I am working with survey data. I have created some formulas that return the survey responses among those who gave one response choice (for example, give me the data for those who reported their Ethnicity is Caucasian). For some of the response choices, the number of respondents is quite low and we’d like to group them together into an “all other” (in this data, Caucasian and Hispanic have the most respondents, then all other ethnicities would be grouped together). I have been able to do a CUBESET formula for multiple items, but I’m running into a character limit of 255. So instead of specifying this “all other” as each response that should be included (it is 7 other ethnicities), I’d like a formula that gives me everything EXCEPT those two responses. Is there a way to do this? Thanks!

  19. if you want to put them in a list in Excel, then you can put the cubemember in each cell, then use that range of cells as your cubeset. Or, you could use a slicer.

  20. jbrwlngs says:

    I’m experiencing extremely poor performance, measured in multiple minutes, creating a ranked list using CubeSet and CubeRankedMember in Excel 2013 against a PowerPivot table. It appears that the determining factor is the size of the dimension which in this case is in the tens of thousands. If I use a pivot table the sort on the same dimension is virtually instantaneous but using the CUBE functions is truly frustrating. I am trying to create a pivotless report for my stakeholders but I’m about ready to give up due to the extremely long recalculations. Any advice or workarounds?

    • jbrwlngs says:

      After researching the DAX syntax I came up with the following approach (only slightly better than a pivot table).

      What I need to do is generate an ordered list without displaying the ordering value. I can come close as in example below, but I would really like to just return the list of ThreadIDs without the associated ViewCount. Basically I’m looking for a replacement for CubeSet and CubeRankedMember, after which I can then use CubeValue and CubeMember functions. Any ideas anyone?

      As you know the TOPN function does get the top 30 thread records, but does NOT return them in order! Since the ORDER BY phrase can only be used at the outer level of EVALUATE, one has to include the ordering value in the result set, which I want to avoid. Right now it’s appearing, that for improved performance, my only option is to put these mini-tables on a hidden tab?

      DAX Query:
      EVALUATE
      SUMMARIZE(
      (TOPN(
      30,
      FILTER(
      ForumThreads,
      ForumThreads[SampledLastMonth] = “Yes”
      ),
      ForumThreads[DlyViews]
      )
      ),
      ForumThreads[ThreadID],
      ForumThreads[DlyViews])
      ORDER BY ForumThreads[DlyViews] DESC

  21. Patrick says:

    Hi,
    Thanks for this so helpful blog.
    I looked for a solution of my problem, but I was not able to find it or maybe not well understand your advice :(
    I am using Powerpivot 2013, I would like to use a cell reference in a cubemember() fromula.
    =CUBEMEMBER(“ThisWorkbookDataModel”,{“[02 Orders].[Name].&[Branch A]”,”[02 Orders].[Category].&[Cat 4]”})

    I would like [Branch A] is pointing on an cell reference changed it by
    =CUBEMEMBER(“ThisWorkbookDataModel”,{“[02 Orders].[Name].&[“&A2&”]”,”[02 Orders].[Category].&[Cat 4]”})

    But that’s never worked. I changed all the {}, “” and & positions. But, I have always an error, or excel doesn’t accept the formula…

    What am I doing wrong ?
    Thanks

Leave a Comment or Question