Link to cube formulas spreadsheet fixed

February 2, 2010

Sorry about that link below.  It is corrected now in the original article below, but here it is as well:

http://cid-470312ee93cc790d.skydrive.live.com/self.aspx/.Public/Football%20Cube%20Formulas.xlsx


Introducing CUBESET() and CUBERANKEDMEMBER()

February 2, 2010

batman04xf9

“Where does he get such WONDERFUL toys???”

-Kasper de Jonge

(To be honest, Kasper probably looks nothing like this.  I’ve never been to the Netherlands, but based on the fashions in Sweden, I am fairly certain that the Dutch don’t dress like that, at least not when they visit Sweden).

 

Poor Kasper, though.  The mystery of how the parameterized, sliced, auto-sorting report was built has simply been too much for him to bear, so the other day I sent him an advanced preview of the workbook.  I’ll share the same workbook (without the underlying data, which is licensed) with all of you, but first, some explanation.

Two posts back, I showed you the functions CUBEMEMBER and CUBEVALUE.  Think of CUBEMEMBER cells as composing the axes of a report, and CUBEVALUE cells as composing the numerical data in the report.

But CUBEMEMBER does not easily lend itself to dynamically-sorted reports.  Each CUBEMEMBER function fetches a fixed value (an NFL player, in my case) from the database.  To get a nifty auto-sorting report, we need to introduce…

CUBESET()

As far as Excel functions go, CUBESET is a strange one.  It’s 100% useless by itself, but forms the foundation for some very interesting things.  Let’s look at its syntax:

CUBESET(connection, set_expression, , [sort_order],
       [sort_by])

  1. Connection – the name of the connection, from Excel’s perspective.  In CTP3 builds, this is always “Sandbox” for PowerPivot connections.  This, BTW, is the first argument to every cube function.
  2. Set_expression – rather than return a single “member” like CUBEMEMBER, CUBESET returns a whole “Set” of values.  There are many different syntaxes for representing a set, but the one you will use most often is <ColumnName>.children, which returns all unique values of that column.
  3. Caption (optional) – there is no way for Excel to display all values of a set in a single cell, so you generally make up your own caption.  “” is a common caption, since it doesn’t clutter your report, but you can return any string you want, like “All NFL Players”.
  4. Sort_order (optional) – an integer from 0 to 7, representing “No Sorting,” “Sort Ascending,” “Sort Descending,” “Alpha Ascending,” “Alpha Descending,” “DB order ascending,” “DB order descending.”
  5. Sort_by (optional) – the MEASURE you want to sort by.  (Must be a measure, not a column!)  This argument is ignored unless you specify 1 or 2 for sort_order (ascending or descending).

Using CUBESET() in the football report

In my football report, I use CUBESET behind the scenes, with the following arguments:

=CUBESET("Sandbox",
         "[CleanPlayers].[FullName].children",
         "All Players”,
         2,
         "[Measures].[Rushing Yards]")

Ignoring the connection and caption arguments, what this formula means is:  “Return all unique values from the [FullName] column in the [CleanPlayers] table, sorted descending by the Rushing Yards measure.”

(Note that in Cube formulas, table names are wrapped in square brackets, whereas in DAX they are wrapped in single quotes.  It would be nice if cube functions could use single quotes for tables to distinguish them from columns as DAX does, but there’s an excellent historical reason for that.)

Fetching values from that set using CUBERANKEDMEMBER()

This is where it gets fun :)

The cell that contains the CUBESET formula only displays a caption, but behind that cell, the entire set is stored, invisibly.  CUBERANKEDMEMBER is how you fetch values from that set.

(Whether you chose to specify a sort order or not, that set IS ordered, even if the database decided to return the values in somewhat random order.  Hence the “RANKED” in its name.)

The syntax for CUBERANKEDMEMBER is quite simple:

=CUBERANKEDMEMBER(
         Connection,
         Set_Expression,
         Rank,
         [Caption]
)

For set expression, just give it the cell address of the CUBESET formula.  For rank, give it any integer: 1 to return the first member of the set, 2 for the second, etc.

In my football report, this is:

The syntax for CUBERANKEDMEMBER is quite simple:

=CUBERANKEDMEMBER(
         “Sandbox”,
         $B$3,
         ROW(A1)
)

B3 is the address of my CUBESET formula, but I used the absolute reference syntax $B$3 because I plan to copy this CUBERANKEDMEMBER formula down a column, and I don’t want Excel adjusting B3 to B4, B5, etc., since that won’t reference the right cell anymore.

OK, what the heck is ROW(A1)???

It’s an Excel trick.  If I just enter 1 for this argument, that will NOT increment to 2, 3, etc. when I copy this formula down a column.

And the whole reason to copy down a column is so that I get the 2nd, 3rd, 4th, etc. members of the set as I go down the column.

If I use ROW(A1), that will return 1 for the first instance of the formula.  And when I copy down the column, Excel increments that to ROW(A2), ROW(A3), etc., which return 2, 3, etc. – just what I want!

Comparison to the Named Set Method

Denny Lee’s series of posts on Top N reports goes a different direction than what I have here.  He defines a Named Set in the Excel UI (using the MDX language), then uses a PivotTable to return that top N set.

In comments on the first post, Colin Banfield asked how to order that set.  Neither Denny nor Colin have an answer at the time of this writing.  And if Denny and Colin are stumped on MDX, well, there’s no WAY that I am going to figure it out :)

But with CUBESET, you can see that the Sort_order and Sort_by params give me a simple method for ordering the set by a variety of criteria.  Then CUBERANKEDMEMBER lets me fetch any number of those sorted items, in whatever precise order I’d like.

So… no MDX required, (except for the .children trick I used in CUBESET, which technically is MDX), AND a simple solution to the ordering problem.

The Named Set feature is a good one, and it definitely has its uses, but for custom Top N reports, I’ll take CUBESET and stay away from that MDX madness.

And what about Top N filters in pivots?

Actually, if Top N is all you want, a Pivot is gonna be your best bet at least 90% of the time.  Remember, the only reason I went down the Cube Formulas path is that I wanted custom formatting in my report, and parameterization.

I’ll post again shortly about how pivots might even be better than you expect at Top N reports.

Don’t worry if I am losing you, this is not required…  nor easy

This whole topic of cube formulas is pure gravy, take-it-or-leave-it kind of stuff.  It took me several attempts to get comfy with it myself, and I was part of the team building the feature!

Think of this as a 200-level topic.  I’m including it for the handful of you who consider yourselves hardcore, to test your mettle and stretch your Excel skills :)   Everyone else, just file this away for later and come back some day when you find yourself needing precisely these tricks.

When it comes to building PowerPivot reports, I’d break things out like this:

100 Level Topics

  1. PivotTables, PivotCharts, and Slicers
  2. Converting a PivotTable to formulas and then custom arranging/formatting the report

200 Level

  1. Customizing CUBEVALUE cells to do custom arithmetic and parameterization
  2. Writing CUBESET and CUBERANKEDMEMBER formulas

300 Level

  1. Dynamic sorting using CUBESET and CUBERANKEDMEMBER
  2. MDX-Driven Named Sets, as Denny demonstrated

…which means that my next post on cube formulas is even more optional/ignorable than this one :)

Wake up Denny!

Denny, of course, well…  Denny HAS to pay attention, because I challenged his geek-hood in my last post.

Some of you might say that wasn’t fair, since I have more Excel background than Denny and he has more SQL background than I.  Sun Tzu once suggested that choosing the field and terms of battle was more important than the fighting itself, and I do love me some Sun Tzu :)

Are you like Kasper?  Want to skip ahead?  Download the workbook!

I have not covered every technique yet, but if you are getting anxious like Kasper and can’t wait any longer, you can download the workbook here and inspect every last formula.

Remember, the source data itself is licensed, so I’ve removed that from the workbook.  You won’t be able to fetch fresh values from the PowerPivot db if you try to refresh, or modify formulas.  I’m sharing this so that you can see all the formulas and understand how the report is built.

And no, Kasper didn’t get the source data, either :)

Next Football Post:  Writing a “Rank” Measure >>


Parameterizing the report

February 1, 2010

In the Parameterized Report, how do I get the “Fan Pts” column on the far right of the cube formula report to respond to those parameter cells in the top left?

PowerPivot Report Params

Easy.  The report is just formulas, so I can combine them with the parameter cells using Excel arithmetic.

Naming the Parameter Cells:  a Readability Convenience

The parameters themselves live in cells C3, C4, and C5.  I can reference them that way, but to make things easier to read, I’m going to give them names. 

C3To give a name to cell C3, I select it.  Note the control at the upper left corner of the grid that displays the address C3:

I just click inside that control and start typing to give C3 another name.  In this case, I name it YardsPerRushPt, below:

YardsPerRushPt

That will make my formulas easier to read, and also provides an added benefit later then I publish to SharePoint.

I then repeat this process for the other two parameter cells, C4 and C5, which I name YardsPerRecPt and PtsPerTD, respectively:

 

                  YardsPerRecPtAnd PtsPerTD

The Formula

The formula was always going to be easy to write, but now it will also be easier for us to read and understand.  Here is the formula in the Fan Pts column:

=CUBEVALUE("Sandbox",F3,Slicer_DayNightName,Slicer_WeekNbr,
           "[Measures].[Rush Yards]") /
RushYardsPerPt

+CUBEVALUE("Sandbox",F3,Slicer_DayNightName,Slicer_WeekNbr,
           "[Measures].[Rec Yards]") / RecYdsPerPt

+CUBEVALUE("Sandbox",F3,Slicer_DayNightName,Slicer_WeekNbr,
           "[Measures].[Rush TD]") * PtsPerTD

All three CUBEVALUE terms are identical except for the measure they reference, and the parameter they either divide or multiply by – I have highlighted the differences in blue.

Note how I have used the cell names I defined above, rather than cell references.  This is an under-used and beneficial feature in Excel, and not at all specific to PowerPivot.

Dealing with #N/A – The Excel IFERROR() Function

Sometimes, a player will not have recorded any of a particular statistic.  This is especially true when you start slicing the report and the circumstances get narrower.

When that happens, Excel will often display #N/A in the cell instead of 0.  (In a future post I will delve into what determines 0 vs. #N/A).

Sometimes #N/A is ok to display.  But in the Fan Pts column, it’s a problem.  A player will often record yards but no TD’s, for instance.  But once a single term in the formula above goes to #N/A, the entire formula goes to #N/A, which is incorrect since that player DID have other statistics. 

We just want #N/A terms to be treated as 0 in cases like this.  And the IFERROR() function in Excel is precisely designed for this.  If we wrap each term in IFERROR, we can specify an alternate value to be returned when an error like #N/A is encountered.

For example:

=IFERRROR(
         
CUBEVALUE"Sandbox",F3,Slicer_DayNightName,
                  
Slicer_WeekNbr,"[Measures].[Rush Yards]")
                    /
RushYardsPerPt
,0
)

Get to know IFERRORThe IFERROR wrapper is highlighted in blue.  If no error is encountered, then the “meat” inside the IFERROR function (the CUBEVALUE divided by RushYardsPerPt) is returned as usual.

IFERROR is a very useful function overall, and all PowerPivot Pros should get to know it (a phrase that conjures memories of Jon Lovitz on SNL).  It can also be used to return custom strings in place of errors like “No data recorded.”

                                                                                                                               GET to know me!

Parameters Exposed in SharePoint-Published Reports!

Time for that “extra” benefit of named cells.  In my last video, remember how there was that expandable task pane rendered in Excel Services, that allowed me to change the parameter values interactively?

PowerPivot Web Params

I said that required no coding, and that’s the truth :)   To activate that feature, you have to tunnel down a bit in the Excel UI, because hey, only hardcore folks like readers of this site would ever need such a thing, right? 

First you go to the File tab in Excel and choose the “Share” option:

Share Option 

Then pick the “Publish to Excel Services” suboption, and click the button of the same name that I have highlighted:

Excel Services

The dialog that comes up looks exactly like the Save As dialog.  And that’s what it is.  But there’s a new button there, highlighted below.  Click it!

Save As Dialog

Almost there, don’t give up!  There are two tabs on the resulting dialog.  The default tab, the Show tab, is very useful, but let’s skip that for now and switch to the Parameters tab:

Parameters Tab

Clicking the highlighted “Add” button yields the following, final dialog in the chain:

Add Params

Every named range in the workbook appears here as a checkbox.  (Slicers are treated as named ranges, too, so that they can be referenced in formulas as well).  I select all three of the names representing my parameters, and then tunnel back out, clicking OK as I go.

(I can then save the file locally or save it to SharePoint, or cancel the Save As dialog and do a normal save later – in each case, the parameter settings will now be saved).

When I view this report on SharePoint, using the browser, the parameters pane will now appear automatically.

Parameters as Web Part Interface

The parameters pane is the simplest way to leverage these exposed parameters.  But those three cells are now also exposed as part of the web part framework, enabling other controls in SharePoint to send input values to the report.

That’s an advanced topic and only mentioned here in case you happen to be a SharePoint programmer as well :)

What Next?  Aren’t We Done?

Nope, we are not done.  We haven’t yet covered how to support dynamic sorting!  So that’s coming up.

And keep in mind that this whole foray into cube formulas is a bit of an advanced topic, and it gets trickier as you add more features into the report.  None of this is necessary for building PowerPivot reports – pivots are going to be your answer most of the time anyway.

Denny Lee, Your Time Has Come!

My esteemed colleague, Denny Lee, has found his way into the deep end of the pool lately with a series of posts about Top N PowerPivot reports.

I have a lot of respect for Denny.  After all, the man can write MDX, and the existence of MDX is in some ways the reason why we had to create PowerPivot (…to replace MDX with something less intimidating).  And no one ends a sentence with “eh!” like Denny :)

But apparently Denny did not get the internal executive memo:  high-end Excel techniques are the domain of one Rob Collie, the PowerPivotPro!

Denny, you see, is now on my turf.  As they say in Excel power users circles, it’s, ahem, “on” now.

So my next few posts now have two purposes:  to explain how I did the dynamic sorting, and to give you some (cough cough, superior) alternatives to those MDX-backed techniques outlined by Denny.


PowerPivot for SharePoint Video

January 28, 2010

Royale with Cheese

“You know the funny thing about Europe?  It’s the little differences.  I mean, they got the same shit we got here…  just a LITTLE different.”

-Vincent Vega

Greetings from Stockholm!  This place…  I never want to leave.

Many firsts have been recorded here.  My first-ever presentation to a large audience outside the states.  First time having dinner in a restaurant straight out of a 1930’s Hollywood movie.  First time seeing an American guy, a British guy, a Japanese guy, and a Norwegian guy all get tossed out of a hotel bar for being drunk.  First time getting half-trapped by a blizzard in a foreign country.

…and oh yeah, my first time having a mandatory, cannot-postpone-it Windows Update pop up near the end of a presentation and occupy my demo laptop for a full hour.

You can’t make this stuff up.

I have to hand it to Marcus Gullberg and company from MS Sweden though, as well as Johan Åhlén and the entire Swedish SQL Server Users group.  The former for their quick improvisation – finding an Amir Netz video of the SharePoint stuff I was unable to show, and the latter for their patience and excellence questions while we figured it out.  It ended up being pretty funny, actually.

Their loss yesterday is now your gain – as a make-up, I recorded a video of the stuff I couldn’t show yesterday, and posted it to YouTube.

(This also marks my first-ever recording in another country).

Oh, and you know what they call a Quarter Pounder, in Sweden?

A quarter pounder.


Pulling back the curtain: Intro to Cube Formulas

January 18, 2010

OK, last week I showed you a parameterized PowerPivot report that contained no pivots anywhere.  How did I build it?

Best Tutorial:  Start with a Pivot

The easiest way to show you is to start with a PivotTable, even though once you’ve mastered this technique, you won’t always want to start this way.

Here I’ve got 4 measures from the Plays table broken out by ‘Clean Players’.[FullName], and the report is sorted by the [Rushing Yards] measure:

basic pivot

First, I want to get rid of that blank value at the top – over the years, an awful lot of “nobodies” amassed a lot of stats, and apparently STATS did not care enough to capture their names.  Accordingly, I don’t care enough to want them in my report.

Filter Out Blanks

So, I’ll just filter them out.  Easy enough, as shown here at right.

But I also want to limit my report to the top 20 players.  How do I do that, now that I have already applied a filter to this field?

When I try to add a Top 10 filter to this field, it clears my filter that removed the blank, and I get that blank, unknown player back in the report.

So how do I do this?  Well, I use a feature of PivotTables that Allan Folting argued for, and I argued against, back in 2006.  He was right, I was wrong, and the feature has helped me numerous times since :)

That feature is hidden under the Options button on the PivotTable Options ribbon tab.  It defaults to off.  Once enabled, I can go back to the Pivot and add a Top 10 filter, set to show the top 20 players as sorted by Rushing Yards:

Multiple Filters Per Field

Checking that checkbox allows me to then add my Top 20 filter on top of the original:

Top 20

Then I add some slicers, which yields the basic starting point that I want:

top 20 results

Enough with the pivot, Rob! We want the pivot-LESS report!

Indeed.  Time to blow this pivot into a hundred pieces.  Literally.

On the Pivot Options tab of the ribbon, under a dropdown called OLAP Tools, there is a button named “Convert to Formulas.”  On normal pivots, this whole dropdown is disabled…  because normal pivots are not OLAP pivots.

But PowerPivot-backed pivots are OLAP, because under the hood, the PowerPivot db engine is an OLAP engine.  So this button is enabled:

OLAP Tools Convert to Formulas

Watch what happens when you click that button:

Recap

Briefly, let’s recap what we see in the video:

  1. Every cell in the pivot has been converted to a formula
  2. Each formula returns the same result that the cell previously contained (when it was part of the pivot)
  3. Slicers still work – the formulas return different numbers reflecting slicer selections
  4. The formulas can be treated just like regular spreadsheet cells – moved, rearranged, deleted, etc.
  5. There are two functions involved – CUBEMEMBER() and CUBEVALUE()
  6. Generically, we call this feature “Cube Formulas”

Cube Formulas are NOT DAX!

I know that cube formulas are new to most of you, and seeing how I am introducing them in the context of PowerPivot, it’s natural to suspect that these formulas are part of DAX.

But they are not.  They are built-in to regular Excel and are there before you install PowerPivot.  Remember, DAX can only be used on tables in the PowerPivot window, and in the New Measure dialog.  DAX cannot be used in normal Excel spreadsheet cells.

Cube formulas are just another way for Excel to interact with certain kinds of data sources.  And PowerPivot is one data source that qualifies.  That’s good news for Excel pros – a brand-new way to build reports.  It’s a free benefit of PowerPivot :)

CUBEMEMBER() and CUBEVALUE()

As shown in the video, let’s inspect the formulas in three specific cells to show you what’s going on here.

First, the Edgerrin James label cell:

Edgerrin James CUBEMEMBER

Brushing over syntax for a moment, that formula says “this cell now represents Edgerrin James” – the string “Edgerrin James” was returned from the PowerPivot database in fact.

Then the “Rushing Yards” header cell:

CUBEMEMBER Rushing Yards Measure

Again brushing over syntax, that formula means “this cell now represents the Rushing Yards measure.”

And lastly, the 9842 value, which is Edgerrin James’ total Rushing Yards:

Rushing Yards CUBEMEMBER

I put this one in edit mode to show you that it references the Edgerrin James and Rushing Yards cells.  So this CUBEVALUE formula shows the intersection of Edgerrin James, and Rushing Yards, just like it did in the PivotTable.

Notice how it also references the two Slicers.  That’s how the formula returns different values as the slicers are manipulated.  Again, just like it did in the PivotTable.

More to come

Cube formulas are indeed the core trick behind my “pivot-less” report.  But there are several elements yet to be explained, like the parameterization, and even more importantly, how the sort order of the report can change in response to parameters and slicers.

I also want to show you what happens with this report when published to SharePoint.

I think this is enough for one post though :)

Next Football Post:  CUBESET() and CUBERANKEDMEMBER() >>


Not your average PowerPivot report

January 13, 2010

Thats not a PowerPivot Report

“THAT’s not a PivotTable!  Oh wait…”

-Bill Gates

(OK, so I turned it into a movie quote of sorts by using the picture of Anthony Michael Hall playing Bill Gates in “Pirates of Silicon Valley.”  Was that a good casting decision?  I’m torn between “yeah it was OK I guess” and “well it WAS made for TV.”)

Setting that aside, the real Bill Gates DID in fact say that when we first showed him the new PivotTables in Excel 2007.  Prior to that, the appearance of a finished PivotTable was something that the average user found off-putting and geeky.  So we were pretty happy with that response to the new look.

PivotTables (and PivotCharts) have come a long way.  A few clicks and you’ve got a very professional and customized report.  But there are, of course, still cases where you need an even greater level of control than what pivots can give you.  The good news is that if you find yourself in one of those cases, you still have other options.

Excel Pros – Do NOT tune out!

Excel pros are probably thinking “yeah, Rob, I know all about those other options.  You make a PivotTable, you hide it on another sheet, and then build a report sheet using formulas, etc.”

I have a trick up my sleeve today, though.  I bet most Excel pros have NOT seen what I am about to show you :)

Here it is – Not Your Average PowerPivot Report!

(You’ll need to crank your volume up – I got a new mic and need to adjust its sensitivity).

Summary of what’s in that report

  1. PowerPivot-Backed – this report is indeed a PowerPivot report.
  2. No PivotTables anywhere – that’s right, there isn’t a pivot in the entire workbook!
  3. Custom formatting – I split the report body with horizontal and vertical dividers, and could place them wherever I wanted.
  4. Parameterization – users can play “what if” by changing input cells.  A numeric column in the report responds to those changes in realtime.
  5. Sorting – the report is dynamically sorted by that numeric column in #2.  Change a parameter, and the report re-calcs AND re-sorts.
  6. Respects slicers – all columns in the report respect the slicers on the page as well.  This also impacts the sort order of the report.

How it was done

I’m going to go through it step-by-step in the next few posts.  Yes, I am deliberately teasing you :)

Next Football Post:  Introducing Cube Functions >>