The Friendly Neighborhood Operators && and ||

April 22, 2011

David Hager’s guest post below reminded me of something I’ve been meaning to share for awhile.  Here’s a feature of PowerPivot so tiny, so sensible, that I’ve had a very hard time internalizing that Excel lacks it.  In fact, I think I’ve had to re-check Excel 3-4 times to confirm for myself that Excel doesn’t have it.

The feature is the pair of logical operators && and ||.  They are alternatives to the Excel functions AND() and OR(), respectively.  Check out these calculated column formulas:

IF(Product[Color]=”Blue” && Product[Weight]>6, “Heavy and Blue”, “Everything Else”)

IF(Product[Color]=”Blue” || Product[Color]=”Red” || Product[Color]=”Yellow”, “Primary Color”, “Blend”)

The first example is looking for Products that are both blue AND weigh over 6 pounds.  The second example will tag a product as “Primary Color” if it is Blue, Red, OR Yellow. 

Notice how you can also use more than two clauses – neat huh?

Note that using && and || is often a great alternative to the dreaded “nested IF” formula.

Also, I mentioned above that Excel DOES have the functions AND() and OR() that do the same things.  But these operators make for much easier formulas to read, and I think they are a lot easier to write as well.

Using || in a CALCULATE measure

I don’t think I’ve had occasion to use && in a CALCULATE measure, primarily because adding another filter clause to calculate achieves the same effect as an AND.  But || sure is helpful, here’s an example:

[Return Dollars] = CALCULATE([Sales],
     Sales[TransactionType]=”Return” ||
     Sales[TransactionType]=”Credit”)

That measure recalculates the [Sales] measure and only includes transactions that were either Returns OR Credits.  Neat huh?  It’s a great shortcut alternative to adding a calculated column with a nested IF, or even a calc column that uses || – just do it right there in the measure.

And now for another alternative to nested IF’s:  David’s post on using a lookup table.


Guest Post: Calculating a Sum Based on a List Criteria

April 22, 2011

Guest Post By David Hager

It is a fairly easy task in PowerPivot to calculate a sum based on N criteria (after a learning curve). This can be accomplished by filtering your data before importing it in the PowerPivot window, selections made in the Pivot Table, or through some (relatively) simple DAX calculations.

However, if N gets too big, the task get much bigger. Now, imagine a table with thousands of customers and the task is to calculate a sum based on only 1000 of those customers. Conventional filtering does not provide a way to do this. Then, say that this list of customers changes from day to day. It would be nice to have a method to perform this calculation. There is!

The first step is to create a linked table from your Excel workbook to its PowerPivot window. The table in PowerPivot is named LookupList. Then, a relationship is created between that table and an existing Table1 (as shown below).

image

The list shown above does not have 1000 entries, and is just for demonstration purposes. However, I have tested it with >1500 entries and it works :)

Solution One:  Calc Column as Basis for Measure

One solution is to create a calculated column named AggregateList that returns only the amounts that are associated with the customers in the lookup table. The correct formula to do this is:

=IF(ISBLANK(RELATED(LookupList[Customer])),BLANK(),Table1[AMOUNT])

Now, a pivot table can generate the desired result as shown below.

image

By changing the aggregation of the measure, an average or other value can also be obtained.

Solution Two:  Yes/No Column plus CALCULATE Measure

(From Rob):  Rather than produce the numerical column as David did above, my first instinct was to use the LookupTable to generate a simple yes/no column, and then use that as the filter in a CALCULATE measure.

Calc column:

=IF(RELATED(LookupList[Customer])<>"",1,0)

image

and then the measure:

=CALCULATE(SUM(Table1[AMOUNT]),Table1[IncludeCustomer]=1)

This is mostly a matter of style.  Both require a calc column – this option’s calc column is simpler but has a more complex measure.  (Neither approach is all that complicated however.)  This option might make for a smaller file size, but it’s hard to be certain of that, and even if true, it won’t be much.

Back to David.

Refreshing the customer list

I mentioned at the beginning of this article that the customer list provided by the linked table is dynamic by nature. If the list is changed manually, recalculation will occur if the PowerPivot window is opened. However, if the list is generated from Excel formulas some calculation errors can occur. Instead, the use of other methods to update this list is preferable. The details of these issues will be discussed in a future article.


A/B Campaign Analysis with Start & End Date Slicers

April 13, 2011

 
“Because every good story has a beginning, a middle, and an end.”

Awhile back I posted about promotional campaign analysis factoring in seasonal trends.  Now let’s look at another flavor of campaign analysis:  comparing results when a campaign is active versus when it was not.  Sometimes this is referred to as “A/B Testing.”

Let’s start by showing what the results can look like:  a report that has two date slicers – one where you select the Start Date of a campaign, and another where you select the End date:

image Pick Start and End Date, See How Sales Performed On vs. Off Program

The report then shows Sales results “On Program,” which are the sales that occurred between the start date and end date (inclusive) versus the Sales results “Off Program” (sales on all other dates).

Specifically, it shows Sales per Day for On vs. Off Program (because programs run for short periods of time, Sales per Day is a much more “apples to apples” comparison than Total Sales), and then the % change in Sales per Day when On Program versus Off (labeled “Program Delta” above).

(From a quick glance at Program Delta, it’s obvious that this was one awful program that ran from 5/20/03 to 7/8/03, but hey, that’s what we get when we use AdventureWorks as our data set).

Oh, and guess what?  Writing this blog post consumed a lot more time than building the report :)

How This Was Built, Step One:  Date Slicers and Date Measures

The first things you need are two single-column tables of dates – these are used to populate the Start and End date slicers:

 image   image

These tables are NOT related to ANY other tables in the model.  They stand alone, intentionally.  Also, MAKE SURE THEY ARE OF DATA TYPE DATE!  Otherwise the following steps will give you strange results.  Also, make sure none of the date columns in your model have time components lurking in them.

OK, now you need to define a measure on each of those two tables.  It’s the same formula, but I name one [Start Date] and the other [End Date], and assign each one to a different table:

image
image

By the way, the technique I’m showing here is a variation of a technique covered in prior blog posts:  one by Kasper and one by me.  Read Kasper’s in particular if I’m moving a little too fast for your taste in this post.

For grins (and to test this out), you can now slap both slicers and both measures on a pivot and inspect what each of those measures returns:

image

We’ll never place those measures on an actual report, but it’s good to see that they serve their purpose, which is to capture the dates that the user selects on the two slicers.  Note that I changed the captions on each slicer to reflect what their intended use is clear (originally they both just had “Date” as a caption since that was the column name in each table).

Step Two:  Sales Measures that are filtered by those Date Measures

Assuming we already have our base [Sales] measure defined, let’s start with [Sales on Program]:

[Sales on Program] = [Sales](DATESBETWEEN(Dates[FullDate],
      [Start Date],[End Date]))

We’re using the DATESBETWEEN function as a means of filtering our base [Sales] measure, and the [Start Date] and [End Date] measures that we defined previously are the end dates.

Note the use of the Dates table in the formula.  That is a third Date table, separate from the two slicer date tables, and that table IS related to the Sales table.

So…  we are picking up the user’s date selections from two tables that are NOT related to anything, and using the dates they selected to filter the “real” Dates table.  That filter then gets applied to the Sales table because it is related to the Sales table.

[Sales Off Program] is a bit trickier than “On” Program.  I suppose we COULD just subtract “On Program” sales from total [Sales], but just in case someday we need to build some date sensitivity into the base [Sales] measure, let’s defined [Sales Off Program] to be [Sales] filtered to dates OUTSIDE of the selected date range:

[Sales Off Program] = [Sales](FILTER(Dates,
   Dates[FullDate]<[Start Date] || 
   Dates[FullDate]>[End Date]))

First, notice that we are using the FILTER function this time rather than DATESBETWEEN.  There is no DATESNOTBETWEEN or DATESOUTSIDE function, so we have to express the logic more directly ourselves.  But that’s not a big deal – I’m pretty sure that DATESBETWEEN is really just a “skin” over the FILTER function anyway – I’ve used FILTER and DATESBETWEEN somewhat interchangeably over the past year and I always get the same results (and the same performance).

Also note the use of the “||” operator, also known as OR.  Rows from the Dates table are included if they are before the [Start Date] OR after the [End Date].

One more note:  you always want to use FILTER and DATESBETWEEN against the smallest tables you can.  Use them against your Dates table, for instance, rather than against the Date column in your Sales table, because they are MUCH slower against larger tables.

Here’s what we get with those new measures on our pivot, and with some Product hierarchy on rows:

image

Now you see why “per day” versions of these measures are required – the programs are so short that they are dwarfed by the “off” dates.

Note:  These two date slicers are great candidates for disabling cross-filtering and thereby improving the performance (update time) of this report.  See this blog post for an explanation.

Step Three:  Creating the Sales-Per-Day Measures

OK, first we need a [Day Count] measure that we can use as a denominator:

[Day Count] = COUNTROWS(DISTINCT(Sales[OrderDate]))

(Note that I am explicitly counting the Date column from the Sales table rather than the Dates table to account for product lines that did not exist for the duration of the entire Dates table, and therefore did not sell at all – there are tradeoffs here that I won’t go into, but I think this is the right thing to do in most cases when calculating a per-day measure).

[Sales per Day], then, is straightforward:

[Sales per Day] = [Sales] / [DayCount]

Now I need to create “On Program” and “Off Program” versions of that measure.  These formulas parallel the On/Off Sales measures from above:

[Sales per Day on Program] = [Sales per Day]
   (DATESBETWEEN(Dates[FullDate],
    [Start Date],[End Date]))

[Sales per Day Off Program] = [Sales per Day]
   (FILTER(Dates,
    Dates[FullDate]<[Start Date] || 
    Dates[FullDate]>[End Date]))

And lastly, let’s add the [Program Delta] measure, which is really Pct Change in Sales Per Day:

[Program Delta] = IF([Sales per Day on Program]=0,BLANK(),
   ([Sales per Day On Program] – [Sales per Day Off Program])/
   [Sales per Day off Program])

The IF in there is just to catch the case where a product did not sell during the selected dates.  If I leave that IF out, the measure will return –100% for those cases.  Sometimes that is what you want and sometimes it is not – it’s a case by case judgment.

Add some conditional formatting, and here’s the resulting report, repeated again from above:

image

Alternate Approach:  Using a Promotions Table

Rather than use two date slicers like we have here, you COULD have a single Promotions table that lists each promotion and its Start and End date.  Something like this:

image

Now, I just have to go back to my [Start Date] and [End Date] measures and change their definitions to reference the Start and End columns in this table, and “attach” them to this table instead of those slicer tables:

image

image

I don’t have to make ANY other changes.  Now I can remove the Start and End date slicers, and replace them with a single Promo Name slicer:

  image

Or, move Promo Name to Rows and some of the product stuff to slicers:

image

Other Fun Stuff

There are many, MANY other things you can do here too.  For instance, a measure that calculates sales per day for the month leading up to a promotion starting.  Or the month after it ends.  Or the exact same period 1 year ago.  PowerPivot truly does open doors that you’d never consider in traditional Excel (and probably would never get to in traditional BI).


Survey Results Part 2: Tech Pros vs. Excel Pros

April 5, 2011

 
Take the Survey – on average, it’s taken people about 3 minutes to complete.
Results Part One – SharePoint adoption, and overlap/competition with other self-service BI tools.
Results Part Two – adoption by Excel pros, now and projected.

We had a brief conversation on Twitter today regarding PowerPivot adoption, specifically about the ratio of “Tech pros” to “Excel pros” in the survey results so far.

Recall that we’re seeing about twice as many tech pros as Excel pros show up in the results:

PowerPivot Adoption So Far - About 2x Tech Pros as Excel Pros

Updated as of April 4th

There were two questions raised:

  1. Is the ratio reported in the survey at all reflective of the broader reality “in the wild?”
  2. If it is accurate, is the ratio a good or bad sign?

My answers:

  1. Yes – while neither precise nor strictly scientific, I do think the survey provides a good indication
  2. I think it’s a very good sign that we already have half as many Excel pros engaged as tech pros

I will explain, of course.  I will also provide more results analysis, and some projections of my own.

Tech Pros are Proactive, Excel Pros are Incremental

Think for a moment about the “genetic” makeup of Tech Pros versus Excel Pros:

Tech Pros:  Are measured primarily by how much they can do with their specialized technology.  And that technology is constantly in flux.  If you are a BI or a SharePoint pro for instance, your world has been upended significantly in just the last 12 months – PowerPivot has dramatically modified the BI landscape, and SharePoint 2010 is a shift almost as big.

On net, tech pros have to be very actively “outward looking” in order to be successful.  They read blogs.  They attend conferences.  They even voluntarily attend full-day community events on weekends (like SQL Saturday and SharePoint Saturday).  They are constantly experimenting with new toolsets.

Excel Pros:  Excel pros aren’t identified as “Excel Pros” on their resumes – they are business analysts, marketers, etc. who happen to know how to make Excel sing.

And guess what?  Excel has certainly added a lot of improvements over the years, but fundamentally, an Excel pro could have been frozen 15 years ago, thawed out today, and they wouldn’t miss a beat once they got over the shock of the ribbon.

So…  that is why there are no Excel Saturdays.  There are no conferences devoted to Excel.  There ARE Excel blogs of course, but those blogs’ reader bases, while healthy, still only represent a paltry percentage of the millions of Excel pros.

Excel Pros just don’t go out looking to learn new things.  They learn new things only when they encounter a problem that they don’t yet know how to solve.  (See the comments section for a clarification on this).  So when they learn, they learn from incremental experimentation with features they’ve seen but never used – “Oh, I bet THAT’s what the OFFSET function is for!”  Or from colleagues.  Or from forums.  Or maybe they pull a book off their shelf.

99% of Excel skills are learned in response to solving today’s problem today.

Half as Many Excel Pros as Tech Pros ALREADY?  Excellent!

If you’re a BI pro or a SharePoint pro, you’d have to have been living in a cave for the past year in order to have not heard about PowerPivot by now.  Multiple times, actually.  It’s safe to say that audience is nearing awareness saturation.

But if you’re an Excel pro, you are a much more difficult person to reach.  How can PowerPivot be the answer to your problem today if it only works with Excel 2010, which you probably don’t have yet and neither do your colleagues?  Right there, most of the ways you’d typically be exposed to it, or to even have heard of it, are cut off.

And even if you DO have Excel 2010, well, PowerPivot is a separate download isn’t it?  And your copy of Excel doesn’t include a link, hint, or any other suggestion of PowerPivot’s existence.

So I find it VERY encouraging that we already have half as many Excel pros on board as we do tech pros so quickly.  If I’d run this same survey three months ago, I’m positive it would have been more skewed toward tech pros.

More Survey Results Rob!

Let’s drill down into the results in this area a bit:

How People Discovered PowerPivot:  Tech Pros vs. Excel Pros

How People Discovered PowerPivot:  Tech Pros vs. Excel Pros

What jumps out?

  1. MS websites/publications were the first exposure method for about 40% of each audience.  I am not surprised by 40% for the tech pros.  But I AM surprised by 40% for Excel pros.  I suspect that the official Excel blog accounts for a lot of that.
  2. Conferences were more important for tech pros than Excel pros.  Not surprising, but it IS surprising that it’s so close – 20% to 14%.  What conferences are these?  I need to attend!
  3. Non-MS websites are 3x as important to Excel pros as tech pros.  24% to 8%.  Now that DOES jive with my past experience.  Forums, etc.  And MrExcel.com in particular, I suspect.

Next drilldown:

image

Pretty similar.  We’re still seeing, largely, the “first responders,” regardless of specialty.

And that’s an excellent segue way to my next point.

How I Think These Results Will Look Years From Now

Let me add a column to each of those reports, reflecting my personal rough projection of how this survey will look when we run it years down the road:

image

Summary:

  1. Not everyone can be first.  As time goes on, we will see “I was first” drop sharply.  But even more sharply for Excel pros than tech pros.  There just are too many Excel pros for many of them to be first.
  2. Colleagues will be the primary exposure.  More so for Excel pros than tech, because tech pros are naturally just more self-driven to learn new toolsets.
  3. The rise of “Other” for Excel Pros.  Eventually Microsoft will figure out how to bake PowerPivot into the core Excel product.  There are a lot of technical and political challenges for them to address, so there’s no guarantee it will happen soon.  But I think it will happen eventually.

Revealed: The Secret to Reaching More Excel Pros in the Next Year!

It occurs to me that Microsoft and this blog both have the same goal, and challenges, in reaching Excel pros.  Go do a quick Google search on “PowerPivot.”  The first page of results while I am writing this is composed 100% of MS sites, plus this site and its companion, PowerPivotFAQ.com.

So if you’re an Excel pro and you’re doing your “learn PowerPivot incrementally” thing that you do so well, chances are you’ve discovered this site.  That’s why I feel comfortable saying that the survey results so far offer a decent approximation of real adoption trends.

The trick, of course, is getting you to run that query in the first place.  If you’re reading this, I am glad you got started, however you did Smile

So what’s the number one thing Microsoft can do, in the next year, to reach dramatically more Excel pros?

Nothing.  Just do nothing.  And wait.  That’s my unsolicited advice on what will have the greatest impact.

Excel Pro Word of Mouth in ActionSure, there are some clever things to do, like advertising on Excel “haunts” like the most popular forum sites.  Integrating some sort of “awareness” like a help topic or a link into an Excel 2010 service pack would be a MAJOR coup.

But honestly, the ball is rolling, and it’s like that boulder in Raiders of the Lost Ark.  Half as many Excel pros as tech pros already is PHENOMENAL.  Excel 2010 keeps filtering out.  Awareness is spreading.  And word of mouth in a community of millions will soon dwarf any other efforts.

PostScript:  I’m not even making this up!

Right as I finished typing the words above, someone responded to the survey as follows:

  1. Specialty:  Marketing
  2. More than 20 people with desktop PowerPivot installed today (note:  this is the most yet!)
  3. No, I was not first in my company

Heh heh.  In honor of that, let’s do one more projection:

image


The Great PowerPivot Survey: Results Preview

April 1, 2011

Hi folks, the survey is still open, please take the 3 mins (on average) to fill it out if you are using PowerPivot today.

Quick preview of the results so far:

A good mix of respondent types:

clip_image002

And pretty healthy interest in the SharePoint version:

clip_image004

Interest in SharePoint is actually pretty even between Tech folks and Users, 80% and 75%:

clip_image006

More than 20% of the respondents use other self-service BI tools today:

clip_image010

Filtering out the “No usage of self-service BI before PowerPivot,” that self-service use/eval breaks down like this:

clip_image012

Clarification:  the last line in the fragment below, “Qlikview, Other” with 5.7% means 5.7% of the selected respondents evaluated Qlikview AND other self-service tools and decided not to adopt. 

image

So in order to get the full total for Qlikview, you really need to add all the lines that include Qlikview.  This is a function of having survey questions of the format “select all that apply.”  It kinda makes analysis a bit tedious, but that’s a blog post of its own Smile

More results next week.  In the meantime, please take the survey if you have not already.