Counting “Active” Rows in a Time Period: Guest Post from Chris Campbell

April 9, 2013
  • image

  • “There’s a Fight Club up in Delaware City.”

  •     “Yeah, I heard.”

  • “There’s one in Penns Grove too.”

  •     “Bob even found one up in New Castle.”

  • “Did you start that one?”

  •     “No, I thought you did.”

 

 

It Keeps Spreading Smile

About a week ago I was talking to Chris Campbell and some of the other folks at Blue Granite.  Chris mentioned that he has been teaching some PowerPivot classes at the Microsoft Technology Centers, sometimes even in my neck of the woods, but I didn’t know until he told me.

Which, of course, instantly reminded me of the scene above in Fight Club.  I’m sure everyone else makes the same connection right? Smile with tongue out

Anyway, Chris asked if I would be interested in him writing a guest post and I said heck yeah!  So, without further delay, I give you the PowerPivotPro.com debut of Chris Campbell.

-Rob

The Problem

Recently, a customer sent me a question regarding a DAX problem they were working on. They have a Members table in their model that includes attributes of “Start Date” and “End Date” for each member. The question they needed to answer was “How many active members did we have in [fill in the blank]?” I thought this was a pretty interesting question and it seemed like it ought to be pretty easy to do in DAX.

Read the rest of this entry »


Guest Post: Always Show Yesterday, Today, or Tomorrow’s Data

March 28, 2013

 
Today we have a new guest poster – Miguel Escobar.  I’ve been talking to Miguel in email and Skype for a long time now and feel silly that I haven’t asked him to do a guest post until today.  But now, I have, so I can stop feeling silly.

From his writing style and creative approach to solving problems, I think you’ll see that he fits right in.

Cool trick: Always show Yesterday’s, Today’s or Tomorrow’s Data

clip_image002

Executive: Are these values correct?
Excel-guy: yes, but you need to check the dates slicers to see what dates the report is using
Executive: Ugh… I just want to click on the report and see the latest values

If you ever had this situation before let me tell you that you’re not alone on that one…I’ve been there before and it’s time to give you some cool easy tricks on how to set up a Powerpivot report that shows you the yesterday, todays, tomorrow, next week or any type of timeframe (forecasting or that sort of scenario).

Read the rest of this entry »


Modeling Viral and Marketing Growth, Part 3 of 3

January 24, 2013

Why am I doing this in PowerPivot?  Primarily as a challenge.

This is a question I should have answered before I even started down this road.

To be honest, I did it primarily as a challenge – to stretch my brain a little bit.  If I were faced with this exact same task in my daily work, undoubtedly I would just use normal Excel formulas.  In some ways, this modeling exercise has been a deliberate misuse of PowerPivot.  A handful of parameters with no source data whatsoever – this is NOT what the PowerPivot engine was built for, which explains why the PowerPivot solution is actually significantly more difficult than the Excel solution.

“So you’ve been deliberately wasting our time??”

No, I do think there is real value in this exercise, for two reasons:

  1. Brain-stretching with new techniques always comes in handy later.  For instance, on the first post Sergey commented that he’d been thinking about loan amortization measures and this could be applied to that.
  2. I can see this technique being added, as a supplement, to a broader PowerPivot model.  For instance, a model containing lots of real customer data over time, and then a [Projected Customers] measure that forecasts future customer populations based on various assumptions and/or marketing investments.

So with that in mind, here it is:  the final installment of viral/marketing modeling in PowerPivot.

Read the rest of this entry »


Modeling Viral and Marketing Growth, Part Two

January 22, 2013

 
Picking up from last week’s post, the first thing I want to show is that I kinda cheated last time.  To see what I mean, let’s look at Rahul’s original chart:

Viral Marketing Growth in PowerPivot:  Customers Flatten Out Over Time

In Rahul’s Viral Model, Total Customers “Goes Flat” Quickly

In Rahul’s model, if we start With 5,000 initial customers and a viral factor of 0.2, we end up with 6,250 customers and we never get any more!

But in my model from last week, if I use 5,000 and 0.2, customers keep piling up exponentially:

Exponential Ongoing Viral Growth in PowerPivot

In My Model from Last Week, Customers Never Go Flat –
They Just Keep Growing Exponentially

So why the difference?

Read the rest of this entry »


NETWORKDAYS() Equivalent in PowerPivot?

November 22, 2012

 
There is no NETWORKDAYS() Function in PowerPivot

There is no NETWORKDAYS() Function in PowerPivot

A Post on Thanksgiving?

Normally I would take today off and not have a post.  But I’ve posted so many updates about the book lately that it’s got to feel like this place has turned into an advertisement shop, and I want to keep the “real” content up.

So consider this a Thanksgiving “bonus” post – me giving thanks for everyone ordering the book, and everyone tolerating my desire to post updates about “my baby” every five minutes Smile

A Missing Function

It’s funny, I’ve never used NETWORKDAYS() much (if at all) in regular Excel, so I didn’t realize this until someone posted on the MrExcel forums – how do I do a NETWORKDAYS()-style calculated column?

Something like this:

NETWORKDAYS in PowerPivot

Desired Result

So how do we get to this?

Read the rest of this entry »


Last Mini-Post for the Day: Check out Chandoo’s PIVOT version of the Calendar Chart!

September 13, 2012

 
I thought I had seen everything.  I had not.  Chandoo, we bow before you:

Yes, That’s a PivotTable!  Click the Image to View the Post on Chandoo.org

(Oh, and I really need to figure out how to make these animated GIF’s.  Damn!)


PowerPivot Calendar Chart in Excel: Specific Steps for Adapting it to Work With YOUR Data

August 16, 2012

image

Modifying This to Work With Your Existing Workbook Isn’t Hard

Continuation

Given the continued popularity of the Calendar Chart and the post I did on its anatomy, I thought I’d continue today with a more pragmatic “how do I adapt this to work with my data?” post.

Adding the Calendar Chart to YOUR PowerPivot Workbook

OK, so you like the calendar chart but you don’t want to start from scratch in a new workbook?  You already HAVE a PowerPivot workbook and want to just “port” the calendar chart into THAT workbook?

It’s easy.  Probably a 30 minute task, and that includes the time spent reading this post.

Read the rest of this entry »


Explaining the PowerPivot Calendar Chart, Plus an Updated XLSX Download

August 14, 2012

 
image

Hidden Rows and Columns Visible, Color Coded, and Explained
(Slicers Deliberately Moved Aside for Clarity)
(Click for Larger Version)

A Most Popular Post Indeed!

Well the CalChart post was a hit – the second most popular post of this year in fact.  (Second only to Dan Battagin’s spreadsheet formatting post, and that one had the benefit of being directly linked to from the official Excel blog – Dan is a big cheater).

I particularly enjoy how many Excel Pros are arriving at this blog for the first time as a result of the CalChart – you know who you are!  You’re helpless against the luxuriant charms of the CalChart! :)

And you have to have PowerPivot for it to work, muhaha.  Resistance is futile.  Go download it from Microsoft now.  It’s free.

Modifying it to fit your needs

The workbook I made available for download last week included a bunch of unused “machinery” – formulas and cells that I created while I was experimenting with different techniques, but ended up not using in the final version.

Read the rest of this entry »


Introducing… the Calendar Chart!

August 9, 2012

 
Calendar Chart in Excel - PowerPivot Can Do Some Amazing Things

“CalChart” – The Most Absolutely Awesome Thing I’ve Ever Done in Excel
(Data:  Fake UFO Sightings/Alien Abductions – Data I 100% Made Up)

New Chart Type Added to Excel 2010!

Yes, it’s a new chart type.  And yes, it’s been added to Excel 2010.  But not by my former colleagues at Microsoft.  This was done by me, after being inspired by another Excel pro, and with a heavy dose of formatting and sparkline assistance from another.

And it’s not some new fancy software addin or something like that.

It’s formulas.  In the normal Excel grid.

Read the rest of this entry »


Creating a Cycle Plot PivotChart

July 31, 2012

Guest Post by Colin Banfield [LinkedIn]

A cycle chart is a chart designed to show business sales cycles in a manner that cannot be expressed using other standard charts. For instance, a cycle charts can show how monthly sales vary over several years, or how daily sales vary over several weeks. For a good introduction to cycle charts, see this excellent article by Naomi Robbins.

Figure 1 shows cycle PivotCharts using 3M+ rows of sales data from the Contoso database.

 

image

Figure 1 – Monthly sales over years and daily sales over weeks cycle plots (click figure to see an expanded image)

Read the rest of this entry »


Excel 5-Calendar Date Table

May 17, 2012

Guest Post by Colin Banfield [LinkedIn]

image

For some time, I have been looking around for a fairly complete date table in Excel for use with PowerPivot. If you are working with data derived from a data warehouse, a date table is perhaps the most common dimension table that exists in the warehouse. However, not every scenario involves working with a data warehouse directly, and I simply wanted a “portable” date table. I found very little online, the best perhaps being this Excel table offered by the Kimball group (the table has been expanded since I originally downloaded it). I could have modified the Kimball table for my particular needs, but I decided to create one from scratch.  Late last year, Rob posted an article titled the Ultimate Date Table, which is available from the Azure Marketplace. I considered using this table instead of the one I was building in Excel, but the “Ultimate Table” lacks fiscal periods. Much of the analysis work I do includes fiscal periods.

Read the rest of this entry »


Correlating “Fuzzy Time” Events in One Table with Precise Measurements in Another Table

February 26, 2012

 
Precisely-Timed Measurements vs. Imprecisely-Timed Events in PowerPivot - How Do We Relate Across Ranges of Time?

Precisely-Timed Measurements vs. Imprecisely-Timed Events: 
How Do We Correlate/Relate Them?

The Streak is Broken…  and a new streak begins.

Well it was a full six-month run of posting every Tuesday and Thursday, Cal Ripken-like consistency.  Then last week it ended.  I was just drained.  Oh well, time to start a new streak!  Back to Tuesday and Thursday.  So let’s dive in…

Calculated columns.  I am continually realizing how much there is to know about them – stuff I haven’t really been forced to learn since I rely so heavily on Pivotstream’s database team.  My recent work with scientific and medical data sets, starting with the really simple fake data set and then moving into the rat sniffing data, has really driven this home.

So let’s do a quick treatment of fetching data from one table into another, using calculated columns, and in different situations.

When you have a relationship:  =RELATED()

OK, I’ve covered this one before.  When you have two tables and a relationship between them, you can fetch data from the “lookup” or “reference” table into your data table using the RELATED() function.

But there are really only a few places where you SHOULD do this.  You can already include the “lookup” column in your pivots without fetching it into the data table, so there’s little benefit of fetching it.  And doing so will just make your file bigger AND your pivots slower.

So really, I just do this when I’m debugging or exploring my data set.  When I’m done, I delete the column.

When you don’t have a relationship…

OK, back to the rat data Smile.  This will parallel many other real-world examples, such as marketing campaigns etc.  So bear with me even if you don’t find scientific measurements to be interesting.

I have one table called “Sniffs” which records how often (and how deeply) the rats were inhaling:

image

Sniffs Table:  Coded by RatID, SessionID, and TimeID
(Where Each TimeID Represents 1/100 of a Second)

And then I have an Events table.  “Events,” in this case, are things that the rats are doing.  Like…  sniffing each other.  In various places.

image

Events Data – Entered by Humans Watching Time-Coded Video of the Rats
(Note that Time is Recorded in Seconds, not Hundredths of Seconds As in the Sniffs Table)

Event type 4, for instance, is “Rearing” – the rat raising up on its hind legs and sniffing its surroundings.  This event is recorded by someone in the lab watching time-coded video.

THE GOAL:  We want to see how breathing frequency and intensity (data from the Sniffs table) changes during such events – does a rat sniff more or less when Rearing than normal?  (And in Retail, the parallel would be something like “do we get decent lift from our ad campaigns” or “how do spending habits change around a particular event like the Super Bowl?”)

Well, you can’t create a relationship between these two tables, for a few reasons.

Problem 1:  Events are Coded at the Second Level, Sniffs at 0.01 Second

In a single second of elapsed time, there will be 100 rows of data in the Sniffs table, and at most 1 row in the Events table.  Right there, we have a problem.

Then again, each 0.01 second DOES map to a “full” second, in much the same way as a day maps to a year.  The real problem here is…

Problem 2:  Both Tables Are Non-Unique in Time Column

Even if we lined up each 0.01 second from Sniffs with a full second from Events, well, I can have multiple rows in Events that all have the same value for TimeInS, because each row is only unique when you consider RatID and SessionID.

Hmmm…  so how about I just concatenates RatID, SessionID, and TimeInS to form a unique composite column in Events, and then do something similar in the Sniffs table, and THEN I create my relationship?

There are two problems with that – one minor, and one fatal.

The minor problem:  this will make my file quite a bit larger, by adding a column to my largest table (Sniffs).  And it’s the worst kind of column to boot:  a calculated column with a lot of unique values.  Plus, it will form the basis for a relationship, which means my pivot performance will potentially be slow.

The fatal problem?

Problem 3:  The TimeInS Value in the Events Data is Imprecise!

First of all, the video of the rats’ behavior will never be precisely aligned with the scientific equipment that is measuring the rats’ breathing patterns.  There is some “wiggle” in that synchronization, maybe as much as half a second.

Secondly, this is compounded by human error and variability – the act of a rat rearing up to sniff its environment might cover a full two seconds elapsed, but the human being coding the video must pick a single second to flag the behavior.

Taken together, we can see that there’s really a fuzzy “range” defining when the event occurs.

The Solution Part One:  Adding “Time Window” Columns to Events

Let’s say I define the “Event” as lasting one second – 0.5 second before and after the time recorded in the Events table.

Then I add calc columns to my Events table that reflect that:

image

Where the *100 is to convert to the “centiseconds” granularity that I use in my other table (the Sniff table).  PostTimeID is the same, except +0.5 rather than –0.5

The Solution Part Two:  CALCULATE?  In a Calc Column?

Then I can add a calc column to my Sniffs table that “fetches” a corresponding event ID from the Events table whenever there was an event “in progress” for the current sniff row:

=CALCULATE(MAX(Events[EventType]),
   FILTER(Events,
          Events[RatID]=EARLIER(Sniff[RatID]) &&
          Events[SessionID]=EARLIER(Sniff[SessionID]) &&
          Events[PostTimeID]>=EARLIER(Sniff[TimeID]) &&
          Events[PreTimeID]<=EARLIER(Sniff[TimeID])
         )
)

***UPDATE:  I realized that EARLIER() was NOT needed in this formula.  See this post for an explanation.

Notes on this formula:

  1. I had to choose some aggregation function, and I chose MAX, but I could have chosen MIN just as easily.  SUM and AVERAGE would NOT have worked out for me though.  More on this later.
  2. Everything after that is a FILTER expression – with four clauses all “anded” together with the && operator
  3. The first two filter clauses basically just say “only grab rows from Events where the Session and Rat match the Session and Rat on the current row in Sniff”
  4. The function documentation for EARLIER() has a very abstract definition, but in this case you can think of EARLIER() as just meaning “grab the value from the current row in Sniff.”
  5. The last two filter clauses make sure we only grab rows from Events whose time window includes the current row of Sniff.  In other words, if an Event “contains” the time of the current Sniff, it counts
  6. Going back to the MAX() at the beginning of the formula, the MAX essentially breaks ties in favor of events with larger ID’s.  That is of course completely arbitrary.  But in truth there are very, very few overlapping events so it hardly matters.  If I cared more, I could narrow the window to less than +/- 0.5 seconds, or I could pay more attention to my Event ID’s to make sure that higher numbered ID’s truly were the most significant events.

That gives me a result in my Sniff table that basically looks like I had a relationship between the two tables and used =RELATED()

image

OK folks.  I’m tired and about to drop.  But hopefully this inspires some thinking, or at least some questions Smile