Tracking Performance After an Event or Treatment

February 2, 2012

 
Jackie Brown - A Tarantino Movie I Need to Revisit Soon

“…then they start thinking that, ‘where there’s smoke there’s fire’ logic…”

Continuing the Impromptu Series of Simple Real World Examples

The Mr. Excel PowerPivot Forum has inspired me to change gears a little bit and focus more time on simple techniques that don’t break the DAX Spicy Scale while still being very useful, everyday stuff. I sometimes take this stuff for granted and end up looking for topics that are much more “clever” when in reality we all can use a good dose (or two, or ten) of basic useful examples.

Medical Treatments – Are They Effective?  Counterproductive?

A user over at the forum named “Mirknin” posted the following question:

I have a relatively simple data set where each row represents a week, each week has a number of data columns. Occassionally, however, we have treatments and I set up Excel with a column where the treatment is recorded as 1 – i.e. a treatment occurred that week, otherwise the cells are left blank.

I AddColumn in PowerPivot to generate a unique identifier (through the usual ampersand formula) to represent whether a treatment occurred in a specific week. Using the Slicer, I can PivotTable to see the data from the row where the treatment took place because all this info is related to that week, but seeing the data for following four weeks would let me see whether the treatment had made an effect.

I guess my query is whether the Slicer can be modified in such a way as to report the row it’s associated with plus subsequent rows…

Data Set:  Let’s Use Trees Rather than People Smile

When I was manufacturing sample data for this I got a bit squeamish about it.  Not for the obvious reasons though – I’ll explain at the end of the post.  But for now, just know that we’re going to work with Trees, like in a Botany lab, and some experimental treatment they are receiving.

We have a calendar of treatments – a list of dates and Yes/No for each:

image

Calendar of Treatments

And then a table of health readings taken for various trees in the lab on specific dates:

image

Tree Health Measurements

Normal Slicer Function Across Relationships

Notice that the two tables have the little “relationship” icons on the Date column?  These two tables are related on that column.  So if I write an “average health” measure and put the Yes/No treatment field on a slicer, I get:

image

image

But if I put Date on rows, we’ll see that all I am “getting” in my health scores is the health of the tree on the day the treatment was applied:

image

But I Want the Health Scores Over the Two Days AFTER Treatment!

I want to track a range of dates after a treatment is applied, so I need to do something different here.

What I need is a new column in the TreatmentCalendar table:

image

If I use THAT column as a slicer, or even better, put it on rows of the pivot, I get what I want:

image

Hey, check it out!  Trees are doing a little bit better in the two days after they receive treatment!  (That’s lucky, since this sample data was created using RANDBETWEEN).

But how did I build that “Recently Treated” Calculated Column?

Did I say up front that this was going to be LOW on the DAX Spicy Scale?  Hmm.

First let me repeat a longstanding recommendation:  if you are using a database as your data source for PowerPivot, you should seriously consider having this calculated column generated in the database and NOT in PowerPivot.  There are multiple benefits of that, as long as it’s an option.

If it’s not an option, you still have two choices.  For something relatively quick and dirty, you can do the calc column in Excel and just paste an entirely new TreatmentSchedule table into PowerPivot.

But for larger data sets, or cases where you can’t have that manual intervention step every time you get new data, you’re gonna have to write the calc column in PowerPivot.

And since PowerPivot lacks A1-style reference, and this calc column has to look at rows other than just the current row in order to get its answer, you have to use the dreaded EARLIER function.

Primer:   Simple Use of the EARLIER Function

Before we do something advanced with EARLIER, let’s cover the basic usage first:

Say I have the following VERY simple table like this:

image

And I want to add a third column that is the total for each customer:

image

How do I do that?

The calc column formula for that third column is this:

=CALCULATE(SUM([Amt]),
           FILTER(ALL(‘Table’),
             ‘Table’[Customer]=EARLIER(‘Table’[Customer])
                 )
          )

Explaining EARLIER() in depth may be its own post.  Yeah.  This was probably the LAST function I learned to use.  It’s an ongoing joke between me and the Italians actually.

Here’s the Formula for the RecentTreatment Yes/No Column

With that background in mind, here is the formula for the Yes/No column.  I actually did it in two columns:

image

That 1/0 column is the tricky one.  The Yes/No column is just a “cosmetic” column built using IF.

Here’s the 1/0 formula:

=CALCULATE(COUNTROWS(TreatmentCalendar),
           TreatmentCalendar[IsTreatmentDay]=”Yes”,
           FILTER(ALL(TreatmentCalendar),
                  TreatmentCalendar[Date] <
                       EARLIER(TreatmentCalendar[Date])
                       &&
                  TreatmentCalendar[Date] >=
                       EARLIER(TreatmentCalendar[Date])-2))

Easy right? Smile

I Promise to Come Back and Explain This!

EARLIER isn’t all that bad really and neither is the formula above.  But it definitely feels a lot harder than an Excel pro wants it to be.  Given the length of this post however I think I’ll do that next time.

Calling Marco and Alberto!

There are a couple of obvious questions here that my simplified example just outright skipped:

  1. What if I have more than one kind of treatment?
  2. What if not all “patients” receive the same treatment, and/or on different days?
  3. Can I use a slicer to control how “wide” the range of dates is rather than hardcoding 2 into my formulas?

The fact that I was dodging those problems in this post is why I was squeamish about using people as the example rather than trees.  And all three of those make this problem a lot harder.

And what do we do when we hit a problem that goes beyond the powers of mortal Excel Pros?  We throw up the Boot Signal of course:

image


Comparing scientific and other data across trials

January 31, 2012

Enjoying the new forum!

I’ve been spending some time on the Mr. Excel PowerPivot forum over the past few days and I must say I am enjoying the experience.  The blog is essentially a one-way form of communication, given that far less than 1% of views result in comments.

But a forum!  Aha!  That is different!  I encourage you to head over there and drop a question, or maybe even try your hand at answering a question or two when they pop up.

Comparison Across Trials:  A Simple and Useful Example

Here’s one that I particularly liked because it’s practical and ultimately pretty simple on the DAX Spicy Scale after a few posts of higher complexity (but also extreme utility). 

A user over there on the forum named “TopDown” asked a question that I will paraphrase here.  Imagine you have a dataset like this:

Scientific Trial Data in PowerPivot

Measurements Table:  A Series of Measurements Loaded into PowerPivot

Basically you have a set of sensors (111 through 115 in this data) and you take a set of measurements at each sensor (temperature and pressure in this example).  Then you repeat the measurements over multiple experiments – trials 1, 2, 3, 4, etc.

Topdown had a few questions:

  1. How do I compare the results of each trial back to the first trial?  Absolute difference versus trial #1 and percent difference versus trial 1.
  2. Do I have to create calculated columns for the absolute and percent difference for each measurement like temperature?  (Topdown said he has MANY more than two measurements)
  3. And if the answer to 2 is “yes,” then how is PowerPivot useful?  Once I have the calc columns, I can create this in a normal pivot.

I think intuitively, Topdown sensed that this is a great place to use PowerPivot, and just couldn’t put his finger on the how.  He was right.  And I really liked this problem because it is so different from the sorts of problems I work on most of the time.

Solution #1:  Nothing Fancy

Hey, if you ALWAYS want trial #1 to be the baseline, this is pretty easy.  Let’s focus on Temperature for a moment, with the understanding that Pressure will be solved the same way.

First we create a basic measure, AvgTemp:

[AvgTemp] = AVERAGE(Measurements[Temperature])

Then we create a version of that measure that ALWAYS returns the value for Trial 1:

[Trial1Temp] =
CALCULATE([AvgTemp], Measurements[Trial Number]=1)

Put both of those on a pivot with Trial Number on rows:

image

The AvgTemp measure respects the Trial Number filter on rows, just like normal measures do.  But the Trial1Temp measure, by virtue of CALCULATE(), overrides that filter and sets it to Trial 1.  The only place the two measures “agree” is for Trial 1, as expected.

A couple of quick subtraction and division measures follow:

[Delta Temp] = [AvgTemp]-[Trial1Temp]

[Pct Change Temp] = [DeltaTemp]/[Trial1Temp]

Yielding:

image

Again, that first row (Trial 1) is pretty useless, but the others are doing what we expect.  If that’s distracting, we can filter the pivot to NOT display Trial 1, or we can put Trial on a slicer.

No calc columns required.  You just can’t do this in “normal” Excel.

We can stop here and have a quite useful solution, but let’s get just a bit fancier Smile

Solution #2:  Let the report consumer choose which trials to compare!

Folks, I am addicted, yes, ADDICTED to what I call “disconnected slicer tables.”  What do I mean by that?  I mean, tables that I import into PowerPivot for the sole purpose of parameterization.  Let the report consumer make a selection, I “harvest” that selection via a simple formula, and then I feed that parameter into my other measures.

I call these tables “disconnected” because I do NOT create relationships between them and the other tables – they are islands.

Here’s one I just pasted into the model.  Note that TrialSlicer2 is identical.

image

TrialSlicer1 Disconnected Slicer Table
(TrialSlicer2 Table is IDENTICAL)

Put both of these new single column tables on slicers and they have ZERO impact on my pivot, since there are no relationships:

image

No matter what I select on those slicers, [AvgTemp] keeps returning 50.65, because there is no relationship to the Measurements table.  The field list is warning me about that, but I don’t want to create one.

Instead, I write measures to “harvest” the user’s selections on those slicers, so that I can use them as parameters:

[SelectedTrial1] =
IF(COUNTROWS(VALUES(TrialSlicer1[Compare To Trial]))>1,BLANK(),
    MAX(TrialSlicer1[Compare To Trial]))

[SelectedTrial2] =
IF(COUNTROWS(VALUES(TrialSlicer2[Compare To Trial]))>1,BLANK(),
   MAX(TrialSlicer2[Compare To Trial]))

The measures are identical except for the tables they reference, and which tables I “anchor” them to in the field list.  The results look like:

image

OK, now that I have those two parameter measures, I can modify the [Trial1Temp] measure from above to no longer be “hard-coded” to Trial 1.  Now I can make it variable by parameterizing it using [SelectedTrial1]:

[Trial1Temp] =
CALCULATE([AvgTemp],
   FILTER(Measurements,
          Measurements[Trial Number]=[SelectedTrial1]
   )
)

Note that I only have to use FILTER() inside the CALCULATE simply because the “normal” parameters to calculate will NOT accept a measure name on the right side of the equals sign.

This new measure responds nicely to the selections made on the first slicer, whereas AvgTemp keeps ignoring it as always:

image

Now I can write a new [TrialXTemp] measure that is exactly the same as Trial1Temp except that it respects [SelectedTrial2] instead of [SelectedTrial1].  Update my Delta and Pct Delta measures and now I have a completely interactive pivot:

image

Neat huh?  And still pretty simple. Smile

Download the workbook, more variations, etc.

Download the workbook here

And for another example of disconnected slicer tables, see here.

And naturally, if you wanted to write a measure that automatically returns the delta between trial N versus trial N-1, you can always use the Greatest Formula in the World.


Mini Post #3: Measures Can Return Text!

January 26, 2012

 
image

Yes, That IS a Measure

Third and Final Mini-Post for Today

Got a question from a friend yesterday that made me realize I should share this:  in PowerPivot, your formulas do NOT have to return numbers.  As I’ve shown several times, they can return dates, but they can also return just plain and simple text.

For instance, check out the formula for that Trend measure above:

[Trend] = IF([$ - Sales] > [$ - Sales YAG], “Up”, “Down”)

[$ – Sales] and [$ – Sales YAG] are both measures (“current sales” and “sales year ago.”)  So this IF() is 100% dynamic and responds to whatever fields and filtering/slicing you’ve done on your pivot!  If sales have gone up since last year, it returns “Up,” otherwise it returns “Down.”

One of those cool little tricks that comes in handy from time to time Smile


Prepayments & Deferred Revenue

January 24, 2012

 

Guest Post by David Churchward 

Prepayments & Deferred Revenue Layout Example

You’ll often hear us Accountants referring to things like adjusting for timing differences or prepaying costs or deferring revenue.  This is often interpreted as “massaging the numbers”, but, believe me, there is a very reasonable theory behind it.  These sort of adjustments can be made for a number of reasons, but the main underlying concept is that adjustments have to be made to ensure that the transaction is reflected in the period to which it relates.

In this post, I’ll explain how prepayments and revenue deferrals work together with the DAX measures that drive these calculations.

What are Timing Differences?

It’s worthwhile just taking the time to understand what we’re trying to do here.  Take, for example, a support contract that you bill to your customer in advance for 12 months.  You invoice the customer in advance and the customer duly pays.  However, as per accounting principles, you have to reflect the fact that the invoice relates to 12 months worth of support.  In it’s simplest sense, your Profit and Loss should therefore see 1/12th of that revenue in each reporting month.  Since the invoice is for a full 12 months, we have to create transactions to defer the element that relates to future periods, entering the balance onto the Balance Sheet.

As a quick example of the calculation, at the end of the first month to which the support contract relates, you would defer 11/12ths of the invoice value:

Deferral Value = Invoice Value * (Number of Future Months / Total Months on Invoice)

We would therefore see 1/12th of the invoice value on our Profit & Loss and the remaining 11/12ths would be sat on our balance sheet.

Referring to Prepayments separately from Deferred Revenue (or Deferred Costs) for that matter is simply a matter of terminology.  The underlying concept of the calculation is the same.  I consider everything to be a Prepayment in this sense.  The only difference is where the transactions sit in the Profit & Loss and Balance Sheet statements.

Note – Accountants may defer revenue or costs for other reasons.  This post is simply a calculation to reflect the timing difference adjustment.

Days or Months

In the simplified example above, I’ve used a monthly concept that allows for a simple allocation of the transaction value to the Profit & Loss in equal monthly values.  However, as we know, the number of days in each month can differ.  For this calculation to be more accurate, we should pro-rata the calculation based on the number of days in that month.

The DAX measures that I show in this post will allow for selection of Monthly Deferral / Prepay or Daily Deferral / Prepay.  Which you use depends on your accounting policy.

Data Structure

My data structure is as follows:

Fact Table (Prepayments)

I have a core fact table which contains all transactions that require a prepayment or deferred revenue adjustment.  I have called this table Prepayments.

Prepayments Fact Table

The key elements to this table are as follows:

Date – This is the transaction date of the invoice posting

Nominal_PandL – This is the Profit & Loss Nominal Ledger account that the invoice was posted to

Nominal_Prepay_BS – This is the Balance Sheet Nominal Ledger account where the prepayment or deferral will be posted to.

Start / End – These are the start and end dates of the transaction.  If the invoice relates to a period from 1st Jan 2012 to 31st Dec 2012 then these dates would be 01/01/2012 and 31/12/2012 respectively.

Method – This is either MONTH or DAY depending on the level of granularity that you want your calculation to be calculated at.

PandL_Values – This is the total value of the invoice charge.

Prepay_Start / Prepay_End – I’ll come onto these alternative dates shortly.  In brief, these exist because your prepayment timeframe may differ from the strict dates to which the invoice relates.

Days Table

Prepayments Days Table

This table holds a sequential, unbroken list of dates (Date) together with associated attributes including Month End Date, Previous Month End Date, Month Start Date and Next Month Start Date.  These attributes become relevant in our measures and, whilst these can be calculated at run time, it’s easier to take care of them this way.  The purest would be correct in suggesting that there’s a potential performance impact, but dealing with these aspects in our measure may confuse the issue for now!

Months Table

Prepayments Months Table

Similar to the Days table, this table holds a collection of attributes relevant at the month level including Month End Date, Previous Month End Date, Month Start Date and Next Month Start Date.

Start Dates Table

Prepayments Start Dates Table

This is another sequential list of dates which is joined to the Start field on the Prepayments table.  I’ll come on to the PrepayStartdate field shortly.

End Dates Table

Prepayments End Dates Table

This is another sequential list of dates which is joined to the End field on the Prepayments table.  You’ll notice the absence of a third field which, again, I’ll come on to.

Table Relationships

Prepayments Table Relationships

The only relationships that exist are between the Prepayments table and the Start_Dates and End_Dates tables.

Dates That Span Periods

A charge period can be anything.  In it’s simplest form, it would be for complete months starting on the first day of a month and ending on the last day of a month.  However, things are never that simple.  Unfortunately, you could get a charge that is for 3 months from 14th Jan 2012 to 13th Apr 2012 as an example.  The time between these two dates is indeed 3 months, but the charge covers 4 discrete calendar months.

If you’re using a Days method of apportionment, this isn’t a problem.  However, if you’re apportioning on the basis of months, you have 3 options in this example:

  1. Take the 3 equal months charges as Feb, Mar and Apr
  2. Take the 3 equal months charges as Jan, Feb and Mar
  3. Take equal monthly charges for Feb and Mar with a pro-rata charge for Jan and Apr.

In truth, if you’re going to even attempt option 3, you would probably be better off jumping straight for the Days apportionment and circumvent the problem!  Options 1 and 2 depend on accounting policy and president.  The prudent option would probably be to go for option 2 for costs (taking them as early as possible) and option 1 for Revenue (taking later rather than earlier).  On the basis of consistency, I feel it’s best to prescribe this in your accounting policies and then use the same method for everything.  Having said that, the best option is almost certainly to take a Days approach.

For the purpose of this post, I’m going to use option 1.  This means that I’ll translate any start date that isn’t the first day of the month to the first day of next month.  I create two calculated fields on my Prepayments table as follows:

Prepay_Start

=IF(Prepayments[Method]=”MONTH”,RELATED(Start_Dates[PrepayStartdate]),Prepayments[Start])

 

Prepay_End

=IF(Prepayments[Method]=”MONTH”,RELATED(End_Dates[MonthEndDate]),Prepayments[End])

These measures retrieve an alternative date from the Start_Dates and End_Dates table.  For any mid-month dates, this table holds a PrepayStartdate of the first of the next month.

The Months Approach

Firstly, let’s just remember our simplified equation to give us the deferral value:

Deferral Value = Invoice Value * (Number of Future Months / Total Months on Invoice)

And now we’ll jump straight into the DAX

Month_Prepayment

=IF(COUNTROWS(VALUES(Months[MonthEndDate]))=1,

IF(LASTDATE(VALUES(Prepayments[Prepay_Start]))<LASTDATE(VALUES(Months[MonthEndDate]))

&&LASTDATE(VALUES(Prepayments[Prepay_End]))>LASTDATE(VALUES(Months[MonthEndDate])),

CALCULATE(

SUMX(Prepayments,

Prepayments[PandL_Value]

*(

(CALCULATE(COUNTROWS(Months),

DATESBETWEEN(

Months[MonthEndDate],

LASTDATE(VALUES(Months[MonthEndDate])),

LASTDATE(End_Dates[MonthEndDate])

         )

       )-1

     )/

CALCULATE(COUNTROWS(Months),

DATESBETWEEN(

Months[MonthEndDate],

LASTDATE(Start_Dates[PrepayStartdate]),

LASTDATE(End_Dates[MonthEndDate])

         )

   )

  )

),

Prepayments[Method]=”Month”

    )

)

)

The first two IF statements in this measure calculate when we want the rest of the measure to evaluate.  We want this measure to evaluate when:

  1. The number of month end dates on our report is 1 (because anything else is likely to return an error)
  2. The prepayment start and end dates fit within the range of our report.  This, essentially, tells the report which month end dates to display and the rest of the measure uses these dates to evaluate the correct result to display.

We then use a CALCULATE statement to calculate our basic equation:

Deferral Value = Invoice Value * (Number of Future Months / Total Months on Invoice)

Invoice Value = SUMX(Prepayments,Prepayments[PandL_Value]

Number of Future Months =

*(

(CALCULATE(COUNTROWS(Months),

DATESBETWEEN(

Months[MonthEndDate],

LASTDATE(VALUES(Months[MonthEndDate])),

LASTDATE(End_Dates[MonthEndDate])

         )

       )-1

    )

 

This calculates the number of month values in the Months table using the COUNTROWS function that sit between the MonthEndDate on our report and the MonthEndDate of the end date of our transaction.  I then subtract 1 to reflect that the MonthEndDate of the report is not included as it is the current reporting month.

Total Months on Invoice =

/

CALCULATE(COUNTROWS(Months),

DATESBETWEEN(

Months[MonthEndDate],

LASTDATE(Start_Dates[PrepayStartdate]),

LASTDATE(End_Dates[MonthEndDate])

         )

   )

Once again, I’m counting the number of entries in the Months table that sit between the start and end dates of our prepayment timeframe.

I then apply a filter to the CALCULATE statement to ensure that this measure is only applied to transactions where the Method is set to “MONTH”.

So why do we have to use SUMX for the Invoice Value?

The simple answer is that we need to conduct this equation at the transaction level of granularity and then SUM the results.  The main reason for this is that we’re using a division calculation which is only relevant at that level and becomes confused and meaningless when aggregated.

In order to show this, I’ll explain a simple example of converting USD to GBP with some theoretical exchange rates:

SUMX Example

What are the values for X and Y?  We obviously know that Y needs to be 150, being the sum of the two GBP values.  However, if we use SUM, we wouldn’t get that answer.  The answer that you would get depends on how you condition your measure but you wouldn’t get an answer of 150, it’s more likely to be 150/2=75 (where MAX is used in the equation) or 250/3.5=71.42 (where SUM is used) or 250/1.75=142.86 (where SUM and AVERAGE is used).  The correct value only comes about when you use SUMX which sums the underlying outcomes of the equation and evaluates at the granularity level of the table that you specify (in our case the base level fact table – Prepayments).

This is a very brief outline of a very complex and clever function.  In short, if you want to simply add the outcome of underlying equations, SUMX is the way to go.

It’s worth checking out one of my favourite ever posts to start getting to grips with SUMX – SUMX() – The 5 point palm exploding fxn technique.

The Days Approach

This uses the same methods and concepts as the Month approach.  The only difference is that we’re evaluating to the number of days in the relevant periods of time.

Days_Prepayment

=IF(COUNTROWS(VALUES(Months[MonthEndDate]))=1,

IF(LASTDATE(VALUES(Prepayments[Prepay_Start]))<LASTDATE(VALUES(Months[MonthEndDate]))

&&LASTDATE(VALUES(Prepayments[Prepay_End]))>LASTDATE(VALUES(Months[MonthEndDate])),

CALCULATE(

SUMX(Prepayments,

Prepayments[PandL_Value]

*(

(CALCULATE(COUNTROWS(Days),

DATESBETWEEN(

Days[Date],

LASTDATE(VALUES(Months[MonthEndDate])),

Prepayments[Prepay_End])

                    )-1

  )/

CALCULATE(COUNTROWS(Days),

DATESBETWEEN(

Days[Date],

Prepayments[Prepay_Start],

Prepayments[Prepay_End]

      )

             )

         )

      )

,Prepayments[Method]=”Day”

  )

       )

  )

Prepayments Total

We now have two measures to evaluate our Days Prepayments and our Months Prepayments.  We can bring these together in one measure as follows:

Prepayments_Total

=SUMX(Prepayments,[Month_Prepayment]+[Days_Prepayment])

Once again, I have to use SUMX to ensure that this measure sums the total of the underlying calculations.

Prepayments Without Values Not Prepaying

The “Values Not Prepaying”

So we’ve dealt with the reducing balances associated with Monthly and Daily Prepayments and Deferred Revenues.  However, what about a situation where the invoice charge relates to future months that haven’t yet been reached?  For example, we receive an invoice in January that relates to April and beyond, as in the screen shot above where the last entry has a transaction date of 10th Jan but relates to a period April to October.  In this instance, we have to prepay (or defer) the full amount of the invoice for January, February, March and April.

Value_Not_Prepaying

=CALCULATE

(

SUMX(Prepayments,

Prepayments[PandL_Value]

),

FILTER(Prepayments,

Prepayments[Date]<LASTDATE(VALUES(Months[MonthEndDate]))

            ),

FILTER(Prepayments,

Prepayments[Prepay_Start]>LASTDATE(VALUES(Months[MonthEndDate]))

  )

   )

In this measure, we once again use SUMX to ensure that we SUM the outcome of the underlying evaluations.  However, we apply two filters to limit the selection to situations where certain criteria hold.  This is an alternative to preceding the CALCULATE with an IF statement to conduct this evaluation first.

  1. The first filter evaluates when transaction date is less than the MonthEndDate of the report column
  2. The second filter evaluates when the Prepay_Start date is greater than the MonthEndDate of the report column.

We then add this into our Prepayments_Total measure as:

Prepayments_Total

=SUMX(Prepayments,[Month_Prepayment]+[Days_Prepayment]+[Value_Not_Prepaying])

 

Prepayments Balance Sheet Layout

And there you have it.  Our Balance Sheet layout above shows the total value of each transaction that needs to be prepaid or deferred in each month.

What This Means to Accountants

For those Accountants amongst you, you’ll no doubt have cottoned onto the fact that, if you’re careful, this could be your prepayments and deferred revenue reconciliations conquered.  The process that I’ve always implemented is as follows:

  1. Ensure that all transactions are posted to the Profit and Loss in full, as if the P&L was going to take the whole value in one period.
  2. Capture prepayment start and end dates on each transaction
  3. Extract the data into a PowerPivot application such as this.
  4. Refresh to give you the values that need to be prepaid (as per the above screenshot)
  5. Journal the appropriate values between the Profit and Loss and Balance Sheet on a reversing journal.
  6. Next month the previous transaction will reverse and you do the same again.

By doing this, you’ve got a self reconciling Balance Sheet for Prepayments and Revenue Deferrals (and all transactions for that matter that have a straight line release – for example, you could use this for elements of Fixed Asset Depreciation but that’s a whole other post).

And One More Thing

You may have noticed in the first graphic that there was also a Profit & Loss impact section to the report.  I’ll come onto this in a future post but it uses similar concepts to the measures in this post.  With this in place, you’ve got elements of a forecast P&L, you’re on course for Recurring Revenue Assurance analysis and you’re putting in place the building blocks of a full Finance Business Intelligence Application that’s just dying to be exposed in SharePoint as per Rob’s post In the Browser Aesthetics Yield a Greater Return.


Data of Different Grains: A Followup

January 19, 2012

 
image

Sales Measure Returns the Same Value for Everything:  We Solved the Calendar/Periods
Problem But Now We Have a Products/Category Problem

Quick Recap:  We Created a Separate “Periods” Table

In last week’s post, I explained how PowerPivot can very easily help you solve the “Budget vs. Actuals” problem or any other problem where you have data sets of different granularities that you want to compare in a single report.

The crux of the problem in my example was that my Budget table only budgeted down to the Month level whereas my Sales table went down to the day level.

To solve that, I created a separate table that only contained months (no sales or budget data, just months), and assigned a “PeriodID” to each month.  (I could have just as easily called it MonthId).

image

The Newly-Created Periods Table

I then linked that PeriodID column to the PeriodID column that already existed in my Budget table:

image

And then created a new PeriodID column in my Sales table using a formula:

image

Resulting in a three-table setup that looks like this:

image

Which then allows me to use fields from that Periods table on slicers (or on rows or columns) while using measures from both Sales and Budget in the same pivot:

image

And that, my friends, is a very useful thing indeed.

Could I have just added the PeriodID column to Sales and Stopped?

This is an important question for us to stop and consider.  Once I had a PeriodID column in both Sales and Budget, could I have skipped the creation of the Periods table and just moved on?

No.  I could not.  And there are two reasons.

Reason #1:  I can’t create a relationship between Sales and Budget.

image

Each PeriodID Appears More than Once in Each Table, So You Can’t Relate
Sales Directly to Budget or Vice Versa

That’s just the way relationships work.  The matching column needs to be unique (no value appears more than once) in at least one of the two tables.  No need to really worry about why, but if you think about how VLOOKUP works when you set the last argument to FALSE, it’s pretty similar.

Reason #2:  Without a relationship, PeriodID from one table doesn’t work with measures from the other.

Look what happens to Budget measures if I use PeriodID from the Sales table:

image

PeriodID from Sales Table Results in Broken Budget Measures
(and a relationship warning)

Yep, all busted.  And if I use PeriodID from the Budget table, I’ll get good Budget numbers but the Sales numbers will be hosed instead.

THAT is why the third, new, separate Periods table is required.  When I use fields from THAT table on my pivot, I can use measures from both Sales and Budget at the same time and nothing is broken.

So we use the Periods table as a master filter of sorts, one that can drive filters down into both Sales and Budget.

“Master Filter” Table is a MUCH better description than “Bridge”

I debated using a “bridge” metaphor to describe the role of the Periods table but I think that’s misleading – a bridge helps you travel from point A to point B:  A –> Bridge –> B.  That would make you think that we’re “starting” on the Sales table and “traversing” the Periods table to get to the Budget table.

But that’s not how it works.  Neither Sales nor Budget is “in charge” here – neither is the starting point.  There are two separate paths here:  Periods –> Sales, and Periods –> Budget.  The diagram drives that home:

image

So you should think of the Periods table as being “in charge” here.  Which makes sense, since you have to use it on the pivot, and can’t use PeriodID from either Sales or Budget.

Extending that Rule to Fields Other than PeriodID

This is the crux of what I left out of my previous post, and it’s an important rule to drive home:

When you have separate data tables that cannot be combined into a single table (as is the case with Sales and Budget), you CANNOT use row/column/slicer fields from one table in a pivot with measures from the other table.  You MUST have a separate “master filter” table, and use fields from THAT table in your pivot.

So does that apply to fields other than PeriodID?  You betcha.  Guess what happens if you use a Product Subcategory or similar field from the Budget table in a pivot that contains a Sales measure (or vice versa)?  Yeah, same problem as trying to use PeriodID from Sales as we did above:

image

What’s the solution?  A separate master filter table for Products!

And that may be as simple as a single column:

image

Wash, Rinse, Repeat

Once you have measures from two different data tables, every row/column/slicer field you use requires a separate master filter table.

It may sound tedious but it’s actually quite quick.  To easily create such a table you can even just use  a pivot: 

  1. Put the ProductSubCat field from either Sales or Budget on rows.
  2. Copy/paste that column of unique values into PowerPivot as a new table.
  3. Create the relationships to Sales and Budget.  
  4. Use fields from the newly created master filter table in your pivot.  Done.

(Better to get someone to add a new view or table to a database for you if that’s available, but copy/paste works well otherwise).

Also Useful for Single Table Situations!

One last point:  even when you only have a single table of data, like Sales, it’s often still quite useful to create separate master filter tables.

Why is that?

Because when you create master filter tables, it gives you the opportunity to remove lots of columns from your “big” table.

Imagine a sales table that, in addition to columns like Quantity and Amount, also contains columns like CustomerID, CustomerGender, CustomerAge, CustomerAddress1, CustomerAddress2, etc.

If you leave CustomerID in the sales table but “move” all of the other customer columns into a separate Customer table, and then link the two tables by Customer ID, you can reduce the number of columns in your Sales table by a significant number.

And that can sometimes shrink your files dramatically, AND speed them up.  See this post for an example.


Sales/Budget: Integrating Data of Different “Grains”

January 12, 2012

 
image

Latest Article for CIMA Insight

Some of you know that I’ve been writing a series of articles for the Chartered Institute of Management Accounts, aka CIMA.  Up until this point those articles have either focused on the organizational impact, the “why” of PowerPivot.  Things like how it contrasts with traditional BI and traditional spreadsheets.  How it saves costs and delivers more.

And then I started to introduce basic topics that have already been covered at length here on this blog, just simple how-to-get-started stuff.

But this month is the first time I wrote something for CIMA that I have not really covered here.

The Problem:  60k Sales Rows vs. 2k Budget Rows

PowerPivot Sales Data Very Granular Day Level

Sales Table with 60 Thousand Rows

PowerPivot Budget Table Less Granular Month Level

Budget Table with Two Thousand Rows

The Desired Result:  A Single Unified Report

PowerPivot Report Combining Day Level Actual Sales and Month Level Budget Granularities

The Solution

Diagram of How to Integrate Different Granularities of Data in PowerPivot

Here’s where I have simplified it a bit for the introductory audience.  I likely have a Calendar table in a well-designed model, and that doesn’t appear in the diagram.  I also likely have to solve for more than just Time granularity – the Budget is also likely less granular at the Product level as well.

But to introduce this powerful concept, I think a simple example is best so I kept it clean.

Click here to read the full (short) article and see the details of how I create the linkage between the two tables.

Popular Topic This Month!

Everyone has this on the brain this month it seems Smile

First, in the exact same issue of CIMA, there’s an article about handling Sales vs. Budget using traditional Excel.

And Marco Russo has written one about this same topic in PowerPivot and BISM Tabular.  As usual, his approach is excellent, and quite different from what my more primitive mind conjures.

I didn’t even have to use the Boot Signal:


Switching Subtotals On & Off Dynamically Using DAX in PowerPivot

January 5, 2012

Guest Post by David Churchward

Cash Flow Statement With and Without Subtotals

Subtotals on or off for a field - They’re the options that you get.  That’s probably great for a lot of situations, but I would put money on the majority of readers of this post having had a situation where they’ve asked “where’s the option to not display a subtotal when there’s only one record being shown?”.  Frustrating isn’t it?  You end up with a load of subtotals simply repeating the same value as it’s own one subset member because one of your other report elements has a subset of data that does need subtotalling.  You can, of course, toggle the show and hide for each member, but wouldn’t it be better if it was dynamic?

Fortunately, DAX has the answer.  It’s not the answer that I would have hoped for because this one feels like it should be an option setting somewhere.  However, we can deliver the same functionality by writing it into our measures.

In this post, I’ll explain how to create a dynamic measure that works out whether a subtotal is needed and displays that subtotal accordingly.  I’ll be using the report that was created as part of the Cash Flow Statement post.

The Subtotal Issue

Let’s refer back to the final report that I created for the Cash Flow Statement.

Cash Flow Statement With Totals2

In the column CF3_Name, the only subtotal that I want to see is the one on Working Capital as this is the only one that has any lower level detail.  You’ll notice that all other areas are carrying subtotals  which only serves to repeat the number above it.

COUNTROWS and COUNTX

In order to solve this problem, you obviously need to know how many subset elements there are that would make up your subtotal.  It would be reasonable to jump straight for COUNTROWS.  Let’s examine that quickly with a measure that we’ll call CF4_Headings_No in the hope that it will tell us how many headings are in our subset.

CF4_Headings_No

= COUNTROWS(VALUES(CF4[CF4]))

COUNTROWS Separate Tables

On the face of it, that doesn’t appear to give us what we want.  We have the same results for all headings and CF4_Name repeats all of the same headings.  The reason that this happens is because CF3_Name and CF4_Name are held on separate tables.  Although they hold relationships to the same table, PowerPivot doesn’t pre-determine which combinations of these fields exist, even at run-time.

So let’s look at what happens when these headings are held on the same table with the same COUNTROWS formula (but obviously changing table name etc!).

COUNTROWS Same Table

That looks better right?  We can see all of the correct combinations of CF3_Name and CF4_Name.  On the one hand, this is good because we can distinguish the fact that Working Capital is the only element that has more than 1 subset item and therefore the only element that requires a subtotal.

However, there’s one big downside here.  For all other sections, we can’t distinguish between a subtotal line and a detail line.  There’s a couple more downsides here too:

  1. I have to create a table that holds all of the combinations of these headings.  This isn’t the biggest issue as I have a table called GL_Headings that holds the codes associated to these headings.  I can create a RELATED()measure in my PowerPivot dataset to return the heading name.  However, why bother if we don’t have to?
  2. Don’t underestimate the power of the function COUNTROWS(VALUES(TABLE[FIELD])) returning a value of 1 which happens when headings are held as a separate table.  This opens up a whole host of functionality and it’s probably one of the most important aspects of PowerPivot in my opinion.

So, what’s the final analysis of COUNTROWS

  1. Using COUNTROWS against a separate table with unique values for each heading gives us the means to distinguish between subtotal sections and detail sections of our report.
  2. Using COUNTROWS against a combined table with heading combinations tells us the CF3 headings that carry detail where we do want to see a subtotal.

I’m sure you’ve probably deduced where I’m going with this as the combination of these two elements probably tells us what we need to know to figure out whether a subtotal is required.  Unfortunately, this is not entirely true.  The problem is that the combination of these elements won’t operate in context unless the context is directly applied against the heading table (ie you put a slicer on the report for the cash flow heading elements).  What about a date slicer?  If you apply a date slicer which is a separate table, these measures won’t react.

Bring on COUNTX

Let’s jump straight into the DAX on this one and see what it’s doing by creating a measure called CF3_Total_Reqd.

CF3_Total_Reqd

=COUNTX(CF4,[Cash_Flow_Statement])

COUNTX is analysing our CF4 table in the context of our measure [Cash_Flow_Statement] (as calculated in the Cash Flow Statement post).  The context applied to [Cash_Flow_Statement] is carried across to our COUNTX formula.  We run this against our CF4 table because we want to get the number of detail heading names that carry a value in our [Cash_Flow_Statement] measure.

So what do we get:

COUNTX

Our CF3_Total_Reqd measure using COUNTX is telling us how many headings are used in our dataset for the MonthEndDate shown on our report.  This is exactly what we want.  What’s even more special is that if, in a particular month, our [Cash_Flow_Statement] measure only has one heading in the Working Capital section, the Working Capital Total would return a value of 1, which is a scenario when we want to switch our subtotal off.

But there’s a problem

Our COUNTX measure, as it stands, is returning a count where we have a heading, but it isn’t telling us when we have a valid blank entry at the CF4_Name level.  As a result, any CF3 category that doesn’t carry CF4 heading detail, isn’t telling us anything.  I know we have Operating Profit elements, but we’re not getting any counts returned.

But what if we specify the column that we’re analysing in our CF4 table?  We can do this by adding in VALUES and specifying the column in our CF4 table.

CF3_Total_Reqd

=COUNTX(VALUES(CF4[CF4_Name]),[Cash_Flow_Statement])

COUNTX With VALUES()

We have a result!  And our Working Capital section is behaving itself too!

And now for the Subtotal Switch On / Off Measure

Consider our two measures:

  1. CF3_Total_Reqd this tells us when a total is required ie CF3_Total_Reqd > 1
  2. CF3_Total_Reqd also tells us where detailed elements exist, ie returns a value of 1, but this measure does the same for those associated subtotals.  CF4_Headings_No, at the same time, helps us distinguish, generally and irrespective of context, between detail and subtotal levels on our report ie CF3_Total_Reqd = 1 AND CF4_Headings_No = 1

If we combine these elements, we get a measure that we’ll call Cash_Flow_Heading_Tidy

Cash_Flow_Heading_Tidy

=IF(

[CF3_Total_Reqd]>1||

([CF3_Total_Reqd]=1&&[CF4_Headings_No]=1),

[Cash_Flow_Statement],

BLANK()

)

Cash Flow Statement Major Subtotals

That gives us what we want, right?  Well not quite.  The Cash Flow From Operations section is now right.  However, the other sections in my CF2_Name column have reached the criteria for their totals to be eliminated.  If I don’t want this effect (and I don’t because this is a major total), I have to specify for my Cash_Flow_Heading_Tidy to directly include those totals.

Cash_Flow_Heading_Tidy

=IF([CF3_Total_Reqd]>1||

([CF3_Total_Reqd]=1&&[CF4_Headings_No]=1)||

COUNTROWS(VALUES(CF3[CF3_Name]))>1,

[Cash_Flow_Statement],

BLANK()

)

Cash Flow Statement Complete

There you have it.  All subtotals behaving properly.


Cash Flow Statement in PowerPivot

December 15, 2011

 

Guest Post by David Churchward

Cash Flow Statement

There’s few, if any, reports that I consider to be more important than the Cash Flow Statement.  Every business needs one, probably above any other report.  There’s numerous different versions of Cash Flow reports and, in reality, it can become a very personal thing to a company so I’m not going to propose to provide a one size fits all report.  In addition, I’m not going to proclaim to provide an IAS7 – Cash Flow Statement (that’s International Accounting Standards) textbook report because this isn’t designed as an accountancy lesson (but, if you’re interested, the method I’ll show here relates to the Indirect Method in IAS7).

With that said, if you consider this post alongside Kasper’s excellent post Building a cash flow statement in PowerPivot using Dynamic Measures in DAX, you should have all the tools you need to produce any Cash Flow Statement that you require.

The Accountancy Bit (very quickly!)

Just before we start, a quick accountancy point to consider.  Kasper’s approach considers cash payment and receipt transactions.  My model uses all transactions in the general ledger with the exception of Bank transactions.  On the basis that double entry exists (ie every transaction has an equal and opposite effect), all transactions that aren’t the bank transactions must equate to the same value.  The reason we use the “other” side of the account is because this side of the transaction tells us what the cash transaction was for.  As a result, we should simply be able to aggregate transactions into Cash Flow categories.

This seems quite straight forward.  Unfortunately, there are a few exceptions that need to be handled.  I’m only going to explain one which should give you the means to adjust for others.

Depreciation is purely an accounting adjustment.  This has the effect of crediting the balance sheet and debiting the P&L.  The effect of crediting the balance sheet has the effect of reducing the value that will be shown as Fixed Asset additions.  We therefore net off the depreciation transaction as if it never happened and I’ll show this later in this post.

The Dataset

My dataset (FACT table) contains 3 columns:

  1. GL_Account– this is my general ledger code.  Every transaction that forms the trial balance will carry a general ledger code and this tells us what the transaction was for.
  2. Month_End_Date – this date tells us the financial period to which the transaction relates.  I would advise using the month end date of the financial period in which the transaction will be reported as I’m doing in this example. (or a period number as in Rob’s Greatest Formula In The World posts).  It should be noted that a transaction date may not be in the same calendar period as the reporting date so care should be taken.
  3. Amount– this is the transaction value.  Using normal accounting signage, this will be a positive value for debits and a negative value for credits.

My dataset contains all transactions processed on the system – this is PowerPivot, it can handle it – and this essentially forms the basis of the trial balance.  If you have a lot of old data and don’t want to include it all, you could substitute historic data with opening balance values for each general ledger code.  This simply aggregates a lot of old transactions into one balance (per general ledger code) for an opening balance date, whilst maintaining the overall integrity of a balanced trial balance and the associated values.

You may have a dataset that holds monthly values for each general ledger code as opposed to each individual transaction.  This will work equally well although you won’t have the means to dissect the Cash Flow Statement by subset records such as customer or project.

If your data is held as monthly columns, it’s worth running an UNPIVOT.  Whilst we could condition PowerPivot to work with individual values for each period, it’s more efficient to work with one value column with associated period dates held in another column.

Related Tables

I link my my FACT table to a GL_ACCOUNTS table on the GL_Account code.  The GL_Accounts table provides details relating to the GL_Code such as account name and  report heading codes.  These heading codes, in turn, link through to a further GL_Headings table which holds cash flow report heading codes.  This, in turn, links through to cash flow heading tables.  It’s these headings that we will use on our Cash Flow Statement report.

Cash Flow Statement Dataset Schema

image

You’ll notice that CF1 doesn’t link to any other tables.  I’ll come onto that shortly.

General Ledger Account Mappings

The GL_Headings mapping is the critical element to this dataset.  This provides all of the key relationships between a general ledger code and the Cash Flow Statement report heading that you want the account to be represented under.

GL_Headings Table

image

My GL_Accounts map through to my Heading_Code on this GL_Headings table.  This table holds all of my key report headings and provides a useful grouping of accounts.  For instance, all general ledger codes that relate to Sales will carry a Heading_Code of 1010.

These definitions are important.  This is a one-off setup and you’ve probably done this indirectly in some way or another to drive any reports or Excel analysis that you’re currently doing (probably with VLOOKUP).  This process simply affirms that mapping.  Ideally, you would code these heading codes into your General Ledger account code setup and formulate the code allocation as part of the process to set up new general ledger accounts.

You’ll recall that I mentioned Depreciation earlier in this post.  Whilst this is a P&L account (heading code 1045 in this example), I’m coding this to exactly the same place as Fixed Assets on the Balance Sheet (heading code 2010).  This effectively eliminates the deprecation transactions.  An alternative would be to filter out the accounts in question.  You can do this, but be careful to ensure that those accounts net to zero.

Each of these codes map through to Cash Flow Statement heading tables.  These heading tables can be whatever you need them to be, with headings and ordering assigned however you want your final report to be represented.  The one’s I’m using are shown below.

CF2 – Cash Flow Statement Heading 2

image

This is my first grouping of headings.  These align with the IAS7 headings for CF2 codes 1, 2 and 3 and basically split my Cash Flow Statement between Operations Activities, Investing Activities and Financing Activities.

There’s a fourth section called Bank.  The Bank general ledger codes map through to this code and we eliminate them for the Cash Flow element of this report.  However, we bring them back into action when we display the Bank Balances at the end of the report.

CF3 – Cash Flow Statement Heading 3

image

This is the next level of granularity on my report.  The Bank general ledger codes don’t have any CF3 mapping and yet I’m carrying Opening and Closing Bank Balance codes on this heading set.  I’ll come onto why that is a bit later.  You’ll also notice that I’ve coded these headings with an Open_Bank  and Close_Bank flag.  We’ll use dynamic DAX measures later on to utilise these.

Note – there should NOT be any mapping of accounts to the CF3 codes 10 and 11 in this table (ie Open and Close Bank Balance).  We’ll use these simply to drive a DAX measure later.

CF4 – Cash Flow Statement Heading 4

image

One element of my Cash Flow Statement will carry a further breakdown – Working Capital (although you can have more if you want).  I’ve done this simply to provide an extra level of granularity, but you could decide to replace the Working Capital heading in CF3 with these headings and map accordingly.

I’ve got a few reasons for doing this:

  1. I like to show my Working Capital elements specifically – this targets the key operational balance sheet items that generally need to be tracked
  2. This helps to line up a future post!

CF1 – Cash Flow Statement Heading 1

You might, quite rightly, question why my ordering has gone to pot and I’m explaining CF1 last!  Quite simply, this table sits out on it’s own because it simply drives report sections as opposed to being a mapped heading code.  I’m going to show a Cash Flow section and then, below that, I’m going to show the effect on the bank balance.  No relationships required.

image

And Now for the DAX – Cash Flow

I’ve got to be honest, this measure isn’t going to set the world alight.  However, this goes to show how easy PowerPivot can be, especially if you condition your data properly.  I’m creating a measure aptly called Cash_Flow

Cash_Flow

=CALCULATE(

SUM(FACT[Amount]),

CF2[CF2_Name]<>”BANK”

       )*-1

This is a simple CALCULATE function which sums the [Amount] field on the FACT table with a simple filter set to ignore any GL_Accounts that have a CF2 category mapping of “BANK”.

You’ll notice that I multiply this by –1 to reverse the signage.  In simple terms this is because of the double entry effect.  We’re using the “other” side of the transaction which means that the sign convention is going to be back to front.  This simply reverses that effect.

When I pull this measure onto my report together with the CF1,2,3,4 heading names, I get the following.

Cash Flow in Excel - PowerPivot Turns Complex into Simple

You might say that this is job done (especially if you remove CF1 heading from the report). This does give us a Cash Flow Statement. However, this doesn’t give my bank balances and, if I include my CF1 heading, my report is repeating the same data for both of the CF1_Name headings. This is understandable as CF1 is not linked to my FACT table in anyway.  This is intentional (believe it or not!).

Bank Balances

I want to show opening and closing bank balances for the Bank section of my report and this is where it gets a bit more interesting.  Let’s create a measure that calculates these balances and let’s call this measure Bank…

Bank

=IF(COUNTROWS(VALUES(CF3[CF3]))=1,

IF(VALUES(CF3[Close_Bank])=1,

CALCULATE(

SUM(FACT[Amount]),

ALL(CF3),

CF2[CF2_Name]=”BANK”,

FILTER(

ALL(Month_End_Dates[MonthEndDate]),

Month_End_Dates[MonthEndDate]<=

MAX(Month_End_Dates[MonthEndDate])

            )

          ),

IF(VALUES(CF3[Open_Bank])=1,

CALCULATE(

SUM(FACT[Amount]),

ALL(CF3),

CF2[CF2_Name]=”BANK”,

FILTER(

ALL(Month_End_Dates[MonthEndDate]),

Month_End_Dates[MonthEndDate]<=

EOMONTH(MAX(Month_End_Dates[MonthEndDate]),-1)

  )

),

BLANK()

   )

    )

      )

We should be offering prizes for those of you would can instantly spot the GFITW elements!!  Anyone would think that this had somehow been contrived!  Believe me – it hasn’t.  This is a concept that I use regularly and it’s a joy to see Rob explain it so well so I don’t have to!

So what is this doing?  I’ll explain as clearly as I can, but it’s worth paying very close attention to Rob’s Greatest Formula In The World posts.  I won’t re-iterate too much here.

  1. The first IF statement simply serves to provide an answer as to where I want the measure to be evaluated.  I want an answer when my CF3_Heading is flagged as Open_Bank or Close_Bank.
  2. The second IF statement creates a value when my CF3_Heading [Close_Bank] is flagged.  Where this is the case, the following happens:
    1. Sum the [Amount] Field
    2. Expand my dataset to all CF3 headings in order to release the dependency on the current CF3 heading being evaluated – ALL(CF3)
    3. Apply a filter to include all of my Bank general ledger codes – CF2[CF3_Name]=”BANK”
    4. Apply a further filter (FILTER) that:
      1. Removes the dependency on the MonthEndDate – ALL(Month_End_Dates[MonthEndDate])
      2. Apply a filter that includes all months that pre-date and include the reporting month – Month_End_Dates[MonthEndDate]<=MAX(Month_End_Dates[MonthEndDate]) – this is important because the transaction base for the month won’t give us the balance.  We need to include ALL transactions that pre-date and include the reporting date.
  3. The third IF statement has a similar effect to the second IF statement, but it produces an opening balance and therefore the current reporting month has to be ignored.  As a result, I want to sum all transactions where the date is less than or equal to my prior month date – Month_End_Dates[MonthEndDate]<=EOMONTH(MAX(Month_End_Dates[MonthEndDate]) ,-1)

Cash Flow in Excel with Bank Balances Thanks to PowerPivot

Once again, we have the correct results but we’re getting values repeated for both CF1 headings.

Sorting Those CF1 Headings

We now have our two base measures.  All we now need to do is evaluate when to use each one.  Let’s create another measure called Cash_Flow_Statement.

Cash_Flow_Statement

=IF(COUNTROWS(VALUES(CF1[CF1]))=1,

IF(VALUES(CF1[CF1])=1,

[Cash_Flow],

[Bank]

   )

)

This measure simply evaluates for CF1 headings and determines that if my CF1 Heading Code is 1 (ie Cash Flow) then use my [Cash_Flow] measure.  Otherwise, use my [BANK] measure.

Cash Flow Finished

A bit of formatting and there you have it.  You’re just about done.  Or rather, you’re done if you think that’s enough.  I, however, don’t like some of these totals and duplication of values in certain areas.  There’s a way to deal with those totals but that’s a topic for a future post.

So what else can we do with this?

Just as a taster, here’s something that I like to call a Cash Flow Bridge.  This could be a topic for a future post, but just to whet your appetite……

Cash Flow Statement Bridge

If there’s enough call for this, I’ll get this scheduled in for a future post.  Neat eh!


The Greatest Formula in the World, Part 3

December 13, 2011

 
Our time machine that we developed in part one and part two still needs some explanation, so let’s dive in.

First Things First:  The MAXX Wasn’t Needed

Looking back, I noticed that I was overcomplicating things needlessly.  The final formula I had involved the following FILTER() term, in which I used a MAXX:

FILTER(ALL(Periods),
       Periods[NextYear Period]<=
          MAXX(VALUES(Sales[Period Num]), Sales[Period Num])
)

Turns out that MAX() works just fine, so let’s replace that clause and simplify things a bit.  Here’s the new formula:

CALCULATE([Total Sales],
   ALL(Periods),
   FILTER(ALL(Periods),
      Periods[Year]=MAX(Periods[Year])-1
   ),
   FILTER(ALL(Periods),
          Periods[NextYear Period]<=MAX(Sales[Period Num])
   ),       
   VALUES(Periods[MerchPeriod])
)

OK, with that, we can move on to explanation:  How does this formula work??

I used to call this technique “expand then filter”

Well actually I still do, in my own head.  It’s just that the GFITW is a catchier title.

OK, so the “expand” part is just that first ALL():

CALCULATE([Total Sales],
          ALL(Periods),

The first step in the formula, then, is basically just telling the calc engine to throw away all filters on the Periods table.  In other words, “forget all concept of time, pretend the pivot is not filtered at all with respect to time.”

We do this so that we have a clean slate.  Then, in the subsequent FILTER clauses, we build up a new filter context for time.

One more time for clarity:  the way this formula works is to first throw out all time filters, and then in subsequent steps, we build up new filters to match the time period that we want, which in this case is last year.

Once you understand that, this overall formula starts to get pretty simple.  Each piece of the formula is quite straightforward in its own right.  ALL() is pretty straightforward for sure, and so are the subsequent FILTERS().

OK, we’ve expanded.  Now on to the filters!

The first filter says, “hey, now that we’ve thrown out all time filters, let’s filter time back down to just be last year.”

FILTER(ALL(Periods),
       Periods[Year]=MAX(Periods[Year])-1
)

Let’s talk about the FILTER() function itself for a moment.

How does FILTER() Work?

Honestly this function has deserved its own post for a long time.  I’ll give a brief explanation here.

The syntax for the FILTER function is FILTER(TableToFilter, FilterExpression).   Pretty simple.  Here’s some more detail:

  1. FILTER() takes a TableToFilter and a FilterExpression, and returns all rows from that TableToFilter that match the FilterExpression.
    1. In the example above, TableToFilter is ALL(Periods)
    2. and FilterExpression is Periods[Year]=MAX(Periods[Year])-1
  2. FILTER() steps through the TableToFilter one row at a time. 
    1. And for each row, it evaluates the FilterExpression.  If the expression evaluates to true, the row is “kept.”  If not, it is filtered out.
    2. Because FILTER() goes one row at a time, it can be quite slow if you use it against a large table.  When I say “large” that is of course subjective.  A few thousand rows is fine in my experience.  A million is not.  Do not use FILTER() against your fact table.
  3. The FilterExpression typically takes the form of Table[Column] = <expression>
    1. The comparison operator doesn’t have to be “=”.  It can also be <, >, <=, >=, <>
    2. The expression on the right hand side of FilterExpression can be “rich.”  This is VERY useful.  In a simple CALCULATE, the right side of each filter expression has to be simple, like a literal number (9) or a string (“Standard”).  The fact that FILTER() allows for rich expressions here is one of the most common reasons I use FILTER().
    3. The Table[Column] in the filter expression is a column in the TableToFilter.  If you are filtering the Periods table, it makes sense that you are testing some property of each row in Periods.  I can’t think of a sensible reason to use a column here that is NOT from TableToFilter.  (Insert “boot signal” here, maybe the Italians can address this).
  4. FILTER() ignores everything else going on in your formula and acts completely on its own.
    1. For example, our overall formula sets ALL(Periods) as the first argument to CALCULATE.
    2. The FILTER()’s that come after that do NOT pay any attention to other arguments however, including that ALL(Periods).
    3. In other words, the FILTER() functions are still operating against the original filter context from the pivot!  If the pivot is sliced to Year=2009, then the FILTER() function starts with the Periods table already pre-filtered to just 2009.
    4. This is why each of my FILTER()’s uses ALL(Periods) for TableToFilter.  I have to repeat the “expand” step so that my FILTER() is also working from a clean slate.
  5. Even though each FILTER() operates on its own, their results then “stack up” in the overall formula.
    1. Even though FILTER() RETURNS a set of rows that matched the FilterExpression, it actually REMOVES rows from the overall filter context.
    2. This sounds tricky but really it isn’t.
    3. Let’s say our TableToFilter contains 6 rows:  A, B, C, D, E, and F.
    4. And our overall formula contains two FILTER() clauses that both operate on the same TableToFilter, just like our overall formula near the beginning of this post.
    5. Let’s also say that the first FILTER() returns rows A, B, C, and D.
    6. And the second FILTER() returns rows C, D, E, and F.
    7. The net result is that only rows C and D are left “alive” in the overall filter context of the formula.
    8. So one way to think of this is that FILTER()s “stack up” on top of each other.
    9. Another way to think of it is that even though the first filter RETURNED rows A, B, C, and D, its real effect was to REMOVE all other rows (E and F) from consideration.

OK, back to that first filter!

Here it is again:

FILTER(ALL(Periods),
       Periods[Year]=MAX(Periods[Year])-1
)

Let’s revisit points 1-5 above for this FILTER expression to see how it all works.  And let’s examine just a single cell of the pivot to see how this FILTER operates for that one cell:

image

Focusing on The Orange-Circled LASTYRSALES Cell As An Example
With Its Period Filter Context Highlighted in Green

In that picture above, the orange cell we are looking at has a filter context “coming in” from the pivot.  It has Period[Year] set to 2011 and Period[MerchPeriod] set to 1, as highlighted in green.

Given the detailed description of FILTER() from points 1-5 above, we can see that:

  1. We set ALL(Periods) as our TableToFilter so that we are starting from a clean slate with respect to time.  So our Periods table now has “all rows alive.”
  2. Then our FilterExpression tests against the Periods[Year] column.
  3. MAX(Periods[Year]) – 1 still operates independently!, so it still picks up Periods[Year]=2011 from the pivot.  Therefore it returns 2011 – 1 = 2010!
  4. Since we started with ALL(Periods) as the TableToFilter, and the FilterExpression only “keeps” rows where Year=2010, we are left with all 2010 rows “alive” after evaluating this FILTER().
  5. If we didn’t do ALL(Periods) for TableToFilter, and instead just used Periods without the ALL(), our FILTER would start out with only rows from 2011 (since that is what the pivot is telling us).
    1. And then in the next step when we go back a year to 2010, FILTER() would find no rows.  There are no rows that match Periods[Year] = 2010 and Periods[Year]=2011. 
    2. So our FILTER would return no rows, which means it would have the effect of REMOVING all rows from Periods in the overall formula, and our measure would return blank for all cells in the pivot.

That’s a lot of explanation, I know.  Walk through it a few times.  It’s actually pretty intuitive once you’ve done it a few times.  The tricky part, for me, was discovering all of these details for myself.  And since I’ve done all of that, you don’t have to.

That’s enough for this time.  I think you can probably figure out how the second FILTER() evaluates based on the above, but I will step through it next time. 

I will also explain why we use that last VALUES() in the formula, and probably also share some of the answers I got from the Italians, and from David Churchward, in response to my question “did I need to add that calc column in the Periods table?”


The Greatest Formula in the World, Part Two

December 7, 2011

image

Our Time Machine From Part 1 Needs a Couple of Fixes

OK, if you recall from part one, we had a “year over year” time machine calculation going on, and it was built against a data set that lacked a “real” calendar table.  Very cool.  Very resourceful.  And very necessary.

But the time machine had two problems.  Let’s fix them.

Problem #1:  Meaningless Grand Total

Let’s start with the easy one:

Year over year in PowerPivot - Meaningless Grand Total

Grand Total is Meaningless for Last Year Sales

It is meaningless to have a “grand total” value for a measure that returns “last year’s sales.”  What year would that BE, actually?  It’s nonsense.  So we use an old trick, one of the many flavors of IF(VALUES()):

=IF(COUNTROWS(VALUES(Periods[Year]))>1,BLANK(),
   CALCULATE([Total Sales],
      ALL(Periods),
      FILTER(ALL(Periods), Periods[Year]=MAX(Periods[Year])-1),
      VALUES(Periods[MerchPeriod])
   )
)

Where the new IF “wrapper” is highlighted and the original formula is in normal font.

For more on that “IF VALUES” technique, please see this post.

The results are as desired, the grand total is now blank:

image

Problem solved.  Moving on…

Problem #2:  2011 isn’t complete but we’re getting all of 2010 sales

Remember, our sales data for 2011 only goes through the first 6 months.  So we do NOT want “last year sales” for months 7-12:

image

That can be solved a number of ways.  The simplest is just to add another IF(), and have the Last Year Sales measure return BLANK() whenever there are no Sales:

=IF(COUNTROWS(VALUES(Periods[Year]))>1,BLANK(),
   IF([Total Sales] = BLANK(), BLANK(),
      CALCULATE([Total Sales],
         ALL(Periods),
         FILTER(ALL(Periods),
            Periods[Year]=MAX(Periods[Year])-1
         ),
         VALUES(Periods[MerchPeriod])
      )
  )
)

That trims the results to just the months desired:

image

But note that the subtotal for 2011 is still too high:

image

“Last Year Sales” for 2011 Should Only be Returning
2010 Sales Through the First Six Months

This one is trickier.  I struggled to find a good answer before coming up with the following:

Step 1:  Add a “Next Year Period Num” Column to the Periods Table

Recall that my Sales table has a “Period Num” column that is the basis for the relationship with the Periods table:

image

And that matches up with a similar column in the Periods table:

image

Note that Period Num does NOT reset to 1 with each new year (unlike MerchPeriod).  So that is the absolute unique ID for a given Month/Year combo.

Well, I created a new column in the Periods table that tells me, for a given Period Num, what the equivalent Period Num will be NEXT year:

image

Now, in my FILTER statements, I can choose to match on THAT column instead.

Bringing it Home

OK, now that I have that column, I can add a new FILTER clause to my measure:

=IF(COUNTROWS(VALUES(Periods[Year]))>1,BLANK(),
   CALCULATE([Total Sales],
      ALL(Periods),
      FILTER(ALL(Periods),
         Periods[Year]=MAX(Periods[Year])-1
      ),
      FILTER(ALL(Periods),
         Periods[NextYear Period]<=
            MAXX(
                VALUES(PeriodSales[Period Num]),
                PeriodSales[Period Num]
                )
      ),
       
      VALUES(Periods[MerchPeriod])
   )
)

OK, what does that do?  Simply put, it further filters the Periods table to NOT go beyond any periods for which we currently have records in the Sales table.

And the results:

Last Year Sales Measure in PowerPivot With a Custom Calendar

Last Year Sales Measure Doing the Right Thing

And now if you want YOY Growth, it’s straightforward:

=([Total Sales] – [LY Sales Finished]) / [Total Sales]

Which yields:

Year over Year / Year on Year Growth Percentage Measure in PowerPivot With a Custom Calendar

Year over Year / Year on Year Growth Percentage
Measure in PowerPivot With a Custom Calendar

OK, I cheated and wrapped another IF(COUNTROWS(VALUES)) around the outside of the formula so that the grand total cell is blank again, because again, that’s a meaningless value to report.  Details.

Isn’t that a lot of work?

Depends on how you look at it.  First of all, it’s a lot harder to grasp the first time than it is as you get used to it, trust me.

But even better, this investment pays off forever.  Normal Excel formulas may be easier to write the first time, but then you have to re-write and adjust them forever, every time your data changes.  And every time your desired report shape changes.

This one is a portable formula.  It goes wherever you want it to go, and eats whatever data you feed it.  Forever.  It’s worth a little extra effort.

OK but do I really have to understand all of that?

No, not really.  Not right away.  Think of it this way:  this is what I do all day, every day.  And I can tell you that figuring this out is not something you want to do in your spare time.

But once someone gives you the pattern?  Wash, rinse, repeat.  Make small adjustments as needed.

Honestly, that whole formula above could be wrapped up in a function provided by Microsoft, and all of the nasty stuff hidden.  I told them as much last time I visited Redmond.  And if it were a function rather than a formula, we wouldn’t care at all how it worked.

So… treat this as a pattern.  Copy/paste and modify to fit your needs.  Seriously.

Unfinished Business

I actually WILL retrace my steps here though and explain a number of things though, so that you CAN understand.  I have a couple of posts in the queue aimed at just that.

Were there alternate ways to write this measure?

Yes, there were, especially in that last step where I needed to get the “Last year sales” measure correct for the 2011 subtotal.  I could have used SUMX to make the year subtotal equal to the sum of its underlying months.  That is worthy of a post for sure.

But I was wondering if there was still another way.  A way that doesn’t use SUMX and doesn’t require a new calc column.

And for that, we need the Italians.  So I’m gonna try out my new toy, the Boot Signal:

image

Calling Marco and Alberto

Get it?  *Boot* Signal?