Profit & Loss (Part 3)–Return On Sales and Variances

October 27, 2011

 

P&L Graphic

Guest Post by David Churchward

PART 1 and PART 2 of this series on Profit and Loss posts covered the basic layout of the P&L together with some time intelligence and filtering to display relevant numbers to cover actual, budget and prior year for both a selected period and the equivalent year to date.  This was all based around the core measure referred to as Cascade_Value_All.

In this post, I’ll go on to refine this report further by adding Return On Sales (ROS) percentages and variance calculations as well as tune up the appearance of the report.

Return on Sales (ROS)

Return on Sales (or ROS for short) is the percentage that a number on our report represents as a function of the equivalent sales value.  Most people will be familiar with the term Gross Margin Percentage and this is one type of ROS.  Gross Margin percentage is a function of gross margin value divided by sales value.  This is often dissected by individual product lines as the gross margin values are analysed by the same product categories as sales.  Operating Profit Percentage is another ROS measurement but this is not normally sub analysed (although it can be if your dataset lends itself to that approach).

You’ve probably deduced that the financial calculation is therefore:

ROS %= Analysed Value / Equivalent Sales Value * 100

We’ve already got the value to be analysed in our model.  This is the Cascade_Value set of measures.  Therefore, we simply need to create the Equivalent Sales Value measure.  I’ve called this measure Sales_Compare.

Calculating the Equivalent Sales Value – Sales Compare

In PART 2, we ended up with 6 measures (Cascade_Month_Actual, Cascade_Month_Budget, Cascade_Month_Actual_PY and the equivalents for YTD).  We can use these measures as the basis for for a new set of Sales_Compare measures.  I’ll run this example on the Cascade_Month_Actual measure.  The same methodology applies for the other Sales Compare measures that we’ll use for each of our final report headings.

My Cascade_Month_Actual carefully sums the underlying dataset for each of my report headings (Report Heading 1 contains my main groups and Report Heading 2 contains my sub groups).  I need to maintain my Report Heading 2 categorisation but remove the link from my Report Heading 1 categorisation.  This is done using an ALL() function within a CALCULATE function.

SalesComp_Month_Actual_Interim

=CALCULATE(

[Cascade_Month_Actual],

ALL(DIM_Heading1),

DIM_Heading1[Heading1_Name]=”Sales”

       )

You’ll notice that I use my Cascade_Month_Actual measure, remove any association that it may have with Heading 1 and then direct it to filter the measure based on Heading 1 being equal to “Sales”.  This way, I essentially repeat my sales value in each section of my P&L report.

image

It should be noted that this is an interim calculation.  It doesn’t necessarily make sense as it stands in the report above, but it will prove useful in further calculations.  Having said that, I would prefer to only see the values in the sections where I want to display ROS percentages.  In order to do this, I need to tell my dataset where I want to see ROS percentages displayed.  To do this, I go to my DIM_Heading1 table and create a field where I can mark the headings where I want ROS calculations to be performed.  I’ve called this field Heading1_PCT and mark the headings where I want to drive this calculation with a 1.

image

I can now reference the Heading1_PCT field to determine when the Sales Compare measure should return a result.  We’ll call this measure SalesComp_Month_Actual.

SalesComp_Month_Actual

= IF(

COUNTROWS(VALUES(DIM_Heading1[Heading1_PCT]))=1,

IF(VALUES(DIM_Heading1[Heading1_PCT])=1,

CALCULATE(

[Cascade_Month_Actual],

ALL(DIM_Heading1),

DIM_Heading1[Heading1_Name]=”Sales”

       ),

           BLANK()

         ),

    BLANK()

    )

The calculate function is exactly as per our previous measure but I’m only running the calculate function when Heading1_PCT is equal to 1.  I can only conduct this evaluation when I only have 1 value for Heading1_PCT so I have to use COUNTROWS and VALUES to determine this.

image

Calculating ROS

We now have the two key measures to derive our ROS being Cascade_Month_Actual and SalesComp_Month_Actual.  I could simply divide one by the other surely.  Essentially, that is correct, but as explained at length in Profit & Loss – The Art of Cascading Subtotals, we need the measure to behave slightly differently in different sections of the report.  You’ll notice above that I have SalesComp_Month_Actual detail values for Operating Profit and Gross Margin.  I only want my ROS calculation to evaluate at the total level for Operating Profit but at the detail level for Gross Margin.  I can therefore use the evaluation that I used in my Cascade_Value_All measure and join it up with my simple division.

ROS_Month_Actual

= IF(

MAX(DIM_Heading1[Heading1_Summary]) = 1

&&(MAX(DIM_Heading1[Heading1_Show_Detail]) = 1

||COUNTROWS(VALUES(DIM_Heading2[Heading2_Name])) > 1

          ),

[Cascade_Month_Actual]/[SalesComp_Month_Actual],

    blank()

     )

My logical test is exactly the same as the one we saw in Cascade_Value_All in Profit & Loss – The Art of Cascading Subtotals.  After that, it’s a simply division.

image

This same method is used for each associated column measure that we wish to calculate an ROS for.  If I also remove my interim measures and apply some formatting, I get the layout below for month actual, month budget and month prior year.  I should take this opportunity to apologise for not having been very inventive when I made this dataset up having used a flat set of percentages!

image

Variances

To complete the picture, we need to add some variance calculations.  Because of the way that we’ve built up our measures, this is now very simple.  Our actual versus budget variance is

Variance_Month_AvB

= [Cascade_Month_Actual] – [Cascade_Month_Budget]

I’m not going to insult your intelligence by writing out the equivalent for actual versus prior year.

Tidy Up

My report is going to display data for a selected month and the year to date position for that month.  I prefer to split the two sections with a blank column.  I don’t think we’re going to register on Rob’s spicy scale for this one!

Blank = BLANK()

Headings and CUBEVALUE

To give my measure names meaning and to allow for their careful use, they’ve become somewhat ugly.  You can of course provide a title for the column that is a lot more user friendly.  However, I prefer to write my own because I can make it look tidier and, more importantly, I can reference the period that has been selected on my slicer.

I use an Excel formula in my heading that uses a CUBEVALUE function to call the period number that I’ve selected.  You could also do the same with year selections.

=”Period – “&CUBEVALUE(“PowerPivot Data”,”[Measures].[Maximum of Period]“,Slicer_Period)

This appends the period number onto the text “Period – “ by referencing the MAX value for Period when brought into the context of my slicer called Slicer_Period.  I do the same for YTD, add a few more headings to describe the type of data and then hide my pivot table headings!

When It All Comes Together

When you put all of these elements together (PART1, PART2, Sales Compare, ROS, Variances, Blank and Headings) you get something like this….

image

You can DOWNLOAD THE WORKBOOK HERE


Profit & Loss (Part 2)–Compare and Analyse

October 20, 2011

PandL Full Screenshot

Guest post by David Churchward

In my recent post, Profit & Loss-The Art of the Cascading Subtotals, I went through a basic P&L layout with some relatively complex DAX measures to display and hide row headings as appropriate together with calculating accurate values.  In order to make this report more meaningful, it needs comparatives and further analysis.  In this post, I’ll build on the P&L created in part 1 to create some of the key elements of the layout shown above including Actual values, Budget values and Prior Year values for a selected period and the associated year to date (YTD).  In part 3, I’ll go on to show how the percentage calculations work and maybe some pointers for making it look REALLY good!

Time Intelligence

As our report is considering different timeframes, we need to establish a time dimension and condition slicers to select a required timeframe.  From this, we can determine the required time parameters for use in subsequent measures.

We need to establish 4 tables that we will use for time intelligence.  These are fully explained in my recent post Slicers for Selecting Last “X” Periods.

Dates – this is a list of sequential dates covering the timespan of our dataset.  This is linked to a date field in the dataset.

Year – this is a sequential list of years covering the timespan of the dataset.  This table should NOT be linked to the core dataset (or fact table if you prefer).

Period – this is a sequential list of periods, normally from 1 to 12 where a period is a calendar or fiscal month.  Again, this table should NOT be linked to the core dataset.

Year_Period – this is a list of year and period combinations covering the timeframe of the dataset.  This table also carries other relevant dates and attributes that relate to the date records contained within.

Year_Period is linked to both Year and Period on a many to one basis.

Creating Time Parameters

I create time parameter measures on the Year_Period table.  We need the following time parameters for use in our measures:

Selected Month End Date – this tells us the date relating to the end of the month for the selected period and year combination selected by the user on slicers.  Month end dates don’t change and so I’ve been able to hold this as a field in the Year_Period table which means that I just need to capture the associated value and deal with the fact that multiple selections could be made (or no selection at all).  This is done using:

Selected_Month_End_Date = LASTDATE(Year_Period[Month_End_Date])

It should be noted that I’m always using the LASTDATE function in these measures.  This is to ensure that I always evaluate to one result.

Selected Month Start Date – again, this is available in the Year_Period table as the value doesn’t change for any selected date

Selected_Month_Start_Date = LASTDATE(Year_Period[Month_Start_Date])

Selected Prior Year Month End Date – once again, we could make this available on the Year_Period table as the value won’t change and, for efficiency reasons, I would tend to do that.  However, for the benefit of showing an additional method, I’ve chosen to use the DATEADD function here.

Selected_PY_Month_End_Date = LASTDATE(DATEADD(Year_Period[Month_End_Date],-1,YEAR))

DATEADD requires the syntax DATEADD(dates, number of intervals, interval).  The dates element is a table / column expression which details which column to use in the evaluation.  I use –1 as an interval to go back in time (essentially turning DATEADD into DATEMINUS which of course doesn’t exist as a function!).  The interval is the timeframe type by which you wish to adjust and this can be DAY, MONTH or YEAR as my date field doesn’t contain any time elements.

Selected Prior Year Month Start Date – let’s not let this drag out and get boring!  Measure below:

Selected_PY_Month_Start_Date = LASTDATE(DATEADD(Year_Period[Month_Start_Date],-1,YEAR))

Selected Year Start Date – DATEADD is used again here but we need to know how many months to go back.  As the user is selecting a period and year combination, we can pick up the period that is being evaluated from the slicer and use this to work back to the first date in the year.

Selected_Year_Start_Date = LASTDATE(DATEADD(Year_Period[Next_Month_Start_Date],MAX(Year_Period[Fiscal_Period])*-1,MONTH))

As I’m using Fiscal Periods (in the main I use July as Period 1), I need to pick up the selected Fiscal Period (ensuring that I evaluate to one answer – hence the MAX function) and then multiply by –1 to work backwards through time as opposed to forwards.

Selected Prior Year Start Date – this is getting boring Churchy – move on:

Selected_PY_Year_Start_Date = LASTDATE(DATEADD(DATEADD(Year_Period[Next_Month_Start_Date],MAX(Year_Period[Fiscal_Period])*-1,MONTH),-1,YEAR))

Nothing tricky here.  I just take one year away from Selected_Year_Start_Date.  I’ve used a nested DATEADD to show how it’s done as opposed to using the answer from Selected_Year_Start_Date.  There’s multiple ways of doing these things!

With that done, I think we’re finished with date parameters.  Let’s get on with putting them to good use.

Applying Time Intelligence to my Cascade_Value_All Measure

You may recall in Profit & Loss-The Art of the Cascading Subtotals that the key outcome was a measure called Cascade_Value_All.  We can now start dissecting this measure by overlaying time intelligence and filters to capture the values that we need on our report and this is all done safe in the knowledge that Cascade_Value_All will manfully ensure that report headings behave appropriately.

I’ll construct these measures in sections to make them clear.  You can ultimately combine these into fewer measures if required.  Let’s start by overlaying time intelligence.  This is done using the DATESBETWEEN function.  I tend to use this regularly as I’ve never encountered a situation where I’m working with calendar years!

We need 4 measures – current month, current month last year, current month YTD, current month YTD last year.

Cascade_Month =

CALCULATE(

[Cascade_Value_All],

DATESBETWEEN(

Dates[Date],

Year_Period[Selected_Month_Start_Date],

Year_Period[Selected_Month_End_Date]

         )

  )

In this measure, we’re essentially filtering the outcome of Cascade_Value_All down to the underlying transactions that fit between the start and end dates provided to the measure.  This means that our other 3 required measures that represent current month last year, current month YTD and current month YTD last year are exactly the same although the start_date and end_date elements highlighted in bold are substituted with the relevant date parameters calculated in the Creating Time Parameters section above.  Let’s call these three measures Cascade_Month_PY, Cascade_YTD and Cascade_YTD_PY.  Don’t worry as all measures will be available in the workbook that I’ll make available with Part 3.

Filter Actual and Budget

In our P&L report, we have two types of data being Actual and Budget.  These are all records in the main fact table dataset and each record is denoted with a Data_Type field (value of 1 denoting Actual and a value of 2 denoting Budget).  This field is linked to a DIM_DataType table.

image

To filter the dataset for the current month measure (I won’t go through them all as I’m sure you’ll get the idea), I use the following measure:

Cascade_Month_Actual =

CALCULATE(

[Cascade_Month],

DIM_DataType[Data_Type_Name]=”Actual

      )

This will give us the correct values for the selected month representing the “Actual” dataset.  We can create an equivalent for “Budget” simply by changing the filter value.

I create a “Prior Year” version of Cascade_Month_Actual by substituting [Cascade_Month] with [Cascade_Month_PY], ensuring that our filter is set to “Actual”.

Why Filter “Actual” and “Budget” instead of simply adding them to the Column Headings in the Pivot Table? – Use of Static Columns

I could avoid multiple measures by simply using my [Cascade_Month] measure (together with the other time adjusted measures) and adding Data_Type as a column heading.  There’s a few reasons why I wouldn’t do this in this particular case.  That’s not to say that it isn’t a valid approach in the majority of other examples.  My reasons are:

  1. I want to ensure that my columns remain static.  If, for whatever reason, I evaluate to not having a budget for a particular slice of the data, I don’t want the column to disappear.
  2. OK, so number 1 might be a bit weak!  I’m also aware that the budget for Prior Year probably isn’t relevant so I don’t want to see it.  My prior year comparison is being used to evaluate my current year actual.  I don’t want prior year budget getting in the way so I only want to see actual for my prior year dataset.
  3. I want to add a blank column between my “Month” information and my “YTD” information.  Again, a bit weak, but buy-in from users is partly about how pretty is looks and I think it looks prettier this way.
  4. If I use a field on my column headings, I could get multiple rows dedicated to my column titles.  As a taster to what’s coming up, I don’t want to use the Pivot Table headings because they’re ugly on this report.  I’m going to create my own damn it!  If I use static columns, I can guarantee only one heading row which means I can hide it and create my own prettier version!
  5. This is probably the most compelling reason!  It’s quite normal for Budgets to be superseded by Revised Forecasts.  With static columns, I can ultimately provide a slicer that allows the user to select whether they want to see budget or revised forecast information against actuals.

Note – there’s always a balance to be considered when making an assessment like this.  On the one hand, this may “look pretty” and avoid displaying information I don’t want but if performance suffers than you’re onto a rough deal!  With these measures, I haven’t had a performance issue with quite large datasets so I don’t have to compromise appearance just yet but there’s no doubt that it’s less optimised than constructing one measure and adding a field to column headings.

When We Pull This All Together……

I add the six new measures to the P&L report and attach slicers for Period [Period] and Year [Year], tidy up a bit and produce the following:

PandL Post 2 Screenshot

Look out for Part 3 when we’ll add ROS (Return on Sales) percentages and variance calculations whilst also having a tidy up of the layout.


Slicers For Selecting Last "X" Periods

October 11, 2011

 

Guest post by David Churchward

Using Time Intelligence in PowerPivot can appear scary when you first start using it and I’ve seen some weird and wonderful ways of attacking it, some that look like we’re calling into question the validity of the global phenomenon that is time and others that are quite simply brilliant.

In this post, I hope to provide a simple and understandable approach whilst adding another flexible dimension called “X” where “X” is a variable that can be set by a user in a slicer.  This might be, for example, a P&L report where the user wants to see the last 4 months instead of the normal pre-determined timeframes such as year to date or quarters.  Alternatively, the report may be reviewing trends and, again, year to date or one of the other standard timeframes may not be the most relevant timeframe.

Incidentally – if you’re in the mood for conquering time intelligence there’s a host of brilliant information on this site including posts such as Running Totals Without a Traditional Calendar Table and PowerPivot time intelligent functions: why use ALL() and how to work around it which give some excellent direction on Time Intelligence and some of the complexities.

In this post, I’ll explain how you can put all of the control in the hands of the user by creating an X Periods measure. That is to say that the user can review the last X periods up to a date that they control.

Paving the Way

The crux of dealing with time intelligence in this way is to create a time table that links to your dataset and then a further time selection set of tables from which the user can select their required time parameters.

I’ve created so many of these where the fiscal year isn’t a conventional calendar year that I’ve found that it’s best to deal with the fact that some of the built-in time intelligence PowerPivot functions may not be valid. However, with last X periods, it doesn’t matter anyway.

I need the following tables in my dataset:

FACT_Tran- this is my fact table dataset that I wish to analyse.

clip_image002

Dates- this is an unbroken list of dates representing the timeframe of my dataset. Time Intelligence works better when your fact table is linked to a dates table, chronologically ordered without any breaks.

clip_image002

Year- a list of years in the dataset

clip_image003

Period - a list of periods in the dataset. This would normally be 1 to 12

clip_image004

Year_Period- this is more aligned to a proper time dimension in that it is a combination of the years and periods representing my dataset and carries additional information such as month end dates that we need for our measures (this version has been simplified for the purposes of this post)

clip_image005

X- this is a single column table that allows users to select the number of periods that they wish to see. This could be a numeric sequential number or specific timeframes that are most likely to be selected as in my example here.

clip_image006

Note – I’m using a predetermined set of values for X which contradicts my opening remarks but this is purely for the purposes of showing this example and it really depends on the situation!

Links

clip_image007

You’ll notice that my Year_Period table is linked to my Year and Period tables. My fact table is separately linked to my Dates table. For this method of Time Intelligence to work, you shouldn’t link the two sets of tables together otherwise filtering will occur when we don’t want it to.  You can get around this, but there’s no need to link the two sets of tables anyway.

Determining what has been Selected

Users will be given slicers to select the year and period that they are working on. In terms of X periods, we would consider this to drive the end date of the period to which X relates.  We then work back X periods from that date.

I need to create a series of measures to determine what the user has selected. These are as follows:

Selected_Month_End_Date = LASTDATE(Year_Period[Month_End_Date])

Selected_X = MAX(X[X])

Selected_X_Months_Start_Date = LASTDATE(DATEADD(Year_Period[Next_Month_Start_Date],[Selected_X]*-1,MONTH))

clip_image008

As I’m using Fiscal Periods and my financial year starts in July, period 3 for the year ending 2010 (as selected above) is actually September 2009. I’ve selected 9 as a value for X which means that I’m looking back 9 months from September 2009. My start date is therefore 1st Jan 2009.

Building These Dates into my Financial Measure

Now that we know what the user has selected in terms of X and the date range to which X relates, our financial measure is now very simple.

X_Periods_Value

=CALCULATE

(

SUM(FACT_Tran[Value]),

DATESBETWEEN(Dates[Date],[Selected_X_Months_Start_Date],[Selected_Month_End_Date])

)

We now have a dynamic measure which calculates the start and end dates that we wish to see and applies those dates to our value column.

This can be massaged to calculate year to date, prior year and many other date values as required and it is totally dynamic without any back end calculations.

Taking Things to the Next Level

The method above is what I use regularly to allow for year, period and X selectors to drive the analysis.  This works a treat if your primary reporting timeframe is a month.  However, what if days or weeks are a relevant timeframe?

Let’s adapt what we’ve already done by creating a new dates table.  Let’s call this table Dates2.  This is a sequential date column similar (if not identical) to Dates but NOT linked to our fact table.  Create 3 new measures on this table which are variations of the Selected_X_Months_Start_Date measure that we created previously.

Selected_X_Days_Start_Date

= DATEADD(LASTDATE(DATEADD(Dates2[Date],[Selected_X]*-1,DAY)),1,DAY)

 

Selected_X_Weeks_Start_Date

= DATEADD(LASTDATE(DATEADD(Dates2[Date],[Selected_X]*-7,DAY)),1,DAY)

 

Selected_X_Months_Start_Date

= DATEADD(LASTDATE(DATEADD(Dates2[Date],[Selected_X]*-1,MONTH)),1,DAY)

So why do we need two DATEADD functions in each of these measures? Simply put, we’re essentially using dates here. That’s a simple statement that doesn’t answer anything! What I mean is that we’re essentially not using datetime fields. Each date is therefore a value in it’s own right. Therefore, if we treat 30th September as a day only and therefore a value of 30, when we subtract one day, we would get 29. As a result, we would be reviewing 29th and 30th when in reality we only want to review one day. Since 30th is one day in it’s own right, we always have to add one day to each answer.

Note – take care to ensure that your fact table dates are in the same format.  Time can skew the answer if you’re not careful.

In the previous solution using year and period slicers to select months above, this problem was dealt with by using the Next_Month_Start_Date value which is one day ahead of our month end date selected and held as a value in our Year_Period table.

Our weeks solution is a derivative of the days function where we simply adjust by days * 7 to get to weeks.

I now create another selection table called Time_Type.  This is designed to select which timeframe type the user is selecting.

image

I create a slicer on my pivot table using the Type_Name and create a measure which determines which item has been selected by the user.

Selected_TimeType

= MAX(TimeType[Type_Code])

I then create another measure which uses the selected time type to determine which of the 3 measures to use:

Selected_X_TimeType_Start_Date

= IF([Selected_TimeType]=1,[Selected_X_Days_Start_Date],

IF([Selected_TimeType]=2,[Selected_X_Weeks_Start_Date],

IF([Selected_TimeType]=3,[Selected_X_Months_Start_Date],

   BLANK()

  )

   )

     )

 

image

We need one further measure to determine which date has been selected

Selected_Date2

= LASTDATE(Dates2[Date])

These new measures can now be used in our value measure:

X_Periods_Value

=CALCULATE

(

SUM(FACT_Tran[Value]),

DATESBETWEEN(Dates[Date],[Selected_X_TimeType_Start_Date],[Selected_Date2])

)

What if I Always Want my Report to be X Periods to Today’s Date?

This is simple.  We simply need to substitute Selected_Date2 for a new measure called Todays_Date in our X_Periods_Value measure and substitute Dates2[Date] with Todays_Date in Selected_X_Days_Start_Date and Selected_X_Weeks_Start_Date and Selected_X_Months_Start_Date measures

Todays_Date

= TODAY()

You do however need to take care about what your data content is.  As an example, I use a data warehouse that is built every night.  When I refer to today, I actually mean last night which means that my measure should in fact be

Todays_Date_LastNight

= TODAY() -1

What Does This Mean?

By using this method, we have passed the requirement to determine the timeframe to which a trend or report relates across to the user.  Trend timeframes could change and do we really want to have to recreate reports and dashboards as a result?  I hope that this solution solves that problem.

Having learnt from my previous post, YOU CAN DOWNLOAD THE WORKBOOK HERE.


Profit & Loss–The Art of Cascading Subtotals Workbook

September 15, 2011

For those of you who want to look into “The Art of Cascading Subtotals” further, you’ll no doubt find the workbook useful.  I’ve added a link into the original article, but for those of you who have read through and didn’t find the workbook first time around (because it wasn’t there until the first request came in………..), YOU CAN DOWNLOAD THE WORKBOOK HERE.


Generating a Tag Cloud via PowerPivot

February 21, 2010

blog1 “Navajos say it is good plan to have two arrows for one bow”

– Chief Thundercloud

Thanks to Rob Collie for letting me do another guest post. There will be a bit more Excel than PowerPivot in this post, but that’s what you get when you give MrExcel access to the blog!

Donald Farmer issued an interesting 3rd challenge for the Alpha Geek Challenge contest. He was looking for interesting visualizations created from PowerPivot pivot tables – something beyond the regular charts. 

Please take a look at the contest and vote for your favorites.

Update: Here is a link to an animated GIF showing the tag cloud in action. Watch the slicers along the right as well as the tag cloud and the SmartArt. Also, when you click on the next image, you should see a higher-resolution screen capture.

blog16 My entry uses 40,000 rows of Billboard Hot 100 Music Chart data and throws just about every visualization that I could think of:

  • A Decade slicer. This slicer was dragged outside of the bounding box so that I could make the buttons larger than usual and use two columns.
  • Four paired line charts that compare the decline of the Vocals genre and the rise of Country and Rap. These charts are static and use pivot tables that are not tied to the slicer.
  • A blue data bar that shows the average numbers of weeks on the chart by genre. These data bars are static and are not tied to the slicers.
  • A red data bar that shows the # of Unique Artists by genre for the selected decade(s). This required a DAX Measure with =COUNTROWS(Distinct(Table1[Artist])). This data bar uses one of the new feature in Excel 2010 where the data bar runs right-to-left. This is a cool feature for creating comparative histograms.
  • A set of 10 Win/Loss Sparklines that show year-by-year history for the top 10 artists. (Top 10 is based on song-chart-weeks. If Elvis had 10 songs on the chart for 10 weeks each, that would count for 100 chart-weeks.) More about this sparkline later.
  • The range F3:J18 contains some SmartArt that is dynamically refreshed to show (a) the top 3 genres for the selected decades, and (b) the top 3 artists in those genres for those decades. Now, since SmartArt is always static and never dynamically updated, this one was quite a trick.
  • The range E19:M23 contains a Tag Cloud of the top 10 artists for the selected decade. More about this one later.

A Bit About the Data

The data set was ugly. There were 20 columns of identifying data that provided year, genre, artist, title, beats per minute, followed by 67 columns listing the rank of Week 1 through Week 67. The original data set went back to 1890. Since I eventually hoped to publish a non-PowerPivot example on my public SkyDrive, I trimmed the data down dramatically. I added some calculated columns to count the # of weeks on the chart, the peak, a calculation to look for songs that hit #1 and other songs that peaked at #2 (see the “First Loser” chart on the SkyDrive version). I added a calculation in Excel to calculate the Decade. I deleted the song titles since this would screw up the Vertipaq compression and I did not plan on reporting at that level of granularity. I deleted the 67 columns of weekly data. The final file was 29K rows x 17 columns.

Initial Pivot Tables

My first pivot table had genre in the row area. Chart Weeks was in the value area, sorted descending. From this, I learned that Rock, Vocal, Country, Rap, and R&B were the top genres since 1950. I initially created a single chart showing all five lines, but then I used a trick from my Charts & Graphs book where four smaller charts can communicate more information that one large chart. These four charts became the first visualizations in the top left corner of the dashboard.

blog3

Creating the Sparklines

Sparklines are new in Excel 2010. Professor Edward Tufte imagined Sparklines as tiny, intense, word-sized charts in his book, Beautiful Evidence. As I was working on my book for Excel 2010 charting, I started to realize that (a) the best sparklines are larger than a single cell, (b) benefit from outside labels, and (c) benefit from manual gridlines.

In the image below, you can see that Elvis Presley had a song on the Hot 100 every year from 1958 through 1977, then hit the chart again in 1981, 1982, and 2004. In the second row, Elton John charted every year from 1970 through 2001, except for 1987. Fascinating stuff, right?

To make a sparkline larger than normal, you can increase the height and width of a cell. If that is not possible, then you can merge cells. (As MrExcel, I despise merged cells and I am a bit horrified that I am actually suggesting that you merge cells, but merged cells let you fit a lot of sparkline information into an area). In the screenshot below, cells B25:M25 are merged into a single cell and represent a single sparkline. Cells B26:M26 are merged into a single cell. I had to keep merging each row one at a time.

blog11

The data for the sparklines are off to the right of the screen. There are years from 1958-2009 stretching across the columns. A count of the # of songs by artist occurs in each cell. These Win/Loss sparklines actually need a positive number for the green marker and a negative number for the red number. I ended up using (# of Songs*2)-1. This would force cells with zero to become -1. Cells with 1 would still report 1, and songs with 2 would end up reporting 3, but I didn’t care, as all I wanted was positive or negative values.

The labels for 1970′s, 1980′s, through 2000′s in white are text boxes that I added manually. The dotted white lines that separate out the decades are Excel drawing tools added with the Insert tab. (Tip: Hold down the shift key while drawing the line to make sure it is perfectly straight. Once you get the first line drawn in, use the Drawing Tools Format tab to change the fill color to white and use a dashed line. Once you get the first line drawn in, Ctrl+Drag to make identical copies of that line.

Creating the SmartArt

blog4 SmartArt is new in Excel 2007. Use the SmartArt icon on the Insert tab in Excel. That is a Vertical Chevron List. I created the SmartArt with “Country” as the Level 1 text in each shape and “Bruce Springsteen” as the Level 2 text three times inside each shape. This is all just dummy text to allow Excel to size and shape the SmartArt. I used the SmartArt tools to apply a style and colors.

I created a pivot table that has Genre and Artist as the Row labels and Chart Weeks in the Values area. Both of those fields are set to use a Top 10 Filter to show the top 3 items based on Chart Weeks. This pivot table uses the Decade as a slicer.

Formulas point to that pivot table to build the text that should show up inside the SmartArt. The formula for the first Genre chevron is simple: =AT40. The formula for the three bullet points next to the first chevron is more complicated: =AU40&CHAR(10)&AU41&CHAR(10)&AU42. Those CHAR(10) are line feed characters and translate to bullet points when applied to the SmartArt.

Once you get the SmartArt looking just right, follow these steps:
a) Click inside the SmartArt to select the SmartArt and display the SmartArt Tools Design tab of the ribbon.
b) On the right side of the ribbon, choose Convert to Shapes.
What did this accomplish? Well, SmartArt which is new in Excel 2007 can not do one simple trick of Shapes which have been around since Excel 97. The text in SmartArt is static. In contrast, the text in a Shape can come from a formula in another cell.

You are going to do these steps 6 times:
a) Click on the text box inside the first Chevron. You want the bounding box to be solid, not dotted. If you get a dotted text box, click on the dots to make it solid.
b) In the formula bar, type an equals sign and point to the cell which will contain the first genre.
c) Press Enter.
You now have a dynamic text box on the shape.

Repeat for the other 2 genres and for the three text boxes that contain the top three artists.

Creating the Tag Cloud

Of the whole entry, I like the tag cloud the best.

I will have to try to build an animated GIF showing the SmartArt and the Tag Cloud as you choose various combinations of decades in the slicers.

Here is how to create a dynamic tag cloud using PowerPivot. AGCPost01The underlying pivot table has the Top 10 artists based on Chart Weeks, using More Sort Options to keep the data sorted high to low. The pivot table respects the decade slicer. The pivot table is nothing special, as shown here.

I noticed that many artists were in the database as “Beatles, The”. I wanted to get rid of the word “The”. So, below the pivot table, I used =SUBSTITUTE(AZ40,”, The”,”") to remove the comma and the word “The”.

AGCPost02 This screenshot shows how AZ63:BD63 is merged and uses a formula to point to the original pivot table.

In this new section of the worksheet, I bumped the top artist up to 36 point bold.

It is important to make the Fill Color for those 10 cells to be No Fill. This is the only way to make the area around the words be transparent when you create linked pictures.

Change the color of each value to another color.

Make the #2 artist a bit smaller, and progressively smaller as you continue down the list.

To build the Tag Cloud, you are going to use something that old-time Excellers know as the Camera Tool. The command sequence no longer uses the words “Camera Tool” anymore. Here’s what you do:

Select AZ63 and press Ctrl+C to copy that cell.

blog14 Go to the tag cloud area. Use the Paste dropdown on the Home tab and choose Linked Picture from Other Paste Options. This creates a picture of the top ranked artist. The background of the picture is transparent, so the orange color shows through.  (Technical problem: I changed the decades chosen in the slicer between the R. Kelly item above and the Madonna item at left).

Continue copying each individual cell and pasting as a picture. Arrange the pasted pictures in a somewhat random fashion, trying to keep the words from overlapping. The net result is that as you choose decades from the Slicer, the 10 artists in the Tag Cloud automatically update.

Here is the tag cloud for the 50’s:

blog12

Here is the tag cloud for the 80’s+90’s:

blog13

So – there you have it – a lot of basic Excel blocking and tackling, along with a bit of cool DAX measures to calculate the number of distinct artists in a genre. Again, please take a moment to visit the contest and vote for your favorites.

Will this Work On the Server Version?

Rob and I had a discussion this morning about the running this on PowerPivot Server. Since both the tag cloud and the SmartArt replacement are using shapes, none of those will render in Excel Services. Thus, this is a client-only solution for now.

Technorati Tags: ,,,

-Bill Jelen