SQL UNPIVOT Makes My Data Skinny

 

Guest post by David Churchward

If you’re anything like me, you will have eagerly sucked data into PowerPivot, keen to get cracking with solving all of your reporting needs.  If you’re an Accountant (amongst others) doing this, you will no doubt have brought your trial balance dataset in and found endless columns of values representing monthly values.  “How do I get time intelligence to work with this?” is probably one of the first issues you’ve encountered and then, further down the line, when you’ve conjured up complex measures to resolve this problem, the words “why can’t I get this to run quicker” pass your lips.  On this note, everyone should pay close attention to Less Columns, More Rows = More Speed! where Rob proves that the tall and skinny table is much more effective than a short and wide version.

In this post, I’ll explain how SQL can solve this problem by transforming the data on the data extraction process using a SQL function call UNPIVOT and it’s available to everyone – not just developers!

This is PowerPivot using Pivot tables – why on earth do I want to UNPIVOT.  Just like a kid who ties their shoelaces wrong, Mum has to untie in order to do it properly!  The fact is that data transformation has to happen in many situations to leverage the full power that PowerPivot can provide.

What’s this SQL Stuff – Surely that’s for Developers?

I was reading Rob’s recent posts on Why PowerPivot is Better Fed From a Database and it made me realise that there’s a grey area that represents how far an Excel Pro can go in data transformation.  I’m going to leave the “Developer V Excel Pro” debate to Rob (but believe me, he’s right).  As an Accountant who learnt SQL but who is an Excel Pro through and through, my view is that Excel Pros should be able to transform their data to the extent that a SQL view can.

Some of you will be aware, but others won’t, that there is a SQL query editor in PowerPivot.  If I’m upsetting Developers by revealing this, I’m sorry, but, if you’re worried, please take this as an opportunity to provide your Excel Pros with a secure Dev environment!

Accessing The SQL Query Editor

You can access query editor either when you first import data or by editing a table that you’ve already imported.

Accessing Query Editor on First Import

  1. From the PowerPivot window, select From Database > From SQL Serverand set up your connection as you would do normally.
  2. Click Next and select the option Write a query that will specify the data to import

image

Accessing Query Editor From a Table Already Imported

  1. Select the table that you want to affect in the PowerPivot Window and select Design > Table Properties
  2. Change from Table Preview to Query Editor

image

image

And there you have it.  A script that is running to import the data that you’ve specified.  We can now edit this to transform the data.

Before I go ahead, I just want to set a benchmark regarding the data that I’m using.  When I’ve brought this data in, I’ve got 38,183 records using 14 columns.  We’ll see what the UNPIVOT does to this in the end.  Everyone’s expecting a lot more rows but fewer columns, right?

image

The SQL “UNPIVOT” SCRIPT

So, here it is, the script to use in Query Editor mode:

SELECT

– These are the fields that I want in my final dataset

NCODE,

NNAME,

VALUE_CATEGORY, – NEW FIELD 1 – The field that takes my value column names

VALUE – NEW FIELD 2 – The field that takes my values from those columns

FROM

– I specify where these fields come from

(

SELECT

NCODE,

NNAME,

NTURNOVER_C1, – Value Field

NTURNOVER_C2, – Value Field

NTURNOVER_C3, — Value Field

NTURNOVER_C4, – Value Field

NTURNOVER_C5, – Value Field

NTURNOVER_C6, – Value Field

NTURNOVER_C7, – Value Field

NTURNOVER_C8, – Value Field

NTURNOVER_C9, – Value Field

NTURNOVER_C10, – Value Field

NTURNOVER_C11, — Value Field

NTURNOVER_C12 – Value Field

 

FROM dbo.AZZ_V_POWERPIVOTPRO_TB –Substitute this with the name of your source table or view

) PIVOT_TABLE

– Unpivot the table               

UNPIVOT

(VALUE FOR VALUE_CATEGORY in        – These are the names given to the

       — value field (NEW FIELD 1) and value column name (NEW FIELD 2)

(

– Enter value fields as per the select above

NTURNOVER_C1,

NTURNOVER_C2,

NTURNOVER_C3,

NTURNOVER_C4,

NTURNOVER_C5,

NTURNOVER_C6,

NTURNOVER_C7,

NTURNOVER_C8,

NTURNOVER_C9,

NTURNOVER_C10,

NTURNOVER_C11,

NTURNOVER_C12

)

) AS UNPVT

WHERE VALUE <> 0

ORDER BY NCODE, VALUE_CATEGORY

Note – green text within the script are comments to provide some direction as you use this script.  You can copy these into query editor too – PowerPivot will ignore them.

How to Use This Script

  1. Copy and paste the script into your query editor
  2. In the first SELECT, enter the field names that you require in your final dataset.  These should correlate to fields in the table except where I have noted “NEW FIELD 1” and “NEW FIELD 2”.  These “NEW FIELDS” will carry the value elements that we are unpivoting.
  3. In the second SELECT, detail the fields that you have already detailed in the first select together with the column names carrying the values that you will unpivot.  Also change the table or view name in the FROM clause to reference the source data that you are using.
  4. In the UNPIVOT section, detail all of the value fields again that you are unpivoting.
  5. In the ORDER BY clause, select the fields that you wish to use for dataset ordering.

When you save this, your dataset will have been narrowed and you’ll have a new field denoting the data type (or value column reference).

image

You’ll notice that we’re down from 38,183 records to 7,103.  This won’t always happen.  The reality is that we’re able to strip out the zero value cells by using the WHERE VALUE <> 0 clause.  If you’re carrying a number of cells that are zero valued, we won’t see that item as a record in this new dataset.  As a result, we get fewer records in this case.  In any event, your data is now a lot more streamlined and ready for more efficient measures to be written.  You don’t have to include the WHERE clause if you prefer, for whatever reason, to bring all records in.  If that’s the case, just delete that clause.

One Further Step

To make this truly useful, we need to translate VALUE_CATEGORY into something more meaningful.  I would normally expect this to be a date but most trial balances use column names such as “CURRENT1” so that the system isn’t date dependent.  Another table would be required that the VALUE_CATEGORY column can link through to which translates that value, thereby opening up the whole time intelligence functionality.

A Quick Warning

This UNPIVOT approach has given you the power to transform your data into a more streamlined dataset.  However, by using this approach you’ve created a local version of the script.  You won’t be able to reuse this without copying and pasting it into other workbooks (or at least I haven’t found a way).  However, those Developer friends of ours can give you a centralised version that can be controlled and reused more easily.  If you ask these guys nicely, they might create a view or stored proc to do this for you.  I’ve found with my team that salt based snacks, chocolate and the occasional Spag Bol for lunch does the trick and, if it doesn’t, I go ahead and do it anyway!

8 Responses to SQL UNPIVOT Makes My Data Skinny

  1. sam says:

    I am impressed. For some strange reason this gives me an error when I use it on Access.
    I have removed the dbo. in the from clause and referred the table inside []

  2. Lee Hawthorn says:

    Another top class post. This is a good scenario to show where the worlds of the business analyst and database dev are merging.

    Lee

  3. Erik Olsson Dibbern says:

    This is great! It gives a great flexibility to turn the data around like this.

    I am also (as Sam) experiencing problems when trying to do this from an Excel sheet as the data source. It gives me error in the FROM clause and I am guessing it is the declarations that is hard to nail when not doing this from a SQL data source.

    Any ideas?

    /Erik

  4. Silvia says:

    This is a great solution! did anybody figure out the FROM problem?? I am getting the same error.

    Thanks!

  5. Silver_A says:

    This only works if the data comes from SQL or a datasource that supports that funcion. Unfortunatelly this doesn’t work with ACCESS as a datasource as ACCESS doesn’t support that function. Unfortunately…

  6. Paras Doshi says:

    When the article was published, Data Explorer preview might not have been publicly available. But it’s available in preview for Excel 2010/2013 – it’s VERY easy to unpivot data using data explorer. It’s just few click! Here’s a tutorial that I wrote: http://parasdoshi.com/2013/04/25/unpivoting-data-using-the-data-explorer-preview-for-excel-20102013/

    And Data Explorer can do other amazing things too, Unpivot is just one of the many functionality it offers.

  7. John says:

    This is giving me a “Syntax error in FROM clause”. Any ideas?

  8. For those coming from a statistical background or perhaps those of us who have dabbled in/with R, this concept is what is what Hadley Wickham coined as tidy data http://vita.had.co.nz/papers/tidy-data.pdf (see packages reshape and reshape2) and has several additional advantages to the ones listed in this post.

Leave a Comment or Question