What is a “Thin” Workbook?
At Pivotstream, we came up with the term “thin workbook” to describe an Excel workbook that does not contain a PowerPivot model, but that DOES connect to another workbook that does, which we call a “core” workbook. In other words, thin workbooks use core workbooks as their data source. Others have called this “hub and spoke,” and I may even like that phrasing better, but hey, we’ve been using “thin and core” at Pivotstream for nearly a year and it’s hard to change.
So, a thin workbook contains pivots, charts, cube formulas, etc. – it’s a report-only workbook. And whereas our core workbooks may be 200 MB or more, our thins are often 5 MB or less.
Why create thin workbooks?
Here are a few reasons why you may want to use a thin workbook approach.
1) To avoid duplicating data. If you want to create multiple workbooks that offer different views of the same data, which we do all the time, you probably don’t want to duplicate the PowerPivot data across all of those workbooks. That duplication needlessly burns RAM on the server, takes extra time to run scheduled refreshes, and also increases your maintenance work (ex: if you want to modify a measure, you have to do it multiple places instead of one). I *highly* recommend you never duplicate PowerPivot models like that, and thin workbooks are a good alternative.
2) To lock down data. In many ways this is another flavor of #1. At Pivotstream we often create one workbook that contains a full set of measures, and then another that omits certain measures that are particularly sensitive. Both can use the same core workbook as a data source, but if you properly lock down your permissions on SharePoint, you can give one set of users a different level of data than another set.
3) To avoid the refresh on open performance problem. In the last post about refresh, I pointed out that the PowerPivot refresh service on SharePoint kinda “cheats” a little bit – it refreshes the data in the PowerPivot model, but the data in the pivots (the data in the Excel sheets themselves) is untouched. To work around that, PowerPivot for SharePoint sets the “refresh on open” flag, forcing Excel to refresh all pivots whenever the workbook is opened.
That can be slow. In many cases in our work at Pivotstream, it is VERY slow. Since the PowerPivot refresh service does NOT touch the thin workbooks, “refresh on open” never gets set.
Of course, that creates another problem doesn’t it? The data in the Excel sheets then remains stale, with no automatic way to refresh it. That’s no good.
But we are busy little beavers at Pivotstream and we don’t give up easy :) We started working on this problem in March 2010. More on this later.
How do you create thin workbooks?
The first step is to publish your “Core” workbook to SharePoint. You CANNOT connect a thin to a core sitting on your desktop – the desktop version of the PowerPivot engine does not support cross-workbook connections like that.
Once you have your Core published, there are two primary ways to create a thin workbook: Start from Scratch, or Hack like Mad.
Boring: Start from Scratch
In this method, all you do is go to the PowerPivot Gallery and click the highlighted button:
More Fun: Hack Like Mad!
For those of you who prefer to color outside the lines, here’s another technique that I often find helpful. You’ve created a single Core workbook, and it has lots of report sheets in it. You don’t want to start from scratch and throw all of those report sheets away, right?
So here’s what you do:
- Create a copy of the workbook. Just copy/paste the file in Windows Explorer. Name one file something like Core.xlsx and the other Thin.xlsx
- Delete pivots from the Core workbook. Two reasons for this. One, it reduces confusion – when you open the Core, you know it’s the core because you don’t see any report sheets. But deleting pivots also makes the scheduled refresh process on SharePoint run faster – after PowerPivot refreshes the model on the server, it then needs to refresh the thumbnails in the gallery. To do that, it asks Excel Services to open the workbook, which triggers “refresh on open.” Why pay extra time and CPU penalties on reports no one is going to see?
- Publish the Core workbook to SharePoint.
- Change the connection string in the Thin workbook. Open it up in Excel, and in Connection Properties change the “$Embedded$” string to be the full URL to the core workbook on SharePoint. Example:
When you click OK, Excel performs a full refresh against the new source. Be patient. Once that is done, your thin no longer connects to the PowerPivot data embedded in its own file, it now queries the server copy instead.
Of course, now you probably want to make the file smaller and get rid of all that unused PowerPivot embedded data. Remember where that lives? This post explains in more detail, but here’s the image of the thin workbook renamed from XLSX to ZIP:
Item1.data is the file you want to get rid of. But if you delete it, you will corrupt the file. So I just create a 0-byte Item1.data file on my desktop and copy it over the top of the file inside the ZIP.
Rename back to XLSX and you have a true “Thin” workbook with all of the original report sheets intact and functional!
Summary, and a hint at the next post
Thin workbooks are a VERY useful tool in a PowerPivot system. For Pivotstream, they are absolutely essential.
The only trick, as mentioned above, has been getting those thin workbooks to always be up to date. I’m happy to say we have that problem solved.
I’m pretty sure others are going to need this sort of thing, too, as well as some other internal tools and components we have built. If you have interest in beta testing our PowerPivot Accelerators, as we are calling them, watch for the next post