PowerPivot Refresh: CPU/RAM Spike at Completion

December 6, 2010

 
I mentioned this in a recent post, but I figure a picture illustrates it better.  This is a screenshot of task manager as a PowerPivot refresh completes.  This one ran for about 30 minutes, so this represents just the tail end of the process:

PowerPivot Data Refresh CPU and RAM Spike 

Note the two highlighted points – one of the 4 CPU’s pegged at 100% for awhile – and while the other 3 did go “quiet” at the very end, they WERE active for part of the time that the fourth CPU was at 100.  So – CPU gets precious near the end of refreshes (after hovering around 50% for much of the process).

Even more notably, RAM usage spiked by nearly 2 GB!  The PowerPivot file in question was 1.45 GB on disk when it was complete.  And that workbook was still in RAM after refresh completed, so the 2 GB spike was pure overhead during the final compression process.

(Side effect:  Even if you have enough RAM to load a workbook, that does not mean you have enough RAM to refresh it.)

Allocate your server RAM and CPU around the refresh process, folks.


Hardware/Capacity Planning: From the Trenches

November 30, 2010

So How Many Servers Do I Need in My PowerPivot SharePoint Farm?

As people increasingly move up from just dabbling with the addin, and decide to start leveraging the publish/schedule/share/secure benefits of the PowerPivot for SharePoint infrastructure, I am getting this question more frequently.

How many servers?  How much RAM?  And less often but just as important…  how many CPU’s?

Time for an old joke:  “Ever hear the one about the statistician who drowned in a river that was, on average, only 3 inches deep?”

That’s the whole joke.  And as time goes on, I only find it funnier.  And funnier.  But first, some basics.

The Three Primary Server Roles

For the vast majority of PowerPivot deployments, you will mostly need to concern yourself with three server types:  Excel Services, PowerPivot Engine, and Data Source DB Servers:

PowerPivot Server Roles Summary For RAM and CPU Planning 
(Click for Larger Version)

A few notes on the diagram up front:

  1. I recommend viewing the larger version – there is information on the diagram (in the notes) that I will not repeat below.
  2. Don’t read too much into the 3/4/2 ratio of Excel Services/PowerPivot/SQL – I included different numbers of servers in each tier intentionally, to illustrate that you can scale each tier out at independently.  But, for instance, I do NOT expect you will need 3 Excel Services boxes for every PowerPivot box (you will likely need less).
  3. Although separated above, server roles CAN be combined onto single servers – for instance, many folks can get by with an “all in one” server where everything above lives on a single box.  Also, even in a multi-server farm, one of the MS-recommended configs is to combine Excel Services and PowerPivot onto a single box, and then deploy as many of those combined boxes as needed.
  4. There are some elements missing from the diagram – for instance, there is a Web Front End (WFE) role, and a SharePoint Configuration Database role, but in my experience, PowerPivot does not put unique strain on those elements (except for config db disk space).  Your mileage may vary of course.

Early Planning Efforts

Back in the Spring, I set down to the very serious task of “how much hardware will we need on our SharePoint servers in order to handle user loads?”  And naturally, I made a very serious spreadsheet to model it out:

PowerPivot SharePoint Hardware Planning Spreadsheet v1

Wow, what a spreadsheet.  I mean, I even used the =POISSON.DIST() & =BINOM.DIST() functions – clearly, such spreadsheet horsepower indicates accurate results!

Turns out, that spreadsheet was nothing more than an amusing theoretical exercise with little bearing on the real world.

There were multiple problems with that spreadsheet:

  1. It relied on outright guesses as to how often/how intensely consumers would utilize the published reports.
  2. I was not yet aware of the CPU-gobbling power of slicers, a power that makes real-world queries many times more CPU-intensive.
  3. Most importantly, it assumed peak usage would occur in the morning, when consumers are most in need of fresh information.

Solution for 1) and 2)

These are in some sense the simplest to address.  Quite simply, put up a pilot solution and observe the usage characteristics.  You will learn a lot about usage patterns as well as what that does to CPU and RAM.

But even better:  you may choose to simply ignore these factors for now and focus on problem 3 instead.  Here is why:

Peak Load is Probably Experienced During Scheduled Refresh

If you’ve looked at the larger version of the diagram above, you have seen a hint as to what I’m about to say here: 

At Pivotstream, our nightly refresh process puts far more strain on our servers than our users do, and is what we now plan our hardware around.

I will explain further, so you can evaluate whether your situation will be similar.  We receive new data on a nightly basis (typically late at night).  That data gets ingested into SQL server, and then our PowerPivot refresh process begins.

Once PowerPivot refresh begins, we have about 6 hours, tops, to get all models and reports refreshed, so that when business opens in the morning, everyone has access to fresh insights.  6 hours sounds like a lot…  until you discover that it isn’t.

A lot happens during scheduled refresh!

Keep in mind that the PowerPivot model (the embedded database living inside your PowerPivot workbooks) must first be refreshed – let’s call this phase “Model Refresh.”  This can put a lot of strain on your SQL servers just in terms of supplying the data, since PowerPivot v1 pulls a fresh copy of all tables (no incremental refresh).

As the data flows into the PowerPivot servers, a LOT of CPU power goes into compressing that data into the storage format.  During the refresh process, RAM usage steadily climbs as well.

As model refresh nears completion, RAM usage spikes upward quite a bit, sometimes by as much as 50-100% the size of the resulting workbook.  CPU usage also spikes during this phase.

Once the model is done refreshing, PowerPivot for SharePoint then triggers a refresh (via Excel Services) of the Pivots and Cube Formulas in the workbook, so that when thumbnail screenshots are “snapshotted” for the report gallery, those reflect the latest data (let’s call this phase “Pivot Refresh”). 

Yes, that process initiates via Excel Services, but that refresh immediately results in a meaty set of queries sent back to the PowerPivot servers for processing.  So, CPU usage spikes again.  And the model in question is held in RAM while that happens, so that RAM can’t be recycled into the pool to be used for model refresh.

Remember, this all happens for a single workbook.  And if you hang a bunch of report-only workbooks off of a single “hub” model workbook (as we do all the time), the amount of time it takes to finish Pivot Refresh can actually exceed the time for Model Refresh.

If you have to cram all of this into a narrow nightly window, chances are that you will need more hardware to pull it off than you ever will need for normal daily usage!

Rough Guidelines for Your Own Situation

This post is running a bit long, so I will try to be succinct in closing:

  1. If a significant percentage of your PowerPivot models and reports will be refreshed nightly, refresh is likely going to be a larger peak strain than normal interactive usage
  2. Even if you only refresh weekly (or less often), if the execution window between “data is ready” and “reports must be ready” is basically still a single night, then refresh is still likely going to drive your peak hardware need.
  3. If you think “Peak Load = Refresh Process” is likely the case for your organization, I suggest ignoring interactive usage projections during your hardware planning process, and instead developing some prototype workbooks (with rich data sets and mutliple report sheets with 100% realistic slicer sets, measure complexity, etc.), and then putting those workbooks through the refresh process to get a baseline.

Illustrated guide to Excel Services Chart Rendering

October 4, 2010

 
COMMON QUESTION:
  “Hey, will this chart render unchanged in Excel Services, or will it look different?”

MY STOCK ANSWER:  “Hmm, I forget, let’s try it out”

I hate giving that answer.  So today, I ran a little test.  Simple workbook, 12 charts that are all flavors of fancy.  Not an exhaustive test, but a reasonable sample.  Here are the 12 charts in Excel client (desktop).  Click for a larger version.

Client2 
12 Charts in Desktop Excel

Green Circle = Renders 100% identical in Excel Services
Red X = Renders less than 100% identical (sometimes small differences, sometimes large)

Now, here are the same 12 charts rendered in Excel Services:

ECS
Same 12 Charts, As Displayed on SharePoint (Excel Services)

Again, note that being flagged with an X isn’t the kiss of death – I just didn’t want to bog down in “grading” the differences.  Sometimes the changes are big (top left chart) and sometimes the changes are astoundingly small (bottom middle chart).

Either way, having an illustrated reference is going to be helpful for us, so I thought I’d share it.  Feel free to submit questions/comments/other examples.  I may expand this reference over time.


Customized SharePoint Report Portal

September 23, 2010

 
Believe It

 


“I can’t believe it’s not SharePoint, er, Excel, er, wait… it IS Excel and SharePoint…  so confused…”

-Me

 

 

 
A couple weeks back I posted screenshots of our custom SharePoint master pages.  The functionality was great, but the aesthetics were still a little bland.

We’ve had some plastic surgery done since then.  Here’s the new home page:

Custom PowerPivot SharePoint Report Portal
Redesigned Customer Report Portal

We kept all of the functionality from before (no Ribbon or Site Actions menu unless you are an admin, etc.) but there’s a new component now as well:  the treeview.  There are five reports under each node of that tree, so our customers can quickly find any of the 100+ reports they are looking for.

(No, we don’t use the built-in Report Gallery – it really doesn’t handle more than 2-3 reports in practice).

And here is one of the redesigned report pages:

Custom PowerPivot SharePoint Report Page 
Redesigned Customer Report Page

As I said before, if you need advice or help doing something similar, drop us a note, we’ll see if we can work something out so that you don’t have to start from scratch:  info@pivotstream.com


Cleaner SharePoint pages, and an offer to share

August 30, 2010

 
Worlds Most Renowned Designer of Business Intelligence Weaponry (Nonlethal of course)

 
 
“Dr. A. Heller. Weapons designer.

Innovator. Inventor. World changer.”

“…why does he live in a deserted amusement park?”

 

Building an Arsenal of PowerPivot Components and Tools

In the past six months assembling the core PowerPivot-fueled platform at Pivotstream, we’ve been developing some components and tools to supplement the core PowerPivot v1 product, both in terms of what our customers see as well as “behind the scenes” tools for us.

You might expect this from an ex-Microsoftie whose prior job was designing software – I really can’t stay out of that game completely.  (Doubly so given my background on the PowerPivot and Excel teams).  I’m always starting sentences out like “You know, I bet we could…”

One small example of this that I shared last week:  VBA macros that we’ve been using and modifying to make workbook creation/editing go faster.  That’s just the tip of the iceberg though really, so let’s take a look at the first in a series of larger components.

First Example:  Cleaner SharePoint Pages

First, consider what the top of the default SharePoint page looks like:

Standard SharePoint Non Report Page

Default SharePoint Home Page Etc. 

And what the default Excel Services report page looks like:

Standard SharePoint PowerPivot Report Page 
Default Excel Services Report Page

Not bad for out of the box, but there is definitely room for improvement.  Let’s look at those again with some elements highlighted:

Standard SharePoint Non Report Page Highlighted 
Extraneous Elements Highlighted

Things that many BI sites would rather NOT have:

  1. Site Actions Menu – This is fine for a standard SharePoint experience, where users are often pseudo-admins of the site.  But for a BI site, where you want as little noise as possible, and don’t want users to have anything resembling admin control, this is a liability.
  2. SharePoint Ribbon – Don’t get me wrong, I actually am LOVING the SharePoint Ribbon.  Way more than I expected.  But again, on a BI site, the user’s task is to navigate as quickly as possible to actionable information.  And you will be surprised how often someone will call you and say that they can’t find the option they are looking for on the Page ribbon tab, when the button they want is right there in the report.  Again, it’s just noise in the BI scenario.

Now let’s take a look at the Excel Services page again, with the same level of focus:

Standard SharePoint PowerPivot Report Page Highlighted 
Extraneous Elements Highlighted Again

Things that many BI sites would rather NOT have:

  1. Excel File Tab and Toolbar – Just like above, this is a great element for many scenarios.  If you view Excel Services as “Excel in the browser,” which is precisely its mission in most cases, then yes, you need a File menu for Saving, Downloading, Checking Out, etc.  But in a BI site, again, it’s noise, it’s confusing because it’s read only, etc…  plus it distracts from the well crafted perception that this is NOT Excel but a Web Application.  Most people just feel GOOD about web apps, and they feel kinda icky about Excel.  People are more likely to use something that they feel good about.
  2. File Name – Same thing.  The feel-good vibe of the app is disrupted by the presence of an Excel file name.  “Are you saying that’s a file on my desktop?  Where is it?”  Subtle little things like this are easy for techies like us to dismiss, but ignore them at your own peril.
  3. Unsupported Features Bar – This one requires little explanation, it looks horrible.  And it suggests to the user that something is broken, when really, it’s just those slicer parent controls.  In the VBA post I gave you a macro for removing all of those right before you save, but in practice that isn’t a viable solution.

The Simplified/Modified Versions

None of those problems was going to fly for us at Pivotstream, so we developed our own modified set of page templates.

First, a really simple, sample home page as a customer would see it:

Pivotstream HomePage Theme 1

Note also the little copyright notice at the bottom, that’s built into all of our pages now as a result of the template.

These templates are also themeable, so we can change the color scheme, page content, and images from within SharePoint itself with a few clicks:

Pivotstream HomePage Theme 3

(No one has ever accused me of being an artist, so at time I choose intentionally ugly colors – you can do much better I am sure.  Bet you can’t whip up a PowerPivotYoda like I can though).

Improvements to Note:

  1. No SharePoint ribbon to distract
  2. No Site Action menu (ditto)
  3. No Search box (we can turn that back on once we have it hooked up to work properly)
  4. No QuickLaunch menu telling me about all those doclibs and such on this site that I don’t want the users to concern themselves with.
  5. Lots of real estate for adding content and links.

Customized Report Page Templates

Next, here’s a report page we have in production, with a very minimalist style:

PivotStream PowerPivot Online Service 1

Zooming in on the top, we see:

Pivotstream PowerPivot Report Page Zoomed

Improvements to Note:

  1. Still no SharePoint Ribbon or Site Actions menu
  2. No Excel “File” ribbon
  3. No “Unsupported Features” warning
  4. “Download Snapshot” appears as a link in the header, separate from the File ribbon
  5. No Excel Icon on the browser tab – actually it is usually our Pivotstream logo, but IE forgets those icons all the time and puts the little IE logo in instead.  Sigh.  Firefox is much better about that I’m told, but I’m still running IE.
  6. No Excel filename in the title bar
  7. “Breadcrumbs” nav bar in the header, for nav back to the site home

In other words, everything extraneous removed, and only functionality remains.

Fear not, admins still see the suppressed controls!

Of course, when *I* log in to these sites, I often need to make changes, like add a new page, update a setting, etc.

So, I need a lot of the things that are suppressed in the screens above, like the Site Actions menu, and the SharePoint ribbon.

The screenshots above were all taken when I was logged in under my User account.  Here’s what I see when I sign in as my Admin account:

Pivotstream Admins See the Ribbon Etc

Neat huh?  It detects I am an admin, and gives me back all the stuff like the SharePoint ribbon, Site Actions, the notification I have the page checked out, etc.

Under the hood

OK, so what all is going on here, in order to make this work?  Here’s a rough list:

  1. A new SharePoint Master Page, from which all user-facing pages are derived
  2. Admin vs. User detection logic in that Master Page, and subsequent suppression of certain page components when User
  3. A new Layout for the report page, that includes the Excel Services web part with default settings already applied
  4. A new Control (part server-side and part Javascript) that implements Download Snapshot functionality without having to expose the Excel toolbar (whose options are confusing in a pure BI environment).

That’s about all I am qualified to convey – I supervised the creation of all of this, but I did not code it.  (Similarly, I can tell you about how Excel’s calc chain involves a separate linked list per processor, but I couldn’t tell you how to write something similar now could I?).

That Excel Services Toolbar – Isn’t that part (hiding it) simple?

Yes and no, and then no again :)   On one hand, it IS easy to create a new web part page, slap an Excel Services web part on there, and then customize its settings to your liking.

What I did NOT find easy, was repeating that process every time I wanted to publish a new report.

With this new system, I just instantiate the right template, and change one setting to point the template at the right workbook.  Done.

Also, the Download Snapshot feature turns out to be pretty important.  For us, and most locked-down BI deployments of PowerPivot, the snapshot feature is the ONLY way to print something that looks decent.  And it’s the only way to take a fixed view of the entire workbook offline.  So, we didn’t want the File menu, but we needed that feature to stay.

So we re-implemented it :)

What about the Sharing thing you mentioned?

It cost us well over $20,000 in billable time to build all of this, in addition to my time designing and supervising it.  In my opinion, there is no sense in others having to sink the same cost (and their own time) rebuilding the wheel.

If you are interested in getting your hands on any of the components I described here, or if you are interested in whether they can be customized to your specific needs, drop a note to:

     software@pivotstream.com

Unlike the VBA Macros and other small stuff, I can’t give away modules like this that the company paid for.  But I am free to explore the idea of sharing with the community at a fraction of what it cost us, as a means of recouping some of our development budget.

And if I can make this sort of project pay for itself, that just means I can spin up other such projects :)   This is very much an experiment at this point.

Anyway, drop us a note if you’re interested, and we’ll see if we can figure something out.