My last 2 blog posts have been about data compression – that theme continues today. In December I blogged about the basics of compression and impacts that columns have on file size and did some testing with randomised data. In January I went a bit deeper and talked about restructuring your data to get better compression. In the comments section of this last post in January there was a seemingly innocuous comment from Mati Selg at the bottom of the…
I was relaxing during my vacation thinking some more about column compression in Power Pivot. One of the main things to know about compression is that a high level of cardinality is your main enemy (ie a large number of unique values will mean poor compression). I started to think about how I could reduce the cardinality of one or more columns in a large data table I use for a client. This blog covers the process I went through…
I have read a lot about how good Power Pivot compression is over the last couple of years, and I have a pretty good understanding of how it all works. But there were some doubts in my mind after some recent reading, so I thought I would simply do some tests to firm up my understanding – I am sharing that testing and learning in this blog today. There are a number of compression algorithms that Power Pivot uses. Today…
by Matt Allington Over the last 12 months I have engaged with may people from lots of different companies on the topic of Power Pivot. And I have been quite surprised at the low quality of some of the PC equipment people are given to do their job. I am not talking about people that need to do their emails and the odd Power Point presentation here – I am taking about people that are power users and data analysts…
Post by Rob Collie
Does Your Pivot Look Like This? Does its Slow make you Sad? Time for a Fix!
Tell me if this sounds familiar…
Yes, you know that pivots are meant to show aggregations. Summaries. Pivots were NOT invented to display thousands of rows of detail data.
But still, sometimes you need to do precisely that. The biz needs its list of customers and how much they’ve been buying, for instance, and all that data is in YOUR Power Pivot model.
And hey, pivots are really the only game in town* for table-shaped display of data. So, you build one of the monstrosities like the above.
(*OK yeah, you DO know about this thing called DAX Query Tables, but those are seriously a pain to set up. So, no. You rule those out before even starting. Just like me!
So You Do The Flattened Pivot Dance, Right???
In pictorial form…
The Flattened Pivot Option – Found in the PivotTable Dropdown in the Power Pivot Window
Next, You Pile a Whole Bunch of Fields Onto Rows
Then, on the PivotTable Design Ribbon Tab, You Turn Off All Subtotals
And Voila! It’s Slow as Heck.
Post by Rob Collie Looking for the Fastest Desktops, Laptops, Tablets, and Servers! “What’s a good computer for running Power Pivot?” That’s a question we get asked all the time. And we do have lots of advice – things that make a good CPU, RAM considerations, etc. – but that always translates to a hunting license of sorts. It would be much nicer for us to say, “here are three models in your price range that we recommend.” And to…
An 11,000-Row Calendar Table Spanning from 2000 to 2030:
Most of the Time This is Harmless Overkill
A 60x Speed Improvement From a Most Ordinary Place
I’ve been doing some work lately for a client who really pushes the DAX envelope. One of the top-three models I’ve ever worked on in terms of complexity, no doubt. And really, my role is just to help fine-tune it and add a few bells and whistles. They built this sucker themselves and I am way impressed.
Crazy stuff. Formulas that use outlier dates from one Data table (“fact” table) to then subsequently filter another Data table (via its related Calendar table), but then wrap that up inside a MAXX inside a SUMX… and it all makes perfect business sense. It’s magic.
But speed ain’t its strong suit. We tried all the usual tricks – “de-importing” unneeded columns, replacing calculated columns with imported versions, etc.
And it was still way too slow. Then we tried something even simpler, and things got 60x faster.
A PowerPivot Model with “Single-Step” Relationships is a “Star,” and one with “Daisy-Chained” Relationships is a “Snowflake.” Snowflakes are Often Slow. And Evil. Don’t trust them.
(Images taken from page 194 of the book).
“Hey, Why Did Things Get So Slow?”
Got a great question this week from Dave Boylan:
“I think I read in your book that you should always use dimension (lookup table) data in your slicers and page fields (even if the same data is in your fact/data table) because they behave better and perhaps use less memory. This makes sense as my fact table has 2MM rows and the slicer has three distinct values.”
Yes – if you have a lookup table, you should use fields from that lookup table on your pivot, rather than their equivalent fields from your data table. My recent post on using fields from your date table on the pivot is just one example of this.