New Docs from Redmond!

April 27, 2010

Dax WhitepaperCouple quick things today.

First, Howie Dickerman, one of the SuperHeroes of DAX, has written a new version of the DAX whitepaper.

I had a chance to discuss some of it in person with Howie when I was in Redmond.  It opened my eyes to a number of things I had been missing.  I sadly have not had time to review it in detail yet – there are other things (good things) afoot at PivotStream these days…  like, um, I dunno, our first industrial-strength PowerPivot farm in the sky…  but I plan to get back to DAX shortly.

In the meantime, I have posted the updated whitepaper in the Samples Gallery out on the FAQ Site.

PowerPivot Architecture Poster

 
Also, Denny Lee shared an excellent new poster that displays all the components of the PowerPivot architecture.  I’ve also uploaded that to the Samples Gallery

Note that the poster is HUGE, meant to truly be a poster, so it’s not really something you view in a web page.  Hence the PDF format.  It’s also available in other formats.

 


Six Months With PowerPivot, part one

April 23, 2010

simply_red1

   
“If you don’t know me by now…  you will never never never know me…”

In honor of PowerPivot v1’s public release, I thought I’d offer up my thoughts on my last six months working with the product. 

Remember, I physically left Redmond in August and then officially left Microsoft in February, but have been putting PowerPivot through its paces that whole time, with a wide variety of data sets and business scenarios.

So in some sense, you can’t really get a closer view of PowerPivot than I have, while still maintaining any sense of neutrality about the product.  So here it is, the honest truth…

It passed the Great Football Project Challenge

When I started the Great Football Project back in October, I really did not know what to expect.  Honestly, I wasn’t sure that it was a great idea – was I going to get 5 days in and discover that PowerPivot simply wasn’t suited to that problem?  There was genuine potential for embarrassment.  But I needed something to blog about, and I was anxious to get started, so I just dove in.

After a few weeks, I was still holding my breath a bit.  I was past the basics but hadn’t really pushed the envelope at all.

At some point though I just stopped wondering.  It wasn’t until recently that I looked back and realized that I was building features, like the Rank measure, that were never in the original project at all. 

I can say now with confidence that PowerPivot can absolutely handle the business logic phase of the football project, which the consultant back in 2006 described as one of the most complex cubes he’d ever seen.

I call that a success.  Note how I specified the “business logic phase?”  That brings me to the next topic…

It is NOT a data cleaner/shaper

OK, this is perhaps obvious since the PowerPivot tables cannot be edited, and there are no features/functions that let you reshape rows.  Yes, you can add calc columns and that often does the trick, but there are other cases (like for instance, removing dupe rows) that PowerPivot can’t do at all.

The whole “shape it in normal Excel and then paste it over to PowerPivot window” workaround is ok for one-off work, or for tables that you don’t expect to change very often.  But like many of you, I am striving to gain the benefits of automatic nightly data refresh that the server provides, and the Excel workaround doesn’t translate to that system at all.

So, even more than ever, you need a clean and properly-shaped data source to start with.  So the first half of the work that the consultant did for the football project in 2006, where the text files were imported into SQL and turned into a decent schema, is still very much required.

For a production system, I don’t think this is a bad thing at all.  It forces you to cooperate with IT (or whoever owns your databases) to give you what you need.  And it forces them to listen to you more clearly, as long as they care about “taming the Excel beast” that is always their favorite thing to complain about :)

For a production system, that cooperation is essential for robust results.  And if it’s not a production system, then yeah, the Excel shaping workaround is great.

“No, it can’t do that.  Oh, wait. Nevermind.  Yeah, it CAN do that.”

Having worked on the product from the beginning, I’m more closely familiar with its limitations than most people.  In some ways that’s an advantage of course – I don’t dive into projects only to find out later that it can’t be done.

Oddly though, so far, knowing the limitations has largely just been a hindrance.  Every time my initial answer was “no, not in PowerPivot v1,” I’ve turned out to be wrong.

I have succeeded using PowerPivot to solve every single real-life problem I’ve encountered in my first six months of using it.

Market basket analysis?  Ranking measures?  Standard deviation?  Many to many relationship problems?  Godawful horrible data sources?  Measures that calc according to different formulas at different levels of the pivot?  Iterating over variables that aren’t even in the view?  PowerPivot has defeated them all.  Well, more accurately, I have defeated them all with PowerPivot.  It’s not like I sit back and watch PowerPivot do its thing.  It is not always easy.  Which brings me to the next point:

Challenging and Rewarding

You know those rare occasions where you suddenly find yourself in the fast lane?  When your brain is forced to expand?  When you are truly challenged, in a good way?

I’m talking about a specific kind of challenge, the good kind.  Not the kind like when you take a new job and are overwhelmed by all the new rules and bureaucracy you have to learn.  And not even the kind when you’re learning most new technologies (HTML and XML come to mind).

The best examples of this “positive challenge” vibe from my life are the ones where I’ve been pushed by a mentor.  A couple of teachers come to mind.  Some specific coworkers as well – Zeke Koch, David Gainer, and Amir Netz most prominently.  Being around those guys everyday basically was a wakeup call – “Rob, you’ve been asleep.  Wake up, it’s time to grow, to be excellent.”

PowerPivot, oddly, has felt like that.  My brain has been expanding again, after a period of stagnation.  More specifically, PowerPivot combined with the problems I’ve been tackling has done this.

And it flows over to other areas too.  Example:  years ago when I needed to estimate the incoming query load for the football stats project, I asked around for advice, got none, and then basically just guessed.  It was a very hollow experience.  Non-excellent.

Then recently, I was presented with essentially the same challenge.  But this time, I didn’t guess, I modeled it:  estimated how many reports (and queries per report) each user would exercise at peak, built a spreadsheet, re-taught myself the Poisson and Binomial functions, and voila – a “users per server” estimate I could believe. 

I’m positive that working with PowerPivot is the difference between the “lazy guess” mentality from before and the “it can be done” mentality today.  I love it :)

Carrot, not stick (but sometimes the carrot is too big for one sitting)

I don’t want you to interpret the above as “PowerPivot is a harsh Pai Mei figure, kicking Rob’s ass day in, day out.”

Nope, I worked with PowerPivot for months, and did things I thought were pretty damn cool, without ever stretching my brain the way I have been lately.  If you’re an Excel pro, you will find PowerPivot to be a very welcoming environment.

Excel veterans:  you will never be forced to do anything uncomfortable with PowerPivot.

In fact you will do amazing things in your first week that will actually deliver greater results than standalone Excel, and it will actually seem easier than normal Excel.  Check out the CALCULATE function and you will see what I mean.

But boy, sooner or later, it will TEMPT you to try something bigger.

You’ll be sitting there one day thinking, “Gee, I sure wish I could build an analysis that showed X.”  And then something hits you – “hey wait, I bet THAT’s what that function I looked at the other day does” or “Didn’t I see a blog post last month about something similar?”

Two hours later, you’re still heads-down over your DAX formulas and relationships, feeling like you’re 30 seconds away from a breakthrough that will change your professional life.

You might not even succeed that first day.  You may have to come back tomorrow with a fresh perspective and a clear head.

And you love it.  Every minute of it.

But that’s when you realize that you have left the reservation.  You are not in Kansas anymore.  Time to take off the training wheels.  Pick your analogy.  Make no mistake – the power of DAX in particular can challenge you immensely.  Jon Udell described the examples I’ve covered here on the blog as “magic,” and he’s right – you won’t pick up everything overnight.  You should be prepared for that.

It’s called learning.  And you’ve almost forgotten what that feels like.  It’s what drew you to Excel in the first place, years ago – that fleeting glimpse of capabilities and the results it could deliver, but also frankly because you were hooked, addicted, to mastering it.

That feeling is back :)


PowerPivot FAQ roadshow at SQLPass EU

April 20, 2010

(Another guest post by Kasper – I have been in Redmond for the past week working on a few things with some of the pros back there.  Lots to report).

(Oh, and should we do this roadshow idea at the BI Conf this summer? -Rob)

At the great PowerPivot FAQ PowerPivot experts from across the globe answer all kinds of questions on PowerPivot. So my idea was wouldn’t it be great if we could have a live FAQ session at SQLPass EU! Since 2 moderators are at the SQLPass EU this year i decided to contact the Pass committee, they liked the idea as well .

So without further ado: The great PowerPivot FAQ will be coming to you at the SQLPass EU this spring.

Wesley Backelant (from Microsoft Belgium and also moderator at the FAQ)  and I will be having a PowerPivot FAQ session at the SQLKitchen at SQLPass EU. Since this idea was kinda last minute the date and time aren’t available yet but i’ll keep you posted ! Maybe we can have other experts to answer questions as well. Of course questions that aren’t in the FAQ will be added on the spot !

Hope to see you there!

UPDATE: The great PowerpivotFAQ.com session is at the sqlkitchen on SQLPASS EU, friday 11:45 until 12:45


Quick Tip: Don’t Over-Use FILTER()

April 7, 2010

The other day I was working on an updated version of my Team Playcalling report, which, by the way, look FABULOUS with Slicers:

NFL PlayCalling in PowerPivot

Only problem with it was that it was taking 90 seconds to refresh.  Unacceptable.

Howie Dickerman and Marius Dumitru, two of the Superheroes of DAX at Microsoft, pointed out that my measures were using the FILTER function in places that were not necessary.

For instance:

  [New Measure] = [Original Measure] ( 
                      FILTER(DataTable[PlayType]=”Pass”)
                                              )

Can be rewritten without the FILTER function:

  [New Measure] = [Original Measure] ( 
                      DataTable[PlayType]=”Pass”
                                              )

When I switched over to using those expressions directly, rather than the FILTER function, my 90 second refresh time dropped to about 3 seconds.

That’s a pretty significant boost in DAX measure performance from a very simple change.

Why so much faster?

The explanation from Howie made a lot of sense.  When the PowerPivot engine is evaluating a measure, it already has to take filter context from the pivot itself – row fields, slicer selections, etc.

And since that’s pretty much the #1 use case for the engine, well, applying filter context to a measure is highly optimized and fast.

So it’s not a big deal for the engine to inject another filter like I have specified in the rewritten example – it gets treated much the same as if that filter came from the pivot, as if a slicer had been set to PlayType=”Pass”

But the FILTER function, on the other hand…  well, it creates a brand-new table in memory.  I know my example just filters by one column, but the FILTER function can do some pretty amazing things, dynamically responding to current context.

Because of that power, the FILTER function must either create or update that dynamic table for every cell in the pivot where FILTER is used.  I have 256 cells in this particular pivot (32 teams times 8 measures), and originally all of them used FILTER.  But given that some of my measures are based on other measures that aren’t displayed in the pivot, my actual cell count was even higher.

Now imagine what would happen in a pivot with 5,000 rows :)

Next Football Post:  Grading the Football Project  >>


David Coe Wins Excel Monkey Contest

April 7, 2010

Awhile back we announced the Excel Monkey TShirt Contest.  All was good, but then I got a little lax about announcing the winner :)

Well, the judges (me, Denny, and Chris) huddled and it was unanimous.  The man who started it all, David Coe, takes home the fabulous prizes… once those prizes are in print :)

100% Authentic Quote from the Winner Himself

“Wow.  I am speechless.  This is truly the most magnificent thing to happen to me in days.”

     (OK, he didn’t say that.  I just made that up.  His REAL quote was:)

“Wow, I can’t believe my monkey didn’t get spanked”

     And no, I did NOT make that one up :)

Prizes

As a reminder, the prizes are three autographed books – one by Denny Lee and company, two by Mr. Excel:

Denny Book Bill Book 1 Bill Book 2

Entries Recap

Here was David’s winning entry:

Runner-Up – Entry from Dan English:

Dan English Excel Monkey Evolution

Our fine CEO at PivotStream, Jeff Elderton, submitted this entry that I think was perhaps a little tongue-in-cheek:

   jeff1 Jeff2

And Thiago Zavaschi submitted this concept sketch that he was hoping the design department here at PowerPivotPro could flesh out better.  Sadly, there is no such department.

Monkey

Congratulations David!


DAX.xml file from Colin

April 5, 2010

Rather than do the email dance forever, Colin has provided a copy of his DAX.xml file (for Notepad++ autocomplete support, as described in the previous post) in the PowerPivotFAQ samples library:

http://powerpivotfaq.com/PowerPivot%20Samples/Forms/AllItems.aspx


Notepad++ and DAX – Guest Post from Colin Banfield

April 5, 2010

 
Force 10 From Navarone

“Excuse me, sir, you don’t need 20 men, you need Miller Banfield. 
He’s an expert with explosives Excel, sir. He’s probably the best in England.”

 

 

That’s right folks, after prowling the comment threads for months now as one of the most insightful contributors, Colin Banfield of BIExcel.com is taking the plunge and sharing a full guest post with us.  This is indeed a very positive development for us all :)

Today, Colin is giving us more info on using Notepad++ to write PowerPivot formulas.  Take it away Colin :)

Using Notepad++ as an editor for creating, reading, and troubleshooting DAX formulas

Rob recently wrote an article about the difficulty he experiences editing complex formulas in PowerPivot, even though he helped to create editor used in the Measure Settings dialog box. You can read Rob’s article here http://powerpivotpro.com/2010/03/27/two-observations/. In the article, Rob noted the following as shortcomings of PowerPivot’s built-in editor:

· Can’t indent formulas
· Can’t save formulas
· No find and replace option
· Can’t change the font size

To address these issues, Rob has been using Notepad as his PowerPivot formula editor. In the above-mentioned article, Rob solicited opinions on alternative text editors. Some folks, including me, suggested Notepad++. Notepad++ is a free text editor that might be viewed as Notepad on steroids. However, the point is not that Notepad++ has more features than Notepad – it’s that you can customize the program to work specifically with DAX formulas. The purpose of this article is to explain how the customization is done, and the many benefits you gain from the effort.

You can download the latest version of Notepad++ from here http://notepad-plus.sourceforge.net/uk/site.htm. During the installation, I strongly suggest that you don’t check the option to bypass using the %APPDATA% folder for storing Notepad++ settings (the option is unchecked by default is used mainly for portable installations). If you have already installed the program and checked the option during installation, I recommend that you completely uninstall the program (including everything under the Notepad++ installation folder) and start over.

Figure 1 shows the Notepad++ main window. The file manager window on the left is a plug-in that you can install through the Plugins menu. The plug-in is called Explorer. The built-in file manager is useful when you want to quickly retrieve saved formula files.

clip_image002

Figure 1 – Notepad++ main window

The first step in customizing Notepad++ is to create a custom language for DAX. The purpose of a custom language is to provide special syntax highlighting for various elements of the language. Choose ViewàUser -Defined Dialogue…The User-Dialog box appears as shown in figure 2.

clip_image004

Figure 2 – User-Defined dialog box

Click Save As… and enter “DAX” (without the quotes) in the Name text box. After you click OK, you will notice that extension (Ext) box appears at the top right of the User-Defined dialog box. Type an extension in this box (txt, dax, foobar, whatever), and save your formula files with this extension. This ensures that when you subsequently open the file, Notepad++ applies the custom language for editing the file. You can associate more than one extension with a custom language by separating the extension names with a space in the Ext box.

After you enter an extension, you can explore the various tabs in the dialog box. My personal setup is as follows (it has changed over time):

· Folder and Default tab – As shown in figure 3.

clip_image006

Figure 3 – Folder& Default tab

The folder open and close keywords are used expand/collapse a block of code. This might be useful for delineating parts of a DAX formula that can be used as intermediate measures in a PivotTable (see figure 7).

· Keyword Lists tab – See figure 4.

clip_image008

Figure 4 – Keyword Lists tab

In the “1st Group” list box, I pasted an alphabetized list of DAX functions.

· Comment and Number tab – See figure 5.

clip_image010

Figure 5 – Comment & Number tab

· Operators tab – See figure 6.

clip_image012

Figure 6 – Operators tab

Naturally, each user will personalize these dialogs to suit his or her taste. Notepad++ saves the configuration after you exit the dialog box. To use your custom language, you must choose it from the Language menu. Custom languages appear in a separate section at the bottom of the menu. Currently (version 5.6.8 at the time of this writing), there is no way to make your custom language the default language. You must select the custom language every time you open Notepad++ and every time you open a new file. This is a very annoying and stupid limitation. If you don’t assign an associated file extension for the custom language, the syntax highlighting for the default built-in language will be applied after you open the file. However, there are many user requests to make a custom language the default language, so it’s possible that the feature will appear in a future version of the product (hopefully, very soon).

Figure 7 shows an example of using your new DAX language to edit a DAX formula. Notice the vast improvements in readability over editing the formula in the PowerPivot Measure Settings dialog box, or in Notepad. The bottom left of the windows displays the language in use.

clip_image014

Figure 7 – Example DAX formula with syntax highlighting, bookmarks (blue blobs) & folding blocks

Pretty cool, no? However, we’re just getting started. The next step in customizing the language is adding function auto-complete, function parameter tooltips and function description tooltips. You heard that right. It takes some time to create the text for the auto-complete XML file, which must have the same name as the language (is this case the name of the auto-complete file would be DAX.xml). You must place DAX.xml in the Notepad++pluginsAPIs folder, and the turn on the appropriate Auto-Completion options in the Preferences dialog box. These options are shown in figure 8.

clip_image016

Figure 8 – Auto-Completion options

I’ve created a DAX.xml file, and anyone interested in using it can download it here:
http://powerpivotfaq.com/PowerPivot%20Samples/Forms/AllItems.aspx

The next couple of figures illustrate how function auto-complete and tooltips work.

clip_image018

Figure 9 – Function auto-complete

clip_image020

Figure 10 – Function parameter and description tooltip

The only functionality you lose from PowerPivot’s built-in editor is the very useful table and field names auto-complete. You can use the PowerPivot editor to auto-complete table and field names that you paste into Notepad++. In many cases, like that shown in figure 7, you end up reusing the same fields in different parts of the formula – which is then a simple cut and paste operation.

I encourage you to explore the many other features of Notepad++ not covered in this article. Other features I find useful include:

· FileàLoad Session & File Save Session. Because Notepad++ uses a multiple document interface (each document appears in a separate tab), you can save all the formula files associated with a project as a session.

· Bookmarks. You can toggle a bookmark on and off in several ways, for example: 1) click the right-side of the line number in the margin, 2) select a line and enter Ctrl+F2. You can also bookmark all lines containing specific text. Bookmarks are useful if you want to include folding keywords (as shown in figure 7), or comments inside the DAX formula. After you bookmark the folding keyword lines and comment lines, you can delete these lines by choosing SearchàDelete Bookmarked Lines. After you delete the bookmarked lines, you can paste the formula into PowerPivot.

As a final note, I’d like to say that Notepad++ has helped tremendously in managing my calculated measures. With complex formulas, I tend to create many intermediate formulas so that I can see the calculated results in the PivotTable. I don’t want to expose these intermediate calculations to the end user, so before deploying the PowerPivot model, I have to amalgamate the intermediate formulas into a single formula. It’s very unfortunate that PowerPivot doesn’t allow you to hide calculated measures.