“Introducing” Data Explorer

March 7, 2013


In My Opinion, There Aren’t Many Things More Exciting Than a New Ribbon Tab Full of Goodies
(And One That I Can Add to Excel 2010 or 2013 – I Hear That’s Important for Some Reason)

Maybe There’s a Future in this “Numbers” Thing…

It does seem like Microsoft has figured out that data is a big deal.  Every time I turn around, I am hearing of a new software development team joining Microsoft’s efforts in the Excel/BI/Overall Data Crunching space.

Often, such teams are merely whispers – shadowy rumors on the wind.  Friends disappear from their familiar roles and reappear in places they can’t talk about.

Other times, someone new to me walks up and just hands me a piece of nearly-finished software.

This is one of those latter cases.

Some Highlights

I don’t have time for a full tour today, and honestly I haven’t even explored all of the functionality yet.  So let’s hit some highlights shall we?

Read the rest of this entry »

Cloud Data Approaching Critical Mass: Connection Cloud, SalesForce, PowerPivot, & Webinar on YouTube

December 13, 2012


Cloud Data Like SalesForce Available to PowerPivot as if it Were in a Local Database:
My Long Wish for a “Data Highway” Gets Closer Every Day
(Click for the Webinar Featuring Yours Truly on YouTube)

Flashback 2001:  The “Data Highway” Concept

Back at Microsoft in 2001 when I was working on what eventually became Excel 2003, I pitched a vision that I called “Data Highway.”  (OK, not an original name considering the Information Superhighway thing coined by Internet inventor Al Gore, but invention is smart and theft is genius, or something like that.)

The idea behind Data Highway was simple:  all relevant data made available to the most popular tools (cough cough Excel), in a convenient and refreshable format.  No manual gruntwork required to “fetch” data in other words – saving your brain for actual thinking.

imageThere were three elements to the pitch:

  1. A common internet protocol for exchanging data. 
  2. “Teaching” Excel, Access, and other tools to consume any data source exposed via that protocol.
  3. A marketplace for data where providers like Dun and Bradstreet could sell data to be piped straight into Excel.

Well the protocol flopped and our VP killed the marketplace idea before it got off the ground.  Having good ideas isn’t enough – you can’t be too early, and you also need to execute better than we did.

Fast Forward to Today

Here we are at the end of 2012, and we have all three elements available in different (but robust and real) forms:

Read the rest of this entry »

Datamarket: Quick Followup

April 19, 2012


“There are people out there whose jobs force them to be the place where two sources of data meet, and they are the ones who integrate and cross-reference that data to form conclusions…

…I think a lot of the world is like that.

-Bill Gates circa 2002

People are always asking me if I know Tyler Durden

I mean, I’m often asked if I ever met Bill Gates during my time at Microsoft.  I did, once, in 2002, when he wanted to review the XML features we were introducing in Excel 2003.

A few things from that meeting lodged in my head, and the quote above is one of them.  The first sentence is paraphrasing on my part as I don’t precisely remember.  But the last sentence, in italics, I remember word for word because I found it so validating of some of my own personal views and experience.

And I just realized, today, that quote should have been attached to one of the previous posts on DataMarket.

Real Reason for the Post:  People with V2 Don’t See the Same Thing?

Last week I posted a workbook that lets you download weather data from basically anywhere in the world, accompanied with instructions on how to customize it for your needs.

The workbook I provided was produced in PowerPivot V1 (SQL 2008 R2, versions 10.xx).

I received reports from a few people that when you got to the step of editing the connection in the workbook, you saw a different dialog than I saw.


Is This What You See When You Edit Connection In The Workbook I Provided?

What I See

From the previous post, here’s what I see, with the two things you need to change highlighted:


What I See

How To Make Required Changes If You Are Seeing the “Alternate” Dialog


Click Advanced


Fill In Required Info in These Three Places

That Should Do It

Let me know, again, if you have problems.  I am channeling feedback to Microsoft on this stuff so they can address any snags we hit.

Why is Microsoft paying such close attention to us in particular?

Because WE are those people where the data sources come together.

Apparently April 12 is “DataMarket Weather Day”

April 12, 2012

So cool…  Chris Webb ALSO posted today about downloading weather data from DataMarket.

And any post that starts with the words “I don’t always agree with what everything Rob Collie says” gets an immediate boost in credibility – very wise words indeed Smile


Click Image for Chris’s Post

Chris takes a different approach and goes through the full online UI rather than sharing out a pre-baked workbook like I did.  My approach was intended to make things simpler for you.  Let me know if I was successful Smile

Download 10,000 Days of Free Weather Data for Almost Any Location Worldwide

April 12, 2012


“And I feel, so much depends on the weather…
So is it raining in your bedroom retail?”

Example:  800 Days of Weather in New York City


820 Days of Weather Data from New York City, Pulled From DataMarket
(Temps in F, Precipitation in Inches)

Come on admit it.  It’s very likely that you would have a use for data like this, whether it was from a single location or for a range of locations, as long as the locations(s) were relevant to your work and the data was easy (and cheap) to get.

Good news:  I’m gonna show you how to get this same data for the location(s) you care about, for free, and make it easy for you.  Read on for the weather workbook download link and instructions.

First:  A Practical Application to Whet the Appetite

As I said in the last post, I think there’s a lot of important things to be learned if we only cross-referenced our data with other data that’s “out there.”

I happen to have access to two years of retail sales data for NYC, the same location that the weather data is from.  To disguise the sales data, I’m going to filter it down to sales of a single product, and not reveal what that product is.  Just know that it’s real, and I’m going to explore the following question:

If the weather for a particular week this year was significantly better or worse than the same week last year, does that impact sales?

Let’s take a look at what I found:

Impact of Weather on Sales of a Specific Product

RESULTS:  Weather This Year Versus Last Year (Yellow = “Better”, Blue = “Worse”),
Compared to Sales This Year Versus Last (Green = Higher, Red = Lower)

I don’t have a fancy measure yet that directly ties weather and sales together into a single correlative metric.  I’m not sure that’s even all that feasible, so we’re gonna have to eyeball it for now.

And here is what I see:  I see a band of weeks where sales were WORSE this year than last, and the weather those weeks was much BETTER this year than last.

And the strongest impact seems to be “number of snow days” – even more than temperature, a reduction in snow this year seems to correlate strongly with worse sales of this product.

Does that make sense?  I mean, when the weather is good, I would expect a typical retail location to do MORE business, especially in a pedestrian-oriented place like NYC.  And we are seeing the reverse.

Aha, but this is a product that I would expect people to need MORE of when the weather is bad, so we may in fact be onto something.  In fact this is a long-held theory of ours (and of the retailer’s), but we’ve just never been able to test it until now.

All right, let’s move on to how you can get your hands on data for your location(s).

Download the Workbook, Point it at Your Location(s)

Step 1:  Download my workbook from here.

Step 2:  Open it and find a location that interests you from this table on the first worksheet:


Nearly Four Thousand Cities Are Available on the First Sheet
(Pick One and Write Down the Location ID)

Step 3:  Open the PowerPivot Window, Open the connection for editing:


Step 4:  Replace the Location ID and Fill in your Account Key:


ALSO IMPORTANT:  Make sure the “Save My Account Key” Checkbox is Checked!

Don’t Have A DataMarket Account Key?

No problem, it’s easy to get and you only have to do it once, ever.

Step 1:  Go to https://datamarket.azure.com/

Step 2:  Click the Sign In button in the upper right to create your account:


Step 3:  Follow the Wizard, finish signing up.  (Yes it asks for payment info but you won’t be charged anything to grab the data I grabbed.)

Step 4:  Go to “My Account” and copy the account key:


Next:  Subscribe to the Free Version of the Weather Feed

Go to the data feed page for “WT360” and sign up for the free version:


I’ve Gone a Bit Crazy With This Service and Already Exhausted my Free Transactions,
But That Was NOT Easy to Do

Back in the Workbook…

Now that you’ve entered your account key, set the LocationID, subscribed to the feed, and saved the connection, you can just hit the Refresh button:


Send Me Your Mashups!

Or at least screenshots of them.  I’m Rob.  At a place called PowerPivotPro.  DotCom.

And let me know if you are having problems, too.

I’ll have some more details on Tuesday of things you can modify and extend, like how I added a “Snow Days” calculated column.

DataMarket Revisited: The Truth is Out There

April 10, 2012


How many discoveries are right under our noses,
if only we cross-referenced the right data sets?

Convergence of Multiple “Thought Streams”

Yeah, I love quoting movies.  And tv shows.  And song lyrics.  But it’s not the quoting that I enjoy – it’s the connection.  Taking something technical, for instance, and spotting an intrinsic similarity in something completely unrelated like a movie – I get a huge kick out of that.

That tendency to make connections kinda flows through my whole life – sometimes, it’s even productive and not just entertaining Smile

Anyway, I think I am approaching one of those aha/convergence moments.  It’s actually a convergence moment “squared,” because it’s a convergence moment about…  convergence.  Here are the streams that are coming together in my head:

1) “Expert” thinking is too often Narrow thinking

I’ve read a number of compelling articles and anecdotes about this in my life, most recently this one in the New York Times.  Particularly in science and medicine, you have to develop so many credentials just to get in the door that it tends to breed a rigid and less creative environment.

And the tragedy is this:  a conundrum that stumps a molecular cancer scientist might be solvable, at a glance, by the epidemiologist or the mathematician in the building next door.  Similarly, the molecular scientist might breeze over a crucial clue that would literally leap off the page at a graph theorist like my former professor Jeremy Spinrad.

2) Community cross-referencing of data/problems is a longstanding need

Flowing straight out of problem #1 above is this, need #2.  And it’s been a recognized need for a long time, by many people.

Swivel and ManyEyes Both Were Attempts at this Problem

Swivel and ManyEyes Both Were Attempts at this Problem

I remember being captivated, back in 2006-2007, with a website called Swivel.com.  It’s gone now – and I highly recommend reading this “postmortem” interview with its two founders – but the idea was solid:  provide a place for various data sets to “meet,” and to harness the power of community to spot trends and relationships that would never be found otherwise.  (Apparently IBM did something similar with a project called ManyEyes, but it’s gone now, too).

There is, of course, even a more mundane use than “community research mashups” – our normal business data would benefit a lot by being “mashed up” with demographics and weather data (just to point out the two most obvious).

I’ve been wanting something like this forever.  As far back as 2001, when we were working on Office 2003, I was trying to launch a “data market” type of service for Office users.  (An idea that never really got off the drawing board – our VP killed it.  And, at the time, I think that was the right call).

3) Mistake:  Swivel was a BI tool and not just a data marketplace

When I discovered that Swivel was gone, before I read the postmortem, I forced myself to think of reasons why they might have failed.  And my first thought was this:  Swivel forced you to use THEIR analysis tools.  They weren’t just a place where data met.  They were also a BI tool.

And as we know, BI tools take a lot of work.  They are not something that you just casually add to your business model.

In the interview, the founders acknowledge this, but their choice of words is almost completely wrong in my opinion:


Check out the two sections I highlighted.  The interface is not that important.  And people prefer to use what they already have.  That gets me leaning forward in my chair.

YES!  People prefer to use the analysis/mashup toolset they already use.  They didn’t want to learn Swivel’s new tools, or compensate for the features it lacked.  I agree 100%.

But to then utter the words “the interface is not that important” seems completely wrong to me.  The interface, the toolset, is CRITICAL!  What they should have said in this interview, I think, is “we should not have tried to introduce a new interface, because interface is critical and the users already made up their mind.”

4) PowerPivot is SCARY good at mashups

I’m still surprised at how simple and magical it feels to cross-reference one data set against another in PowerPivot.  I never anticipated this when I was working on PowerPivot v1 back at Microsoft.  The features that “power” mashups – relationships and formulas – are pretty…  mundane.  But in practice there’s just something about it.  It’s simple enough that you just DO it.  You WANT to do it.

Remember this?

OK, it’s pretty funny.  But it IS real data.  And it DOES tell us something surprising – I did NOT know, going in, that I would find anything when I mashed up UFO sightings with drug use.  And it was super, super, super easy to do.

When you can test theories easily, you actually test them.  If it was even, say, 50% more work to mash this up than it actually was, I probably never would have done it.  And I think that’s the important point…

PowerPivot’s mashup capability passes the critical human threshold test of “quick enough that I invest the time,” whereas other tools, even if just a little bit harder, do not.  Humans prioritize it off the list if it’s even just slightly too time consuming.

Which, in my experience, is basically the same difference as HAVING a capability versus having NO CAPABILITY whatsoever.  I honestly think PowerPivot might be the only data mashup tool worth talking about.  Yeah, in the entire world.  Not kidding.

5) “Export to Excel” is not to be ignored

Another thing favoring PowerPivot as the world’s only practically-useful mashup tool:  it’s Excel.

I recently posted about how every data application in the world has an Export to Excel button, and why that’s telling.

Let’s go back to that quote from one of the Swivel founders, and examine one more portion that I think reflects a mistake:


Can I get a “WTF” from the congregation???  R and SAS but NO mention of Excel!  Even just taking the Excel Pro, pivot-using subset of the Excel audience (the people who are reading this blog), Excel CRUSHES those two tools, combined, in audience.  Crushes them.

Yeah, the mundane little spreadsheet gets no respect.  But PowerPivot closes that last critical gap, in a way that continues to surprise even me.  Better at business than anything else.  Heck, better at science too.  Ignore it at your peril.

6) But Getting the Data Needs to be Just as Simple!

So here we go.  Even in the UFO example, I had to be handed the data.  Literally.  Our CEO already HAD the datasets, both the UFO sightings and the drug use data.  He gave them to me and said “see if you can do something with this.”

There is no way I EVER would have scoured the web for these data sets, but once they were conveniently available to me, I fired up my convenient mashup tool and found something interesting.

7) DataMarket will “soon” close that last gap

In a post last year I said that Azure DataMarket was falling well short of its potential, and I meant it.  That was, and is, a function of its vast potential much more so than the “falling short” part.  Just a few usability problems that need to be plugged before it really lights things up, essentially.

On one of my recent trips to Redmond, I had the opportunity to meet with some of the folks behind the scenes.

Without giving away any secrets, let me say this:  these folks are very impressive.  I love, love, LOVE the directions in which they are thinking.  I’m not sure how long it’s going to take for us to see the results of their current thinking.

But when we do, yet another “last mile” problem will be solved, and the network effect of combining “simple access to vast arrays of useful data sets” with “simple mashup tool” will be transformative.  (Note that I am not prone to hyperbole except when I am saying negative things, so statements like this are rare from me.)

In the meantime…

While we wait for the DataMarket team’s brainstorms to reach fruition, I am doing a few things.

1) I’ve added a new category to the blog for Real-World Data Mashups.  Just click here.

2) I’m going to do share some workbooks that make consumption of DataMarket simple.  Starting Thursday I will be providing some workbooks that are pre-configured to grab interesting data sets from Data Market.  Stay tuned.

3) I’m likely to run some contests and/or solicit guest posts on DataMarket mashups.

4) I’m toying with the idea of Pivotstream offering some free access to certain DataMarket data sets in our Hosted PowerPivot offering.

See you Thursday Smile

The Ultimate Date Table

November 15, 2011


“Looks like it’s time for me to get myself a date.”

-Ace Ventura, PowerPivot Detective

The Importance of a Date/Calendar Table

I get a lot of questions from people who are struggling with the time intelligence functions in DAX.  And nine times out of ten, the answer is that they don’t have a proper date table.

I know it’s tempting.  You’ve got your sales table, and hey, there’s a Date column in there!  So you use it, and pass that column as a parameter to, say, DATESBETWEEN, or DATEADD.

Sometimes that will give you an error.  And other times, it won’t…  but the results will be funky.

You need a separate Dates table, or perhaps you prefer to call it a Calendar table.  A separate table, whose only purpose is to store dates (and the properties of dates, like DayOfWeek, etc.)  And it contains consecutive dates – no “gaps.”  Even if your business is never open on weekends, you need unbroken ranges of dates.

Oh, and then you need to relate it to your Sales table.  (Or whatever fact/measure tables you have).

Much More Than a Single Column

A single-column table that contains merely dates is enough to make the time intelligence DAX functions operate smoothly.  But you will almost certainly want other fields too.  Like Year.  MonthName.  DayOfWeek.  The list goes on.

Maybe something like this:


And yes, you can cobble this together on your own in Excel.  Tedious work though.

Would You Like One for Free?  Try DateStream from Boyan Penev!

Imagine just being able to open up PowerPivot and always having three nice date tables awaiting import:


That’s what Boyan Penev has put together for you.  Three great calendar tables that you can download directly into PowerPivot, for free.

He published them to Azure DataMarket, a service from Microsoft where data providers can actually sell you their data sets – things like weather, demographics, etc.

Boyan did this for free though – I suspect half as a service to the community, and half as a project to learn how to provide a service on DataMarket.

It’s pretty damn cool, and really, the story should end there.  If you’ve used DataMarket before, then it DOES end there.  Go get the date tables and try them out.

But if this is your first exposure to DataMarket, it takes a few minutes to get it set up.  It’s not bad as long as you don’t make the mistakes I did.

How To Get It – Short Version

Hey, it’s on Azure DataMarket.  The URL is in the next section below, or you can just go to Azure DataMarket and search on “DateStream.”

DataMarket is going to be a wonderful service someday, but right now it has a few warts, so there is a Long Version too.

How To Get It – Long Version with Occasional Snarky Commentary

Step 1:  Go to the DateStream page on DataMarket.


Step 2:  Get confused.  OK, now is where things get choppy, because frankly, the DataMarket site itself has a terrible user interface.  I sent a full page of feedback to the DataMarket team about a month ago and as far as I can tell, they ignored it.  (Which is pure karma – I used to be one of the people at MS who ignored 90% of the feedback coming in, and now I get to be the one who is ignored).

I don’t want this to be a tutorial on how to navigate their website, or even how NOT to design a website.  So let’s just hit the highlights and try to get to Boyan’s date tables as soon as we can.

Step 3:  Get an account.  OK, this isn’t bad.  Another MS site that requires a Live ID.  Most of us have three of those by now.

Step 4a:  Scan the DateStream page looking for the “Download to PowerPivot” button or link.

Yeah that’s right.  There is no such link – you can get to one by navigating a few levels deeper but I’m going to skip that.  Don’t despair though, good things await you!

Step 4:  Find the URL of the DateStream Service

This is NOT the same as the URL of the DateStream page.  But it IS displayed on the page.  Here’s the URL you need:


OK, copy that.  You will need it.

Step 5:  Launch PowerPivot, Go Into the PowerPivot Window

And click this button:



If you don’t have that button, you need a newer version of PowerPivot.  Go get that from PowerPivot.com and resume the next step.

Step 6:  Fill in the Dataset URL From Step 5


Step 7:  Account Key

See that last text box in the picture above?  The one with the long code in it that I’ve partly blurred?  That’s my account key.  I highly recommend clicking that Find button.  It’s actually pretty damn useful.

Be careful – the DataMarket site has TWO long nasty codes like that for you.  One of them is the one you want, and the Find button takes you to that one:


THIS is Your Account Key

Do NOT, under any circumstances, do what I did, and confuse Account Key with Customer ID:



This is NOT Your Account Key.
Do NOT Be Tempted to Use This!

Step 8:  Click Next, and Pick Your Table or Tables


NOW we are on familiar ground.

Last Note:  Parameterization?

One thing I have not yet figured out is how to limit the date range I import.  The table starts in the year 1900, which goes back a bit far for my needs, and makes the dataset take a long time to download.

You’ll notice that when importing from DataMarket, the Preview and Filter UI lacks the filter dropdown buttons:


No Filter Dropdowns, Just Checkboxes

But the DateStream homepage DOES indicated that parameterization is possible:


So if you’ve got that figured out, drop me a note Smile