Creating a Histogram with a User Defined Number of Buckets

March 12, 2015

By Dany Hoter

Intro by Avi: I have often been asked, about ways to provide an “input” to the Power Pivot model from Excel. Disconnected Slicers are a popular way to do this. But with Excel and Power Pivot, there is always more than one way to accomplish a task. Dany shows us how, while making histograms easier to use. He uses a filter dropdown, which even works with Excel Online – inside a browser! Here is the end result, read on to learn how and download file.

Take it away Dany…


Creating a histogram in Excel based on Power Pivot is not as easy as it should be.

The method I use is no different from what others have already blogged and wrote about. There is even a solution that calculates the number of bins in a histogram with a formula that is based on the total number of cases.

My take on the problem was to let the user choose in run time what is the interval between each bin as a percentage and to show the number of bins accordingly.

Business Case

The model contains data about a service that is in its infancy and so the users experience a relatively high number of errors.

The managers responsible for the service posed the following request:

“We want to see a histogram of the sessions showing how many users have experienced no errors in all the sessions they initiated, how many experienced errors in 10% of the sessions, 20% of the sessions … all the way to these poor users who saw nothing but errors in 100% of the sessions (Told you it is in early stage…)


Read the rest of this entry »

How to Compare the Current Row to the Previous Row in Power Pivot

March 5, 2015

Post By Dany Hoter

Intro from Rob:  In this post, Dany demonstrates how we can use Power Query to add a numerical index column to our table, and then use that to address the previous row in a calc column.  Lots of good stuff here.

More generally, this is achievable using the misleadingly-named EARLIER function.  EARLIER does not mean “previous row,” but coincidentally, you can use it to do just that, and many other “cross-row reference” things.  Check out this post on EARLIER for the basics (or read the calc column chapter in Power Pivot Alchemy), and read some of the comments at the end of this post for further examples.

Take it away, Dany…

Business Case

I was approached with a business question from the car fleet manager in our company.

There are many different types of cars and the fleet manager is trying to optimize the fuel cost and reduce emissions.

The data about the fuel consumption comes from a smart device installed in the car.

Every time a company car enters a gas station the device reads the car’s odometer and reports the car’s ID + the odometer + gas quantity purchased to a central database.

From this data we needed to create a report that shows average fuel consumption by make and model of the cars in the fleet. We got the data as two csv files , one with the entire history of gas transactions and one with the car fleet details.

Sounds Simple

The data model and the report are really basic and straight forward as you can see in the attached workbook which is obviously fake data created for this article.

So where is the challenge? The only problem is that each row representing a fuel transaction includes the odometer for this transaction and the question is how to calculate the distance traveled since the last visit to the pump.

The Excel way

In excel this is a very simple problem.

– Make sure the rows are ordered by car plate # and date

– Subtract the odometer value for the previous row from that of the current row checking that both rows are from the same car. Ignore the first transaction for each car.


The example uses a table and notice that the expression is a mix between structured reference (circled black) and regular reference (circled red) when the cell referenced is on a different row in the table.

One problem with this solution is that it is difficult to make it refreshable without using VBA.

Next month when there will be more data the table need to be sorted again and this is possible only from the UI or from VBA.

The Power Pivot way

Actually there is no PP way without some help. In DAX you can’t reference the previous row in any way because there is no order to the rows. There are probably solutions to this specific example using purely DAX but I’m sure that they are pretty complex.

Some Help from Power Query

I used PQ to create a new column which helps PP understand what the previous row is.

If you open the query that reads the fuel csv file, you’ll see that I sort the data and immediately add an index column to the table which persists the current order of the rows which is ascending by car and date.

Back to Power Pivot

I can use this new column in PP to calculate the difference between any row and the row next to it.

I add a calculated column with this expression:






The index column allows me to reference other rows relative to the current row using the LOOKUPVALUE function.


Using Power Query I was able to sort the data and persist the order after sort using an index column.

These two operations are part of the import operation and so will be performed each time new data arrives.

The index column enables creating a calculated column that compares data in the current row with data in the “previous” row, when previous is defined by the index column.

Download Files

Cloud Data Approaching Critical Mass: Connection Cloud, SalesForce, PowerPivot, & Webinar on YouTube

December 13, 2012


Cloud Data Like SalesForce Available to PowerPivot as if it Were in a Local Database:
My Long Wish for a “Data Highway” Gets Closer Every Day
(Click for the Webinar Featuring Yours Truly on YouTube)

Flashback 2001:  The “Data Highway” Concept

Back at Microsoft in 2001 when I was working on what eventually became Excel 2003, I pitched a vision that I called “Data Highway.”  (OK, not an original name considering the Information Superhighway thing coined by Internet inventor Al Gore, but invention is smart and theft is genius, or something like that.)

The idea behind Data Highway was simple:  all relevant data made available to the most popular tools (cough cough Excel), in a convenient and refreshable format.  No manual gruntwork required to “fetch” data in other words – saving your brain for actual thinking.

imageThere were three elements to the pitch:

  1. A common internet protocol for exchanging data. 
  2. “Teaching” Excel, Access, and other tools to consume any data source exposed via that protocol.
  3. A marketplace for data where providers like Dun and Bradstreet could sell data to be piped straight into Excel.

Well the protocol flopped and our VP killed the marketplace idea before it got off the ground.  Having good ideas isn’t enough – you can’t be too early, and you also need to execute better than we did.

Fast Forward to Today

Here we are at the end of 2012, and we have all three elements available in different (but robust and real) forms:

Read the rest of this entry »

“Drill Across” in PowerPivot – Live Demo

May 24, 2012

Hyperlinks in a Pivot

“I’m telling you there are monkey-fighting hyperlinks in this Monday-to-Friday pivot!”

(Seriously this is how they cleaned up his line for TV, with “monkey-fighting” and “Monday to Friday”)

***UPDATE:  I am no longer working at Pivotstream and do not endorse their services.  All links are removed from this article but feel free to look them up if you are interested.





Retailer Competitive Overlap Application – New and Improved Live Demo

PowerPivot Retailer Competitive Overlap Application With Drill Across

Revamped/Simplified “Retailer Competitive Overlap” Application
(Note that the Row Labels Area of the Pivot Contains Hyperlinks!)

Clicking an Item to Get More Detail

The retailer overlap application is one that I’ve covered before, in my post announcing our live PowerPivot demo site, but I’ve recently spent some time improving it based on customer feedback and requests.

Specifically, our retail customers have asked the following:  “It’s great that I can see that Retailer X competes with me for our senior citizen customers much more aggressively than we thought, but can I get a list of the actual stores that overlap, with addresses?”



But WHICH Stores?  I Want to See the Addresses!

Hyperlinks in a Pivot!?

Let’s zoom in on the row area of the pivot pictured above:


Read the rest of this entry »

Datamarket: Quick Followup

April 19, 2012


“There are people out there whose jobs force them to be the place where two sources of data meet, and they are the ones who integrate and cross-reference that data to form conclusions…

…I think a lot of the world is like that.

-Bill Gates circa 2002

People are always asking me if I know Tyler Durden

I mean, I’m often asked if I ever met Bill Gates during my time at Microsoft.  I did, once, in 2002, when he wanted to review the XML features we were introducing in Excel 2003.

A few things from that meeting lodged in my head, and the quote above is one of them.  The first sentence is paraphrasing on my part as I don’t precisely remember.  But the last sentence, in italics, I remember word for word because I found it so validating of some of my own personal views and experience.

And I just realized, today, that quote should have been attached to one of the previous posts on DataMarket.

Real Reason for the Post:  People with V2 Don’t See the Same Thing?

Last week I posted a workbook that lets you download weather data from basically anywhere in the world, accompanied with instructions on how to customize it for your needs.

The workbook I provided was produced in PowerPivot V1 (SQL 2008 R2, versions 10.xx).

I received reports from a few people that when you got to the step of editing the connection in the workbook, you saw a different dialog than I saw.


Is This What You See When You Edit Connection In The Workbook I Provided?

What I See

From the previous post, here’s what I see, with the two things you need to change highlighted:


What I See

How To Make Required Changes If You Are Seeing the “Alternate” Dialog


Click Advanced


Fill In Required Info in These Three Places

That Should Do It

Let me know, again, if you have problems.  I am channeling feedback to Microsoft on this stuff so they can address any snags we hit.

Why is Microsoft paying such close attention to us in particular?

Because WE are those people where the data sources come together.

Apparently April 12 is “DataMarket Weather Day”

April 12, 2012

So cool…  Chris Webb ALSO posted today about downloading weather data from DataMarket.

And any post that starts with the words “I don’t always agree with what everything Rob Collie says” gets an immediate boost in credibility – very wise words indeed Smile


Click Image for Chris’s Post

Chris takes a different approach and goes through the full online UI rather than sharing out a pre-baked workbook like I did.  My approach was intended to make things simpler for you.  Let me know if I was successful Smile

Download 10,000 Days of Free Weather Data for Almost Any Location Worldwide

April 12, 2012


“And I feel, so much depends on the weather…
So is it raining in your bedroom retail?”

Example:  800 Days of Weather in New York City


820 Days of Weather Data from New York City, Pulled From DataMarket
(Temps in F, Precipitation in Inches)

Come on admit it.  It’s very likely that you would have a use for data like this, whether it was from a single location or for a range of locations, as long as the locations(s) were relevant to your work and the data was easy (and cheap) to get.

Good news:  I’m gonna show you how to get this same data for the location(s) you care about, for free, and make it easy for you.  Read on for the weather workbook download link and instructions.

First:  A Practical Application to Whet the Appetite

As I said in the last post, I think there’s a lot of important things to be learned if we only cross-referenced our data with other data that’s “out there.”

I happen to have access to two years of retail sales data for NYC, the same location that the weather data is from.  To disguise the sales data, I’m going to filter it down to sales of a single product, and not reveal what that product is.  Just know that it’s real, and I’m going to explore the following question:

If the weather for a particular week this year was significantly better or worse than the same week last year, does that impact sales?

Let’s take a look at what I found:

Impact of Weather on Sales of a Specific Product

RESULTS:  Weather This Year Versus Last Year (Yellow = “Better”, Blue = “Worse”),
Compared to Sales This Year Versus Last (Green = Higher, Red = Lower)

I don’t have a fancy measure yet that directly ties weather and sales together into a single correlative metric.  I’m not sure that’s even all that feasible, so we’re gonna have to eyeball it for now.

And here is what I see:  I see a band of weeks where sales were WORSE this year than last, and the weather those weeks was much BETTER this year than last.

And the strongest impact seems to be “number of snow days” – even more than temperature, a reduction in snow this year seems to correlate strongly with worse sales of this product.

Does that make sense?  I mean, when the weather is good, I would expect a typical retail location to do MORE business, especially in a pedestrian-oriented place like NYC.  And we are seeing the reverse.

Aha, but this is a product that I would expect people to need MORE of when the weather is bad, so we may in fact be onto something.  In fact this is a long-held theory of ours (and of the retailer’s), but we’ve just never been able to test it until now.

All right, let’s move on to how you can get your hands on data for your location(s).

Download the Workbook, Point it at Your Location(s)

Step 1:  Download my workbook from here.

Step 2:  Open it and find a location that interests you from this table on the first worksheet:


Nearly Four Thousand Cities Are Available on the First Sheet
(Pick One and Write Down the Location ID)

Step 3:  Open the PowerPivot Window, Open the connection for editing:


Step 4:  Replace the Location ID and Fill in your Account Key:


ALSO IMPORTANT:  Make sure the “Save My Account Key” Checkbox is Checked!

Don’t Have A DataMarket Account Key?

No problem, it’s easy to get and you only have to do it once, ever.

Step 1:  Go to

Step 2:  Click the Sign In button in the upper right to create your account:


Step 3:  Follow the Wizard, finish signing up.  (Yes it asks for payment info but you won’t be charged anything to grab the data I grabbed.)

Step 4:  Go to “My Account” and copy the account key:


Next:  Subscribe to the Free Version of the Weather Feed

Go to the data feed page for “WT360” and sign up for the free version:


I’ve Gone a Bit Crazy With This Service and Already Exhausted my Free Transactions,
But That Was NOT Easy to Do

Back in the Workbook…

Now that you’ve entered your account key, set the LocationID, subscribed to the feed, and saved the connection, you can just hit the Refresh button:


Send Me Your Mashups!

Or at least screenshots of them.  I’m Rob.  At a place called PowerPivotPro.  DotCom.

And let me know if you are having problems, too.

I’ll have some more details on Tuesday of things you can modify and extend, like how I added a “Snow Days” calculated column.

DataMarket Revisited: The Truth is Out There

April 10, 2012


How many discoveries are right under our noses,
if only we cross-referenced the right data sets?

Convergence of Multiple “Thought Streams”

Yeah, I love quoting movies.  And tv shows.  And song lyrics.  But it’s not the quoting that I enjoy – it’s the connection.  Taking something technical, for instance, and spotting an intrinsic similarity in something completely unrelated like a movie – I get a huge kick out of that.

That tendency to make connections kinda flows through my whole life – sometimes, it’s even productive and not just entertaining Smile

Anyway, I think I am approaching one of those aha/convergence moments.  It’s actually a convergence moment “squared,” because it’s a convergence moment about…  convergence.  Here are the streams that are coming together in my head:

1) “Expert” thinking is too often Narrow thinking

I’ve read a number of compelling articles and anecdotes about this in my life, most recently this one in the New York Times.  Particularly in science and medicine, you have to develop so many credentials just to get in the door that it tends to breed a rigid and less creative environment.

And the tragedy is this:  a conundrum that stumps a molecular cancer scientist might be solvable, at a glance, by the epidemiologist or the mathematician in the building next door.  Similarly, the molecular scientist might breeze over a crucial clue that would literally leap off the page at a graph theorist like my former professor Jeremy Spinrad.

2) Community cross-referencing of data/problems is a longstanding need

Flowing straight out of problem #1 above is this, need #2.  And it’s been a recognized need for a long time, by many people.

Swivel and ManyEyes Both Were Attempts at this Problem

Swivel and ManyEyes Both Were Attempts at this Problem

I remember being captivated, back in 2006-2007, with a website called  It’s gone now – and I highly recommend reading this “postmortem” interview with its two founders – but the idea was solid:  provide a place for various data sets to “meet,” and to harness the power of community to spot trends and relationships that would never be found otherwise.  (Apparently IBM did something similar with a project called ManyEyes, but it’s gone now, too).

There is, of course, even a more mundane use than “community research mashups” – our normal business data would benefit a lot by being “mashed up” with demographics and weather data (just to point out the two most obvious).

I’ve been wanting something like this forever.  As far back as 2001, when we were working on Office 2003, I was trying to launch a “data market” type of service for Office users.  (An idea that never really got off the drawing board – our VP killed it.  And, at the time, I think that was the right call).

3) Mistake:  Swivel was a BI tool and not just a data marketplace

When I discovered that Swivel was gone, before I read the postmortem, I forced myself to think of reasons why they might have failed.  And my first thought was this:  Swivel forced you to use THEIR analysis tools.  They weren’t just a place where data met.  They were also a BI tool.

And as we know, BI tools take a lot of work.  They are not something that you just casually add to your business model.

In the interview, the founders acknowledge this, but their choice of words is almost completely wrong in my opinion:


Check out the two sections I highlighted.  The interface is not that important.  And people prefer to use what they already have.  That gets me leaning forward in my chair.

YES!  People prefer to use the analysis/mashup toolset they already use.  They didn’t want to learn Swivel’s new tools, or compensate for the features it lacked.  I agree 100%.

But to then utter the words “the interface is not that important” seems completely wrong to me.  The interface, the toolset, is CRITICAL!  What they should have said in this interview, I think, is “we should not have tried to introduce a new interface, because interface is critical and the users already made up their mind.”

4) PowerPivot is SCARY good at mashups

I’m still surprised at how simple and magical it feels to cross-reference one data set against another in PowerPivot.  I never anticipated this when I was working on PowerPivot v1 back at Microsoft.  The features that “power” mashups – relationships and formulas – are pretty…  mundane.  But in practice there’s just something about it.  It’s simple enough that you just DO it.  You WANT to do it.

Remember this?

OK, it’s pretty funny.  But it IS real data.  And it DOES tell us something surprising – I did NOT know, going in, that I would find anything when I mashed up UFO sightings with drug use.  And it was super, super, super easy to do.

When you can test theories easily, you actually test them.  If it was even, say, 50% more work to mash this up than it actually was, I probably never would have done it.  And I think that’s the important point…

PowerPivot’s mashup capability passes the critical human threshold test of “quick enough that I invest the time,” whereas other tools, even if just a little bit harder, do not.  Humans prioritize it off the list if it’s even just slightly too time consuming.

Which, in my experience, is basically the same difference as HAVING a capability versus having NO CAPABILITY whatsoever.  I honestly think PowerPivot might be the only data mashup tool worth talking about.  Yeah, in the entire world.  Not kidding.

5) “Export to Excel” is not to be ignored

Another thing favoring PowerPivot as the world’s only practically-useful mashup tool:  it’s Excel.

I recently posted about how every data application in the world has an Export to Excel button, and why that’s telling.

Let’s go back to that quote from one of the Swivel founders, and examine one more portion that I think reflects a mistake:


Can I get a “WTF” from the congregation???  R and SAS but NO mention of Excel!  Even just taking the Excel Pro, pivot-using subset of the Excel audience (the people who are reading this blog), Excel CRUSHES those two tools, combined, in audience.  Crushes them.

Yeah, the mundane little spreadsheet gets no respect.  But PowerPivot closes that last critical gap, in a way that continues to surprise even me.  Better at business than anything else.  Heck, better at science too.  Ignore it at your peril.

6) But Getting the Data Needs to be Just as Simple!

So here we go.  Even in the UFO example, I had to be handed the data.  Literally.  Our CEO already HAD the datasets, both the UFO sightings and the drug use data.  He gave them to me and said “see if you can do something with this.”

There is no way I EVER would have scoured the web for these data sets, but once they were conveniently available to me, I fired up my convenient mashup tool and found something interesting.

7) DataMarket will “soon” close that last gap

In a post last year I said that Azure DataMarket was falling well short of its potential, and I meant it.  That was, and is, a function of its vast potential much more so than the “falling short” part.  Just a few usability problems that need to be plugged before it really lights things up, essentially.

On one of my recent trips to Redmond, I had the opportunity to meet with some of the folks behind the scenes.

Without giving away any secrets, let me say this:  these folks are very impressive.  I love, love, LOVE the directions in which they are thinking.  I’m not sure how long it’s going to take for us to see the results of their current thinking.

But when we do, yet another “last mile” problem will be solved, and the network effect of combining “simple access to vast arrays of useful data sets” with “simple mashup tool” will be transformative.  (Note that I am not prone to hyperbole except when I am saying negative things, so statements like this are rare from me.)

In the meantime…

While we wait for the DataMarket team’s brainstorms to reach fruition, I am doing a few things.

1) I’ve added a new category to the blog for Real-World Data Mashups.  Just click here.

2) I’m going to do share some workbooks that make consumption of DataMarket simple.  Starting Thursday I will be providing some workbooks that are pre-configured to grab interesting data sets from Data Market.  Stay tuned.

3) I’m likely to run some contests and/or solicit guest posts on DataMarket mashups.

4) I’m toying with the idea of Pivotstream offering some free access to certain DataMarket data sets in our Hosted PowerPivot offering.

See you Thursday Smile

Movie Grosses and Inflation: Methodology

April 5, 2012

In the voting from the last post, the majority of respondents found the third ranking to be most realistic:


I agree – this is the one that I personally found most consistent with reality.  “The Exorcist” surprises me – I did not expect to see it in the top ten – but it was actually higher in the other two rankings.

Mashup of Movie Grosses with Average Historical Ticket Prices

Data Set #1:  Raw Movie Grosses


Raw Movie Grosses
(Source: – Click Image for Source Data)

Data Set #2:  Average Historical Ticket Prices


Average Historical Ticket Prices
(Source:  BoxOfficeMojo – Click Image for Source Data)

Relate both of these to the Years table:


Related the Two Mashup Sets to My Common “Years” Table
(Yes, I Have Many Data Sets Now)

The Measures

Yes, it’s overkill in this case to use measures.  I could just use a calc column.  In fact I probably should.  But I have a thing for measures, they just feel more “portable” over the long haul.

[US Movie Gross] =
   MAX(MovieGrossesUS[Total Box Office])

[Avg Movie Ticket Price] =
   AVERAGE(MovieTicketPrices[Avg  Price])

[Tickets Sold Approx] =
   [US Movie Gross]/[Avg Movie Ticket Price]

Flattened Pivot

One of my favorite techniques, the flattened pivot!




Turn Off Subtotals

To get a truly flat pivot, you then have to turn off subtotals:


And that yields:


Download the Workbook!

Go ahead and grab the whole thing here: 

(This only contains movies and ticket prices though – none of my other goodies like Inflation etc.)

What were the other two rankings?

They were both based on differing measures of inflation.  If I have time tomorrow, I may do a bonus post.

Mashup: Movie Grosses vs. Inflation

April 3, 2012

Avatar vs. Episode IV - Which is REALLY the #1 of All Time?

“I Just Don’t Believe that Avatar Truly Out-Earned Star Wars Episode IV”


(Image Credit:  Channel 24.  Click Image for Original)

You KNOW It’s Been Bugging You!

You’ve probably heard by now that “Avatar” is the #1-grossing movie of all time.  And before that, “Titanic” was #1.

Neither of which, of course, is Star Wars.  Which any child of the 1970’s KNOWS is the TRUE #1.  To suggest otherwise is blasphemy.  Let’s fix this crime.

The “Official” Box Office Ticket Sales Figures Do NOT Count Inflation!

Here’s the official all-time ranking (for the United States only):


US Ticket Sales Top Ten – NOT Inflation Adjusted
(source: – click image for raw data)

You need look no further than Episode I outgrossing Episode IV to know something is amiss Smile

Notice how five of the top ten films were all released in the past eight years?  Have movies gotten that much better?  Nope.

But ticket prices have sure gone up haven’t  they?  And hey, since I now have inflation figures from all pre-loaded into a model on my desktop (see this post about Moore’s Law and Inflation), it’s time for another quick mashup.

Question to Readers:  Which of these seems most correct?

PowerPivot is amazing.  One copy-paste of movie grosses, a quick relationship to the Years table, a few simple measures, and I have JUSTICE!

Here are three alternate rankings of all-time US movies, courtesy of my PowerPivot mashup.  Which one seems most realistic to you?

Three Inflation-Corrected Movie Rankings

They all share the obviously-correct #1 movie – comforting!  But they differ quite a bit on movies 2-10.

In each of the three alternates, I used a different data set to “level set” the price of tickets to the year the movie was released.  But other than data set, it was the same methodology.  It took me a lot longer to find the data sets than it did to produce these three alternate rankings.

So, which one seems most correct to you?  Let me know, and Thursday I will reveal the data sets behind all three.

Moore’s Law and Inflation

March 13, 2012

SQL Rockstar, aka Tom LaRock (Blog | Twitter) sent me a fascinating data set the other day:  a table of different computing devices over the years, their “horsepower” in calculations per second, and how much they cost:


Source:  The Rise of the Machines

The Cost of a Million Calcs per Second, Over History

Tom remarked that some modern devices, like his iPad2, have more computing power per dollar than even some of the recent supercomputers.

To validate this, I added a calc column called “Cost per MCalc”

  =Computers[Cost]/(Computers[Comp per Second]/1000000)

And indeed, his iPad2 is cheaper per million calcs than, say, 2002’s Earth Simulator:


By a lot, too.  Like 40x cheaper per MCalc.

But then he had an astute follow-on question:  how would that change if we took inflation into account?  That’s when he tagged me in.

Enter Shadowstats!

Tom’s idea finally gave me an excuse to subscribe to  About six months ago I even emailed them and asked them whether they had considered setting themselves up on Azure Datamarket.  (Their answer:  not yet.  My answer:  I’ll be back to convince you later.)

Shadowstats provides historical data on things like inflation and employment, and provides it in a convenient format (which is in itself quite valuable – have you ever tried to make sense of the data from .gov sites?  It’s a labyrinthine mess.)


CPI Data From, Pasted
Into PowerPivot as a New Table

Wow, 11% inflation in the US in 1974, my birth year?  Wow.  That’s intense.  And it piles up quickly when you have a few years in a row of high inflation.

Cumulative Impact of Inflation

To measure cumulative impact, I added a new column:


What that shows us is this:  Prices were 65% higher at the end of 1977 than they were at the beginning of 1970.  A 65% increase in the cost of living in just eight years.

Let’s chart it:


Inflation:  In 2011 It Took $6 to Buy What $1 Bought in 1970
(Official US Govt Numbers)

Factoring Inflation Into Price per MCalc:  2011 Adjustment Factor


Tom wanted to convert everything into 2011 dollars, which makes sense.  In order to do that, I created two measures.  Cumulative Inflation Official is just the Average of the same column in the CPI table of my source data, and the Adj Factor is:

=CALCULATE([Cumulative Inflation Official], Years[Year]=2011)
    / [Cumulative Inflation Official]

In other words “take the Cumulative Inflation value for 2011 and divide it by the Cumulative Inflation value for the current year.”

Now I can use that factor to give me a 2011-Adjusted cost per MCalc:


Where that second measure is just:

=[$ per M Calc]*[Adj Factor for 2011 Dollars Official]

Which shows us that the iPad2 is an even better deal (in terms of MCalc’s), compared to the Earth Simulator, than we had originally thought – more like 50x cheaper as opposed to our original 40x cheaper.

What’s with the Playstation3?

It rounds to $0.00 per MCalc even when adjusted to 2011 dollars?  I know game systems are sold at a loss, but really?  Well, the Playstation IS the only gaming/graphics system in the data, and those are of course dedicated renderers of polygons, and NOT general-purpose calcs.  So the number is indeed a lot higher – gaming systems just offer an insane number of (polygon) calcs per second, which is why there is so much interest these days in using GPU’s for business calc purposes – if you can “transform” a biz problem into a polygon problem, you’re off to the races.

Tying it All Together

Here you go, the “final” results:


Cost of a Million Calcs per Second, Adjusted to 2011 Dollars

Why the heck does it go back UP????

Isn’t that interesting?  We see a steady and sharp decline in price per MCalc from the 1970’s all the way into 2006, but then prices start to RISE again?

The data IS skewed a bit by the fact that we only are looking at 26 computers.  So if we happened to have a supercomputer in the data in 2009 but no PC or server, that can throw us off.

Let’s take a look by type of computer, then:


All Three Computer Types Show a Price per MCalc INCREASE in the late 2000’s
(Click for Larger Version)

OK, the shapes are a bit different, and the 2006 “plunge” in price for PC/Gaming/Handheld IS indeed due to 2006 being the year where the PS3 shows up…  but all three charts DO show some form of recent price increase per M Calcs.

So what’s going on?  Our Quest for Power Meets Physics.

I’m no expert but I’ve read enough over time to have a decent idea what’s causing that rise.

For the most part over computing history, our quest hasn’t really been for “cheap.”  It has been a quest for power.  My Dell PC in 1992 was about $3,000 and offered 33 M Calcs.  Five years later I bought a $2,500 machine that offered about 300 M Calcs.  So in five years the price fell a little but the power grew a lot.image

In theory, at that point I could have bought the 1992 level of power (33 MCalcs) for $300 or so.  But that’s not what I did.  I bought a more powerful machine rather than a cheaper machine.  My roommates would have made fun of me.  Conor lorded his fancy 3d graphics card over me every time we played Quake.  I needed 300 MCalcs!

Moore’s Law is based on our ability to continually cram ever-more transistors onto a single chip.  And as we’ve gotten closer and closer to the size of atoms, we’ve hit a bit of a limit in that regard.  Moore’s Law is “stalling.”

When Moore’s Law stalls, do we stop chasing power?  Nope, we just go in a different direction.  We start going multi-core.  Multi-CPU.  And while that DOES deliver more MCalcs, at the moment, it’s a more expensive way of doing it than the old way of “keep shrinking the transistors.”

Next Up…

One of the REALLY cool things about PowerPivot is its “mashup” capability.  I’ve shown it over and over.  But now that I have a good source of inflation statistics, I can dollar-correct ANYTHING.


I Can Now Mash This CPI Model Into ANY Other Data Source
That is Associated with a Year or Date!

I’ll have another wrinkle to share about inflation later this week, and the hint is the difference between “official” and “real” inflation – you may even see some measures in the field list above.

Friday Bonus: UFO Sightings & Hallucinogen Use

October 7, 2011


I didn’t have this ready in time for the contest Microsoft was running, and I’m not sure they would have wanted this to represent PowerPivot in an “official” sense so it likely wouldn’t have won anyway.

But here it is folks – a data mashup of UFO sightings versus Hallucinogenic drug usage.  All sourced from real data, for the United States only, between 1930 and 2000.

To place everything on the same scale, all measures are “indexed” against their maximum.  UFO sightings peaked in 1999 and declined slightly in 2000 for instance, so 1999 is where the green line hits 100%.  And I indexed LSD and Ecstasy versus total Hallucinogen usage (of all types), so neither of those lines ever hits 100%

Before you go concluding that drug usage leads to UFO sightings that are merely hallucinations, remember that correlation is not causality.  It is just as likely that UFO pilots are drawn to drug usage.  They are particularly fond of spying on raves, apparently.