Post By Dany Hoter
Intro from Rob: In this post, Dany demonstrates how we can use Power Query to add a numerical index column to our table, and then use that to address the previous row in a calc column. Lots of good stuff here.
More generally, this is achievable using the misleadingly-named EARLIER function. EARLIER does not mean “previous row,” but coincidentally, you can use it to do just that, and many other “cross-row reference” things. Check out this post on EARLIER for the basics (or read the calc column chapter in Power Pivot Alchemy), and read some of the comments at the end of this post for further examples.
Take it away, Dany…
I was approached with a business question from the car fleet manager in our company.
There are many different types of cars and the fleet manager is trying to optimize the fuel cost and reduce emissions.
The data about the fuel consumption comes from a smart device installed in the car.
Every time a company car enters a gas station the device reads the car’s odometer and reports the car’s ID + the odometer + gas quantity purchased to a central database.
From this data we needed to create a report that shows average fuel consumption by make and model of the cars in the fleet. We got the data as two csv files , one with the entire history of gas transactions and one with the car fleet details.
The data model and the report are really basic and straight forward as you can see in the attached workbook which is obviously fake data created for this article.
So where is the challenge? The only problem is that each row representing a fuel transaction includes the odometer for this transaction and the question is how to calculate the distance traveled since the last visit to the pump.
The Excel way
In excel this is a very simple problem.
- Make sure the rows are ordered by car plate # and date
- Subtract the odometer value for the previous row from that of the current row checking that both rows are from the same car. Ignore the first transaction for each car.
The example uses a table and notice that the expression is a mix between structured reference (circled black) and regular reference (circled red) when the cell referenced is on a different row in the table.
One problem with this solution is that it is difficult to make it refreshable without using VBA.
Next month when there will be more data the table need to be sorted again and this is possible only from the UI or from VBA.
The Power Pivot way
Actually there is no PP way without some help. In DAX you can’t reference the previous row in any way because there is no order to the rows. There are probably solutions to this specific example using purely DAX but I’m sure that they are pretty complex.
Some Help from Power Query
I used PQ to create a new column which helps PP understand what the previous row is.
If you open the query that reads the fuel csv file, you’ll see that I sort the data and immediately add an index column to the table which persists the current order of the rows which is ascending by car and date.
Back to Power Pivot
I can use this new column in PP to calculate the difference between any row and the row next to it.
I add a calculated column with this expression:
The index column allows me to reference other rows relative to the current row using the LOOKUPVALUE function.
Using Power Query I was able to sort the data and persist the order after sort using an index column.
These two operations are part of the import operation and so will be performed each time new data arrives.
The index column enables creating a calculated column that compares data in the current row with data in the “previous” row, when previous is defined by the index column.