PowerPivot Alchemy

PowerPivot Alchemy

by Bill Jelen
     
 

View All Available Formats & Editions

Rob Collie (PowerPivotPro) and Bill Jelen (MrExcel) join forces in this combined sequel to their best-selling Power Pivot books. Alchemy sets aside the fundamentals of past books and provides a series of practical, easy-to-follow “patterns” for dozens of common analysis, reporting, and dashboarding needs. Three “bonus” chapters

Overview

Rob Collie (PowerPivotPro) and Bill Jelen (MrExcel) join forces in this combined sequel to their best-selling Power Pivot books. Alchemy sets aside the fundamentals of past books and provides a series of practical, easy-to-follow “patterns” for dozens of common analysis, reporting, and dashboarding needs. Three “bonus” chapters introduce you to the newer members of the “Power BI” family – Power Query, Power View, and Power Map. Whether you are starting your transformation from “old school” Excel User to Modern Data Professional or a veteran of the Power BI arts, this book will dramatically expand your data superpowers.

Product Details

ISBN-13:
9781615470211
Publisher:
Holy Macro! Books
Publication date:
02/01/2014
Pages:
223
Sales rank:
390,362
Product dimensions:
8.10(w) x 10.70(h) x 0.60(d)

Related Subjects

Read an Excerpt

PowerPivot Alchemy


By Rob Collie, Bill Jelen

Holy Macro! Books

Copyright © 2014 Robert Collie and Tickling Keys, Inc.
All rights reserved.
ISBN: 978-1-61547-021-1



CHAPTER 1

Dashboard Tricks and Visualization Techniques

Broadly speaking, Power Pivot is a numbers-producing machine: Raw data goes in, your formulas and relationships digest it, and magically useful numbers come out. Those numbers are often metrics on a business, and that's a beautiful thing; very often, those metrics are being "seen" for the first time in the history of the business. Quite often, before a business adopts Power Pivot, it is forced to operate without metrics that are, in hindsight, quite clearly critical. This is a lot like a doctor suddenly having access to patient information such as temperature, pulse rate, and blood pressure — after lacking that information for most of a career. Such a shift is transformational. So, Power Pivot produces numbers — incredibly important numbers that quite often have never before existed. It's empowering stuff for sure.

There is a natural tendency among "numbers" people to view the freshly pressed numbers as the final destination. Resist that temptation! In order for numbers to have an impact, they need to be communicated to other people who may not be fascinated with these magical digits. Furthermore, those people then need to translate them into action. So the way you present numbers is often every bit as critical as the numbers themselves.

This chapter provides a collection of tricks in that vein. We have intentionally selected techniques ranging from "bread and butter" (that you might use in nearly every report/dashboard) to "envelope pushing" (that you might not apply in precisely the manner presented but that might inspire related approaches). We start with one of the former.


Adding a "Last Refreshed Date" Readout

You've built some killer models and reports. You've published them to SharePoint. You've scheduled automatic refresh to run, say, once per week. You have this Power Pivot thing All. Dialed. In. But are the report consumers satisfied? Nope, they aren't satisfied!

You can be certain that your report consumers will never be satisfied. And, hey, that's kinda cool. It's a good thing. You weren't given all this new power just so you could sit on your laurels, were you? Nope. True Power Pivot pros aren't even sure they can find their laurels, much less sit on them. You will always be improving — both your skill set and the reports you produce.

One improvement you can make is to automatically inform report consumers of how "fresh" the data is so they don't have to deal with stale reports. Adding such a readout is actually pretty simple: It requires just two steps.


Step 1: Adding a Last Refreshed Measure

Somewhere in the Power Pivot window, you probably have a column whose most recent date is always the date on which the data was last refreshed. For instance, in a retail system, you might be able to use your Sales table for this, as long as you always have at least one transaction per day. Or perhaps the Calendar table you pull from the database is always current (and does not contain future dates). Or perhaps you can get your DB admin to add a single-cell table just for this purpose.

This example uses the TransactionDate column from a Sales table:

[ILLUSTRATION OMITTED]


Did you know that measures can return dates? They sure can, and it's killer useful. Here's how you create a measure that does this:

[LastRefreshed] = LASTDATE(Sales[TransactionDate])

LASTDATE() is kinda like MAX() but for dates. It always returns the most recent transaction date:

[ILLUSTRATION OMITTED]

Neat, huh? It's a date returned as a measure, in a Pivot.


Step 2: Using the Measure in a Cube Formula

The most flexible, least intrusive way to display the LastRefreshed measure in a report is to create a single cube formula for it and then stuff that formula into a single cell. That way, you have complete control over its appearance.

Two things to note here:

• Given that your reports will often have columns that are oddly sized, to make everything look good, it's much better to use a single-cell formula that includes the label than to split it across two cells.

• A cube formula, when it returns a date measure, formats it as an integer rather than as a date. So you need to reformat it as a date in the formula. (You could just use Format Cells, but when you put the label and the date in a single cell, you can't just format the cell as a date.)


To make a long story short, here is the cube formula to use in this case:

="Last Refreshed:" & TEXT(CUBEVALUE("PowerPivotData", "[Measures].[LastRefreshed]"),"mm/dd/yyyy")


You can leave the result as a date serial number and then put the label in the custom format. Use =CUBEVALUE() to return the date serial number. Select the cell, press Ctrl+1, choose Number, Custom. In the Type box, enter "Last Refreshed "m/d/yyyy:

[ILLUSTRATION OMITTED]

Plop this in the desired cell, and you're all set:

[ILLUSTRATION OMITTED]

Pretty slick. The cell will refresh every time the workbook is refreshed. But it will not be reevaluated during an update, as happens with a slicer.


Note

For more on refreshing versus updating and the huge difference in performance between the two, see http://ppvt.pro/REFRESHUPDATE.


Normalizing Your Measures to First/Average/Max Values for Charts

In this chart, the line for Total Sales is quite visible, whereas the other two lines are squashed at the bottom. Even when this chart is in color, you can't tell that the line for Transaction Size is 50 times higher than the line for Active Customers.

The three lines plotted on this chart are "sourced" from the following numbers:

[ILLUSTRATION OMITTED]

These three measures are very different in their relative sizes, varying from two digits to six digits. Data like this results in crappy charts. But with a little formula magic, you can fix the chart above so it looks like this:

[ILLUSTRATION OMITTED]

Ah, formulas make everything better. Yep, this chart uses the same data as the earlier chart, just "normalized."


A Word from the Charting Pit of Derision

Yeah, I (Rob) hear that chittering out there ... the mandibles of the demonspawn chart fiends are clacking out a sound that resembles "Secondary Axis!"

Well, I need three axes this time. Is there a Tertiary Axis feature? I seriously don't even know. There are only two sides to the chart, so it would make sense, I guess, to not offer a tertiary axis. You'd have to start "stacking" scales side by side, and that would probably make Tufte cry.

But I'm intentionally not checking whether there is such a feature. Because honestly I don't even like the Secondary Axis feature that much.

If you are absolutely sure that none of your numbers will ever be 0 or negative, you could select Layout, Axes, Primary Vertical Axes, Show Axis with Log Scale:

[ILLUSTRATION OMITTED]

This will automatically make the smaller numbers easier to see. In a log scale, the distance from 1 to 10 is the same as the distance from 10000 to 100000. However, when, in real life, would you have a data set without a single zero point?


Numbers Are Numbers, Visuals Are Visuals, and Everyone Has a Preference

I (Rob) have a theory that you are either a numbers person or a visuals person. When you first get some new data, is your first instinct "I need to crunch this data with some formulas," or is it "I need to get this on a chart"?

Now, of course, a numbers person uses charts, and a visuals person sometimes needs to write some formulas. But which one is your first instinct determines which type of person you are.

I'm a numbers guy. Charts are very much a last step in the process — if that — in my world. Heck, give me some conditional formatting in a Pivot, and I am usually set. (Tellingly, though, I always need conditional formatting before I am happy. See, even a numbers guy can leverage visuals; it's just that I am more on the numbers side of things.)

So, even in a case with two different measures, I am tempted to correct with formulas rather than track down the Secondary Axis feature. I'm really just not that comfortable with charts. I struggle to make them do what I want. And I want to work with numbers. So this is really all just personal preference on my part.


The Formulas

Let's get back to normalizing the earlier data. Basically, you divide each measure by the maximum value of that measure and put everything on a 0 to 100% scale. Here's one of the formulas you need to do this:

[Sales Indexed to Max Week] = [Total Sales Measure] / MAXX(ALL(Calendar[WeekNumberOfYear]), [Total Sales Measure])


You put Calendar[WeekNumberOfYear] in the Rows drop zone for the Pivot (which "powers" the horizontal axis of the PivotChart). So if you change the field in the Rows drop zone, you need to change that part of the formula.

The other two measures follow exactly the same pattern, but you substitute their respective base measures for Total Sales Measure.

In this Pivot, all three sets of data have been normalized to be between 0 and 100% — which yields a useful chart:

[ILLUSTRATION OMITTED]


Alternative Formulas: Using the Average

Maybe 0 to 100% is too restrictive for you. Maybe you want to divide by the average instead of the max. Here's what it looks like:

[Sales Indexed to Average Week] =
[Total Sales Measure] /
CALCULATE([Total Sales Measure] /
DISTINCTCOUNT(Calendar[WeekNumberOfYear]),
ALL(Calendar[WeekNumberOfYear)
)

Note

Again, if you have something else in the Rows drop zone, you need to replace Calendar[WeekNumberOfYear].

This yields the following, slightly different (better?) chart:

[ILLUSTRATION OMITTED]

The chart is still quite readable, but the fact that Active Customers has wider variation than the other measures is no longer hidden by being squashed into 0 to 100%.

Of course, you could also just use AVERAGEX() instead of those CALCULATE() shenanigans in the denominator. In fact, that's what does happen for Trans Size Indexed to Average Week:

[Trans Size Indexed to Average Week] =
[Transaction Size] /
AVERAGEX(ALL(Calendar[WeekNumberOfYear]),
[Transaction Size])

It doesn't make a huge difference in most cases. AVERAGEX() doesn't care whether certain weeks had higher sales volumes than others: All weeks will be averaged as equals, whereas the fancy CALCULATE() approach above computes the average as a grand total ratio, which is inherently weighted.


Building Charts That Are Dynamically Indexed to the First Value

Try using normal Excel to create a sliceable chart in which every series always begins at 100%. A chart like this is very useful for comparing the relative performance of different things over a given period of time. Setting this up is relatively labor intensive in regular Excel, and then if you want to change the selected time period, you must repeat that manual effort.

Power Pivot allows you to create such a chart with less effort than required in normal Excel, and the resulting chart responds dynamically to changing date range selections, with no modifications required.

In the following example of such a chart, notice how September is selected in the first image and October is selected in the second image. In both cases, all three series are indexed to 100% at the beginning of the time period:

[ILLUSTRATION OMITTED]

Here's the formula for the normalized measure displayed on the chart:

[Normalized Sales] = DIVIDE([Total Sales], [Sales on First Date in Range])

This seems anticlimactic, doesn't it? But there's real magic in the Sales on First Date in Range measure:

[Sales on First Date in Range] =
CALCULATE([Total Sales],
DATESBETWEEN(Calendar[Date],
[First Date in Range],
[First Date in Range]
)
)


Sorry, we keep teasing you. We like to write measures in intermediate steps like this. The real magic is in that First Date in Range measure:

[First Date in Range] =
CALCULATE(FIRSTNONBLANK(Calendar[Date], [Total Sales]),
ALL(Calendar[Date])
)


Note

We struggled, at first, to write this measure. We tried using

ALLSELECTED(Calendar[Date]), but because we were slicing by the Month column (a different column than Date) and letting Month filter the date range rather than filtering by the Date column directly,

ALLSELECTED(Calendar[Date]) was returning January 1 even when we had selected December on the month slicer. VALUES() provided another dead end. We found ourselves desiring an ALLVALUES() function before realizing that FIRSTNONBLANK() is built for this kind of thing. All's well that ends well.


Bubbling Up Exceptions with "Sarah Problem"

Sometimes, when a Pivot has multiple fields in the Rows drop zone or the Columns drop zone, interesting "outlier" values are hidden from view until the user expands (drills down into) the correct branch of the Pivot. Rather than require consumers of your Pivots to expand and scan every node, it is sometimes valuable to "bubble up" certain details from lower levels and display them at a higher level.

For example, in the following Pivot, a flag appears at the top level (Accessories), which tells you there's a problem further down. You then expand Accessories, and then Bottles and Cages, to find that Road Bottle Cage — a product with significantly negative 1-year sales growth — is the culprit:

[ILLUSTRATION OMITTED]

At the top level, the Accessories category looked quite healthy, so you may never have known there was a problem lurking deeper down. You can use "bubbling up" techniques to prevent such lurking problems. We like to call this particular example the "Sarah Problem" technique.


Starting with a "Sarah Problem" Measure

Let's say you have a measure. It doesn't matter what it is or how it's calculated really, except that it reports on whether there's a problem. It returns 1 (or Yes or True) if there's a problem, and it returns 0 (or No or False) if there isn't. Or maybe it returns a "regular" number when it crosses a certain line that your business has decided is bad. The measure helps you determine where there's a problem, and we punnily call it Sarah Problem. (To pronounce it properly, you have to add a question mark, so, technically, it's Sarah Problem?)


Note

I (Rob) can't take credit for the Sarah Problem pun. Back when I lived in Seattle, my wife (girlfriend at the time, and fellow Microsoft engineer) played roller derby. She was a "Rat City Rollergirl," playing for a team called Grave Danger, and her skater name was Natalie Fatality. And she had a teammate whose skater name was "Sarah Problem." I am not making this up.


Back to the Formulas

In this example, the story of Sarah Problem starts with a measure named 1 Year Sales Trend. The formula for 1 Year Sales Trend is completely irrelevant! You just need to know that it measures "year-over-year" sales growth or decline. So it can go as low as — 100% (for example, in the case of products that haven't sold at all this year) and can go as high as, well, there is no upper limit.

Say that you've decided that anything that falls below — 10% is bad. So you want to flag products that have declined by 10% or more.

Here you see the 1 Year Sales Trend measure circled:

[ILLUSTRATION OMITTED]

Notice that Sarah Problem is very meticulous! At the Accessories level, the trend is very positive, but it's still flagged:

[ILLUSTRATION OMITTED]

How do you create that Products in Decline column of the Pivot? Well it starts with another measure:

[Products in Decline] = COUNTROWS(FILTER(Products, [1 Year Sales Trend]<-.1))

This formula counts the number of rows in the Products table (each of which is an individual product) for which 1 Year Sales Trend is below the threshold of — 10%.

Here's how you do the conditional formatting:

[ILLUSTRATION OMITTED]

This conditional formatting rule is pretty simple: It just flags cases where there's at least one "bad" product. (Note that the Show Icon Only check box is checked.) So you get a flag whenever there's at least one "bad" product, in any level of the Pivot.

Of course, maybe you don't want to count bad products. Maybe you want to count bad stores. Or bad customers. To count anything, you just change your second measure to count rows of the proper table. Or maybe even distinct values of a column, using VALUES(column). Knock yourself out. But know that no one knocks people out like Sarah Problem.


Ranks and Exceptions That Bubble Up to Subtotals

Rather than bubble up the existence of problems or outliers, what if you want to display the best (or worst) value that can be found deeper down a particular branch of a Pivot?

Here's an example of bubbling up the best product rank under a branch:

[ILLUSTRATION OMITTED]

Here's the measure formula:

[Product Sales Rank]=
IF(HASONEVALUE(Products[ProductName]),
RANKX(ALL(Products), [Total Sales]),
MINX(VALUES(Products[ProductName]),
RANKX(ALL(Products), [Total Sales])
)
)

The first input to IF() (that is, HASONEVALUE) checks whether the current measure cell is in the context of a single product. If it is, the first branch performs a rank of that product against all other products:

[ILLUSTRATION OMITTED]


(Continues...)

Excerpted from PowerPivot Alchemy by Rob Collie, Bill Jelen. Copyright © 2014 Robert Collie and Tickling Keys, Inc.. Excerpted by permission of Holy Macro! Books.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.

Meet the Author

A former Microsoft engineering leader, Rob Collie is an entrepreneur, author, and consultant who believes Excel is enjoying a renaissance in power and importance - a trend that provides massive opportunities for the millions of Excel practitioners worldwide. His passion is empowering and the Excel Professional to seize those opportunities. When he's not training, consulting, or writing, Rob can be found in his laboratory, devising new tools and techniques for the Excel community. He also operates the world's leading PowerPivot website (PowerPivotPro.com) and occasionally sleeps.
Bill Jelen is the host of MrExcel.com and the author of 43 books about Microsoft Excel. He is a Microsoft MVP in Excel and a contributor to Strategic Finance Magazine.

Customer Reviews

Average Review:

Write a Review

and post it to your social network

     

Most Helpful Customer Reviews

See all customer reviews >