

eBook
Available on Compatible NOOK devices, the free NOOK App and in My Digital Library.
Related collections and offers
Overview
Product Details
ISBN-13: | 9781615471492 |
---|---|
Publisher: | Holy Macro! Books |
Publication date: | 03/01/2019 |
Series: | Straight to the Point |
Sold by: | Barnes & Noble |
Format: | eBook |
Pages: | 52 |
File size: | 7 MB |
About the Author
Read an Excerpt
CHAPTER 1
FIRST, SOME BACKGROUND FOR POWER QUERY
Before we dive into Power Query examples, there are a few things you need to know.
Excel Versions
How you access Power Query depends on the version of Excel you're running:
In Excel 2016, Power Query is in the Data tab of the ribbon: It's the unintuitively named Get Data option in the Get & Transform Data group. (I won't get into the history of why this ribbon option is not actually called Power Query, but I'll assure you that Microsoft is gradually working the term Power Query back into its products.)
In Excel 2010 and 2013, Power Query is an add-in that must be downloaded from the Microsoft website: https://www.microsoft.com/en-us/download/details.aspx?id=39379.
Currently, Power Query is not available for Mac versions of Excel.
Everything Happens via Queries
It is not likely that your source data is in Power Query. Data is typically imported into Power Query from various sources such as tables, other workbooks, folders, databases, Facebook, and SharePoint.
When you're inside the Power Query Editor, you can build queries. Steps are recorded as you transform your data. In the image below, under Applied Steps, Split Column by Delimiter was added after I used the interface to split a column. Think of Applied Steps as a modern Macro Recorder with two differences from the old Macro Recorder. First, Applied Steps records in a language called M instead of VBA. Second, Applied Steps actually works reliably, whereas the VBA Macro Recorder rarely recorded production-ready code.
Although this grid looks a lot like the Excel grid, there is one major difference: You can't just click in a cell here and type a correction. Corrections to the data must happen at the source. For example, in the image below, say that you realize that in row 1 Lindy should actually be Linda. You cannot type the change here to correct it. You have to fix it in the workbook, worksheet, or table the data came from. If you realize that Lindy should be Linda in this one instance, you should head back to Excel and correct the data there. If, however, you need to globally change every Lindy to Linda, you can use Replace Values in Power Query to make the change to the column.
Data Types
Power Query is very strict about data types. Notice in the image below that there are icons to the right of the column headers. These icons tell you what each data type is:
ABC icon = text
123 icon = whole numbers
Calendar icon = dates
ABC123 icon = a mix of numbers and text
If you look at the Custom column, you can begin to understand the importance of data types. That whole column errored out when I added together the Visits and Birthday columns. On an Excel worksheet, you would get a result if you tried to combine this data. But Power Query says, "No, no, no, I'm not adding a date and a whole number together!" and it kicks out an error.
If you want to add a whole number to a date, Power Query forces you to take a deliberate action and specify whether that whole number represents days, months, or years. (You will learn more about this in the section "Adding Custom Columns with Power Query Functions and Adding Whole Numbers to Dates" on page 58)
Power Query Warnings
Power Query makes a lot of things easy, but you have to be wary. Some things aren't possible inside the Power Query Editor, and some tasks that are easy on an Excel worksheet are complicated inside Power Query. Therefore, it's important to be strategic and think several steps ahead to anticipate what might be best handled in a worksheet versus in the Power Query Editor. Keep in mind these warnings:
You cannot operate at the row or cell level. Whatever you do in your transformations will happen to the whole dataset or to an entire column.
Importing data into Power Query eliminates formatting such as font color, boldface, cell color, and borders. It is important to know this in some situations. For example, in the image below, Excel uses font color to distinguish between the name of a city and the name of a person. When you bring this data to Power Query, as shown at the far right, there is no visual clue that distinguishes a city, such as Hong Kong, from a person's name, such as Romona Lloyd.
A blank cell, null, and 0 are not the same thing. In the example below, Clyde La Croix has a 10 and 12 that should be added in the Total column as 22. Instead, Power Query shows the total as null. Why? Power Query doesn't see null as a 0.
CHAPTER 2GETTING DATA INTO POWER QUERY
How do you get data into Power Query so that you can start taking advantage of its magic? There are various ways of importing data. The main way is from a worksheet in a table or range of data.
Importing Data from a Table or Range
Say that, as shown in the image below, you have a range of addresses and need to send the data into Power Query. Here's what you do:
1. Place your cursor in the data range.
2. On the Data tab of the ribbon, select From Table/Range. The Create Table dialog box appears.
3. Verify that the data you want has been selected. In this case, the green dotted line around the dataset shows that, yes, the complete dataset B1:E14 has been selected.
4. Select the checkbox My Table Has Headers to let Power Query know that the table has headers.
5. Click OK.
Now your data is inside Power Query, as you can see below.
Note: This book primarily uses the method just shown for importing data. For the sake of brevity, whenever a data import is required, we skip the details of importing from a table or range and show the data already inside the Power Query Editor.
Importing Data from a Workbook
Importing from a workbook is really cool for a lot of reasons:
Sometimes you need to bring data together from multiple workbooks, and Power Query allows you to skip any copied/pasted or linked workbooks that contain delicate formulas.
Data may not neatly fit into a table, as in the example below. Here you can see empty rows and columns, as well as headers and symbols. There is too much going on to try to include it all in a table. A better strategy would be to save this data in its own workbook, which you could call, for example, Member Addresses.
To import the data shown above so you can work on it in Power Query, follow these steps:
1. Open a new workbook and then, from the Data tab, select Get Data > From File > From Workbook.
2. Navigate to the Member Addresses file and click Import. The Navigator opens.
3. Tell Power Query whether you want to import the whole workbook, a specific worksheet, a table, or something else. In this case, select Sheet1, and you get to see a preview of the dataset before you commit to the import.
4. Click Edit so that the data goes into Power Query and not the worksheet. (You don't want it there yet.) As shown below, you now see the data in Power Query! The entirety of Sheet1 in the Member Addresses workbook is ready for any transformations you need to make.
Next, you can start cleaning and transforming the data.
FILL UP AND FILL DOWN
One common need when you work with data is to fill holes in a dataset. For example, say that you have a report that contains details about four office locations. You can see below that Alesha works at 19 Corporate Center, her supervisor is Elise, and there are 100 employees at that location.
Say that you'd like to see this data in different ways:
You want to sort the Leadership Team column in alphabetical order and keep each person connected to the correct manager and office.
You want to match each office with the number of employees.
Excel does have Fill Up and Fill Down features, but they don't respect changes in the data. If you were to apply Fill Down to C3:C18, for example, the entire column would become Mt. Hood Park, from top to bottom, completely erasing the other three offices. Instead, to get the results you want, you can use Power Query.
With the dataset in Power Query, you can fill holes in this dataset by following these steps:
1. Highlight the Manager column.
2. Right-click and select Fill > Down.
3. The Manager column is now correctly filled in, as shown below.
4. Do the same thing to fill in the Office column with Fill Down.
5. Highlight the Total Employees column. Right-click and use Fill, Up.
6. Right-click and select Fill > Up.
7. To sort the Leadership Team column in alphabetical order, highlight the column.
8. In the Home tab of the ribbon, click the A-Z icon.
CHAPTER 3PARSING TEXT
Parsing strings of data is one of the most common and most irritating things you have to deal with when cleansing data. Power Query makes the process downright sexy. You could separate first and last names as shown below. You might think that it would then be easy to separate the names with Text-to-Columns or Flash Fill because the comma provides a consistent separation between the names.
Last Name, First Name Angus, CFP, CPA, Lindy Reynolds, Stephanie Tucker, DDS, Juliet
But then, as you can see below, trying to get from the original dataset in column B to the result in columns E and F is a pain in Excel.
In column B, those commas separating the names, professional designations, and suffixes ... Ouch!
Text-to-Columns would make a big mess here, and the traditional alternative is a complex formula using the SUBSTITUTE function. No thanks!
But Power Query gives you multiple ways of easily splitting data, as discussed next.
Splitting Names by Delimiter
To split names by delimiter, follow these steps:
1. On the Home tab, select Split Column > By Delimiter.
2. The Split Column by Delimiter box defaults to splitting by comma, but in this case, click the arrow to open the dropdown list and choose Custom.
3. Type in a comma and a single space as a custom delimiter, as shown below. (You will be able to get rid of both the comma and the space later on, as opposed to splitting by the comma and having a space to clean or trim later.)
4. Under Split At you have three brilliant options that you don't get with Text-to-Columns. In this case, choose Right-Most Delimiter because that's what separates the first name from the last name and professional designations.
5. Click OK, and you see the result shown below.
6. Now you can double-click the headers to change the titles and drag the columns to switch their positions, as shown below. This is just what you wanted!
Splitting into Rows
Sometimes you face the dreaded "everything crammed into one cell" problem, where data has been pasted into Excel from a web page or an email, and it looks like this:
In this case, it would be nice to have each student in his or her own cell in one column and matched with the session attended. Here's how you do this:
[TABLE OMITTED]
1. In Power Query, highlight the Students column.
2. On the Home tab, select Split Column > By Delimiter.
3. In the Split Column by Delimiter dialog, select Custom as the delimiter and type in a comma with a space after it. (As mentioned earlier, if you split with just a comma, you'll need an additional step to clear the spaces. This way, you can handle everything all at once.)
4. For Split At select Each Occurrence of the Delimiter.
5. Click the triangle to expand the Advanced Options section and under Split Into, select Rows.
6. Leave the Quote Character setting as is.
7. Click OK.
Below you can see the glorious result!
Did you notice? The session values were automatically paired with the names. You didn't have to do a Fill Up or Fill Down after splitting the rows!
Now you can sort by names or see if any students were in more than one session. If you have a list of the students' contact information, you can also match up that information.
SORTING IN POWER QUERY
In Power Query, sorting is a bit different than in native Excel. On the File tab of the Power Query ribbon is a Sort button like the one in Excel that you're familiar with. However, the Power Query Sort button just sorts. You don't get a dialog box with an assortment of options. So, the obvious question is: How can you sort multiple levels? Answer: Carefully!
You have to sort columns in the order in which you want them sorted. In the dataset shown below, for example, you have Names, Directors, and Hours columns, and you might want to sort three levels:
By Director, smallest to largest
By Hours, largest to smallest
By Name, smallest to largest
If you were to use the AZ and ZA buttons in Excel, you would have to sort by Name first, then Hours, then Director. But the process in Power Query is reversed.
To do this, you must select each column and then sort it, select and sort the next column, and so on:
1. Highlight the Director column and sort A-Z.
2. Select the Hours column and sort Z-A.
Note: Even though you have done two Sort operations, the only one step appears in Applied Steps. The formula bar is building a multilevel sort because the sort operations were performed in sequence.
3. Finally, select the Name column and sort A-Z.
The result:
[ILLUSTRATION OMITTED]
Note: If you look at the result of three sorts in View, Advanced Editor, the three operations created on step: #"Sorted Rows" = Table.Sort(#"Changed Type",{{"Director", Order.Ascending}, {"Hours", Order.Descending}, {"Name", Order.Ascending}}). Because you carefully performed the sort in the proper sequence, Power Query built a multi-level sort.
CHAPTER 4MERGING DATA: JOINING AND APPENDING
This section is one I suggest studying carefully because much of what you do with data involves merging, comparing, and stacking lists.
Joins and Merging
Here are some ways you might want to use joins and merging:
To determine what's on list A that's not on list B
To match the email addresses from list B with the right people on list A
To create a master list of people who are on two different lists and match up the ones that are present on both lists Power Query gives you six joins to help you with these needs:
Full Outer Join
Left Outer Join
Right Outer Join
Inner Join
Right Anti Join
Left Anti Join
No, the names of the joins aren't intuitive. (A Left Anti Join is anti the left of what?) And when you look at the interface, there is no left or right.
Thanks to a blog post by Excel MVP Ken Puls, I came to understand what all the Inner, Outer, Anti, Left, and Right stuff is about. Here I detail a few of the six joins and leave the rest for you to explore on your own.
To see the join options in action, you need two lists where it's possible for entries to be on either list or both lists. Your lists could be, for example, people who attended an event and people who registered online.
With a quick glance at these two lists, it's easy to see a few details:
1. Al and Thomas attended but didn't sign up online.
2. Millicent registered online but didn't attend.
3. Cassie, Cricket, and Jorge both signed up online and attended.
Now let's view the lists a different way:
[ILLUSTRATION OMITTED]
If you make this view of the lists into a Venn diagram, as shown above, the left, right, and inner distinctions start to make sense.
Before you can start using joins, you need to create queries. In this, case you want to load them as connections. Follow these steps:
1. With the cursor in the Attended list in Power Query, on the Data tab select From Table/Range.
Because the list is already in a table, you are taken straight into Power Query.
Note: you could click in the Name box in Properties and rename Table2 to Attended. This is an alternate to having to rename the query later. For the sake of this example, you don't need to modify the data, so click the down arrow next to Close & Load and select Close & Load To.
3. In the Import Data dialog that appears, select Only Create Connection and click OK. The query is created.
Note: You should typically select Only Create Connection when the result of the query doesn't need to be visible in the workbook. In this example, you are making a query for the purpose of being able to merge it with another query. You don't need to see the intermediate queries, and they would just create clutter in the workbook.
4. To give the query a useful name, in the Queries & Connections pane, right-click the query you just created. Here, it's called Table2. Select Rename and call it Attended.
After renaming, you have one properly named query in the Queries & Connections pane.
5. Repeat steps 1–4 to make the query for the list of people who registered online and name that query Registered.
With both queries created, you can start merging, joining, and digging into the data.
Full Outer Join: Creating a Master List
The Full Outer Join option brings two lists together and matches all entries that have matches. Here's how you use it:
1. From the Data tab, select Get Data > Combine Queries > Merge.
2. Notice that in the Merge pane, shown below, the default is Left Outer join. Click the Join Kind down arrow and select Full Outer.
(Continues…)
Excerpted from "Cleaning Excel Data With Power Query Straight To The Point"
by .
Copyright © 2019 Tickling Keys, Inc..
Excerpted by permission of Holy Macro! Books.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.
Table of Contents
About the Author, 1,
Acknowledgments, 1,
INTRODUCTION, 1,
So, What Is Power Query?, 2,
When I Became a Believer ..., 2,
FIRST, SOME BACKGROUND FOR POWER QUERY, 3,
Excel Versions, 3,
Everything Happens via Queries, 3,
Data Types, 5,
Power Query Warnings, 5,
GETTING DATA INTO POWER QUERY, 7,
Importing Data from a Table or Range, 7,
Importing Data from a Workbook, 8,
FILL UP AND FILL DOWN, 12,
PARSING TEXT, 16,
Splitting Names by Delimiter, 17,
Splitting into Rows, 20,
SORTING IN POWER QUERY, 22,
MERGING DATA: JOINING AND APPENDING, 25,
Joins and Merging, 25,
Full Outer Join: Creating a Master List, 33,
Inner Join: Determining Who Registered and Attended, 38,
Right and Left Anti Joins: Determining What's over Here That's Not over There, 43,
RIGHT ANTI JOIN, 43,
LEFT ANTI JOIN, 43,
Left and Right Outer Joins: Power Query's VLOOKUP, 44,
LEFT OUTER JOIN, 44,
RIGHT OUTER JOIN, 44,
Appending Datasets, 45,
GROUPING DATA, 47,
REFRESHING AND UPDATING DATA, 49,
ADDING COLUMNS, 50,
Adding Conditional Columns, 50,
Adding Multiple Conditions, 52,
Adding Custom Columns, 55,
Adding Custom Columns with Power Query Functions and Adding Whole Numbers to Dates, 58,
Adding a Column Using Column from Examples, 59,
CONCLUSION, 63,