Accessing and Analyzing Data with Microsoft Excel

Accessing and Analyzing Data with Microsoft Excel

by Paul Cornell
     
 

Master the tools that transform the information in spreadsheets, databases, and servers into faster, better business decisions. Packed with expert insights and practical Your Turn exercises, this essential guide shows how to use the data analysis capabilities in Microsoft® Office applications to capture data, analyze trends, identify risks—and seize

Overview

Master the tools that transform the information in spreadsheets, databases, and servers into faster, better business decisions. Packed with expert insights and practical Your Turn exercises, this essential guide shows how to use the data analysis capabilities in Microsoft® Office applications to capture data, analyze trends, identify risks—and seize opportunities. It covers everything from mastering the basics of sorting, filtering, and formatting data to performing sophisticated what-if scenarios, analyzing multidimensional data, and even creating automated data-analysis solutions.

Discover how to turn data into results!

  • Help numbers tell the story with compelling, well-designed Excel charts, graphs, and reports
  • Use PivotTable® and PivotChart® dynamic views to detect patterns and evaluate trends
  • Use Data Analyzer to perform detailed analyses and present information in dynamic pie charts, bar graphs, and other views
  • Publish and analyze data over the Web using Office Web Components
  • Learn techniques to interpret—and exchange—XML data
  • Analyze large databases using the OLAP features in Microsoft Excel
  • Write code to automate data analysis and reporting solutions

  • Implement a database maintenance plan that helps keep data accurate, accessible, and secure

Get tools and sample files on CD, including:

  • Add-ins for Microsoft Access and Excel, including Report Manager, XML Spreadsheet, and OLAP CubeCellValue
  • Dozens of sample PivotTables and PivotCharts to experiment with
  • Office XP Web Component Toolpack
  • Multimedia Data Analyzer tutorial
  • Fully searchable eBook
  • More code, databases, and extras to put your data to work!

A Note Regarding the CD or DVD

The print version of this book ships with a CD or DVD. For those customers purchasing one of the digital formats in which this book is available, we are pleased to offer the CD/DVD content as a free download via O'Reilly Media's Digital Distribution services. To download this content, please visit O'Reilly's web site, search for the title of this book to find its catalog page, and click on the link below the cover image (Examples, Companion Content, or Practice Files). Note that while we provide as much of the media content as we are able via free download, we are sometimes limited by licensing restrictions. Please direct any questions or concerns to booktech@oreilly.com.

Product Details

ISBN-13:
9780735618954
Publisher:
Microsoft Press
Publication date:
02/15/2003
Series:
Bpg-Other Series
Edition description:
Book & CD
Pages:
416
Product dimensions:
7.38(w) x 9.00(h) x 1.23(d)

Read an Excerpt

Chapter 3.

Analyzing Data with Microsoft Excel

  • Sorting and Filtering Data
    • Sorting Data
    • Filtering Data
  • Using Worksheet Functions
  • Applying Conditional Formatting
  • Working with Charts
  • Importing External Data
    • Filtering Imported Data
  • Querying External Data
    • Querying Data on the Web
  • Analyzing Data in Lists
  • Using the Analysis ToolPak
    • The Descriptive Statistics Tool
    • The Histogram Tool
    • The Moving Average Tool
    • The Rank and Percentile Tool
    • The Sampling Tool
  • Using the Solver Add-In
    • Summary


3   Analyzing Data with Microsoft Excel

Microsoft Excel provides several features and tools that you can use to analyze data in various formats. Understanding how these features work can significantly expand the types of data analysis tasks that you can complete.

Many data analysis tasks in Excel consist of sorting, filtering, formatting, and summarizing data. More sophisticated data analysis tasks can be performed with the built-in and add-in functions that Excel supplies. In this chapter, you’ll learn basic sorting, filtering, and querying skills as well as learn about the tools you use for more sophisticated statistical analysis and "what-if" scenarios. You will not only learn how to work with data stored in Excel worksheets, but also how to bring external data into Excel—data from a server-based database or data from the Web, for example—and use Excel’s data analysis features and tools on the imported data.

As the basis for the examples in this chapter, you’ll analyze data in a spreadsheet that records occupancy rates and purchasing habits for a fictional hotel chain’s preferred guests. You should imagine yourself to be the hotel’s marketing specialist, trying to figure out how to increase business from these preferred guests by using the data analysis features of Excel. The spreadsheet you’ll use, named Hotel.xls, is included in the Chap03 folder at the location where you copied the sample files for this book.

Objectives

In this chapter, you will learn how to analyze data in Excel by

  • Using basic features for sorting, filtering, formatting, and importing data.
  • Using worksheet functions.
  • Creating and interpreting charts.
  • Importing and analyzing data from external data sources and the Web.
  • Using the basic features of the Analysis ToolPak and the Solver Add-In.

Sorting and Filtering Data

Sorting and filtering data are among the most basic data analysis tasks that you can perform. But even though these tasks are usually quick and simple, they can still yield meaningful business facts. The following sections show you how to sort and filter data records.

Sorting Data

Sorting data is a great technique when you need to

  • Find the absolute highest or lowest data value in a list of data records.
  • Find a group of values that are the top or bottom values in a list of data records.
  • Rank data values in highest-to-lowest or lowest-to-highest order.
  • Group repeating data values together.

For instance, you might want to quickly know the absolute highest sale order amount or the top three months of sales figures. You might need to figure out how a particular customer purchased items as compared to similar customers or how many times a particular product was ordered. Sorting can help you answer these types of questions. Sorting data is easy; select the cells you want to sort, click Sort on the Data menu, and then complete the information in the Sort dialog box.


TIP:
To select all the data in the active worksheet, press CTRL+A.

Your Turn

In this exercise, you want to discover who is the preferred customer with the highest total room service charges in any single month so that you can reward the customer with a $100.00 gift certificate. This type of data analysis task is best suited for a simple sort.

  1. Start Excel, and open the Hotel.xls file in the Chap03 folder.
  2. Press CTRL+A to select all the records.
  3. On the Data menu, click Sort.
  4. In the My List Has area, click Header Row.
  5. In the Sort By list, select Total Room Service.
  6. Select the Descending sort order option.
  7. Compare your results to Figure 3-1, and then click OK.
  8. Figure 3-1 The Sort dialog box. (Image unavailable)

  9. Compare your results to Figure 3-2. Edgar has the highest total room service charges, $1,835.40 in August.
  10. Figure 3-2 Customers sorted by highest room service charges in descending order. (Image unavailable)


    NOTE:
    Although you can sort cells by selecting the cells you want to sort and clicking the Sort Ascending button or Sort Descending button on the Standard toolbar, you might not get the sorting results that you expect.

Filtering Data

Filtering data is an ideal data analysis technique when you want to display a group of records that match specific conditions and hide records that don’t match. (Data is not deleted from hidden rows.) For instance, you may want to find out how many customers purchased a particular product or service or how many products or services came from a particular supplier.

Filtering data is similar to sorting, but filtering displays only the data that matches the filter conditions. You should use filtering when you want to focus your attention on specific records that match your filter conditions.

Filter conditions (also known as filter criteria) can be simple (values in your worksheet) or advanced (criteria that you enter separate from the data values in your worksheet). To filter records in the active worksheet by using simple filter criteria, click any cell in the list of records. Then, on the Data menu, point to Filter, click AutoFilter, and select one of the choices in the AutoFilter lists. The arrow in a cell turns blue to confirm that data is being filtered by criteria selected in that column. You can apply another filter to the data that remains if you want to select even more specific records. If none of the list’s choices meet your needs, you can select Custom from the AutoFilter list to display the Custom AutoFilter dialog box, in which you can specify your own filter criteria. You can use the ? and * characters to represent single and multiple "wildcard" characters, respectively, in custom filter criteria expressions. You can also click the And or Or options to specify whether matching records must meet two criteria or either of the custom filter criteria you enter.

Your Turn

In this exercise, you want to find records for all of the room service charges in August for your preferred customers. Because this group of records is related by containing an identical data value in a specific field, this task is a good candidate for a filtering operation.

  1. Open Hotel.xls. If the file is open already, close it without saving the file and open it again.
  2. On the Data menu, point to Filter and then click AutoFilter.
  3. In cell B1, the Month column, click the arrow and then click August. Only the records for August are displayed. Your results should look similar to Figure 3-3.
  4. Figure 3-3 The August data records. (Image unavailable)


    NOTE:
    The row numbers in your spreadsheet might not match the row numbers in Figure 3-3. The data should be the same, however.
    TIP:
    To remove filters from the active worksheet, on the Data menu, point to Filter and then click Show All. To remove all filters from the active worksheet and remove the arrows next to each of the row’s cells, point to Filter on the Data menu and then click AutoFilter.

As you experiment with filtering, you might discover some cases in which you aren’t able to use either the AutoFilter list or the Custom AutoFilter dialog box to set up the exact combination of filter criteria that you need. In these cases, you should specify advanced filter criteria. To filter a list of records by using advanced filter criteria, you first need to insert at least three blank rows above the list to which you want to apply the filter.


TIP:
A list to which you apply advanced filter criteria must contain a header row. Also, leave at least one blank row between the group of cells making up the advanced filter criteria and the header row of the list to which you want to apply the filter. Otherwise, Excel might not be able to determine where your filter criteria ends and the list of records begins.

After you have inserted the blank rows, in separate cells in the first blank row, type the name of each column by which you want to filter. Then, in the second and subsequent rows, type the advanced filter criteria. Click any cell in the list to which you want to apply the filter criteria, point to Filter on the Data menu, and then click Advanced Filter. Finally, provide the filter criteria in the Advanced Filter dialog box and click OK to apply the filter.

Your Turn

In this exercise, you want to move some of your Gold customers to Platinum status by finding out which customers booked more than 15 nights and spent more than $1,500.00 on room service in any single month during the first quarter of the year. Because these filter criteria involve ranges of potential data values, this is a typical task for an advanced filter.

  1. Open the Hotel.xls file. If the file is open already, close it (do not save the file) and then open the file again.
  2. Insert five blank rows above the data list. To do so, select cells A1 through A5 and then click Rows on the Insert menu.
  3. Type the information shown in Figure 3-4 into cells A1 through C4.
  4. Figure 3-4 Advanced filter criteria. (Image unavailable)

  5. Click cell A6, and then, on the Data menu, point to Filter and click Advanced Filter.
  6. In the List Range box, leave the default value of $A$6:$D$318.
  7. Click in the Criteria Range box, and then select cells A1 through C4, inclusive. Compare your results to Figure 3-5.
  8. Figure 3-5 The Advanced Filter dialog box. (Image unavailable)

  9. Click OK. The customers that booked more than 15 nights and spent more than $1,500.00 in any single month during the first quarter of the year include Davis, Finch, Ventura, Xavier, and Zimmerman, as shown in Figure 3-6.
  10. Figure 3-6 Results of running the advanced filter. (Image unavailable)

Here are additional advanced filters that you can try:

  • Customers that booked less than 5 nights or more than 15 nights in any given month. (See Figure 3-7.)
  • Figure 3-7 Advanced criteria for customers that booked less than 5 or more than 15 nights in any given month. (Image unavailable)

  • Customers that booked less than 5 nights or more than 15 nights and spent less than $100.00 or more than $1,500.00 on room service in any given month. (See Figure 3-8.)
  • Figure 3-8 Advanced criteria for customers that booked less than 5 or more than 15 nights and spent less than $100.00 or more than $1,500.00 on room service in any given month. (Image unavailable)

Putting It Together

You can use a combination of filtering and sorting to quickly reduce the amount of visual clutter while at the same time organizing the data you want to analyze. In the last three exercises, you practiced sorting and filtering data as separate data analysis tasks. In this exercise, you’ll combine sorting and filtering to find out which customer had the highest total room service charge in February. You want to display only the February room service totals and rank the totals in descending order.

  1. Open the Hotel.xls file. If the file is open already, close it (do not save the file) and open the file again.
  2. On the Data menu, point to Filter and then click AutoFilter.
  3. Click the arrow in cell B1, the Month column, and then click February.
  4. Select all of the records by pressing CTRL+A.
  5. On the Data menu, click Sort.
  6. Click the Header Row option.
  7. In the Sort By list, select Total Room Service and then click OK.
  8. Customer Davis had the highest February total room service charge ($1,590.49).

Using Worksheet Functions

Worksheet functions are built-in Excel formulas that perform calculations. An example of a simple function is the SUM formula which, as you might know, adds together the values in two or more cells. The result of the function =SUM(3, 2) is, of course, 5. In this case, the numbers 3 and 2 are called the function’s arguments, and the number 5 is called the function’s result. In technical terms, a function returns a result. The result a formula returns is displayed in the cell that contains the formula.

Very few meaningful data analysis tasks can be accomplished without calculations. Understanding how spreadsheet functions work is important; functions not only yield important business facts, but they also allow you to increase the range of data analysis tasks that you can perform. For example, by using spreadsheet functions you can

  • Display the average insurance claim amount for a given year.
  • Return the highest or lowest sales order without needing to also sort data values.
  • Display values such as In Stock, Out of Stock, Back Ordered, or Must Reorder to highlight specific product stock levels.

Here’s a list of some popular business data analysis functions, the results they return, and a description of how you will likely use them in your daily data analysis tasks.

FunctionResult
ANDReturns TRUE if all the arguments are true or FALSE if one or more of the arguments is false. This function is helpful for evaluating whether several data values, when taken as a group, meet specific business criteria. For instance, =AND(2+2=4, 1+3=3) returns FALSE.
AVERAGEReturns the average (or mean) value of the arguments. For example, =AVERAGE(3, 2, 5, 7) returns 4.25.
CEILINGReturns a number rounded up to the nearest multiple that you specify. For instance, =CEILING(15.25, 0.50) returns 15.5, while =CEILING(15.25, 1) returns 16. This function, along with the FLOOR function described later, are ideal for rounding to specific financial increments, like eighths or quarters of points or dollars.
COUNTFrom a group of cells, returns the number of cells that contain numbers. This function is helpful for ignoring cells that contain text values such as N/A or None. Contrast this function to the ISBLANK function described later.
COUNTIFFrom a group of cells, returns the number of cells that match specific conditions. This function is helpful when you need a quick sum of matching data records but you don’t want to apply a filter, which could hide some of the data records.
FLOORIn contrast to the CEILING function described earlier, returns a number rounded down to the nearest multiple that you specify.
For instance, =FLOOR(15.25, 0.50) and =FLOOR(15.25, 1) both return 15.
FREQUENCYReturns a vertical list of numbers, grouped into "bins" or "buckets," that describes how often specific data values occur within a given list. This function is useful for grouping data values without sorting the original list of data records, which could disrupt the original list’s sort order.
IFBased on a condition that you specify, returns one value if the
condition is TRUE or returns another value if the condition is FALSE. This function is ideal for displaying word pair results such as In Stock/Out Of Stock, Yes/No, and so on. For example, =IF(1+1=3, "Correct", "Incorrect") returns Incorrect.
INTStrips off the fractional part of a number. For instance, =INT(123.456) returns 123. This function is helpful in financial analyses where you want to remove portions of dollars from the result without rounding.
ISBLANKReturns TRUE if the cell referred to is blank. This function is helpful for accounting for data-entry errors in which specific cells don’t contain data.
LARGEBased on the values in a group of data values, returns the nth
largest value using the position you specify. For example, =LARGE({2100, 3300, 1000, 5000, 4575}, 3) returns 3300, the third largest value in the group of numbers.
MAXReturns the largest number in a group of data values, regardless of the number’s position. For instance, =MAX(5, 4, 3, 2, 1) returns 5.
MEDIANReturns the middle value in a group of data values. For example, =MEDIAN(99, 42, 50, 1, 3) returns 42. Note that the median is not the same as the mean (or average).
MINThe opposite of the MAX function; returns the smallest number in a group of data values, regardless of the number’s position. For instance, =MIN (5, 4, 3, 2, 1) returns 1.
MODEReturns the most frequently occurring number in a group of data values. For example, =MODE(5, 4, 3, 3, 2) returns 3. This function is helpful when you want to determine information such as the most frequently ordered product ID or product number.
ORSimilar to the AND worksheet function, returns TRUE if one or more of the arguments are true, or FALSE if all of the arguments are false. For instance, =OR(2+2=4, 1+3=3) returns TRUE.
PERCENTILEBased on the percentile that you specify, returns the percentile value of a group of data values. For example, =PERCENTILE({1, 2, 3, 4, 5}, 0.9) returns 4.6. This function is helpful in determining which data values are above a certain cutoff point, such as a sales quota.
PERCENTRANKSimilar to the PERCENTILE function; returns where a specific data value ranks percentage-wise in a group of data values. For example, =PERCENTRANK({1, 2, 3, 4, 5}, 4) returns 0.75, or 75%. This function is very helpful for ranking data values next to each other for comparisons.
ROUNDRounds a data value to a specified number of digits. For instance, =ROUND(15.755, 1) returns 15.8. This function is helpful for rounding when you don’t want to drop fractional portions of
numbers but you want some degree of accuracy.
SMALLThe opposite of the LARGE function; based on the values in a group of data values, returns the nth smallest number using the position you specify. For example, =SMALL({3300,1000,2100,5000,4575}, 2) returns 2100, the second smallest number in the group.
SUMIFSimilar to the SUM function, but adds only values that match specific conditions. This function is helpful, for instance, if you need
to sum data values for multiple sales promotion figures occurring in the same list.


TIP:
Although the examples in the preceding table used actual data values, you can substitute cell addresses, and Excel will substitute the value of the cell when the function performs the calculation. For example, if you have the value 2 in cell A1 and the value 3 in cell A2, =SUM(A1, A2) would return 5.
TIP:
For a group of cells that touch each other, you can specify the first cell’s address in the group, followed by a colon (:), followed by the last cell’s address in the group. For instance, A1:B3 refers to the group of cells A1, A2, A3, B1, B2, and B3.

Inserting a function into a worksheet cell is simple. First click the spreadsheet cell in which you want the function’s result to appear. Then, on the Insert menu, click Function. Select the function name and click OK. Fill in any empty boxes with data values, cell addresses, or other arguments, and then click OK to insert the function. Press Enter to run the function.

Functions are organized by category. If you know the function’s category, you can select the category in the Insert Function dialog box and then select the function’s name. (Excel 2000 presents only a list of function categories and function names. It doesn’t give you the ability to search for functions.) If you know the name of the function, you can type it in the Insert Function dialog box’s Search For A Function box, click Go, and then select the function’s name in the Select A Function list.


TIP:
You can also display the Insert Function dialog box by clicking the Insert Function button next to the Formula Bar. If the Formula Bar is not visible, on the View menu, click Formula Bar to display it.

Your Turn

In this exercise, you will run various functions on several groups of cells to yield additional facts about your preferred customers. You can use the information to see whether you notice any trends or anomalies in your data.

  1. Open Hotel.xls. If the file is already open, close it (do not save it) and open the file again.
  2. In cell F2, type =SUM(D2:D313) and then press Enter. This returns the total amount of room service charges for the year ($147,683.80).
  3. In cell F3, type =MAX(D2:D313) and then press Enter. This returns the highest monthly room service charge ($1,835.40).
  4. In cell F4, type =MIN(D2:D313) and then press Enter. This returns the lowest monthly room service charge ($0.00).
  5. In cell F5, type =AVERAGE(D2:D313) and then press Enter. This returns the average monthly room service charge ($473.35).
  6. In cell F6, type =LARGE(D2:D313, 2) and then press Enter. This returns the second highest monthly room service charge ($1,794.55).
  7. In cell F7, type =MEDIAN(D2:D313) and then press Enter. This returns the monthly room service charge midpoint ($384.10).
  8. Add descriptions for each of these results in column E. For example, in cell E2 type Sum of All Room Service Charges. Compare your results to Figure 3-9.
  9. Figure 3-9 Summary of room service charges. (Image unavailable)

What anomalies or trends do you notice based on this exercise? For example:

  • Some preferred customers had no room service charges in some months.
  • The average monthly room service charge was about 23 percent higher than midpoint.

What would you do, if anything, as the hotel chain’s marketing specialist, to influence these findings in the future? To answer this question, of course, you would need to know in a more in-depth manner than this sample scenario how the hotel deals with aspects of room service and how it treats its customers.

Applying Conditional Formatting

Formatting involves changing the size, font, color, bolding, underlining, and other characteristics of text. Conditional formatting involves changing text properties if specific conditions are met.

To understand conditional formatting, imagine an electronic digital kitchen timer. When the time reaches zero, usually an audible beep sounds, a bell rings, the number flashes, or something similar occurs. You can apply this same kind of behavior to the formats of one or more worksheet cells. If a cell value were less than the number 100, for instance, you could display the value in red. Similarly, if a cell value were greater than 1000, you could change the cell’s background color to green.

Applying conditional formatting is most helpful when you want to scan a list of values and note anomalies such as numbers that are above or below certain thresholds that can be harmful to your business. For example, if an insurance policy holder makes a claim for more than $5,000.00, you might want to display the claim amount in red.

In Excel, you can easily change the following cell properties:

  • The font type
  • Whether the font is bold, italic, or both
  • The font’s size
  • Whether the font has a single or double underline
  • Whether the font has strikethrough, superscripted, or subscripted formatting
  • The font color
  • To which edges of a border special line thickness will be applied
  • The border thickness
  • The border color
  • The cell’s background shading color
  • The cell’s background pattern type, such as dashes, dots, or lines

To apply conditional formatting to one or more worksheet cells, select the cells to which you want to apply the formatting. Then, on the Format menu, click Conditional Formatting. Provide the conditional formatting properties in the Condition 1 area. If you want to add more cell formatting options, click the Add button and provide the conditional formatting properties in the Condition 2 area. (You can create up to three conditional formats per cell.) Finally, click the OK button to apply the conditional formatting.

Your Turn

In this exercise, you will color red all the cells in the Nights Booked column whose value is 5 or less, apply yellow to a cell if its value is 6 to 14, and apply green if the value is 15 or more.

  1. Open Hotel.xls. If the file is already open, close it (do not save it) and open it again.
  2. Select cells C2 through C313 in column C, the Nights Booked column.
  3. On the Format menu, click Conditional Formatting.
  4. In the Condition 1 area’s Between list, select Less Than Or Equal To.
  5. In the box to the right, type 5.
  6. Click Format, and then click the Patterns tab.
  7. Click a red-colored box, and then click OK.
  8. Click Add.
  9. In the boxes next to the Condition 2 area’s Between list, type 6 in the first box and 14 in the second box.
  10. Click Format, click a yellow-colored box on the Patterns tab, and then click OK.
  11. Click Add. In the Condition 3 area’s Between list, select Greater Than Or Equal To, and then type 15 in the box to the right. Click Format, and then click a green-colored box on the Patterns tab.
  12. Click OK, and compare your results to Figure 3-10.
  13. Figure 3-10 The Conditional Formatting dialog box set up to highlight values in the number of nights booked. (Image unavailable)

  14. Click OK, and compare your results to Figure 3-11.
  15. Figure 3-11 Nights booked with conditional formatting. (Image unavailable)

Conditional formatting can really help you spot individual data values more quickly.

Working with Charts

Charts are a great way to organize, simplify, categorize, and present data. Most of us can comprehend facts better if they are presented visually or in a graphical way. You might scratch your head when you try to analyze a worksheet full of numbers, but looking at a well-designed chart makes you say "A-ha" because the data is logically grouped or summarized. Although you might have worked with charts of all types, it’s helpful to understand the terminology that Excel uses to identify the items that a chart comprises. Figure 3-12 shows a sample Excel chart. An explanation of the different parts of the chart follows.

Figure 3-12 The Abercrombie Reservations (2001) chart. (Image unavailable)

  • The chart title is Abercrombie Reservations (2001).
  • The category axis contains the months January through December. A chart can have multiple category axes.
  • The value axis contains the numbers 0 through 20. Having multiple value axes in a chart is also possible.
  • The gridlines are the lines touching the numbers 0, 2, 4, and so on, extending across the chart. These are also known as major gridlines because they touch values on the axis. A chart can also have minor gridlines drawn between major gridlines for clarity. It is possible to have category axis gridlines as well as value axis gridlines.
  • The data labels are the numbers 14, 1, 3, 9, and so on directly above each column. To show multiple data labels in a single column, a chart can use legend keys to differentiate which data label matches which portion of the column.
  • The data table consists of the numbers 14, 1, 3, 9, and so on that appear at the very bottom of the chart.
  • The legend is the box containing the words Nights Booked.

The following table provides some suggestions about what types of charts you should use when you work with data that is presented in various formats.

Chart Type Type of Data to Analyze
Column or barIdeal for charting a relatively small number of data fields that you want to display as columns or bars.
LineAlso ideal for charting a relatively small number of data fields, but a line chart is used in cases in which you want to connect the values in an unbroken line.
PieSimilar to a column or bar chart, but the chart displays the data as a pie instead of a column or a bar.
ScatterIdeal for charting two fields’ worth of data values when the values do not necessarily follow a trend that can be connected in a reasonably unbroken line.
AreaSimilar to line charts; use an Area chart when you want to fill in areas of the chart on one or both sides of the line.
DoughnutIdeal for charting a relatively greater number of fields. Values are displayed in concentric rings, and the results look somewhat similar to a pie chart.
RadarIdeal for displaying a relatively small number of fields; corresponding data values are displayed relative to a fixed data value.
SurfaceIdeal for displaying more than two fields in cases in which you want to represent values along a multidimensional surface.
BubbleIdeal for charting three fields of data; a bubble chart is similar to a scatter chart, but the size of the scatter point can vary.
StockIdeal for fields that follow an open-high-low-close-volume pattern or a similar stock valuation pattern.


NOTE:
A pie chart and a doughnut chart can have a chart title, a legend, data labels, and legend keys, but the rest of the chart components do not apply. Pie charts can also contain leader lines that extend from data labels to pie pieces for visual clarity. Radar charts, surface charts, and bubble charts do not have options for data tables. Surface charts do not have options for data labels.

Here are some examples of when you should use specific chart types:

  • If your fields consist of a customer name and a purchase amount, column, bar, line, pie, area, and radar charts are good choices.
  • If your fields consist of a customer name, cost of goods, and purchase amount, column, bar, line, area, doughnut, surface, and bubble charts are reasonable choices.

The more fields you add, the more you should lean toward doughnut and surface charts or use PivotChart reports with page fields. PivotChart reports are covered in detail in Chapter 4 and Chapter 6.

Inserting a chart into a spreadsheet is a relatively straightforward process. However, customizing and working with the various chart types may take a little getting used to. To insert a chart into a spreadsheet, select the cells containing the data values that you want to include in your chart, click the Chart Wizard button on the Standard toolbar, and then provide the chart settings as requested by the Chart Wizard.

Your Turn

In this exercise, you want to see whether your data shows any month-to-month correlation between the number of nights that customer Abercrombie books and how much room service charges are for those nights. A chart is a good choice for visualizing this type of potential correlation.

  1. Open Hotel.xls. If the file is open already, close it (do not save it) and open it again.
  2. Select cells B1 through D13, inclusive.
  3. On the Insert menu, click Chart.
  4. Click the Custom Types tab.
  5. In the Chart Type list, select Line–Column On 2 Axes.
  6. Click the Next button three times.
  7. Click the As New Sheet option, and then click Finish. Compare your results to Figure 3-13.
  8. Figure 3-13 Comparing number of nights booked to room service charges. (Image unavailable)

Is there a strong month-to-month correlation between the number of nights that customer Abercrombie books and how much room service charges are for those nights? Do a simple analysis:

  • Number of months in which the diamond is within one notch of the top of the bar: 7.
  • Number of months in which the diamond is more than one notch from the top of the bar: 5.

On the face of it, there may be a correlation. Is it a very strong one? Answering this question successfully depends on how well you understand your business.

Importing External Data

To analyze data with Excel, you must first provide the data in a format that Excel recognizes. You can do this in two ways: you can open a data source in a format that Excel knows how to work with, or you can import the data into an Excel worksheet.


TIP:
When you import data into Excel, you generally are creating a copy of the data, separate from the original data source. As a rule of thumb, if you want to see updates to the original data, you should open the data in Excel instead of importing the data into Excel.

The types of data that you can open in Excel (other than Excel files) include

  • Web pages
  • XML files (Excel 2002 only)
  • Text files
  • Microsoft Access database files
  • Lotus 1-2-3 files
  • Quattro Pro/DOS files
  • Microsoft Works 2.0 files
  • dBase files
  • Any other ODBC or OLE DB source for which you have a corresponding driver (data source connection software), including Microsoft SQL Server 2000 databases and Microsoft SQL Server 2000 Analysis Server databases.

  • NOTE:
    The following procedure does not work in Excel 2000. You can open a file or import data from a text file in Excel 2000, but you cannot import data from formats other than text files.

To import external data into a spreadsheet:

  1. Click the cell in which you want the first item of the external data to appear.
  2. On the Data menu, point to Import External Data and then click Import Data. The Select Data Source dialog box appears.
  3. If you know that the data file or data source connection file already exists, locate and click the file, click Open, and then follow the directions Excel provides to finish importing the data. For example, if your data source is an Access database, you are asked to select a specific data table or data view.
  4. To create a new data source connection file, click the New Source button. The Data Connection Wizard appears and displays the Welcome To The Data Connection Wizard page.

  5. On the wizard’s first screen, select a data source and then click Next.
  6. Depending on the type of data source you want to connect to, follow the steps to specify connection properties. For example, if you are connecting to a Microsoft SQL Server 2000 database, you are asked to provide a database logon name and password, a database name, and a data table or data view name.
  7. After you have specified the connection properties, click Finish. The Select Data Source dialog box reappears.
  8. Click the data source connection file, click Open, and then follow the on-screen directions to finish importing the data.

Filtering Imported Data

Excel allows some types of data sources to be filtered. To filter imported data after you import the data, point to Import External Data on the Data menu and then click Edit Query. One of the Query Wizard screens appears; which one depends on your specific data source. Figure 3-14 shows an example of the Choose Columns page.

Figure 3-14 The Query Wizard - Choose Columns page. (Image unavailable)

After you import data into Excel, you can begin analyzing this data as you would analyze data on any other Excel worksheet. Note that this data is a copy of the external data, and therefore changes cannot be made to the external data source from the Excel worksheet.

Querying External Data

You can use a program called Microsoft Query, included with Microsoft Excel, to query certain types of external data and bring the query results into an Excel worksheet for further data analysis. If the external data changes, you can see the updated data in the Excel worksheet. In short, importing extracts a copy of the data, while querying links directly to the data.


NOTE:
As with external data you import, when you query external data, you cannot save changes that you make to the data in Excel to the original data source. To make changes to the external data, you need to use the application in which the external data originated.

To query external data and place the query results into a worksheet:

  1. On the Data menu, point to Import External Data and then click New Database Query. The Choose Data Source dialog box appears.
  2. If your data source is visible on the Databases tab, double-click the data source and go to step 9 below. Otherwise, click New Data Source on the Databases tab and then click OK.

  3. NOTE:
    If your data is in an OLAP cube stored on a Microsoft SQL Server 2000 Analysis Server computer, see Chapter 8 for details on how to use Microsoft Query to query data in an OLAP cube.

  4. In the What Name Do You Want To Give Your Data Source box, type a name for the data source that’s descriptive yet easy for you to remember.
  5. In the Select A Driver For The Type Of Database You Want To Access list, select the type of data source from which you want to query data.
  6. Click the Connect button. The ODBC Setup dialog box appears.
  7. Provide the information in the ODBC Setup dialog box to connect to the data source (the information varies depending on the type of database you want to query), and then click OK to return to the Create New Data Source dialog box.
  8. In the Select A Default Table For Your Data Source list, click a table to use as the basis for your database query and then click OK.
  9. Click OK in the Choose Data Source dialog box to start the Query Wizard. The Choose Columns page appears.
  10. In the Available Tables And Columns list, double-click each column that you want to include in your database query to move the column to the Columns In Your Query list.

  11. TIP:
    Click the Preview Now button to see which data values will be included in the query. Click the Options button to show or hide tables and other database objects, show the objects in alphabetical order, and set other options.

  12. Click Next. The Filter Data page appears.
  13. You can filter the data to be displayed by clicking a column in the Column To Filter list and specifying filter conditions in the Only Include Rows Where area.
  14. Click Next. The Sort Order page appears.
  15. Click an entry in the Sort By list to specify a column by which to sort. Click the Ascending or Descending option to specify the sort order.

  16. TIP:
    You can specify additional sorting options by repeating the previous step with the Then By list on the Sort Order page.

  17. Click Next. The Finish page appears.
  18. Click one of the options: to return the data to an Excel worksheet, view or edit the data in Microsoft Query, or create an OLAP cube from the data that the query returns. Click Finish.

  19. NOTE:
    When you click the Create An OLAP Cube From This Query option, the OLAP Cube Wizard dialog box appears. For details about how to use the OLAP Cube Wizard dialog box, see Chapter 8.
    TIP:
    Click the Save Query button on the Finish page to save the selections that you made in the Query Wizard dialog box to a query file. This query file can then be shared with other users.

Your Turn

In this exercise, let’s assume that you’re receiving data from your hotel chain’s corporate headquarters, where data is stored on a mainframe computer. The data has been exported from the mainframe to a comma-separated value (CSV) format. The CSV file is named Hotel.csv and is located in the Chap03 folder. You want to see only those records in which customers spent more than $1,500.00 on room service in any one month.

  1. Start Excel. If Excel is already running, create a new worksheet.
  2. On the Data menu, point to Import External Data and then click New Database Query.
  3. On the Databases tab, click New Data Source and then click OK.
  4. Name your data source Hotel CSV 2.
  5. In the Select A Driver For The Type Of Database You Want To Access list, click Microsoft Text Driver.
  6. Click Connect.
  7. Clear the Use Current Directory box, and then click Select Directory.
  8. Select the Chap03 folder, and then click OK. Note that you cannot select the Hotel.csv file in this step. You will select it in the next step.
  9. Click OK in the ODBC Text Setup dialog box, and then click Hotel.csv in the Select A Default Table For Your Data Source list.
  10. Click OK to return to the Choose Data Source dialog box.
  11. On the Databases tab, click Hotel CSV 2 and then click OK.
  12. Double-click the available columns to move them to the Columns In Your Query list.
  13. Continue clicking Next on the rest of the wizard’s pages, specifying any filtering and sorting options that you want, until you get to the Finish page.
  14. Click the View Data Or Edit Query In Microsoft Query option, and then click Finish.
  15. On the Criteria menu, click Add Criteria.
  16. In the Field list, select Total Room Service.
  17. In the Operator list, select Is Greater Than.
  18. In the Value box, type 1500.
  19. Click Add, and then click Close. Your results should look similar (although perhaps not exactly the same as) Figure 3-15.
  20. Figure 3-15 The Microsoft Query user interface, where filter criteria can be added. (Image unavailable)

  21. Click the Return Data button, and then click OK to place the data into the existing worksheet. Your results should look similar to Figure 3-16.
  22. Figure 3-16 Microsoft Query data returned to Excel. (Image unavailable)

Querying Data on the Web

You can query data imported from a Web site just like you can other types of external data. For example, you may want to query sales data on your organization’s intranet site and do further data analysis on the sales data in Excel. For the best query results, the Web-based data should be displayed in a row-and-column format similar to an Excel spreadsheet.


NOTE:
The appearance of the New Web Query dialog box and the steps to configure the dialog box are different in Excel 2000 and Excel 2002, but the end result is the same.
TIP:
As with other types of external data, when you query data imported from a Web site, you cannot save changes that you make to the Web-based data in Excel to the original data source. To make changes to the original data, talk to the Webmaster or Web site administrator to see which application to use and to obtain any necessary permissions.

To query Web-based data, click the cell in the spreadsheet where you want to insert the first item of data. On the Data menu, point to Import External Data and then click New Web Query. In the Address box, type the address for the data and then click Go. The Web page containing the data appears. Click the arrows next to the data tables that you want to query, click the Import button, and then click OK.

Your Turn

In this exercise, you will import a list of preferred customer information from your hotel chain’s Web site. For simplicity’s sake, you will use the Hotel.htm file, located in the Chap03 folder, as the Web page containing the information.

  1. Start Excel. If Excel is already running, create a new worksheet.
  2. On the Data menu, point to Import External Data and then click New Web Query.
  3. In the Address box, type c:\Microsoft Press\Excel Data Analysis\Sample Files\Chap03\Hotel.htm (or the path for the location where you copied the book’s sample files) and click Go. Compare your results to Figure 3-17.
  4. Figure 3-17 The New Web Query dialog box. (Image unavailable)

  5. Click the arrow next to the Gold Customer Name field. Compare your results to Figure 3-18, and then click Import.
  6. Figure 3-18 Selecting the preferred customer data table. (Image unavailable)

  7. Click OK to display the data in the existing worksheet. You can now begin analyzing this data as you would analyze data on any other Excel worksheet.

Analyzing Data in Lists

As you learned in Chapter 2, data lists are highly structured, and they share common field names and data values. Because of this, many of Excel’s features, although they are not considered true data analysis tools, can still provide quick facts about data lists. This section provides some additional, miscellaneous tasks that you can perform with lists.

To reiterate, for the best results when using these data list features, be sure of the following:

  • The data values in a data list share the same field names.
  • The data values for each field are presented consistently.
  • The data values have some type of data in each field, even if the data value is 0, N/A, or Unknown.
  • Only one list appears on each worksheet.
  • Field names are descriptive but not verbose.
  • Fields are separated into their most basic parts.

One helpful data analysis task for lists of values is to add subtotals to the list. Adding subtotals is much simpler and faster than inserting numerous SUM or SUMIF worksheet functions. To add subtotals, select the data you want to summarize, click Subtotals on the Data menu, and then provide the subtotal settings in the Subtotal dialog box.

Subtotals work best when the data contains a header row. You should also perform any required sorting or filtering of the data before adding subtotals. Use the outline buttons and the plus and minus buttons to show or hide data and subtotals. These buttons are located to the left of the row number indicators, as shown in Figure 3-19.

Figure 3-19 Use the outline buttons and the plus and minus buttons to the left of the row number indicators to work with subtotals. (Image unavailable)


TIP:
To remove subtotals, click Subtotals on the Data menu and then click Remove All in the Subtotal dialog box.

Your Turn

To determine any visible trends, you want to quickly subtotal the number of nights booked and the room service charges for each preferred customer for the entire year.

  1. Open Hotel.xls. If the file is already open, close it (do not save it) and open the file again.
  2. Click cell A1, the column heading for Gold Customer Name.
  3. On the Data menu, click Subtotals.
  4. In the At Each Change In list, select Gold Customer Name.
  5. In the Add Subtotal To list, select the Nights Booked and Total Room Service options.
  6. Click OK.
  7. Click outline button 2 to display subtotals by nights booked and total room service. Compare your results with Figure 3-20.
  8. Figure 3-20 Subtotals displayed by nights booked and total room service. (Image unavailable)

Another helpful data analysis task to perform with lists is to quickly display simple function results in the Excel status bar for selected cell groups. To do so, select a group of cells (the cells should all be in the same column), and then right-click anywhere in the status bar to display a list of common data analysis functions. (See Figure 3-21.)

Figure 3-21 A list of simple status bar functions. (Image unavailable)

If you click a function, such as Sum, the status bar shows the simple function’s result, as you can see in Figure 3-22.


TIP:
To display the status bar if it is not visible, click Options on the Tools menu. On the View tab, select the Status Bar check box and then click OK.

Figure 3-22 The sum of the subtotals for total room service is displayed in the status bar. (Image unavailable)

Your Turn

You want to display simple function results for customer Abercrombie.

  1. Open the Hotel.xls file. If the file is already open, close it (do not save it) and open it again.
  2. Select cells D2 through D13 in column D.
  3. Right-click anywhere in the status bar, and then click Average. The average month’s room service charge was $630.l9.

Experiment with other summary functions such as Count, Max, Min, and Sum.

Using the Analysis ToolPak

Excel includes a set of data analysis tools, called the Analysis ToolPak, that you can use to develop complex statistical or engineering analyses. Although this book is not geared toward statisticians, engineers, mathematicians, or academic researchers, this section will describe some of the functions in the Analysis ToolPak that are very helpful for analyzing business data. Because the Analysis ToolPak is not very widely understood, this section details the components of common Analysis ToolPak dialog boxes and demonstrates how to use these dialog boxes in common data analysis scenarios. To use the Analysis ToolPak, simply click Data Analysis on the Tools menu. (See the following if the command is not on the menu.) Click a tool in the list, and then complete the information in the resulting dialog box to perform the analysis.

Installing the Analysis ToolPak

If the Data Analysis command doesn’t appear on the Tools menu, click Add-Ins on the Tools menu, select the Analysis ToolPak check box, and then click OK. If the Analysis ToolPak check box is not visible, be sure you have your original Microsoft Office or Excel installation media handy, double-click the Add/Remove Programs icon in Control Panel, and do one of the following:

For Microsoft Windows 2000, Windows Millennium Edition, and Windows XP:

  • If you installed Excel as part of Microsoft Office, click Microsoft Office in the Currently Installed Programs box and then click the Change button.
  • If you installed Excel individually, click the Excel program entry in the Currently Installed Programs box and then click the Change button.
  • For Microsoft Windows 98 and Windows NT 4.0:
  • If you installed Excel as part of Microsoft Office, click Microsoft Office on the Install/Uninstall tab and then click the Add/Remove button.
  • If you installed Excel individually, click the Excel program entry on the Install/Uninstall tab and then click the Add/Remove button.

Follow the instructions on the screen. The Analysis ToolPak can be found by expanding the Microsoft Excel for Windows node and then expanding the Add-Ins node.

Some of the useful Analysis ToolPak tools for business analysis include the following:

  • Descriptive Statistics, which provides a quick list of summarizations such as minimum, maximum, median, mode, and other results.
  • Histogram, which provides a categorization of data values into similar groupings, or bins.
  • Moving Average, which provides a series of data values averaged over time, for use in forecasting or trending.
  • Rank and Percentile, which provides a list of how data values compare or rank against each other.
  • Sampling, which selects a random number of representative data values. Sampling is especially helpful when you want to quickly analyze a smaller number of data values that you believe are representative of a much larger group of data values.

The rest of this section describes how to use these tools. In any of the dialog boxes for these individual Analysis ToolPak functions, you’ll need to provide the following information:

  • Input Range   In this box, type the cell address for the group of data values that you want to analyze.
  • Grouped By    (part of the Descriptive Statistics and Rank and Percentile dialog boxes). Select the Columns option if your data is grouped by columns; select the Rows option if your data is grouped by rows.
  • Labels In First Row    (simply the Labels option in the Histogram and Sampling dialog boxes). Select this option if the first row or column of your input range contains field names.
  • Chart Output    (part of the Histogram and Moving Average dialog boxes). Select this option to display a chart along with the resulting data table.
  • Output Range   Select this option to place the results in the active worksheet. In the corresponding text box, enter the cell address of the cell where you want the results to start. Excel will display a message before it tries to place results in any cell that already contains a data value.
  • New Worksheet Ply   Use this option to place the results in a new worksheet in the active workbook. Type the name of the worksheet in the corresponding box.
  • New Workbook   Use this option to place the results in a new workbook.

The Descriptive Statistics Tool

To use the Descriptive Statistics tool, on the Tools menu, click Data Analysis. Click Descriptive Statistics, and then click OK. The Descriptive Statistics dialog box is shown in Figure 3-23.

Figure 3-23 The Descriptive Statistics dialog box. (Image unavailable)

In addition to the input range and other standard options, use the following options in the dialog box depending on the information you’re analyzing:

  • Summary Statistics   Select this option if you want Excel to produce the mean, standard error, median, mode, standard deviation, variance, kurtosis (a measure of how data is distributed), skewness (the degree of data distribution), and other related statistics for the selected group of cells.
  • Confidence Level For Mean   Select this option if you want to measure the confidence level for the mean summarization. For example, if the mean for a cell group of 10 values is 50, the standard deviation is 38.7, and you want a 95 percent confidence level, the result is 24. This means that 95 percent of the time, any single data value in the cell group should fall between the range of data values 26 and 74.
  • Kth Largest   Use this option to include the kth largest value in the cell group in the result. In the corresponding box, type the number to use for k. If you type 3, for example, the result is the third largest value in the cell group.
  • Kth Smallest   Select this option to include the kth smallest value in the cell group. If you type 5 for example, the results will show the fifth smallest value in the group of cells.

Your Turn

In this exercise, you will run the Descriptive Statistics tool on the total nights booked for all customers for all months.

  1. Open Hotel.xls. If it is already open, close it (do not save it) and open it again.
  2. On the Tools menu, click Data Analysis, click Descriptive Statistics, and then click OK.
  3. Click the Input Range box, and then select cells C2 through C313.
  4. In the Grouped By area, click Columns.
  5. Click the New Worksheet Ply option.
  6. Select the Summary Statistics and Confidence Level For Mean check boxes.
  7. Check the Kth Largest option, and then type 5 in the adjacent box.
  8. Check the Kth Smallest option, and then type 7 in the adjacent box. Compare your results to Figure 3-24.
  9. Figure 3-24 Completing the Descriptive Statistics dialog box. (Image unavailable)

  10. Click OK, and then compare your results to Figure 3-25.
  11. Figure 3-25 Results of running the Descriptive Statistics tool. (Image unavailable)

The Histogram Tool

To use the Histogram tool, click Data Analysis on the Tools menu. Click Histogram, and then click OK. The Histogram dialog box is shown in Figure 3-26.

Figure 3-26 The Histogram dialog box. (Image unavailable)

In addition to the input range and other standard options, use the following options in the dialog box depending on the information you’re analyzing:

  • Bin Range   In the Bin Range box, type the cell address for the group of data values (preferably in ascending order) that you want to use for the histogram grouping bins. If you leave this box blank, Excel will create a set of evenly distributed bins using the data’s minimum and maximum values.
  • Pareto (Sorted Histogram)   Select this option to present the histogram grouping bins in descending order of frequency.
  • Cumulative Percentage   Select this option to display a histogram column for cumulative percentages and to include a cumulative percentage line in the histogram.

Your Turn

You want to generate a histogram to display the frequency of nights booked per month for all of the preferred customers.

  1. Open Hotel.xls. If the file is already open, close it (do not save it) and open it a gain.
  2. In cell F1, type the number 1. In cell F2, type 2, and so on through the number 20 in cell F20. The values you enter will be used as the bins in the histogram.
  3. On the Tools menu, click Data Analysis.
  4. Click Histogram, and then click OK.
  5. Click the Input Range box, and then select cells C2 through C313.
  6. Click the Bin Range box, and then select cells F1 through F20.
  7. Click the Output Range option, click the Output Range box, and then click cell H1.
  8. Select the Pareto (Sorted Histogram) check box.
  9. Select the Chart Output check box, and then compare your results to Figure 3-27.
  10. Figure 3-27 Completing the Histogram dialog box. (Image unavailable)

  11. Click OK and compare your results to Figure 3-28.
  12. Figure 3-28 The histogram produced for Hotel.xls. (Image unavailable)

The number of nights stayed per month that recur most often are 1, 11, 10, and 14.

The Moving Average Tool

To use the Moving Average tool, click Data Analysis on the Tools menu. Click Moving Average, and then click OK. The Moving Average dialog box appears, as shown in Figure 3-29.

Figure 3-29 The Moving Average dialog box. (Image unavailable)

For a moving average, the output range must be on the same worksheet as the input range. For this reason, the New Worksheet Ply and New Workbook options are not available. In addition to the input range and other standard options, the Moving Average dialog box includes two items that you set to perform your analysis:

  • Interval   For this option, enter the number of values that you want to include in the moving average. The default interval is 3.
  • Standard Errors   Use this check box to have Excel generate a two-column table, showing standard error values in the right column.

Your Turn

You want to forecast what room service charges might be for the next year by using a three-month moving average of preferred customer Abercrombie’s room service charges during the past year.

  1. Open Hotel.xls. If it is already open, close it (do not save it) and open it again.
  2. On the Tools menu, click Data Analysis.
  3. Click Moving Average, and then click OK.
  4. Click the Input Range box, and then select cells D2 through D13 in the Total Room Service column.
  5. In the Interval box, type 3.
  6. Click the Output Range box, and then click cell F1.
  7. Select the Chart Output check box, compare your results to Figure 3-30, and then click OK. Compare the chart output to Figure 3-31.
  8. Figure 3-30 Completing the Moving Average dialog box. (Image unavailable)

    Figure 3-31 The chart showing actual and forecasted room service charges. (Image unavailable)

The numbers starting in cell F1 next to the chart are a result of running the moving average for each month, using the interval you entered. For example, for March, an average of the room service charges for January, February, and March is calculated. For April, an average of the room service charges for February, March, and April is calculated. You don’t see any moving average calculations for January or February because the data for the previous months (November and December of the preceding year) is not available.

It’s hard to say with certainty what preferred customer Abercrombie’s monthly room charges will be next year. However, overall the trend is downward in the last quarter, which could carry over into the next year. Hopefully, you will take some sort of action with this customer to turn things around.

The Rank and Percentile Tool

To use the Rank and Percentile tool, click Data Analysis on the Tools menu, click Rank And Percentile, and then click OK. Figure 3-32 shows the Rank And Percentile dialog box.

Figure 3-32 The Rank And Percentile dialog box. (Image unavailable)

Your Turn

You want to give all preferred customers who spent monthly room service charges in the 90th percentile or higher a special thank-you gift.

  1. Open Hotel.xls. If the file is already open, close it (do not save it) and open it again.
  2. On the Tools menu, click Data Analysis, click Rank And Percentile, and then click OK.
  3. Click the Input Range box, and then select cells D2 through D313 in the Total Room Service column.
  4. Click the Output Range option, click in the Output Range box, click cell F1, and compare your results to Figure 3-33.
  5. Click OK, and compare the output to Figure 3-34. The 90th percentile cut-off point is $1,153.40 in any one month. (The numbers in the Point column refer to an item’s order in the list.)
  6. Figure 3-33 Completing the Rank And Percentile dialog box. (Image unavailable)

    Figure 3-34 All the values above this point are in the 90th percentile. (Image unavailable)

The Sampling Tool

To use the Sampling tool, click Data Analysis on the Tools menu, click Sampling, and then click OK. The Sampling dialog box appears, as shown in Figure 3-35.

The sampling method options include the following:

  • Periodic   Use this option if you want to sample the kth value in the group of cells you select for input. Enter the value of k in the Period box.
  • Random   Use this option if you want to sample values randomly. Type the number of sample values in the Number Of Samples box.

Figure 3-35 The Sampling dialog box. (Image unavailable)

Your Turn

In this exercise, you’ll determine the average amount of monthly room charges made by each of your preferred customers. You’ll start by averaging about 10 percent of the data values to see whether they are representative of the 310 data values in the list.

  1. Open Hotel.xls. If it is already open, close it (do not save the file) and open the file again.
  2. On the Tools menu, click Data Analysis, click Sampling, and then click OK.
  3. Click the Input Range box, and then select cells D2 through D313 in the Total Room Service column.
  4. Click the Random option, and in the Number Of Samples box, type 31.
  5. Click the Output Range option, and then click cell F1. Compare your results to Figure 3-36.
  6. Click OK and compare your output to Figure 3-37. Because you’ve used random samples, your results will vary.
  7. Figure 3-36 Completing the Sampling dialog box for a random number of samples. (Image unavailable)

    Figure 3-37 Running the Sampling tool for a random number of samples. (Image unavailable)

  8. With the randomly picked values selected, right-click anywhere in the status bar, and click Average.
  9. What is the average monthly room service charge? Compare this amount to the average monthly room service charge of $473.35 for all preferred customers. Were the values picked representative? Now try a fixed number of data values.

  10. On the Tools menu, click Data Analysis, click Sampling, and then click OK.
  11. Click the Periodic option, and in the Period box, type 12.
  12. Click the Output Range box, and click cell G1.
  13. Click OK, and compare the output to Figure 3-38.
  14. Figure 3-38 Running the Sampling tool for a periodic number of samples. (Image unavailable)

  15. With the new sample values selected, right-click in the status bar and click Average.
  16. The average monthly room service charge based on these records is $522.37. Again, compare this to the average monthly room service charge of $473.35 for all of the preferred customers. Perhaps every 12th value that was sampled was not representative either. Experiment with higher values for both the number of random samples and the number of periodic values to see whether you can get closer to $473.35.

Putting It Together

You can use a number of Analysis ToolPak tools together to perform several analyses at once and spot trends. In the following exercise, you will compare the average of monthly room service charges to their rank and percentile.

  1. Open Hotel.xls. If the file is open already, close it (do not save the file) and reopen the file again.
  2. On the Tools menu, click Data Analysis.
  3. Click Descriptive Statistics, and then click OK.
  4. Click the Input Range box, and then select cells D2 through D313 in the Total Room Service column.
  5. Click the Columns and New Worksheet Ply options.
  6. Select the Summary Statistics check box, and then click OK.
  7. Click the worksheet labeled Sheet1.
  8. On the Tools menu, click Data Analysis.
  9. Click Rank And Percentile, and then click OK.
  10. Click the Input Range box, and then select cells D2 through D313 in the Total Room Service column.
  11. Click the Columns and New Worksheet Ply options, and then click OK.
  12. Compare the Descriptive Statistics worksheet’s Mean value (473.3455, or $473.35) to the values in the Rank And Percentile worksheet’s Column1 and Percent columns.
  13. You should notice that the mean (average) of $473.35 falls between the 60.1 and 60.4 percentiles. This is more than 10 percentage points higher than the median (midpoint), which is between $380.36 and $387.84 in the Rank And Percentile worksheet’s Column1 column, or $384.10 in the Descriptive Statistics worksheet’s Median row value.

Is this percentage difference good or bad for business? Although you have some big spenders among your customers, this could actually be bad for business in the long run. Here’s why: if the average was closer to the midpoint, this would most likely mean that most customers were making steady, predictable, evenly distributed room service orders. However, because the average is somewhat higher than the midpoint, this means that many of the preferred customers are making either large or small, less predictable room service orders, which could be difficult to plan for in the long run, especially if those big spenders stop purchasing. How do you correct this? The answer comes in knowing how the hotel chain goes about influencing its preferred customers’ purchasing habits and what it can do to keep the big spenders coming back or making the small spenders purchase more room service.

Using the Solver Add-In

The Solver Add-In is part of a suite of commands sometimes called what-if analysis tools. With Solver, you can find an optimal value for a formula in one cell—called the target cell—on a worksheet. Solver works with a group of cells that are related, either directly or indirectly, to the formula in the target cell. Solver adjusts values in cells you specify—called the adjustable cells—to produce the result you specify from the target cell formula. You can apply constraints to restrict the values Solver can use in the model, and the constraints can refer to other cells that affect the target cell formula.

You can use Solver to determine the maximum or minimum value of one cell by changing other cells—for example, you can change the amount of your projected advertising budget and see the effect on your projected profit.

To use the Solver add-in:

  1. On the Tools menu, click Solver. The Solver Parameters dialog box appears, shown in Figure 3-39.
  2. Figure 3-39 The Solver Parameters dialog box. (Image unavailable)

    If the Solver command does not appear on the Tools menu, click Add-Ins on the Tools menu, select the Solver Add-in check box, and then click OK. If the Solver check box is not visible, you can follow the directions for installing the Analysis TookPak provided earlier in the chapter. Substitute Solver for the Analysis TookPak, of course.

  3. In the Set Target Cell box, type the address for the cell that you want to set to a certain data value or that you want to maximize or minimize. The cell must contain a worksheet formula.
  4. Click the Max option if you want to maximize the value; click the Min option if you want to minimize the value. Click the Value Of option if you want a specific value. If you click the Value Of option, type the value in the adjacent box.
  5. In the By Changing Cells box, type the address for the cells that can be adjusted until the problem’s conditions are reached and the cell specified in the Set Target Cell box reaches its target. The cells referred to in the By Changing Cells box must be related directly or indirectly to the cell referred to in the Set Target Cell box.

  6. TIP:
    You can click the Guess button to have Excel try to figure out the nonformula cells referred to by the cell in the Set Target Cell box and place their cell addresses in the By Changing Cells box.

  7. To add conditions, also known as constraints, click Add for each constraint. To change or delete an existing constraint, click the Change or Delete buttons, respectively.
  8. If you add a constraint, the Add Constraint dialog box appears. (See Figure 3-40.) In the Cell Reference box, type the cell address; in the operator list, select an operator (such as <=); and in the Constraint box, type another cell address. Click Add to add another constraint, or click OK to return to the Solver Parameters dialog box.
  9. Figure 3-40 The Add Constraint dialog box. (Image unavailable)

  10. Click the Options button to display the Solver Options dialog box, which allows you to specify how the Solver add-in solves your problems.
  11. Click the Solve button to solve the problem.

  12. TIP:
    You can click the Reset All button to clear the current problem and reset all of the Solver add-in’s settings to their original values.

Your Turn

You want to find out how much extra revenue your hotel would have generated during the past year if your preferred customers had spent an average of $65.00 on room service per night, without increasing the number of nights they booked during the year. You will use the Solver add-in to help make this analysis.

  1. Open Hotel.xls. If it is already open, close it (do not save it) and open it again.
  2. Calculate the total number of nights booked. To do this, type =SUM(C2:C313) in cell C314 and then press Enter. The total number of nights booked is 3,092.
  3. Calculate the amount that all customers spent on room service. In cell D314, type =SUM(D2:D313) and then press Enter. The amount is $147,683.80.
  4. Calculate the average room service charge per customer per night booked. To do this, type =D314/C314 in cell E314 and then press Enter. The average spent on room service per customer per night is $47.46.
  5. Provide a baseline difference between the current room service charges and the results that the Solver add-in will calculate. To do this, type =D314-147683.80 in cell D315 and then press Enter. The result should be zero for now, as you can see in Figure 3-41.
  6. Figure 3-41 The Hotel.xls workbook set up with the information required to run the Solver add-in calculation (Image unavailable)

Now let’s see what happens when the average amount spent on room service per customer per night is raised to $65.00.

  1. On the Tools menu, click Solver.
  2. Click the Set Target Cell box, and then click cell E314. The Set Target Cell box should display $E$314.
  3. Click Value Of, and then type 65.00 in the adjacent box.
  4. Click the By Changing Cells box, and then click cell D314. The By Changing Cells box should display $D$314.
  5. Because you don’t want the Solver add-in to change the number of nights that were booked, click Add to add a constraint.
  6. Click the Cell Reference box, and then click cell C314. The Cell Reference box should display $C$314.
  7. In the adjacent list, select the equals symbol (=).
  8. Click the Constraint box, type 3092, and then click OK. The Subject To The Constraints box should display $C$314 = 3092. Compare your result to Figure 3-42.
  9. Figure 3-42 The Solver Parameters dialog box with the problem’s cell references and a constraint. (Image unavailable)

  10. Click Solve. Cell D315 changes from zero to $53,296.20. If customers had spent an average of $65.00 per night booked during the last year (instead of the amount they actually spent on average—$47.46), the hotel would have collected an additional $53,296.20 in room service revenue.
  11. In the Solver Results dialog box, click OK.

Summary

In this chapter, you learned how to use Excel to

  • Find the absolute highest or lowest data value in a list by sorting.
  • Filter records to show only the data you’re interested in.
  • Highlight data trends and anomalies through conditional cell formatting.
  • Import external data, including Web-based data, to use Excel’s analysis tools on the data.
  • Extend data analysis tasks with worksheet functions.
  • Create charts to spot data details and trends more quickly.
  • Use the Analysis ToolPak to display a list of descriptive statistics, a histogram, a moving average, relative data value rank and percentile, and representative data sampling.
  • Use the Solver Add-In to run best-case or worst-case scenarios or "what if" scenarios.

In the next chapter, "Analyzing Data with PivotTable and PivotChart Reports," you will learn what PivotTable reports and PivotChart reports are used for, as well as how to analyze data with PivotTable reports and PivotChart reports.

Meet the Author

Paul Cornell has been with Microsoft for five years, serving as an inside sales representative, a sales trainer, and a documentation specialist. For the last two-and-a-half years he has been working as a writer and an editor in Office User Assistance. He also currently works for the MSDN Online Office Developer Center and the Microsoft Office Business Intelligence initiative. His projects have included work on the Microsoft Office XP Visual Basic® for Applications Language Reference, the Microsoft Office XP Web Services Toolkit, the Microsoft Office XP Fabrikam 2.0 Solution Sample, and the Microsoft Data Analyzer Tutorial. He writes the \"Office Talk\" column for MSDN at http://msdn.microsoft.com/columns/office.asp and contributes to the \"Office Power User Corner\" column for the Microsoft Office Assistance Center at http://office.microsoft.com/assistance.

Customer Reviews

Average Review:

Write a Review

and post it to your social network

     

Most Helpful Customer Reviews

See all customer reviews >