Business Statistics For Dummies

Business Statistics For Dummies

by Alan Anderson


$20.69 $22.99 Save 10% Current price is $20.69, Original price is $22.99. You Save 10%.
View All Available Formats & Editions
Choose Expedited Shipping at checkout for guaranteed delivery by Wednesday, June 19


Score higher in your business statistics course? Easy.

Business statistics is a common course for business majors and MBA candidates. It examines common data sets and the proper way to use such information when conducting research and producing informational reports such as profit and loss statements, customer satisfaction surveys, and peer comparisons.

Business Statistics For Dummies tracks to a typical business statistics course offered at the undergraduate and graduate levels and provides clear, practical explanations of business statistical ideas, techniques, formulas, and calculations, with lots of examples that shows you how these concepts apply to the world of global business and economics.

  • Shows you how to use statistical data to get an informed and unbiased picture of the market
  • Serves as an excellent supplement to classroom learning
  • Helps you score your highest in your Business Statistics course

If you're studying business at the university level or you're a professional looking for a desk reference on this complicated topic, Business Statistics For Dummies has you covered.

Product Details

ISBN-13: 9781118630693
Publisher: Wiley
Publication date: 11/18/2013
Series: For Dummies Series
Pages: 416
Sales rank: 121,835
Product dimensions: 7.30(w) x 9.20(h) x 0.90(d)

About the Author

Alan Anderson, PhD is a teacher of finance, economics, statistics, and math at Fordham and Fairfield universities as well as at Manhattanville and Purchase colleges. Outside of the academic environment he has many years of experience working as an economist, risk manager, and fixed income analyst. Alan received his PhD in economics from Fordham University, and an M.S. in financial engineering from Polytechnic University.

Read an Excerpt

Business Statistics For Dummies

By Alan Anderson

John Wiley & Sons

Copyright © 2014 John Wiley & Sons, Ltd
All rights reserved.
ISBN: 978-1-118-63069-3


The Art and Science of Business Statistics

In This Chapter

* Looking at the key properties of data

* Understanding probability's role in business

* Sampling distributions

* Drawing conclusions based on results

This chapter provides a brief introduction to the concepts that are covered throughout the book. I introduce several important techniques that allow you to measure and analyze the statistical properties of real-world variables, such as stock prices, interest rates, corporate profits, and so on.

Statistical analysis is widely used in all business disciplines. For example, marketing researchers analyze consumer spending patterns in order to properly plan new advertising campaigns. Organizations use management consulting to determine how efficiently resources are being used. Manufacturers use quality control methods to ensure the consistency of the products they are producing. These types of business applications and many others are heavily based on statistical analysis.

Financial institutions use statistics for a wide variety of applications. For example, a pension fund may use statistics to identify the types of securities that it should hold in its investment portfolio. A hedge fund may use statistics to identify profitable trading opportunities. An investment bank may forecast the future state of the economy in order to determine which new assets it should hold in its own portfolio.

Whereas statistics is a quantitative discipline, the ultimate objective of statistical analysis is to explain real-world events. This means that in addition to the rigorous application of statistical methods, there is always a great deal of room for judgment. As a result, you can think of statistical analysis as both a science and an art; the art comes from choosing the appropriate statistical technique for a given situation and correctly interpreting the results.

Representing the Key Properties of Data

The word data refers to a collection of quantitative (numerical) or qualitative (non-numerical) values. Quantitative data may consist of prices, profits, sales, or any variable that can be measured on a numerical scale. Qualitative data may consist of colors, brand names, geographic locations, and so on. Most of the data encountered in business applications are quantitative.


The word data is actually the plural of datum; datum refers to a single value, while data refers to a collection of values.

You can analyze data with graphical techniques or numerical measures. I explore both options in the following sections.

Analyzing data with graphs

Graphs are a visual representation of a data set, making it easy to see patterns and other details. Deciding which type of graph to use depends on the type of data you're trying to analyze. Here are some of the more common types of graphs used in business statistics:

[check] Histograms: A histogram shows the distribution of data among different intervals or categories, using a series of vertical bars.

[check] Line graphs: A line graph shows how a variable changes over time.

[check] Pie charts: A pie chart shows how data is distributed between different categories, illustrated as a series of slices taken from a pie.

[check] Scatter plots (scatter diagrams): A scatter plot shows the relationship between two variables as a series of points. The pattern of the points indicates how closely related the two variables are.


You can use a histogram with either quantitative or qualitative data. It's designed to show how a variable is distributed among different categories.

For example, suppose that a marketing firm surveys 100 consumers to determine their favorite color. The responses are

Red: 23
Blue: 44
Yellow: 12
Green: 21

The results can be illustrated with a histogram, with each color in a single category. The heights of the bars indicate the number of responses for each color, making it easy to see which colors are the most popular (see Figure 1-1).

Based on the histogram, you can see at a glance that blue is the most popular choice, while yellow is the least popular choice.

Line graphs

You can use a line graph with quantitative data. It shows the values of a variable over a given interval of time. For example, Figure 1-2 shows the daily price of gold between April 14, 2013 and June 2, 2013:


With a line graph, it's easy to see trends or patterns in a data set. In this example, the price of gold rose steadily throughout late April into mid-May before falling back in late May and then recovering somewhat at the end of the month. These types of graphs may be used by investors to identify which assets are likely to rise in the future based on their past performance.

Pie charts

Use a pie chart with quantitative or qualitative data to show the distribution of the data among different categories. For example, suppose that a chain of coffee shops wants to analyze its sales by coffee style. The styles that the chain sells are French Roast, Breakfast Blend, Brazilian Rainforest, Jamaica Blue Mountain, and Espresso. Figure 1-3 shows the proportion of sales for each style.

The chart shows that Espresso is the chain's best-selling style, while Jamaica Blue Mountain accounts for the smallest percentage of the chain's sales.

Scatter plots

A scatter plot is designed to show the relationship between two quantitative variables. For example, Figure 1-4 shows the relationship between a corporation's sales and profits over the past 20 years.

Each point on the scatter plot represents profit and sales for a single year. The pattern of the points shows that higher levels of sales tend to be matched by higher levels of profits, and vice versa. This is called a positive trend in the data.

Defining properties and relationships with numerical measures

A numerical measure is a value that describes a key property of a data set. For example, to determine whether the residents of one city tend to be older than the residents in another city, you can compute and compare the average or mean age of the residents of each city.

Some of the most important properties of interest in a data set are the center of the data and the spread among the observations. I describe these properties in the following sections.

Finding the center of the data

To identify the center of a data set, you use measures that are known as measures of central tendency; the most important of these are the mean, median, and mode.

The mean represents the average value in a data set, while the median represents the midpoint. The median is a value that separates the data into two equal halves; half of the elements in the data set are less than the median, and the remaining half are greater than the median. The mode is the most commonly occurring value in the data set.

The mean is the most widely used measure of central tendency, but it can give deceptive results if the data contain any unusually large or small values, known as outliers. In this case, the median provides a more representative measure of the center of the data. For example, median household income is usually reported by government agencies instead of mean household income. This is because mean household income is inflated by the presence of a small number of extremely wealthy households. As a result, median household income is thought to be a better measure of how standards of living are changing over time.

The mode can be used for either quantitative or qualitative data. For example, it could be used to determine the most common number of years of education among the employees of a firm. It could also be used to determine the most popular flavor sold by a soft drink manufacturer.

Measuring the spread of the data

Measures of dispersion identify how spread out a data set is, relative to the center. This provides a way of determining if the members of a data set tend to be very close to each other or if they tend to be widely scattered. Some of the most important measures of dispersion are

[check] Variance

[check] Standard deviation

[check] Percentiles

[check] Quartiles

[check] Interquartile range (IQR)

The variance is a measure of the average squared difference between the elements of a data set and the mean. The larger the variance, the more "spread out" the data is. Variance is often used as a measure of risk in business applications; for example, it can be used to show how much uncertainty there is over the returns on a stock.

The standard deviation is the square root of the variance, and is more commonly used than the variance (since the variance is expressed in squared units). For example, the variance of a series of gas prices is measured in squared dollars, which is difficult to interpret. The corresponding standard deviation is measured in dollars, which is much more intuitively clear.

Percentiles divide a data set into 100 equal parts, each consisting of 1 percent of the total. For example, if a student's score on a standardized exam is in the 80th percentile, then the student outscored 80 percent of the other students who took the exam. A quartile is a special type of percentile; it divides a data set into four equal parts, each consisting of 25 percent of the total. The first quartile is the 25th percentile of a data set, the second quartile is the 50th percentile, and the third quartile is the 75th percentile. The interquartile range identifies the middle 50 percent of the observations in a data set; it equals the difference between the third and the first quartiles.

Determining the relationship between two variables

For some applications, you need to understand the relationship between two variables. For example, if an investor wants to understand the risk of a portfolio of stocks, it's essential to properly measure how closely the returns on the stocks track each other. You can determine the relationship between two variables with two measures of association: covariance and correlation.

Covariance is used to measure the tendency for two variables to rise above their means or fall below their means at the same time. For example, suppose that a bioengineering company finds that increasing research and development expenditures typically leads to an increase in the development of new patents. In this case, R&D spending and new patents would have a positive covariance. If the same company finds that rising labor costs typically reduce corporate profits, then labor costs and profits would have a negative covariance. If the company finds that profits are not related to the average daily temperature, then these two variables will have a covariance that is very close to zero.

Correlation is a closely related measure. It's defined as a value between -1 and 1, so interpreting the correlation is easier than the covariance. For example, a correlation of 0.9 between two variables would indicate a very strong positive relationship, whereas a correlation of 0.2 would indicate a fairly weak but positive relationship. A correlation of -0.8 would indicate a very strong negative relationship; a correlation of -0.3 would indicate a weak negative relationship. A correlation of 0 would show that two variables are independent (that is, unrelated).

Probability: The Foundation of All Statistical Analysis

Probability theory provides a mathematical framework for measuring uncertainty. This area is important for business applications since all results from the field of statistics are ultimately based on probability theory. Understanding probability theory provides fundamental insights into all the statistical methods used in this book.

Probability is heavily based on the notion of sets. A set is a collection of objects. These objects may be numbers, colors, flavors, and so on. This chapter focuses on sets of numbers that may represent prices, rates of return, and so forth. Several mathematical operations may be applied to sets — union, intersection, and complement, for example.

The union of two sets is a new set that contains all the elements in the original two sets. The intersection of two sets is a set that contains only the elements contained in both of the two original sets (if any.) The complement of a set is a set containing elements that are not in the original set. For example, the complement of the set of black cards in a standard deck is the set containing all red cards.

Probability theory is based on a model of how random outcomes are generated, known as a random experiment. Outcomes are generated in such a way that all possible outcomes are known in advance, but the actual outcome isn't known.

The following rules help you determine the probability of specific outcomes occurring:

[check] The addition rule

[check] The multiplication rule

[check] The complement rule

You use the addition rule to determine the probability of a union of two sets. The multiplication rule is used to determine the probability of an intersection of two sets. The complement rule is used to identify the probability that the outcome of a random experiment will not be an element in a specified set.

Random variables

A random variable assigns numerical values to the outcomes of a random experiment. For example, when you flip a coin twice, you're performing a random experiment, since:

[check] All possible outcomes are known in advance

[check] The actual outcome isn't known in advance

The experiment consists of two trials. On each trial, the outcome must be a "head" or a "tail."

Assume that a random variable X is defined as the number of "heads" that turn up during the course of this experiment. X assigns values to the outcomes of this experiment as follows:


{TT} 0
{HT, TH} 1
{HH} 2

T represents a tail on a single flip
H represents a head on a single flip
TT represents two consecutive tails
HT represents a head followed by a tail
TH represents a tail followed by a head
HH represents two consecutive heads

X assigns a value of 0 to the outcome TT because no heads turned up. X assigns a value of 1 to both HT and TH because one head turned up in each case. Similarly, X assigns a value of 2 to HH because two heads turned up.

Probability distributions

A probability distribution is a formula or a table used to assign probabilities to each possible value of a random variable X. A probability distribution may be discrete, which means that X can assume one of a finite (countable) number of values, or continuous, in which case X can assume one of an infinite (uncountable) number of different values.

For the coin-flipping experiment from the previous section, the probability distribution of X could be a simple table that shows the probability of each possible value of X, written as P(X):

X P(X)
0 0.25
1 0.50
2 0.25

The probability that X = 0 (that no heads turn up) equals 0.25 because this experiment has four equally likely outcomes: HH, HT, TH, and TT and in only one of those cases will there be no heads. You compute the other probabilities in a similar manner.

Discrete probability distributions

Several specialized discrete probability distributions are useful for specific applications. For business applications, three frequently used discrete distributions are:

[check] Binomial

[check] Geometric

[check] Poisson

You use the binomial distribution to compute probabilities for a process where only one of two possible outcomes may occur on each trial. The geometric distribution is related to the binomial distribution; you use the geometric distribution to determine the probability that a specified number of trials will take place before the first success occurs. You can use the Poisson distribution to measure the probability that a given number of events will occur during a given time frame.


Excerpted from Business Statistics For Dummies by Alan Anderson. Copyright © 2014 John Wiley & Sons, Ltd. Excerpted by permission of John Wiley & Sons.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.

Table of Contents

Introduction  1

Part I: Getting Started with Business Statistics  5

Chapter 1: The Art and Science of Business Statistics 7

Chapter 2: Pictures Tell the Story: Graphical Representations of Data 23

Chapter 3: Finding a Happy Medium: Identifying the Center of a Data Set 39

Chapter 4: Searching High and Low: Measuring Variation in a Data Set 55

Chapter 5: Measuring How Data Sets Are Related to Each Other 71

Part II: Probability Theory and Probability Distributions  91

Chapter 6: Probability Theory: Measuring the Likelihood of Events 93

Chapter 7: Probability Distributions and Random Variables 111

Chapter 8: The Binomial, Geometric, and Poisson Distributions 121

Chapter 9: The Uniform and Normal Distributions: So Many Possibilities! 139

Chapter 10: Sampling Techniques and Distributions 165

Part III: Drawing Conclusions from Samples 185

Chapter 11: Confidence Intervals and the Student’s t-Distribution 187

Chapter 12: Testing Hypotheses about the Population Mean 201

Chapter 13: Testing Hypotheses about Multiple Population Means 233

Chapter 14: Testing Hypotheses about the Population Mean 251

Part IV: More Advanced Techniques: Regression Analysis and Forecasting  281

Chapter 15: Simple Regression Analysis 283

Chapter 16: Multiple Regression Analysis: Two or More Independent Variables 309

Chapter 17: Forecasting Techniques: Looking into the Future 327

Part V: The Part of Tens  351

Chapter 18: Ten Common Errors That Arise in Statistical Analysis 353

Chapter 19: Ten Key Categories of Formulas for Business Statistics 361

Index  373

Customer Reviews

Most Helpful Customer Reviews

See All Customer Reviews