Predictive Analytics For Dummies

Predictive Analytics For Dummies

Predictive Analytics For Dummies

Predictive Analytics For Dummies

Paperback(2nd ed.)

$29.99 
  • SHIP THIS ITEM
    Qualifies for Free Shipping
  • PICK UP IN STORE
    Check Availability at Nearby Stores

Related collections and offers


Overview

Use Big Data and technology to uncover real-world insights

You don't need a time machine to predict the future. All it takes is a little knowledge and know-how, and Predictive Analytics For Dummies gets you there fast. With the help of this friendly guide, you'll discover the core of predictive analytics and get started putting it to use with readily available tools to collect and analyze data. In no time, you'll learn how to incorporate algorithms through data models, identify similarities and relationships in your data, and predict the future through data classification. Along the way, you'll develop a roadmap by preparing your data, creating goals, processing your data, and building a predictive model that will get you stakeholder buy-in.

Big Data has taken the marketplace by storm, and companies are seeking qualified talent to quickly fill positions to analyze the massive amount of data that are being collected each day. If you want to get in on the action and either learn or deepen your understanding of how to use predictive analytics to find real relationships between what you know and what you want to know, everything you need is a page away!

  • Offers common use cases to help you get started
  • Covers details on modeling, k-means clustering, and more
  • Includes information on structuring your data
  • Provides tips on outlining business goals and approaches

The future starts today with the help of Predictive Analytics For Dummies.


Product Details

ISBN-13: 9781119267003
Publisher: Wiley
Publication date: 10/31/2016
Series: For Dummies Books
Edition description: 2nd ed.
Pages: 464
Sales rank: 1,080,714
Product dimensions: 7.30(w) x 9.40(h) x 0.90(d)

About the Author

Anasse Bari, Ph.D. is data science expert and a university professor who has many years of predictive modeling and data analytics experience.

Mohamed Chaouchi is a veteran software engineer who has conducted extensive research using data mining methods.

Tommy Jung is a software engineer with expertise in enterprise web applications and analytics.

Read an Excerpt

Predictive Analytics For Dummies


By Anasse Bari, Mohamed Chaouchi, Tommy Jung

John Wiley & Sons

Copyright © 2014 John Wiley & Sons, Ltd
All rights reserved.
ISBN: 978-1-118-72896-3


CHAPTER 1

Entering the Arena


In This Chapter

* Explaining the building blocks

* Probing capabilities

* Surveying the market


Predictive analytics is a bright light bulb powered by your data.

You can never have too much insight. The more you see, the better the decisions you make — and you never want to be in the dark. You want to see what lies ahead, preferably before others do. It's like playing the game "Let's Make a Deal" where you have to choose the door with the hidden prize. Which door do you choose? Door 1, Door 2, or Door 3? They all look the same, so it's just your best guess — your choice depends on you and your luck. But what if you had an edge — the ability to see through the keyhole? Predictive analytics can give you that edge.


Exploring Predictive Analytics

What would you do in a world where you know how likely you are to end up marrying your college roommate? Where you can predict what profession will best suit you? Where you can predict the best city and country for you to live in?

In short, imagine a world where you can maximize the potential of every moment of your life. Such a life would be productive, efficient, and powerful. You will (in effect) have superpowers — and a lot more spare time. Well, such a world may seem a little boring to people who like to take uncalculated risks, but not to a profit-generating organization. Organizations spend millions of dollars managing risk. And if there is something out there that helps them manage their risk, optimize their operations, and maximize their profits, you should definitely learn about it. That is the world of predictive analytics.


Mining data

Big data is the new reality. In fact, data is only getting bigger, faster, and richer. It's here to stay and you'd better capitalize on it.

Data is one of your organization's most valuable assets. It's full of hidden value, but you have to dig for it. Data mining is the discovery of hidden patterns of data through machine learning — and sophisticated algorithms are the mining tools. Predictive analytics is the process of refining that data resource, using business knowledge to extract hidden value from those newly discovered patterns.

Data mining + business knowledge = predictive analytics = value


Today's leading organizations are looking at their data, examining it, and processing it to search for ways to better understand their customer base, improve their operations, outperform their competitors, and better position themselves in the marketplace. They are looking into how they can use that information to increase their market share and sharpen their competitive edge. How can they drive better sales and more effectively targeted marketing campaigns? How can they better serve their customers and meet their needs? What can they do to improve the bottom line?

But these tools are useful in realms beyond business. As one major example, government law enforcement agencies are asking questions related to crime detection and prevention. Is this a person of interest? Is this person about to commit a heinous crime? Will this criminal be a repeat offender? Where will the next crime happen?

Other industries, notably those with financial responsibility, could use a trustworthy glimpse into the future. Companies are trying to know ahead of time whether the transaction they're currently processing is fraudulent, whether an insurance claim is legitimate, whether a credit card purchase is valid, whether a credit applicant is worthy of credit ... the list goes on.

Governments, companies, and individuals are (variously) looking to spot trends in social movements, detect emerging healthcare issues and disease outbreaks, uncover new fashion trends, or find that perfect lifetime partner.

These — and plenty more — business and research questions are topics you can investigate further to find answers to by mining the available data and building predictive analytics models to guide future decisions.

Data + predictive analytics = light.


Highlighting the model

A model is a mathematical representation of an object or a process. We build models to simulate real-world phenomena as a further investigative step, in hopes of understanding more clearly what's really going on. For example, to model our customers' behavior, we seek to mimic how our customers have been navigating through our websites:

[check] What products did they look at before they made a purchase?

[check] What pages did they view before making that purchase?

[check] Did they look at the products' descriptions?

[check] Did they read users' reviews?

[check] How many reviews did they read?

[check] Did they read both positive and negative reviews?

[check] Did they purchase something else in addition to the product they came looking for?


We collect all that data from past occurrences. We look at those historical transactions between our company and our customers — and try to make consistent sense of them. We examine that data and see whether it holds answers to our questions. Collecting that data — with particular attention to the breadth and depth of the data, its quality level, and its predictive value — helps to form the boundaries that will define our model and its outputs.

This process is not to be confused with just reporting on the data; it's also different from just visualizing that data. Although those steps are vital, they're just the beginning of exploring the data and gaining a usable understanding of it.

We go a lot deeper when we're talking about developing predictive analytics. In the first place, we need to take a threefold approach:

[check] Thoroughly understand the business problem we're trying to solve.

[check] Obtain and prepare the data we want our model to work with.

[check] Run statistical analysis, data-mining, and machine-learning algorithms on the data.


In the process, we have to look at various attributes — data points we think are relevant to our analysis. We'll run several algorithms, which are sets of mathematical instructions that get machines to do problem-solving.

We keep running through possible combinations of data and investigate what-if scenarios. Eventually we find our answers, build our model, and prepare to deploy it and reap its benefits.

What does a model look like? Well, in programming terms, a predictive analytics model can be as simple as a few if ... then statements that tell the machine, "If this condition exists, then perform this action."

Here are some simple rule-based trading models:

[check] If it's past 10:00a.m. ET and the market is up, then buy 100 shares of XYZ stock.

[check] If my stock is up by 10 percent, then take profits.

[check] If my portfolio is down by 10 percent, then exit my positions.


Here's a simple rule-based recommender system (for more about recommender systems, see Chapter 2):

[check] If a person buys a book by this author, then recommend other books by the same author.

[check] If a person buys a book on this topic, then recommend other books on the same and related topic.

[check] If a person buys a book on this topic, then recommend books that other customers have purchased when they bought this book.


Adding Business Value

In an increasingly competitive environment, organizations always need ways to become more competitive. Predictive analytics found its way into organizations as one such tool. Using technology in the form of machine-learning algorithms, statistics, and data-mining techniques, organizations can uncover hidden patterns and trends in their data that can aid in operations and strategy and help fulfill critical business needs.

Embedding predictive analytics in operational decisions improves return on investment because organizations spend less time dealing with low-impact, low-risk operational decisions. Employees can focus more of their time on high-impact, high-risk decisions. For instance, most standard insurance claims can be automatically paid out. However, if the predictive model comes across a claim that's unusual (an outlier), or if the claim exhibits the same pattern as a fraudulent claim, the system can flag the claim automatically and send it to the appropriate person to take action.

By using predictive analytics to predict a future event or trend, the company can create a strategy to position itself to take advantage of that insight. If your predictive model is telling you (for example) that the trend in fashion is toward black turtlenecks, you can take appropriate actions to design more black-colored turtlenecks or design more accessories to go with the fashionable item.


Endless opportunities

Organizations around the world are striving to improve, compete, and be lean. They're looking to make their planning process more agile. They're investigating how to manage inventories and optimize the allocations of their human resources to best advantage. They're looking to act on opportunities as they arise in real time.

Predictive analytics can make all those goals more reachable. The domains to which predictive analytics can be applied are unlimited; the arena is wide open and everything is fair game. Let the mining start. Let the analysis begin.

Go to your analytics team and have them mine the data you've accumulated or acquired, with an eye toward finding an advantageous niche market for your product; innovate with data. Ask the team to help you gain confidence in your decision-making and risk management.

Albert Einstein once said, "Know where to find information and how to use it; that is the secret of success." If that's the secret to success, then you will succeed by using predictive analytics: The information is in your data and data mining will find it. The rest of the equation relies on your business knowledge of how to interpret that information — and ultimately use it to create success.

Finding value in data equals success. Therefore we can rewrite our predictive analytics equation as

Data mining + business knowledge = predictive analytics = success


Empowering your organization

Predictive analytics empowers your organization by providing three advantages:

[check] Vision

[check] Decision

[check] Precision


Vision

Predictive analytics will lead you to see what is invisible to others — in particular, useful patterns in your data.

Predictive analytics can provide you with powerful hints to lend direction to the decisions you're about to make in your company's quest to retain customers, attract more customers, and maximize profits. Predictive analytics can go through a lot of past customer data, associate it with other pieces of data, and assemble all the pieces in the right order to solve that puzzle in various ways, including

[check] Categorizing your customers and speculate about their needs.

[check] Knowing your customers' wish lists.

[check] Guessing your customers' next actions.

[check] Categorizing your customers as loyal, seasonal, or wandering.


Knowing this type of information beforehand shapes your strategic planning and helps optimize resource allocation, increase customer satisfaction, and maximize your profits.


Decision

A well-made predictive analytics model provides analytical results free of emotion and bias. The model uses mathematical functions to derive forward insights from numbers and text that describe past facts and current information. The model provides you with consistent and unbiased insights to support your decisions.

Consider the scenario of a typical application for a credit card: The process takes a few minutes; the bank or agency makes a quick, fact-based decision on whether to extend credit, and is confident in their decision. The speed of that transaction is possible thanks to predictive analytics, which predicted the applicant's creditworthiness.


Precision

Imagine having to read a lot of reports, derive insights from the past facts buried in them, go through rows of Excel spreadsheets to compare results, or extract information from a large array of numbers. You'd need a staff to do these time-consuming tasks. With predictive analytics, you can use automated tools to do the job for you — saving time and resources, reduces human error, and improves precision.

For example, you can focus targeted marketing campaigns by examining the data you have about your customers, their demographics, and their purchases. When you know precisely which customers you should market to, you can zero in on those most likely to buy.


Starting a Predictive Analytic Project

For the moment, let's forget about algorithms and higher math; predictions are used in every aspect of our lives. Consider how many times you have said (or heard people say), "I told you that was going to happen."

If you want to predict a future event with any accuracy, however, you'll need to know the past and understand the current situation. Doing so entails several processes:

[check] Extract the facts that are currently happening.

[check] Distinguish present facts from those that just happened.

[check] Derive possible scenarios that could happen.

[check] Rank the scenarios according to how likely they are to happen.


Predictive analytics can help you with each of these processes, so that you know as much as you can about what has happened and can make better-informed decisions about the future.

Companies typically create predictive analytics solutions by combining three ingredients:

[check] Business knowledge

[check] Data-science team and technology

[check] The data


Though the proportion of the three ingredients will vary from one business to the next, all are required for a successful predictive analytic solution that yields actionable insights.


Business knowledge

Because any predictive analytics project is started to fulfill a business need, business-specific knowledge and a clear business objective are critical to its success. Ideas for a project can come from anyone within the organization, but it's up to the leadership team to set the business goals and get buy-in from the needed departments across the whole organization.

Be sure the decision-makers in your team are prepared to act. When you present a prototype of your project, it needs an in-house champion — someone who's going to push for its adoption.

The leadership team or domain experts must also set clear metrics — ways to quantify and measure the outcome of the project. Appropriate metrics keep the departments involved are clear about what they need to do, how much they need to do, and whether what they're doing is helping the company achieve its business goals.

The business stakeholders are those who are most familiar with the domain of the business. They'll have ideas about which correlations — relationships between features — of data work and which don't, which variables are important to the model, and whether you should create new variables — as in derived features or attributes — to improve the model.

Business analysts and other domain experts can analyze and interpret the patterns discovered by the machines, making useful meaning out of the data patterns and deriving actionable insights.

This is an iterative (building a model and interpreting its findings) process between business and science. In the course of building a predictive model, you have to try successive versions of the model to improve how it works (which is what data experts mean when they say iterate the model over its lifecycle). You might go through a lot of revisions and repetitions before you can prove that your model is bringing real value to the business. Even after the predictive models are deployed, the business must monitor the results, validate the accuracy of the models and improve upon the models as more data is being collected.


(Continues...)

Excerpted from Predictive Analytics For Dummies by Anasse Bari, Mohamed Chaouchi, Tommy Jung. Copyright © 2014 John Wiley & Sons, Ltd. Excerpted by permission of John Wiley & Sons.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.

Table of Contents

INTRODUCTION 1

PART 1: GETTING STARTED WITH PREDICTIVE ANALYTICS 5

CHAPTER 1: Entering the Arena 7

Exploring Predictive Analytics 7

Mining data 8

Highlighting the model 9

Adding Business Value 10

Endless opportunities 11

Empowering your organization 12

Starting a Predictive Analytic Project 13

Business knowledge 14

Data-science team and technology 15

The Data 16

Ongoing Predictive Analytics 17

Forming Your Predictive Analytics Team 18

Hiring experienced practitioners 18

Demonstrating commitment and curiosity 19

Surveying the Marketplace 19

Responding to big data 20

Working with big data 20

CHAPTER 2: Predictive Analytics in the Wild 23

Online Marketing and Retail 25

Recommender systems 25

Personalized shopping on the Internet 26

Implementing a Recommender System 28

Collaborative filtering 28

Content-based filtering 36

Hybrid recommender systems 39

Target Marketing 41

Targeting using predictive modeling 42

Uplift modeling 43

Personalization 46

Online customer experience 46

Retargeting 47

Implementation 47

Optimizing using personalization 48

Similarities of Personalization and Recommendations 48

Content and Text Analytics 50

CHAPTER 3: Exploring Your Data Types and Associated Techniques 51

Recognizing Your Data Types 52

Structured and unstructured data 52

Static and streamed data 56

Identifying Data Categories 58

Attitudinal data 59

Behavioral data 60

Demographic data 61

Generating Predictive Analytics 61

Data-driven analytics 62

User-driven analytics 64

Connecting to Related Disciplines 65

Statistics 65

Data mining 66

Machine learning 67

CHAPTER 4: Complexities of Data 69

Finding Value in Your Data 70

Delving into your data 70

Data validity 70

Data variety 71

Constantly Changing Data 72

Data velocity 72

High volume of data 73

Complexities in Searching Your Data 73

Keyword-based search 74

Semantic-based search 74

Contextual search 76

Differentiating Business Intelligence from Big-Data Analytics 79

Exploration of Raw Data 80

Identifying data attributes 80

Exploring common data visualizations 81

Tabular visualizations 81

Word clouds 82

Flocking birds as a novel data representation 83

Graph charts 85

Common visualizations 87

PART 2: INCORPORATING ALGORITHMS IN YOUR MODELS 89

CHAPTER 5: Applying Models 91

Modeling Data 92

Models and simulation 92

Categorizing models 94

Describing and summarizing data 96

Making better business decisions 97

Healthcare Analytics Case Studies 97

Google Flu Trends 97

Cancer survivability predictors 99

Social and Marketing Analytics Case Studies 101

Target store predicts pregnant women 101

Twitter-based predictors of earthquakes 102

Twitter-based predictors of political campaign outcomes 103

Tweets as predictors for the stock market 105

Predicting variation of stock prices from news articles 106

Analyzing New York City’s bicycle usage 107

Predictions and responses 110

Data compression 111

Prognostics and its Relation to Predictive Analytics 112

The Rise of Open Data 113

CHAPTER 6: Identifying Similarities in Data 115

Explaining Data Clustering 116

Converting Raw Data into a Matrix 120

Creating a matrix of terms in documents 120

Term selection 121

Identifying Groups in Your Data 122

K-means clustering algorithm 122

Clustering by nearest neighbors 126

Density-based algorithms 130

Finding Associations in Data Items 132

Applying Biologically Inspired Clustering Techniques 136

Birds flocking: Flock by Leader algorithm 136

Ant colonies 143

CHAPTER 7: Predicting the Future Using Data Classification 147

Explaining Data Classification 149

Introducing Data Classification to Your Business 152

Exploring the Data-Classification Process 154

Using Data Classification to Predict the Future 156

Decision trees 156

Algorithms for Generating Decision Trees 159

Support vector machine 163

Ensemble Methods to Boost Prediction Accuracy 165

Naïve Bayes classification algorithm 166

The Markov Model 172

Linear regression 177

Neural networks 177

Deep Learning 179

PART 3: DEVELOPING A ROADMAP 185

CHAPTER 8: Convincing Your Management to Adopt Predictive Analytics 187

Making the Business Case 188

Gathering Support from Stakeholders 195

Presenting Your Proposal 206

CHAPTER 9: Preparing Data 209

Listing the Business Objectives 210

Processing Your Data 212

Identifying the data 212

Cleaning the data 213

Generating any derived data 215

Reducing the dimensionality of your data 215

Applying principal component analysis 216

Leveraging singular value decomposition 218

Working with Features 219

Structuring Your Data 224

Extracting, transforming and loading your data 225

Keeping the data up to date 226

Outlining testing and test data 226

CHAPTER 10: Building a Predictive Model 229

Getting Started 230

Defining your business objectives 232

Preparing your data 233

Choosing an algorithm 236

Developing and Testing the Model 237

Going Live with the Model 242

CHAPTER 11: Visualization of Analytical Results 245

Visualization as a Predictive Tool 246

Evaluating Your Visualization 249

Visualizing Your Model’s Analytical Results 251

Visualizing hidden groupings in your data 251

Visualizing data classification results 252

Visualizing outliers in your data 254

Visualization of Decision Trees 254

Visualizing predictions 256

Novel Visualization in Predictive Analytics 258

Big Data Visualization Tools 262

Tableau 263

Google Charts 263

Plotly 263

Infogram 264

PART 4: PROGRAMMING PREDICTIVE ANALYTICS 265

CHAPTER 12: Creating Basic Prediction Examples 267

Installing the Software Packages 268

Installing Python 268

Installing the machine-learning module 270

Installing the dependencies 274

Preparing the Data 278

Making Predictions Using Classification Algorithms 280

Creating a supervised learning model with SVM 281

Creating a supervised learning model with logistic regression 288

Creating a supervised learning model with random forest 295

Comparing the classification models 297

CHAPTER 13: Creating Basic Examples of Unsupervised Predictions 299

Getting the Sample Dataset 300

Using Clustering Algorithms to Make Predictions 301

Comparing clustering models 301

Creating an unsupervised learning model with K-means 302

Creating an unsupervised learning model with DBSCAN 314

Creating an unsupervised learning model with mean shift 318

CHAPTER 14: Predictive Modeling with R 323

Programming in R 325

Installing R 325

Installing RStudio 326

Getting familiar with the environment 327

Learning just a bit of R 328

Making Predictions Using R 334

Predicting using regression 334

Using classification to predict 345

Classification by random forest 354

CHAPTER 15: Avoiding Analysis Traps 359

Data Challenges 360

Outlining the limitations of the data 361

Dealing with extreme cases (outliers) 364

Data smoothing 367

Curve fitting 371

Keeping the assumptions to a minimum 374

Analysis Challenges 375

PART 5: EXECUTING BIG DATA 381

CHAPTER 16: Targeting Big Data 383

Major Technological Trends in Predictive Analytics 384

Exploring predictive analytics as a service 384

Aggregating distributed data for analysis 385

Real-time data-driven analytics 387

Applying Open-Source Tools to Big Data 388

Apache Hadoop 388

Apache Spark 394

CHAPTER 17: Getting Ready for Enterprise Analytics 399

Analytics as a Service 403

Google Analytics 403

IBM Watson 405

Microsoft Revolution R Enterprise 405

Preparing for a Proof-of-Value of Predictive Analytics Prototype 406

Prototyping for predictive analytics 406

Testing your predictive analytics model 409

PART 6: THE PART OF TENS 411

CHAPTER 18: Ten Reasons to Implement

Predictive Analytics 413

CHAPTER 19: Ten Steps to Build a Predictive Analytic Model 423

INDEX 433

From the B&N Reads Blog

Customer Reviews