Table of Contents
1 Sports Data Mining: The Field 1
Chapter Overview 1
1 Definition 2
2 History 5
3 Societal Dimensions 8
4 The International Landscape 10
5 Criticisms 12
6 Questions for Discussion 13
2 Sports Data Mining Methodology 15
Chapter Overview 15
1 Scientific Foundation 16
2 Traditional Data Mining Applications 18
3 Deriving Knowledge 20
4 Questions for Discussion 21
3 Data Sources for Sports 23
Chapter Overview 23
1 Introduction 23
2 Professional Societies 24
2.1 The Society for American Baseball Research 24
2.2 Association for Professional Basketball Research 24
2.3 Professional Football Researchers Association 25
3 Sport-Related Associations 25
3.1 The International Association on Computer Science in Sport 25
3.2 The International Association for Sports Information 26
4 Special Interest Sources 26
4.1 Baseball 26
4.2 Basketball 26
4.3 Football 27
4.4 Cricket 27
4.5 Soccer 27
4.6 Multiple Sports 28
5 Conclusions 28
6 Questions for Discussion 28
4 Research in Sports Statistics 29
Chapter Overview 29
1 Introduction 29
2 Sports Statistics 29
2.1 History and Inherent Problems of Statistics in Sports 30
2.2 Bill James 31
2.3 Dean Oliver 32
3 Baseball Research 32
3.1 Building Blocks 33
3.2 Runs Created 33
3.3 Win Shares 35
3.4 Linear Weights and Total Player Rating 35
3.5 Pitching Measures 36
4 Basketball Research 37
4.1 Shot Zones 37
4.2 Player Efficiency Rating 38
4.3 Plus/Minus Rating 38
4.4 Measuring Player Contribution to Winning 39
4.5 Rating Clutch Performances 39
5 Football Research 40
5.1 Defense-Adjusted Value Over Average 40
5.2 Defense-Adjusted Points Above Replacement 41
5.3 Adjusted Line Yards 41
6 Emerging Research in Other Sports 41
6.1 NCAA Bowl Championship Series 42
6.2 NCAA Men's Basketball Tournament 42
6.3 Soccer 43
6.4 Cricket 43
6.5 Olympic Curling 44
7 Conclusions 44
8 Questions for Discussion 44
5 Tools and Systems for Sports Data Analysis 45
Chapter Overview 45
1 Introduction 45
2 Sports Data Mining Tools 46
2.1 Advanced Scout 46
2.2 Synergy Online 47
2.3 Sports Vis 47
2.4 Sports Data Hub 48
3 Scouting Tools 49
3.1 Digital Scout 49
3.2 Inside Edge 49
4 Sports Fraud Detection 50
4.1 Las Vegas Sports Consultants 52
4.2 Offshore Gaming 53
5 Conclusions 53
6 Questions for Discussion 53
6 Predictive Modeling for Sports and Gaming 55
Chapter Overview 55
1 Introduction 55
2 Statistical Simulations 56
2.1 Baseball 56
2.2 Basketball's BBall 57
2.3 Other Sporting Simulations 58
3 Machine Learning 58
3.1 Soccer 58
3.2 Greyhound and Thoroughbred Racing 59
3.3 Commercial Products 60
4 Conclusions 63
5 Questions for Discussion 63
7 Multimedia and Video Analysis for Sports 65
Chapter Overview 65
1 Introduction 65
2 Searchable Video 66
2.1 SoccerQ 67
2.2 Blinkx 68
2.3 Clipta 68
2.4 Sports VHL 69
2.5 Truveo 69
2.6 Bluefin Lab 69
3 Motion Analysis 69
4 Conclusions 70
5 Questions for Discussion 70
8 Web Sports Data Extraction and Visualization 71
Chapter Overview 71
1 Introduction 71
2 Web Data Sources 72
2.1 Baseball 72
2.2 Basketball 74
2.3 Cricket 77
2.4 Football 78
2.5 Hockey 81
2.6 Soccer 82
2.7 Other Sport Sources 83
3 Extracting Data 84
3.1 Programs 85
4 Conclusions 87
5 Questions for Discussion 87
9 Open Source Data Mining Tools for Sports 89
Chapter Overview 89
1 Introduction 89
2 WEKA 89
3 Rapidminer 91
4 Conclusions 92
5 Questions for Discussion 92
10 Greyhound Racing Using Neural Networks: A Case Study 93
Chapter Overview 93
1 Introduction 93
2 Setting Up the Experiments 94
3 Testing ID3 96
4 Testing the Backpropagation Neural Network 98
5 The Results 98
6 Conclusions 99
7 Questions for Discussion 100
11 Greyhound Racing Using Support Vector Machines: A Case Study 101
Chapter Overview 101
1 Introduction 101
2 Relevant Literature 102
3 Research Methodology 103
3.1 Data Acquisition 105
3.2 Support Vector Machines Algorithm 105
4 Results 106
5 Conclusions 108
6 Questions for Discussion 108
12 Betting and Gaming 109
Chapter Overview 109
1 Introduction 109
2 The Effects on Gambling on Sports 109
3 Sportsbooks and Offshore Betting 111
4 Arbitrage Methods 112
5 Cautions and Gambling Pitfalls 113
6 Conclusions 113
7 Questions for Discussion 114
13 Conclusions 115
Chapter Overview 115
1 Sports Data Mining Challenges 115
2 Sports Data Mining Audience 116
3 Future Directions 117
References 119
Index 127