Table of Contents
I Introduction 1
1 Introduction 3
1.1 Defining the Area 3
1.2 A Typical Architecture of a Multimedia Data Mining System 7
1.3 The Content and the Organization of This Book 8
1.4 The Audience of This Book 10
1.5 Further Readings 11
II Theory and Techniques 13
2 Feature and Knowledge Representation for Multimedia Data 15
2.1 Introduction 15
2.2 Basic Concepts 16
2.2.1 Digital Sampling 17
2.2.2 Media Types 18
2.3 Feature Representation 22
2.3.1 Statistical Features 23
2.3.2 Geometric Features 29
2.3.3 Meta Features 32
2.4 Knowledge Representation 32
2.4.1 Logic Representation 33
2.4.2 Semantic Networks 34
2.4.3 Frames 36
2.4.4 Constraints 38
2.4.5 Uncertainty Representation 41
2.5 Summary 44
3 Statistical Mining Theory and Techniques 45
3.1 Introduction 45
3.2 Bayesian Learning 47
3.2.1 Bayes Theorem 47
3.2.2 Bayes Optimal Classifier 49
3.2.3 Gibbs Algorithm 50
3.2.4 Naive Bayes Classifier 50
3.2.5 Bayesian Belief Networks 52
3.3 Probabilistic Latent Semantic Analysis 56
3.3.1 Latent Semantic Analysis 57
3.3.2 Probabilistic Extension to Latent Semantic Analysis 58
3.3.3 Model Fitting with the EM Algorithm 60
3.3.4 Latent Probability Space and Probabilistic Latent Semantic Analysis 61
3.3.5 Model Overrating and Tempered EM 62
3.4 Latent Dirichlet Allocation for Discrete Data Analysis 63
3.4.1 Latent Dirichlet Allocation 64
3.4.2 Relationship to Other Latent Variable Models 66
3.4.3 Inference in LDA 69
3.4.4 Parameter Estimation in LDA 70
3.5 Hierarchical Dirichlet Process 72
3.6 Applications in Multimedia Data Mining 73
3.7 Support Vector Machines 74
3.8 Maximum Margin Learning for Structured Output Space 81
3.9 Boosting 88
3.10 Multiple Instance Learning 91
3.10.1 Establish the Mapping between the Word Space and the Image-VRep Space 93
3.10.2 Word-to-image Querying 95
3.10.3 Image-to-Image Querying 95
3.10.4 Image-to-Word Querying 96
3.10.5 Multimodal Querying 96
3.10.6 Scalability Analysis 97
3.10.7 Adaptability Analysis 97
3.11 Semi-Supervised Learning 101
3.11.1 Supervised Learning 104
3.11.2 Semi-Supervised Learning 106
3.11.3 Semiparametric Regularized Least Squares 109
3.11.4 Semiparametric Regularized Support Vector Machines 111
3.11.5 Semiparametric Regularization Algorithm 113
3.11.6 Transductive Learning and Semi-Supervised Learning 113
3.11.7 Comparisons with Other Methods 114
3.12 Summary 115
4 Soft Computing Based Theory and Techniques 117
4.1 Introduction 117
4.2 Characteristics of the Paradigms of Soft Computing 118
4.3 Fuzzy Set Theory 119
4.3.1 Basic Concepts and Properties of Fuzzy Sets 119
4.3.2 Fuzzy Logic and Fuzzy Inference Rules 123
4.3.3 Fuzzy Set Application in Multimedia Data Mining 124
4.4 Artificial Neural Networks 125
4.4.1 Basic Architectures of Neural Networks 125
4.4.2 Supervised Learning in Neural Networks 131
4.4.3 Reinforcement Learning in Neural Networks 136
4.5 Genetic Algorithms 140
4.5.1 Genetic Algorithms in a Nutshell 140
4.5.2 Comparison of Conventional and Genetic Algorithms for an Extremum Search 145
4.6 Summary 150
III Multimedia Data Mining Application Examples 153
5 Image Database Modeling - Semantic Repository Training 155
5.1 Introduction 155
5.2 Background 156
5.3 Related Work 157
5.4 Image Features and Visual Dictionaries 159
5.4.1 Image Features 159
5.4.2 Visual Dictionary 160
5.5 α-Semantics Graph and Fuzzy Model for Repositories 163
5.5.1 α-Semantics Graph 163
5.5.2 Fuzzy Model for Repositories 166
5.6 Classification Based Retrieval Algorithm 168
5.7 Experiment Results 170
5.7.1 Classification Performance on a Controlled Database 170
5.7.2 Classification Based Retrieval Results 172
5.8 Summary 180
6 Image Database Modeling - Latent Semantic Concept Discovery 181
6.1 Introduction 181
6.2 Background and Related Work 182
6.3 Region Based Image Representation 185
6.3.1 Image Segmentation 185
6.3.2 Visual Token Catalog 188
6.4 Probabilistic Hidden Semantic Model 191
6.4.1 Probabilistic Database Model 191
6.4.2 Model Fitting with EM 192
6.4.3 Estimating the Number of Concepts 194
6.5 Posterior Probability Based Image Mining and Retrieval 194
6.6 Approach Analysis 196
6.7 Experimental Results 199
6.8 Summary 205
7 A Multimodal Approach to Image Data Mining and Concept Discovery 209
7.1 Introduction 209
7.2 Background 210
7.3 Related Work 211
7.4 Probabilistic Semantic Model 213
7.4.1 Probabilistically Annotated Image Model 213
7.4.2 EM Based Procedure for Model Fitting 215
7.4.3 Estimating the Number of Concepts 216
7.5 Model Based Image Annotation and Multimodal Image Mining and Retrieval 217
7.5.1 Image Annotation and Image-to-Text Querying 217
7.5.2 Text-to-Image Querying 218
7.6 Experiments 219
7.6.1 Dataset and Feature Sets 220
7.6.2 Evaluation Metrics 221
7.6.3 Results of Automatic Image Annotation 221
7.6.4 Results of Single Word Text-to-image Querying 224
7.6.5 Results of Image-to-image Querying 224
7.6.6 Results of Performance Comparisons with Pure Text Indexing Methods 226
7.7 Summary 228
8 Concept Discovery and Mining in a Video Database 231
8.1 Introduction 231
8.2 Background 232
8.3 Related Work 233
8.4 Video Categorization 235
8.4.1 Naive Bayes Classifier 237
8.4.2 Maximum Entropy Classifier 238
8.4.3 Support Vector Machine Classifier 240
8.4.4 Combination of Meta Data and Content Based Classifiers 241
8.5 Query Categorization 242
8.6 Experiments 244
8.6.1 Data Sets 244
8.6.2 Video Categorization Results 246
8.6.3 Query Categorization Results 251
8.6.4 Search Relevance Results 253
8.7 Summary 255
9 Concept Discovery and Mining in an Audio Database 257
9.1 Introduction 257
9.2 Background and Related Work 258
9.3 Feature Extraction 260
9.4 Classification Method 263
9.5 Experimental Results 263
9.6 Summary 269
References 271
Index 291