Unique features of the book involve the following.
1.This book is the third volume of a three volume series of cookbooks entitled "Machine Learning in Medicine - Cookbooks One, Two, and Three". No other self-assessment works for the medical and health care community covering the field of machine learning have been published to date.
2. Each chapter of the book can be studied without the need to consult other chapters, and can, for the readership's convenience, be downloaded from the internet. Self-assessment examples are available at extras.springer.com.
3. An adequate command of machine learning methodologies is a requirement for physicians and other health workers, particularly now, because the amount of medical computer data files currently doubles every 20 months, and, because, soon, it will be impossible for them to take proper data-based health decisions without the help of machine learning.
4. Given the importance of knowledge of machine learning in the medical and health care community, and the current lack of knowledge of it, the readership will consist of any physician and health worker.
5. The book was written in a simple language in order to enhance readabilitynot only for the advanced but also for the novices.
6. The book is multipurpose, it is an introduction for ignorant, a primer for the inexperienced, and a self-assessment handbook for the advanced.
7. The book, was, particularly, written for jaded physicians and any other health care professionals lacking time to read the entire series of three textbooks.
8. Like the other two cookbooks it contains technical descriptions and self-assessment examples of 20 important computer methodologies for medical data analysis, and it, largely, skips the theoretical and mathematical background.
9. Information of theoretical and mathematical background of the methods described are displayed in a "notes" section at the end of each chapter.
10.Unlike traditional statistical methods, the machine learning methodologies are able to analyze big data including thousands of cases and hundreds of variables.
11. The medical and health care community is little aware of the multidimensional nature of current medical data files, and experimental clinical studies are not helpful to that aim either, because these studies, usually, assume that subgroup characteristics are unimportant, as long as the study is randomized. This is, of course, untrue, because any subgroup characteristic may be vital to an individual at risk.
12. To date, except for a three volume introductary series on the subject entitled "Machine Learning in Medicine Part One, Two, and Thee, 2013, Springer Heidelberg Germany" from the same authors, and the current cookbook series, no books on machine learning in medicine have been published.
13. Another unique feature of the cookbooks is that it was jointly written by two authors from different disciplines, one being a clinician/clinical pharmacologist, one being a mathematician/biostatistician.
14. The authors have also jointly been teaching at universities and institutions throughout Europe and the USA for the past 20 years.
15. The authors have managed to cover the field of medical data analysis in a nonmathematical way for the benefit of medical and health workers.
16. The authors already successfully published many statistics textbooks and self-assessment books, e.g., the 67 chapter textbook entitled "Statistics Applied to Clinical Studies 5th Edition, 2012, Springer Heidelberg Germany" with downloads of 62,826 copies.
17. The current cookbook makes use, in addition to SPSS statistical software, of various free calculators from the internet, as well as the Konstanz Information Miner (Knime), a widely approved free machine learning package, and the free Weka Data Mining package from New Zealand.
18. The above software packages with hundreds of nodes, the basic processing units including virtually all of the statistical and data mining methods, can be used not only for data analyses, but also for appropriate data storage.
19. The current cookbook shows, particularly, for those with little affinity to value tables, that data mining in the form of a visualization process is very well feasible, and often more revealing than traditional statistics.
20.The Knime and Weka data miners uses widely available excel data files.
21. In current clinical research prospective cohort studies are increasingly replacing the costly controlled clinical trials, and modern machine learning methodologies like probit and tobit regressions as well as neural networks, Bayesian networks, and support vector machines prove to better fit their analysis than traditional statistical methods do.
22. The current cookbook not only includes concise descriptions of standard machine learning methods, but also of more recent methods like the linear machine learning models using ordinal and loglinear regression.
23. Machine learning tends to increasingly use evolutionary operation methodologies. Also this subject has been covered.
24. All of the methods described have been applied in the authors' own research prior to this publication.
Table of Contents
I. Cluster Models
1. Hierarchical Clustering and K-means Clustering to Identify Subgroups in Surveys
2. Density-based Clustering to Identify Outlier Groups in Otherwise Homogeneous Data
3. Two Step Clustering to Identify Subgroups and Predict Subgroup Memberships
II. Linear Models
4. Linear, Logistic, and Cox Regression for Outcome Prediction with Unpaired Data
5. Generalized Linear Models for Outcome Prediction with Paired Data
6. Generalized Linear Models for Predicting Event-Rates
7. Factor Analysis and Partial Least Squares (PLS) for Complex-Data Reduction
8. Optimal Scaling of High-sensitivity Analysis of Health Predictors
9. Discriminant Analysis for Making a Diagnosis from Multiple Outcomes
10. Weighted Least Squares for Adjusting Efficacy Data with Inconsistent Spread
11. Partial Correlations for Removing Interaction Effects from Efficacy Data
12. Canonical Regression for Overall Statistics of Multivariate Data
III. Rules Models
13. Neural Networks for Assessing Relationships that are Typically Nonlinear
14. Complex Samples Methodologies for Unbiased Sampling
15. Correspondence Analysis for Identifying the Best of Multiple Treatments in Multiple
16. Decision Trees for Decision Analysis
17. Multidimensional Scaling for Visualizing Experienced Drug Efficacies
18. Shastic Processes for Long Term Predictions from Short Term Observations
19. Optimal Binning for Finding High Risk Cut-offs
20. Conjoint Analysis for Determining the Most Appreciated Properties of Medicines to beDeveloped