Fundamentals of Biostatistics / Edition 8 available in Hardcover
FUNDAMENTALS OF BIOSTATISTICS leads you through the methods, techniques, and computations of statistics necessary for success in the medical field. Every new concept is developed systematically through completely worked out examples from current medical research problems.
|Edition description:||New Edition|
|Product dimensions:||8.00(w) x 10.00(h) x 1.60(d)|
About the Author
Bernard Rosner is Professor in the Department of Medicine, Harvard Medical School, and the Department of Biostatistics at the Harvard School of Public Health. Dr. Rosner's research activities currently include longitudinal data analysis, analysis of clustered continuous, binary and ordinal data, methods for the adjustment of regression models for measurement error, and modeling of cancer incidence data.
Read an Excerpt
Chapter 1: General OverviewStatistics is the science whereby inferences are made about specific random phenomena on the basis of relatively limited sample material. The field of statistics can be subdivided into two main areas: mathematical statistics and applied statistics. Mathematical statistics concerns the development of new methods of statistical inference and requires detailed knowledge of abstract mathematics for its implementation. Applied statistics concerns the application of the methods of mathematical statistics to specific subject areas, such as economics, psychology, and public health. Biostatistics is the branch of applied statistics that concerns the application of statistical methods to medical and biological problems.
A good way to learn about biostatistics and its role in the research process is to follow the flow of a research study from its inception at the planning stage to its completion, which usually occurs when a manuscript reporting the results of the study is published. As an example, I will describe one such study in which I participated.
A friend called one morning and in the course of our conversation mentioned that he had recently used a new, automated blood-pressure device of the type seen in many banks, hotels, and department stores. The machine had read his average diastolic blood pressure on several occasions as 115 mm Hg; the highest reading was 130 mm. Hg. I was horrified to hear of his experience, since if these readings were true, my friend might be in imminent danger of having a stroke or developing some other serious cardiovascular disease. I referred him to a clinical colleague of mine who, using a standard blood-pressure cuff, measured my friend's diastolic blood pressure as 90 mm Hg. The contrast in the readings aroused my interest, and I began to jot down the readings on the digital display every time I passed the machine at my local bank. I got the distinct impression that a large percentage of the reported readings were in the hypertensive range. Although one would expect that hypertensives would be more likely to use such a machine, I still believed that blood-pressure readings obtained with the machine might not be comparable with those obtained using standard methods of blood-pressure measurement. I spoke to Dr. B. Frank Polk about my suspicion and succeeded in interesting him in a small-scale evaluation of such machines. We decided to send a human observer who was well trained in blood-pressure measurement techniques to several of these machines. He would offer to pay subjects 500 for the cost of using the machine if they would agree to fill out a short questionnaire and have their blood pressure measured by both a human observer and the machine.
At this stage we had to make several important decisions, each of which would prove vital to the success of the study. The decisions were based on the following questions:
(1) How many machines should we test?
(2) How many people should we test at each machine?
(3) In what order should the measurements be taken-should the human observer or the machine be used first? Ideally, we would have preferred to avoid this problem by taking both the human and machine readings simultaneously, but this procedure was logistically impossible.
(4) What other data should we collect on the questionnaire that might influence the comparison between methods?
(5) How should the data be recorded to facilitate their computerization at a later date?
(6) How should the accuracy of the computerized data be checked?
We resolved these problems as follows:
(1) and (2) We decided to test more than one machine (four to be exact), since we were not sure if the machines were comparable in quality. However, we wanted to sample enough subjects from each machine so that we would have an accurate comparison of the standard and automated methods for each machine. We tried to predict how large a discrepancy there might be between the two methods. Using the methods of sample-size determination discussed in this book, we calculated that we would need 100 subjects at each site to have an accurate comparison.
(3) We then had to decide in what order the measurements should be taken for each person. According to some reports, one problem that occurs with repeated bloodpressure measurements is that people tense up at the initial measurement, yielding higher blood pressure than at subsequent repeated measurements. Thus, we would not always want to use the automated or manual method first, since the effect of the method would get confused with the order-of-measurement effect. A conventional technique that we used here was to randomize the order in which the measurements were taken, so that for any person it was equally likely that the machine or the human observer would take the first measurement. This random pattern could be implemented by flipping a coin or, more likely, by using a table of random numbers as appears in Table 4 of the Appendix.
(4) We felt that the major extraneous factor that might influence the results would be body size, since we might have more difficulty getting accurate readings from people with fatter arms than from those with leaner arms. We also wanted to get some idea of the type of people who use these machines; so we asked questions about age, sex, and previous hypertensive history.
(5) To record the data, we developed a coding form that could be tilled out on site and from which data could be easily entered on a computer terminal for subsequent analysis. Each person in the study was assigned an identification (ID) number by which the computer could uniquely identify that person. The data on the coding forms were then keyed and verified. That is, the same form was entered twice, and a comparison was made between the two records to make sure they were the same. If the records were not the same, the form was reentered. (6) After data entry we ran some editing programs to ensure that the data were accurate. Checking each item on each form was impossible because of the large amount of data. Alternatively, we checked that the values for individual variables were within specified ranges and printed out aberrant values for manual checking. For example, we checked that all blood-pressure readings were at least 50 and no more than 300 and printed out all readings that fell outside this range.
After completing the data-collection, data-entry, and data-editing phases, we were ready to look at the results of the study. The first step in this process is to get a general feel for the data by summarizing the information in the form of several descriptive...
Table of Contents
Table of Contents. Preface. 1. General Overview. 2. Descriptive Statistics. Introduction. Measures of Location. Some Properties of the Arithmetic Mean. Measures of Spread. Some Properties of the Variance and Standard Deviation. The Coefficient of Variation. Grouped Data. Graphic Methods. Case Study 1: Effects of Lead Exposure on Neurological and Psychological Function in Children. Case Study 2: Effects of Tobacco Use on Bone-Mineral Density in Middle-Aged Women. Obtaining Descriptive Statistics on the Computer. Summary. Problems. 3. Probability. Introduction. Definition of Probability. Some Useful Probabilistic Notation. The Multiplication Law of Probability. The Addition Law of Probability. Conditional Probability. Bayes' Rule and Screening Tests. Bayesian Inference. ROC Curves. Prevalence and Incidence. Summary. Problems. 4. Discrete Probability Distributions. Introduction. Random Variables. The Probability-Mass Function for a Discrete Random Variable. The Expected Value of a Discrete Random Variable. The Variance of a Discrete Random Variable. The Cumulative-Distribution Function of a Discrete Random Variable. Permutations and Combinations. The Binomial Distribution. Expected Value and Variance of the Binomial Distribution. The Poisson Distribution. Computation of Poisson Probabilities. Expected Value and Variance of the Poisson Distribution. Poisson Approximation to the Binomial Distribution. Summary. Problems. 5. Continuous Probability Distributions. Introduction. General Concepts. The Normal Distribution. Properties of the Standard Normal Distribution. Conversion from an N(μ,σ2) Distribution to an N(0,1) Distribution. Linear Combinations of Random Variables. Normal Approximation to the Binomial Distribution. Normal Approximation to the Poisson Distribution. Summary. Problems. 6. Estimation. Introduction. The Relationship Between Population and Sample. Random-Number Tables. Randomized Clinical Trials. Estimation of the Mean of a Distribution. Case Study: Effects of Tobacco Use on Bone-Mineral Density in Middle-Aged Women. Estimation of the Variance of a Distribution. Estimation for the Binomial Distribution. Estimation for the Poisson Distribution. One-Sided Cis. The Bootstrap. Summary. Problems . 7. Hypothesis Testing: One-Sample Inference. Introduction. General Concepts. One-Sample Test for the Mean of a Normal Distribution: One-Sided Alternatives. One-Sample Test for the Mean of a Normal Distribution: Two-Sided Alternatives. The Relationship Between Hypothesis Testing and Confidence Intervals. The Power of a Test. Sample-Size Determination. One-Sample X2 Test for the Variance of a Normal Distribution. One-Sample Inference for the Binomial Distribution. One-Sample Inference for the Poisson Distribution. Case Study: Effects of Tobacco Use on Bone-Mineral Density in Middle-Aged Women. Derivation of Selected Formulas. Summary. Problems. 8. Hypothesis Testing: Two-Sample Inference. Introduction. The Paired t Test. Interval Estimation for the Comparison of Means from Two Paired Samples. Two-Sample t Test for Independent Samples with Equal Variances. Interval Estimation for the Comparison of Means from Two Independent Samples (Equal Variance Case). Testing for the Equality of Two Variances. Two-Sample t Test for Independent Samples with Unequal Variances. Case Study: Effects of Lead Exposure on Neurologic and Psychological Function in Children. Estimation of Sample Size and Power for Comparing Two Means. The Treatment of Outliers. Derivation of Equation 8.13. Summary. Problems. 9. Nonparametric Methods. Introduction. The Sign Test. The Wilcoxon Signed-Rank Test. The Wilcoxon Rank-Sum Test. Case Study: Effects of Lead Exposure on Neurologic and Psychological Function in Children. Permutation Tests. Summary. Problems. 10. Hypothesis Testing: Categorical Data. Introduction. Two-Sample Test for Binomial Proportions. Fisher's Exact Test. Two-Sample Test for Binomial Proportions for Matched-Pair Data (McNemar's Test). Estimation of Sample Size and Power for Comparing Two Binomial Proportions. R x C Contingency Tables. Chi-Square Goodness-of-Fit Test. The Kappa Statistic. Derivation of Selected Formulas. Summary. Problems. 11. Regression and Correlation Methods. Introduction. General Concepts. Fitting Regression Lines - The Method of Least Squares. Inferences About Parameters from Regression Lines. Interval Estimation for Linear Regression. Assessing the Goodness of Fit of Regression Lines. The Correlation Coefficient. Statistical Inference for Correlation Coefficients. Multiple Regression. Case Study: Effects of Lead Exposure on Neurologic and Psychological Function in Children. Partial and Multiple Correlation. Rank Correlation. Interval Estimation for Rank-Correlation Coefficients. Derivation of Selected Formulas. Summary. Problems. 12. Multisample Inference. Introduction to the One-Way Analysis of Variance. One-Way ANOVA - Fixed-Effects Model. Hypothesis Testing in One-Way ANOVA - Fixed-Effects Model. Comparisons of Specific Groups in One-Way ANOVA. Case Study: Effects of Lead Exposure on Neurologic and Psychological Function in Children. Two-Way ANOVA. The Kruskal-Wallis Test. One-Way ANOVA - The Random-Effects Model. The Intraclass Correlation Coefficient. Mixed Models. Derivation of Equation 12.30. Summary. Problems. 13. Design and Analysis Techniques for Epidemiologic Studies. Introduction. Study Design. Measures of Effect for Categorical Data. Attributable Risk. Confounding and Standardization. Methods of Inference for Stratified Categorical Data - The Mantel-Haenszel Test. Multiple Logistic Regression. Extensions to Logistic Regression. Sample Size Estimation for Logistic Regression. Meta-Analysis. Equivalence Studies. The Cross-Over Design. Clustered Binary Data. Longitudinal Data Analysis. Measurement-Error Methods. Missing Data. Derivation of Selected Formulas. Summary. Problems. 14. Hypothesis Testing: Person-Time Data. Measure of Effect for Person-Time Data. One-Sample Inference for Incidence-Rate Data. Two-Sample Inference for Incidence-Rate Data. Power and Sample-Size Estimation for Person-Time Data. Inference for Stratified Person-Time Data. Power and Sample-Size Estimation for Stratified Person-Time Data. Testing for Trend: Incidence-Rate Data. Introduction to Survival Analysis. Estimation of Survival Curves: The Kaplan-Meier Estimator. The Log-Rank Test. The Proportional-Hazards Model. Power and Sample-Size Estimation under the Proportional-Hazards Model. Parametric Survival Analysis. Parametric Regression Models for Survival Data. Derivation of Selected Formulas. Summary. Problems. Appendix. Tables. Answers to Selected Problems. FLOWCHART: Methods of Statistical Inference. Index of Data Sets. Index of Statistical Software. Index.