Read an Excerpt
The Subjectivity of Scientists and the Bayesian Approach
By S. JAMES PRESS, Judith M. Tanur Dover Publications, Inc.
Copyright © 2001 S. James Press and Judith M. Tanur
All rights reserved.
ISBN: 978-0-486-81045-4
CHAPTER 1
Introduction
This is a book about science, about scientists, and about the methods that scientists use. We will show that the most famous scientists in history have all used their hunches, beliefs, intuition, and deep understanding of the processes they studied, to one extent or another, to arrive at their conclusions. The reader will see that the oft-expressed notion that science is "objective" is only partially true; in fact, science is really a combination of both subjective and objective views.
In this book we tell the stories of 12 of the most famous scientists throughout history, from Aristotle, the philosopher who lived during the era of the ancient Greeks, to Albert Einstein, who lived during the twentieth century. In each case, we discuss how the scientist's preconceived beliefs informally influenced his or her scientific conclusions.
In Chapter 2 we explain that we did not choose these particular 12 scientists to study; they were selected for us. In Chapter 3 we tell the stories of five other very celebrated cases of famous scientists who may have stepped over the lines of acceptable scientific practice. Such overstepping arose either because their convictions about the correctness of their ideas led them to see or accept as accurate only what their theory predicted, or from their zeal to convince the world of the scientific merits of their work.
A major portion of the book is devoted to the stories of our 12 most famous scientists in history (Chapter 4). We examine the lives of these people, their scientific contributions, and the ways in which they used their beliefs together with the results of their scientific experiments to carry out their research. In a final section of Chapter 4 we conjecture about what these scientists might have done had modern methods been available to them for combining their preconceptions about the processes they were studying with their experimental data.
Finally, in Chapter 5, we examine how subjectivity is being used in science in modern times. That discussion focuses on Bayesian statistical science. But first, in this introductory chapter, we provide some definitions and background for what is to come.
1.1 SUBJECTIVITY AND OBJECTIVITY
Because we will be using the words subjectivity and objectivity frequently in this book, we must first explain how we are using these terms. We intend to use them in several ways. But we will always be thinking about observational data, about the distinction between beliefs held by a scientist about a phenomenon prior to collecting the data, and beliefs held by the scientist after the data have been collected and analyzed.
In common usage, certain entities are seen to have only a subjective reality, in that those entities are constructs based on views and beliefs that were formed out the human mind. On the other hand, the term objective reality is used for entities that exist outside the minds of individuals, in that they exist in the world regardless of whether a person perceives them. For example, the opinions people hold about a political or social issue are personal beliefs that have only a subjective reality, whereas our starting point with respect to external reality is that the moon, the sun, and the planets would all exist regardless of whether or not human beings perceived them. But objective entities need not be corporeal. For example, all the scientific laws that govern the behavior of the physical world would exist regardless of their codification by human beings or humans' belief in them.
The crux of our usage makes these terms specific to the scientific endeavor. When anything is observed or measured by human beings, human perception and sensory mechanisms are involved, and the resulting observations that are collected then depend on subjectively based distortions of objective reality. Further, after entities in the objective world are observed (perceived) by a human being, either through the senses and/or assisted by measuring instruments, the resulting measurements (called data) are interpreted by a human being in a subjective way that reflects the person's own experience, understanding, and preconceived notions and beliefs about the entities, the object, phenomenon, or construct being measured. The interpretation of such data is also affected by the person's state of mind and state of senses at the time of the interpretation. We use the term subjective, or subjectivity, to refer to preexisting views or beliefs about entities that influence both the gathering of data and their interpretation.
Broadening our interpretation of the term, we will also use subjectivity to mean a person's intuition, belief, and understanding about some proposition or hypothesis prior to that person collecting observational data that bear on the proposition, or prior to that person obtaining information about the hypothesis. Because the views about the hypothesis that we are talking about are those that a particular person held prior to that person having collected data, or they are his or her views prior to being told about the values of data that have been collected, and because these views may well differ across individuals, we call those views or beliefs subjective.
To define objectivity in an experimental context, we appeal to a (grossly oversimplified) description of the old-fashioned textbook image of how science proceeds and how scientists behave. In this image, science and scientists are objective in the following sense. A hypothesis is developed (the passive voice here is used to signify that little attention is paid to the origin of the hypothesis) and the scientist designs a study to test this hypothesis. After data gathering, whether by designed experiment or by carefully carried out observation in a nonexperimental setting, the scientist dispassionately evaluates the results and their implications for the hypothesis. If the results support the hypothesis, the scientist writes up the study for publication; if they do not, the scientist, again dispassionately, abandons the hypothesis as being wrong, and either revises the hypothesis in light of the new findings and repeats the cycle, or goes on to other concerns.
The subjectivity that we intend to demonstrate that scientists use routinely is both less and more than the opposite of this kind of mythical objectivity. We surely do not suggest that scientists ignore the objective reality that is "out there," nor that they ignore the guidance offered by the results of their investigations. And surely historians of science have long since made it clear that scientists are considerably more human than our oversimplified portrait of dispassionate automaton would indicate. Scientists care about their work, care about their results, and even care about the recognition that will come to them with successful discoveries. These factors often drive the way in which they carry out their research and the way they report it to the world. Despite these factors, some observers of scientific methodology still describe it as "objective."
Strongly held personal beliefs and hypotheses about scientific phenomena, the stuff of what we refer to as subjectivity, are sometimes so strongly held that a scientist will, under their influence, announce confirmatory results from experiments not yet carried out. Such a practice is clearly fraudulent, as is introducing major alterations of data from actual experiments to make them conform to a subjectively held theory. But we need to understand that the dividing line between normal practice and fraud is sometimes a fine one. For example, normal statistical analysis of scientific data often requires the analyst to decide to drop some data points because they lie so far away from what is expected that they seem to be aberrations or mistakes rather than meaningful data that should be considered along with the bulk of the other observations (such data points are sometimes called outliers), or because they are poorly measured.
But what we mean by the subjectivity of scientists is deeper than these understandably human traits. Some scientists seem to be particularly opinionated and stubborn, not unlike some nonscientists. These scientists develop hypotheses based on strongly held, preconceived notions of how the world operates, and they sometimes (usually, unconsciously) design studies to prove these notions rather than merely to test them. In cases in which the results of their investigation are contrary to what the hypothesis would predict, a scientist is sometimes more likely to doubt that the data are accurate than to conclude that the theory is incorrect. Such a scientist will redesign the study and persist in trying to find data that prove the hypothesis, sometimes for years, sometimes for a lifetime.
1.2 SUBJECTIVE AND OBJECTIVE PROBABILITY
There is another important and related sense in which we use the terms subjective and objective in this book; it relates to their meanings in relation to subjective and objective probability. This use of the terms is discussed in great detail in Chapter 5. For the time being, we merely state that subjective probability refers to an individual scientist's degree of belief about the chance of some event occurring. Objective probability refers to the mathematical or numerical probability or chance of some event occurring.
1.3 RATE-OF-DEFECTIVES EXAMPLE
The definition of objectivity employed by some philosophers requires that the statement be testable by anybody. Subsumed in this position is the notion that everyone must have the same interpretation of observational data. But we believe that such uniformity of interpretation is rarely the case. We believe that different people come to observational data with differing preexisting views that induce them to construe the interpretation of the data somewhat differently, depending on their preconceived biases. It will be illuminating to begin our discussion of subjectivity in scientific methodology with an example that illustrates how different scientific observers, usually unwittingly, bring their own beliefs and biases — their subjectivity — to bear on the interpretation of scientific data. We see in the example below how different observers of the same data can proceed very differently and thus come away with very different interpretations of them. We call this illustration the rate-of-defectives example, and we refer to it later.
Before we present the example, however, we quote Ian Hacking (1965, p. 217), who noted that "[o]ne of the most intriguing aspects of the subjectivist theory, and of Jeffreys' theory of probability, is use of a fact about Bayes' theorem to explain the possibility of learning about a set-up from a sequence of trials on it. The fact seems to explain the possibility of different persons coming to agree on the properties of the set-up, even though each individual starts with different prior betting rates and different initial data." Here Hacking refers to the comforting fact that although different observers of the same data may have differing interpretations of them, eventually, with a sufficiently large number of trials, the differing prior views of different observers about the same data will generally disappear as the data begin to dominate all personal views about the underlying process.
Let us suppose that you collect 100 observations from an experiment. We can refer to these observations as data points. You then send these data to five scientists located in five different parts of the world. All five scientists receive the same data set, that is, the same 100 data points. (Note that for purposes of this example, the subjectivity involved in originally deciding what data to collect and in making the original observations themselves is eliminated by sending the same "objective" data to all five scientists.) Should we expect all five of the scientists to draw the same conclusions from these data?
Our answer to this question is a very definite "no." But how can it be that different observers will probably behave very differently when confronted with precisely the same data? Part of the thesis of this book is that the methodology of science requires that inferences from data be a mixture of both subjective judgment (theorizing) and objective observation (empirical verification). Thus, even though the scientists all receive the same observational data, they come to those data with differing beliefs about what to expect and about how to proceed. Consequently, some scientists will tend to weight certain data points more heavily than others and consider differing aspects of the data as more consequential. Different scientists are also likely to weight experimental errors of measurement differently from one another. Moreover, scientists may decide to carry out formal checks and statistical tests about whether the phenomenon of interest in the experiment was actually demonstrated (to ask how strongly the claimed experimental result was supported by the data). Such tests are likely to have different results for different scientists, because different scientists will bring different assumptions to the choice of statistical test. More broadly, scientists often differ on the mathematical and statistical models they choose to analyze a particular data set, and different models usually generate different conclusions. Different assumptions about these models will very often yield different implications for the same data.
These ideas that scientists can differ about the facts are perhaps surprising to some of us. Let us return to our 100 observations and five scientists to give a very simple and elementary example, with the assurance that analogous arguments will hold generally for more realistic and more complicated situations.
Hypothetically, suppose there is a special type of machines that produces a certain component that we can call a "groove joint," a component required for the hard drives of desktop computers. It is common knowledge in the computer industry that because groove joints are very difficult to fabricate, such machines generally produce these components with about a 50% defective rate. That is, about half the groove joints produced by a given machine will have to be discarded because they are defective. The machines that produce groove joints are very expensive.
The hypothetical South Bay Electronics Company suspects that its newly purchased machine may be producing groove joints at a defective rate different from the industry norm, so it decides to test the rate at which the new machine produces defectives. It first fabricates 100 groove joints on the new machine (each groove joint is fabricated independently of every other groove joint) and examines each one to classify it as "good" or "defective." South Bay records the sequence of 100 groove joints produced by the new machine as: G, G, D, ..., with "G" representing "good" and "D" representing "defective." The company finds that there were 90 defective groove joints in this batch, but the quality control staff still doesn't know the long-run rate of defectives for their machine. They decide to send the results representing the 100 tested groove joints to five different scientists to ask them for their own estimates of the long-run rate of defectives for this machine. We shall call this long-run rate of defectives p, bearing in mind that p can be any number between zero and one.
The sequence of G's and D's are the data you send to the five scientists (three women and two men) in five different locations around the world to see how they interpret the results. You tell them that you plan to publish their estimates of the long-run rate of defectives and their reasoning behind their estimates in a professional scientific journal. Thus their reputations are likely to be enhanced or to suffer in proportion to their degrees of error in estimating p. As we shall see, it will turn out that they will all have different views about the long-run rate of defectives after having been given the results of the experiment.
Scientist 1 is largely a theorist by reputation. She thinks that p = 0.5 no matter what. Her line of reasoning is that it just happened that 90% of the first 100 groove joints were defective, that there was a "run" of "defectives." Such an outcome doesn't mean that if the experiment were to be repeated for another 100 trials (produce yet another 100 groove joints) the next 100 trials wouldn't produce, say, 95 defectives, or any other proportion of defectives. Scientist 1 has a very strong preconceived belief based upon theory that groove joints are about equally likely to be good or defective (p = 0.5) in the face of real data that militate against that belief. For her, unless told otherwise, all machines produce defective groove joints for roughly half of their output, even if many runs of many defectives or many good groove joints just happen to occur.
(Continues...)
Excerpted from The Subjectivity of Scientists and the Bayesian Approach by S. JAMES PRESS, Judith M. Tanur. Copyright © 2001 S. James Press and Judith M. Tanur. Excerpted by permission of Dover Publications, Inc..
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.