Quote of the month:
"A scientific truth does not triumph by convincing its
opponents and making them see the light, but rather because its opponents
eventually die and a new generation grows up that is familiar with it." –Maxwell
Planck
Further reading:
Bayes or Bust?, by John Earman. Earman
(a professor of History and Philosophy of Science at the University of
Pittsburgh) argues that Bayesianism provides the best hope for a comprehensive
and unified account of scientific inference, yet the presently available
versions of Bayesianism fail to do justice to several aspects of the testing and
confirming of scientific theories and hypotheses. By focusing on the need for a
resolution to this impasse, Earman sharpens the issues on which a resolution
turns.
Web links:
A collection of
Bayesian sites to find software, theory, and discussions.
A slide show providing an
introduction to Bayesian statistics.
A Bayesian statistics
reading list.
Massimo's
Tales
of the Rational:
Essays About Nature
and Science
Visit
Massimo's
Skeptic
& Humanist Web
Visit
Massimo's
Philosophy Page

How does science work, really? You can read all about it in plenty of texts in
philosophy of science, but if you have ever experienced the making of science on
an everyday basis, chances are you will feel dissatisfied with the airtight
account given by philosophers. Too neat, not enough mess.
To be sure, I am not denying the existence of the scientific method(s), as
radical philosopher Paul Feyerabend is infamously known for having done. But I
know from personal experience that scientists don’t spend their time trying to
falsify hypotheses, as Karl Popper wished they did. By the same token, while
occasionally particular scientific fields do undergo periods of upheaval, Thomas
Kuhn’s distinction between “normal science” and scientific “revolutions” is too
simple. Was the neoDarwinian synthesis of the 1930s and 40s in evolutionary
biology a revolution or just a significant adjustment? Was Eldredge and Gould’s
theory of “punctuated equilibria” to explain certain features of the fossil
record a blip on the screen or, at least, a minor revolution?
But, perhaps, the least convincing feature of the scientific method is not
something theorized by philosophers, but something actually practiced by almost
every scientist, especially those involved in heavily statistical disciplines
such as organismal biology and the social sciences. Whenever we run an
experiment, we analyze the data in a way to verify if the socalled “null
hypothesis” has been successfully rejected. If so, we open a bottle of champagne
and proceed to write up the results to place a new small brick in the edifice of
knowledge.
Let me explain. A null hypothesis is what would happen if nothing happened.
Suppose you are testing the effect of a new drug on the remission of breast
cancer. Your null hypothesis is that the drug has no effect: within a properly
controlled experimental population, the subjects receiving the drug do not show
a statistically significant difference in their remission rate when compared to
those who did not receive the drug. If you can reject the null, this is great
news: the drug is working, and you have made a potentially important
contribution toward bettering humanity’s welfare. Or have you?
The problem is that the whole idea of a null hypothesis, introduced in
statistics by none other than Sir Ronald Fisher (the father of much modern
statistical analyses), constraints our questions to ‘yes’ and ‘no’ answers.
Nature is much too subtle for that. We probably had a pretty good idea, before
we even started the experiment, that the null hypothesis was going to be
rejected. After all, surely we don’t embark in costly (both in terms of material
resources and of human potential) experiments just on the whim of the moment. We
don’t randomly test all possible chemical substances for their role as potential
anticarcinogens. What we really want to know is if the new drug performed
better than other, already known, ones—and by how much. That is, every time we
run an experiment we have two factors that Fisherian (also known as
“frequentist,” see below) statistics does not take into account: first, we have
a priori expectations about the outcome of the experiments, i.e., we don’t enter
the trial as a blank slate (contrary to what is assumed by most statistical
tests); second, we normally compare more than two hypotheses (often several),
and the least interesting of them is the null one.
An increasing number of statisticians and scientists are beginning to realize
this, and are ironically turning to a solution that was devises, and widely
used, well before Fisher. That solution was contained in an obscure paper that
one Reverend Thomas Bayes published back in 1763, and is revolutionizing how
scientists do their work, as well as how philosophers think about science.
Bayesian statistics simply acknowledges that what we are really after is an
estimate of the probability of a certain hypothesis to be true, given what we
know before running an experiment, as well as what we learn from the experiment
itself. Indeed, a simple formula known as Bayes theorem says that the
probability that a hypothesis (among many) is correct, given the available data,
depends on the probability that the data would be observed if that hypothesis
were true, multiplied by the a priori probability (i.e., based on previous
experience) that the hypothesis is true.
In Fisherian terms, the probability of an event is the frequency with which
that event would occur given certain circumstances (hence the term “frequentist”
to identify this classical approach). For example, the probability of rolling a
three with one (unloaded) die is 1/6, because there are six possible,
equiprobable outcomes, and on average (i.e., on long enough runs) you will get a
three one time every six.
In Bayesian terms, however, a probability is really an estimate of the degree
of belief (as in confidence, not blind faith) that a researcher can put into a
particular hypothesis, given all she knows about the problem at hand. Your
degree of belief that threes come out once every six rolls of the die comes from
both a priori considerations about fair dice, and the empirical fact that you
have observed this sort of events in the past. However, should you witness a
repeated specified outcome over and over, your degree of belief in the
hypothesis of a fair die would keep going down until you strongly suspect foul
play. It makes intuitive sense that the degree of confidence in a hypothesis
changes with the available evidence, and one can think of different scientific
hypotheses as competing for the highest degree of Bayesian probability. New
experiments will lower our confidence in some hypotheses, and increase the one
in others. Importantly, we might never be able to settle on one final
hypothesis, because the data may be roughly equally compatible with several
alternatives (a frustrating situation very familiar to any scientist and known
in philosophy as the underdetermination of hypotheses by the data).
You can see why a Bayesian description of the scientific enterprise—while not
devoid of problems and critics—is revealing itself to be a tantalizing tool for
both scientists, in their everyday practice, and for philosophers, as a more
realistic way of thinking about science as a process.
Perhaps more importantly, Bayesian analyses are allowing researchers to save
money and human lives during clinical trials because they permit the researcher
to constantly reevaluate the likelihood of different hypotheses during the
experiment. If we don’t have to wait for a long and costly clinical trial to be
over before realizing that, say, two of the six drugs being tested are, in fact,
significantly better than the others, Reverend Bayes might turn out to be a much
more important figure in science than anybody has imagined over the last two
centuries.
