E Jaynes (top left)
Edwin Jaynes was a physicist, statistician and probability theorist. He believed these fields to be inexorably linked, that they were all extended forms of logic applied to experimental experience. He was, in essence, a hardcore logical positivist who fully embraced Positivism's subjective implications. Obviously, Jaynes was a Bayesian. For him probability was just the formal thinking through of beliefs. He was inspired by Gibbs's approach to statistical mechanics, which involves (most importantly) an ensemble defined by what experimental parameters you can measure or control - that is, subjectively. What you can't control is considered to be maximally random given what you can, that is, the entropy is maximized. This is all done before the data was looked at, which is to say a priori. Therefore, the distribution over the ensemble is called the "prior" distribution. Gibbs could be confident in this approach, he had physical and mathematical arguments that gave him seemingly reasonable priors. Beyond these priors, we can use Bayes rule to change our beliefs when we get data. Therefore, this method is called "Bayesianism".
T Bayes
Jaynes felt that all of science could be done with just those mathematical arguments to do all of science. He was inspired by Claude Shannon's information theory, which made Jaynes realize that there was a certain universality about Gibbs's arguments beyond their physical basis. There have been objections. I want to emphasize that these are objections to the basics of the theory. Obviously, with enough unmotivated tricks and lowering of standards, any philosophy can be saved (it isn't as clear to me that any theory can be, but philosophers of science seem to think so). Anti-Bayesian Cosma Shalizi
notes that the core method always gives exponential distributions, which is very worrying. One can do different
things to get heavy tails, but it's not obvious to me that these are well motivated in Jaynesian theory. We were promised that the Gibbs Approach To Statistical Mechanics was the path to science because of the universality of Shannon Entropy theorems. How much of that can we get rid of before we are sold a bill of goods?
Psychic Phenomena
I would like to open a line of inquisition that I think goes deeper than technical matters. I will do so in a specific context: parapsychology. In many situations, Bayesian solutions do
not appear to give us what we want, evaluations of whether a phenomenon is real or not. The reason is not just subjectivism, but personal subjectivism (if that makes sense).
A young person who reads history may be surprised how many Victorian thinkers were deeply interested in the supernatural, and not just the usual crowd.
Henry Sidgwick, an economist, philosopher and moralist, was a founder of the Society for Psychical Research and his successors included the psychologist and philosopher William James and the physicist William Crookes. Ian Hacking
has shown that the philosopher, mathematician, statistician, physicist and co-founder of experimental psychology
Charles Sanders Peirce invented the modern concept of randomization partly to deal with practical theories in psychology (namely, "Is there a threshold below which weights cannot be distinguished?" is answered "No."), and it was applied widely to para-psychological research, where it has been a powerful force for skepticism. The classic randomization set up is described by
R A Fisher to test whether a lady could tell whether tea or milk was added to a cup first (the answer, incidentally, was "Yes."). This simple test was perceived by Fisher to have deep lessons for experiments in general. Randomization is a part of (note:
not the foundation, as Hacking above makes clear)
Deborah Mayo's error statistics approach to frequentism, because randomization is a kind of error probe. And in this case, it gives us what we say we want, an answer to the question" Are the alleged phenomena distinguishable from blind chance?".
My objection to Jaynes concerns Chapter 5 of
Probability Theory: The Logic Of Science, "Queer Uses For Probability Theory". In this chapter, Jaynes takes up the subject of psychic phenomena, as an example of how probability theory can be used to evaluate subjective degrees of belief. This is an important subject. In truth,
Professor X cannot tell whether tea or or milk was put first in the cup, but
Dr Bristol can. Of course, Professor X claims otherwise. His claim might be honest, as many magicians fool themselves
eventually. RA Fisher would say, let's see how they do in the same test, and he'd compare the likelihood ratio, etc. Jaynes's examination is different. Jaynes does and must say that we should start by evaluating our internal degree of belief in the manner of IJ Good:
"Our brains work pretty much the way this [Bayesian] robot works, but we have an intuitive feeling for plausibility only when it's not too far from 0 db. We get fairly definite feelings that something is more than likely to be so or less than likely to be so. So the trick is to imagine an experiment. How much evidence would it take to bring your state of belief up to the place where you felt very perplexed and unsure about it? Not to the place where you believed it - that would overshoot the mark, and again we'd lose our resolving power. How much evidence would it take to bring you just up to the point where you were beginning to consider the possibility seriously?
We take this man who says he has extrasensory perception, and we will write down some numbers from 1 to 10 on a piece of paper and ask him to guess which numbers we've written down. We'll take the usual precautions to make sure against other ways of finding out. If he guesses the first number correctly, of course we will all say 'You're a very lucky person, but I don't believe it.' And if he guesses two numbers correctly, we'll still say 'You're a very lucky person, but I don't believe it.' By the time he's guessed four numbers correctly - well, I still wouldn't believe it. So my state of belief is certainly lower than 40 db."
Jaynes estimates that he has a belief degree of -100 db in psychic phenomena. These so-called psychics often ask us to keep our minds open, to not do what Jaynes is doing. Why should we not oblige them, at least in a formal manner? It seems that it would be better dialectically if they still couldn't convince us even if we gave them the benefit of the doubt. In the long run, experiments will eventually overcome any initial degree of belief, but strong disbelief can make it take a very long time. Jaynes goes on to give himself another out: "In fact, if he guessed 1000 numbers correctly, I still would not believe that he has ESP...". He describes an experiment that supposedly gives very strong evidence of psychic phenomena. But it would not convince Jaynes.
"[The psychic] will then react with anger and dismay when, in spite of what he considers this overwhelming evidence, we persist in not believing in ESP. Why are we [Jaynesians], as [the psychic] sees it, so perversely illogical and unscientific?". Jaynes argues that the data can never prove psychic phenomena to Jaynes, because all a high likelihood (that is, data improbable to be produced by chance) does is increase the probability of error and deception.
But this is not what we say we want. How do we know that if this idea were not pursued that it wouldn't end up giving you error statistics, perhaps error statistics constrained by Bayesian constraints? I think that it would, though I don't have much argument for it. Further, how does this rule in cases of mistaken disbelief. If I had a mathematically intimidating (but physically naive) argument from statistical mechanics that it shouldn't matter whether tea or milk is put first, I could and should use this out to disbelieve Dr Bristol as much as Professor X. The fact of her complete success is meaningless, since I can simply say that she managed to deceive me. Write mathematically intimidating but ill founded models, and bend the evidence to their will? Perhaps this is what economists do all day (joke)!
What is "wrong" here is that the Bayesian solution is for yourself. If you are a firm that is evaluating its goods and you want to convince yourself they are good, then Bayesianism works well. Good (and Savage and Friedman, etc.) broke their teeth on the statistical problems of WWII, many of which were of this sort. But one advantage of the error statistics approach is dialectical - it is for convincing others. At the end of the experiment, Dr Bristol and Sir Fisher agree. When particle physicists discovered the Higgs Boson, they used (among other things) p-value analysis and other frequentist techniques. This is because they are in dialog with themselves about what the ultimate laws of physics are, not loners trying to make their priors a little sharper. Minimax estimators are the ultimate expression of the Hegelian philosophy (even worse joke)!
Don't make the mistake that dogmatic adherence to only the strictest of frequentist ideas (for instance, banishing stopping times) is the only path forward! In fact, the man who designed the lady drinking tea experiment described had some
very strange ideas about probability. Problems formally equivalent to interior monologues happen all the time. Bayesian theorists have made many contributions to science. I mean to say this as a corrective, not as a
declaration of purpose. It's an easy mistake to make that just because one side seems wrong, the other must be right. Instead, what I think is right is to let
one hundred flowers bloom, but only enjoy them at the appropriate time. For instance, if optional stopping is desirable, there is no reason to report statistics unstable in their presence. If broad agreement is necessary then there is no reason to report "results" that depend on unmotivated priors. This is obviously correct, but sadly totally informal. What is to be done?