> It is not evidence, in the sense that you cannot do 500, 37,100, or a 1 million card guessing attempts in this setup with this level of detail and expect it to shift belief of a rational agent. The ratio between the priors of the usual suspects is going to look the same as your ratio of posteriors between all the hypotheses after the test.
Exactly. When you suspect that the evidence is biased (or in other words, generated by a process other than genuine new physics or supernatural activity), more iterations of the process cannot give you much more evidence. What more iterations does is reduce sampling error from random variation, but it does nothing about systematic error. The idea that you can run a biased experiment 1000 times and get a much more accurate answer than if you ran it 10 times is an example of what Jaynes calls 'the Emperor of China' fallacy, which he discusses in another chapter (I excerpt it in http://www.gwern.net/DNB%20FAQ#flaws-in-mainstream-science-a... ).
That this is so surprising and novel is an interesting example of a general problem with null-hypothesis testing: when a significance test 'rejects the null', the temptation is to take it as confirming the alternative hypothesis. But this is a fallacy - when you reject the null, you just reject the null. There's an entire universe of other alternative hypotheses which may fit better or worse than the null, of which your favored theory is but one vanishingly small member.
What is necessary to show ESP specifically is to take all the criticisms and alternatives, and run different experiments which will have different results based on whether the alternative or ESP is true. (The real problem comes when it looks like the best experiments showing ESP are at least as rigorous as regular science and it's starting to become difficult to think of what exactly could be driving the positive results besides something like ESP: http://slatestarcodex.com/2014/04/28/the-control-group-is-ou... )
Exactly. When you suspect that the evidence is biased (or in other words, generated by a process other than genuine new physics or supernatural activity), more iterations of the process cannot give you much more evidence. What more iterations does is reduce sampling error from random variation, but it does nothing about systematic error. The idea that you can run a biased experiment 1000 times and get a much more accurate answer than if you ran it 10 times is an example of what Jaynes calls 'the Emperor of China' fallacy, which he discusses in another chapter (I excerpt it in http://www.gwern.net/DNB%20FAQ#flaws-in-mainstream-science-a... ).
That this is so surprising and novel is an interesting example of a general problem with null-hypothesis testing: when a significance test 'rejects the null', the temptation is to take it as confirming the alternative hypothesis. But this is a fallacy - when you reject the null, you just reject the null. There's an entire universe of other alternative hypotheses which may fit better or worse than the null, of which your favored theory is but one vanishingly small member.
What is necessary to show ESP specifically is to take all the criticisms and alternatives, and run different experiments which will have different results based on whether the alternative or ESP is true. (The real problem comes when it looks like the best experiments showing ESP are at least as rigorous as regular science and it's starting to become difficult to think of what exactly could be driving the positive results besides something like ESP: http://slatestarcodex.com/2014/04/28/the-control-group-is-ou... )