Sunday 14 August 2011

statistics - Is it necessary to conduct a power analysis before beginning an experiment?

Due to my own woeful ignorance on the subject, I have been reading up on statistical methods recently. From what (little) I understand, the real answer to this question is:



Yes, but only if you are doing Neyman-Pearson hypothesis testing



and



Absolutely not, if you are using Fisher p-values



That is, the question isn't formulated correctly, because power analysis is only valid under one statistical framework (Neyman-Pearson). And you are probably not using that framework.



In my experience, most experimental biologists use Fisher's p-value, which gives the probability of the data (or more extreme data) assuming that the null hypothesis is true. Under Fisher's framework, among other drawbacks, there is no quantitative measure of the test's power. However, it has the benefit that it allows scientists to do something close to what we would like to do--that is to draw conclusions from evidence obtained in individual experiments.



The Neyman-Pearson framework does included the idea of a test's power, because you must formulate an alternative hypothesis as well as desired alpha and beta error rates before starting your experiment. However, it mostly denies us the ability to make inferences from individual experiments, and for that reason appears less suited to experimental science. To quote from Goodman (see below), under Neyman-Pearson, "we must abandon our ability to measure evidence, or judge truth, in an individual experiment."



There is no clear right frequentist framework, although what is clear is that you cannot mix Fisher and Neyman-Pearson. Finally, although it doesn't really address your question directly, it seems wrong not to mention Bayesian methods as an alternative to these two frequentist frameworks, which comes with its own baggage.



Further reading from people that understand this much better than me:



Michael Lew's answer to "Setting the threshold p-value as part of hypothesis generation" at Cross Validated



Michael Lew's answer to "What are common statistical sins" at Cross Validated



Hubbard, Raymond, and M. J Bayarri. “Confusion Over Measures of Evidence ( p’S) Versus Errors (α’S) in Classical Statistical Testing.” The American Statistician 57, no. 3 (August 2003): 171–178. (Working Paper PDF)



Arguments for Bayesian statistics:



Goodman, Steven N. “Toward Evidence-Based Medical Statistics. 1: The P Value Fallacy.” Annals of Internal Medicine 130, no. 12 (June 15, 1999): 995–1004.



Jaynes, E. T. Probability Theory: The Logic of Science (Online version of some parts)

No comments:

Post a Comment