2020-03-10

## What goes wrong with p-values?

Selective reporting invalidates Type I error control.

Ioannidis, J. P. A. (2005). Why Most Published Research Findings Are False. PLoS Medicine, 2(8):e124.

• Argues that amongst significant results reported, a majority are false.
(Note: this is about Positive Predictive Value, PPV=1-FDR, not Type I error rate.)

People read more into a p-value than they should:

• p-values are not an effect size.
• “Insignificant” taken as evidence of no effect, but may be lack of power or bad luck.
• “Significant” taken to mean large effect, but may be a powerful experiment or luck.

## Confidence Intervals are better

• Interval given in meaningful units, can judge importance.
• Cause of dichotomization “paradoxes” are clear.
• Rejects more hypotheses — reject everything outside the interval!

## Confidence Intervals are better

Cumming, Geoff. (2012). Understanding The New Statistics: Effect Sizes, Confidence Intervals, and Meta-Analysis. Taylor & Francis Group, New York and London.

“New Statistics” approach:

• Stop dichotomising.
• Instead think in terms of measurement accuracy.
• Report everything, combine using meta-analysis.

## Confidence Intervals are better

Cumming, Geoff. (2012). Understanding The New Statistics: Effect Sizes, Confidence Intervals, and Meta-Analysis. Taylor & Francis Group, New York and London.

“New Statistics” approach:

• Stop dichotomising.
• Instead think in terms of measurement accuracy.
• Report everything, combine using meta-analysis.

Good prescription for clinical trials or psychology experiments.

Not so great for bioinformatics.

• We often consider thousands of p-values, selective reporting is the whole point.

## False Discovery Rate

Benjamini, Y. and Hochberg, Y. (1995). Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society. Series B (Methodological), 57(1):289–300.

To select a set of hypotheses to reject, $$S$$, from $$n$$, with an FDR of $$q$$,

choose the largest set satisfying:

$S = \left\{ i : p_i \leq {|S| \over n} q \right\}$

Sets for smaller FDR nest within sets for larger FDR, so can be presented in a way that leaves choice of FDR to reader (idea attributed to Gordon Smyth):

• Reader can read down a sorted list to desired cutoff value.

## False Coverage-statement Rate

Benjamini, Y. and Yekutieli, D. (2005). False Discovery Rate–Adjusted Multiple Confidence Intervals for Selected Parameters. Journal of the American Statistical Association, 100(469):71– 81.

After selecting a subset $$S$$ out of $$n$$ parameters, for an FCR of $$q$$,

provide intervals with coverage probability $$1-{{|S| \over n} q}$$.