Review 19: Let us never speak of these values again.

Let us never speak of these values again. by Ben Recht

  • This blog covers the utiliy of statistical signifigance measures like effect size and p value.
  • The range of success of claims is predicated off of a ratio of effect size to standard error.
  • Exposes that fact that signifigance is weighted by the spread of probability denisty. Meaning that if there is a slightly favorable outcome with a small PDF and one with a more favorable average outcome but larger PDF, the former may be favored by the p-value.
  • Simple approximations can distort this p-value. Esp. when it comes to averageing across groups.

Most practicing scientists would be better off not knowing what a p-value is.

  • This leads to the philosophical problem here: how can we really trust the effect size is valid. This is made challenging by varying levels of validity:
    • study design is valid?
    • hypothesis testing is valid?
    • claims are valid?
  • I think the third bullet is especially hard. How can you best gurantee performance on an unsen input/output?