##### Imagine if there were a simple single statistical measure everybody could use with any set of data and it would reliably separate true from false. Oh, the things we would know! Unrealistic to expect such wizardry though, huh?

Yet, statistical significance is commonly treated as though it is that magic wand. Take a null hypothesis or look for any association between factors in a data set and *abracadabra*! Get a “*p *value” over or under 0.05 and you can be ** 95% certain **it’s either a fluke or it isn’t. You can eliminate the play of chance! You can separate the signal from the noise!

Except that you can’t. That’s not really what testing for statistical significance does. And therein lies the rub.

Testing for statistical significance estimates the probability of getting roughly that result *if* the underlying hypothesis is assumed to be true. It can’t on its own tell you whether this assumption was right, or whether the results would hold true in different circumstances. It provides a limited picture of probability, because it takes limited information about the data into account.

What’s more, the finding of statistical significance itself can be a “fluke,” and that becomes more likely in bigger data and when you run the test on multiple comparisons in the same data. You can read more about that here.

Statistical significance testing can easily sound as though it sorts the wheat from the chaff, but it’s not enough to do that on its own – and it can break down in the face of many challenges. Nor do all tests of statistical significance work the same way on all data sets. And what’s more, “significant” doesn’t mean it’s important either. A sliver of an effect can reach the less-than-5% threshold. We’ll come back to what all this means practically shortly.

The common approach to statistical significance testing was so simple to grasp, though, and so easy to do even before there were computers, that it took the science world by storm. As Stephen Stigler explains in his piece on Fisher and the 5% level, “it opened the arcane domain of statistical calculation to a world of experimenters and research workers.”

But it also led to something of an avalanche of abuses. The over-simplistic approach to statistical significance has a lot for which to answer. As John Ioannidis points out here, this is a serious player in science’s failure to replicate results.

##### Go deeper with *Bing News* on:

##### Statistical significance

- A year in statistics—the view from the trencheson January 7, 2020 at 7:21 pm
Perhaps, as Aguinis et al note, ‘a focus on education and reform may be more helpful than the abandonment of statistical significance testing’.8 We look forward to 2020: to new methods, old debates ...

- Should the notion of “statistical significance” be abolished?on December 23, 2019 at 1:07 am
And if there is one thing most students remember from this class, it is probably the notion of “statistical significance.” It would probably be better if they didn’t, according to the ...

- Sorry, wrong number: Statistical significance benchmark comes under fireon November 17, 2019 at 12:00 am
statistical significance. It's an all-or-nothing thing. Your statistical results are either significant, meaning they are reliable, or not significant, indicating an unacceptably high chance that they ...

- Statistical Significanceon September 7, 2019 at 5:00 pm
Statistical significance can be considered strong or weak. When analyzing a data set and doing the necessary tests to discern whether one or more variables have an effect on an outcome, strong ...

- Statistical Significanceon June 24, 2019 at 5:00 pm
Statistical significance is the likelihood that a relationship between two or more variables is caused by something other than chance. Statistical hypothesis testing is used to determine whether the ...

##### Go deeper with *Google Headlines *on:

##### Statistical significance

##### Go deeper with *Bing News* on:

##### Statistical hypothesis testing

- The Trouble with Crime Statisticson January 10, 2020 at 3:25 am
“Whoever gave you those statistics is so full of crap that they can’t even see how ludicrous these statements are—you can quote me on that,” Kendle Allen, the sheriff of Stevens County, Washington, ...

- China's overseas archaeological projects bearing fruiton January 8, 2020 at 6:53 pm
Test of hypothesis A 6-hectare heritage site in Dobrovat ... have more opportunity to conduct joint archaeological research around the world owning to recent intergovernmental agreements. According to ...

- The earlier, the better or the worse? Towards accurate management of patients with arthralgia at risk for RAon January 8, 2020 at 6:14 pm
The hypothesis to test is that an intervention in these early phases may better prevent or reduce ... Adequate trial design means requirements for statistical power, eligible patients, preferred ...

- A year in statistics—the view from the trencheson January 7, 2020 at 7:21 pm
12 Of the criticisms that have been levelled at p-values, two of the most damaging are misinterpreting the p-value as the probability that the null hypothesis is true ... on education and reform may ...

- Design Of Experiments 101: Understanding DOE's Foundational Elementson January 6, 2020 at 2:57 pm
Hypotheses testing works with small samples from large populations. Because of the uncertainties of dealing with sample statistics, decision errors are possible. There are two types of errors: type I ...