# Remove, rather than redefine, statistical significance

@article{Amrhein2017RemoveRT, title={Remove, rather than redefine, statistical significance}, author={Valentin Amrhein and Sander Greenland}, journal={Nature Human Behaviour}, year={2017}, volume={2}, pages={4} }

To the Editor — Benjamin et al.1 propose to redefine statistical significance with a trichotomy: what was once ‘highly significant’ (P < 0.005) becomes ‘significant’, what was once significant (P < 0.05) becomes ‘suggestive’, and what was ‘nonsignificant’ (P > 0.05) remains nonsignificant. Trichotomization is better than dichotomization, and we agree that P values around 0.05 convey only limited evidence against the tested hypothesis (which is usually a ‘null’ hypothesis of no effect)2. We also… Expand

#### Topics from this paper

#### Paper Mentions

#### 86 Citations

Why 'Redefining Statistical Significance' Will Not Improve Reproducibility and Could Make the Replication Crisis Worse

- Psychology, Mathematics
- 2017

A recent proposal to "redefine statistical significance" (Benjamin, et al. Nature Human Behaviour, 2017) claims that false positive rates "would immediately improve" by factors greater than two and… Expand

Abandon Statistical Significance

- Mathematics, Computer Science
- The American Statistician
- 2019

This work recommends dropping the NHST paradigm—and the p-value thresholds intrinsic to it—as the default statistical paradigm for research, publication, and discovery in the biomedical and social sciences and argues that it seldom makes sense to calibrate evidence as a function of p-values or other purely statistical measures. Expand

Redefining the Critical Value of Significance Level (0.005 instead of 0.05): The Bayes Trace

- Biology
- Biology Bulletin
- 2019

The precise sense of some concepts, such as p-value, the Bayes factor, and the minimum a posteriori probability of the zero hypothesis are discussed in this review, made mainly with the examples related to the comparison of frequencies. Expand

The p value wars (again)

- Psychology, Medicine
- European Journal of Nuclear Medicine and Molecular Imaging
- 2019

The p value is at the heart of a much wider discussion which started in earnest about a decade ago in Psychology and quickly percolated through the life sciences in general, and where the p value, or rather its interpretation, takes center stage. Expand

Abandon Statistical Signi fi cance

- 2019

We discuss problems the null hypothesis significance testing (NHST) paradigm poses for replication and more broadly in the biomedical and social sciences as well as how these problems remain… Expand

Redefining significance and reproducibility for medical research: A plea for higher P‐value thresholds for diagnostic and prognostic models

- Medicine, Mathematics
- European journal of clinical investigation
- 2020

It is concluded that a lower P‐value threshold for declaring statistical significance implies more exaggeration in an estimated effect, which implies that if a low threshold is used, effect size estimation should not be attempted, for example in the context of selecting promising discoveries that need further validation. Expand

The Impact of P-hacking on “Redefine Statistical Significance”

- Psychology
- Basic and Applied Social Psychology
- 2018

Abstract In their proposal to “redefine statistical significance,” Benjamin et al. claim that lowering the default cutoff for statistical significance from .05 to .005 would “immediately improve the… Expand

Manipulating the Alpha Level Cannot Cure Significance Testing

- Psychology, Medicine
- Front. Psychol.
- 2018

We argue that making accept/reject decisions on scientific hypotheses, including a recent call for changing the canonical alpha level from p = 0.05 to p = 0.005, is deleterious for the finding of new… Expand

Beyond psychology: prevalence of p value and confidence interval misinterpretation across different fields

- Psychology
- 2020

P values and confidence intervals (CIs) are the most widely used statistical indices in scientific literature. Several surveys have revealed that these two indices are generally misunderstood.… Expand

Three Recommendations for Improving the Use of p-Values

- Mathematics
- The American Statistician
- 2019

ABSTRACT Researchers commonly use p-values to answer the question: How strongly does the evidence favor the alternative hypothesis relative to the null hypothesis? p-Values themselves do not directly… Expand

#### References

SHOWING 1-10 OF 10 REFERENCES

The earth is flat (p > 0.05): significance thresholds and the crisis of unreplicable research

- Medicine, Psychology
- PeerJ
- 2017

The widespread use of ‘statistical significance’ as a license for making a claim of a scientific finding leads to considerable distortion of the scientific process, and potential arguments against removing significance thresholds are discussed. Expand

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

- Medicine, Psychology
- European Journal of Epidemiology
- 2016

Misinterpretation and abuse of statistical tests, confidence intervals, and statistical power have been decried for decades, yet remain rampant. A key problem is that there are no interpretations of… Expand

Invited Commentary: The Need for Cognitive Science in Methodology

- Computer Science, Medicine
- American journal of epidemiology
- 2017

It is concluded that methodological development and training should go beyond coverage of mechanistic biases to cover distortions of conclusions produced by statistical methods and psychosocial forces. Expand

The Long Way From α-Error Control to Validity Proper

- Psychology, Medicine
- Perspectives on psychological science : a journal of the Association for Psychological Science
- 2012

It is argued that, given the current state of affairs in behavioral science, false negatives often constitute a more serious problem and a scientific culture rewarding strong inference is more likely to see progress than a culture preoccupied with tightening its standards for the mere publication of original findings. Expand

The ASA Statement on p-Values: Context, Process, and Purpose

- Psychology
- 2016

Cobb’s concern was a long-worrisome circularity in the sociology of science based on the use of bright lines such as p< 0.05: “We teach it because it’s what we do; we do it because it’s what we… Expand

For and Against Methodologies: Some Perspectives on Recent Causal and Statistical Inference Debates

- Medicine
- European Journal of Epidemiology
- 2017

It is argued that, once these misconceptions are removed, most elements of the opposing views can be reconciled and the chief problem of causal inference becomes one of how to teach sound use of formal methods and how to apply them without generating the overconfidence and misinterpretations that have ruined so many statistical practices. Expand

Power failure: why small sample size undermines the reliability of neuroscience

- Psychology, Medicine
- Nature Reviews Neuroscience
- 2013

It is shown that the average statistical power of studies in the neurosciences is very low, and the consequences include overestimates of effect size and low reproducibility of results. Expand

Statistical inference : a commentary for the social and behavioural sciences / Michael Oakes

- Psychology
- 1986

Preface ON SIGNIFICANCE TESTS: The Logic of the Significance Test A Critique of Significance Tests Intuitive Statistical Judgements SCHOOLS OF STATISTICAL INFERENCE: Theories of Probability Further… Expand

Competing interests The authors declare no competing interests. Nature HumaN BeHaviour | VOL 2 | JANUARY 2018 | 4 | www.nature.com/nathumbehav

- Stat. 70,
- 2016