TY - JOUR
T1 - Manipulating the alpha level cannot cure significance testing
AU - Trafimow, David
AU - Amrhein, Valentin
AU - Areshenkoff, Corson N.
AU - Barrera-Causil, Carlos J.
AU - Beh, Eric J.
AU - Bilgiç, Yusuf K.
AU - Bono, Roser
AU - Bradley, Michael T.
AU - Briggs, William M.
AU - Cepeda-Freyre, Héctor A.
AU - Chaigneau, Sergio E.
AU - Ciocca, Daniel R.
AU - Correa, Juan C.
AU - Cousineau, Denis
AU - de Boer, Michiel R.
AU - Dhar, Subhra S.
AU - Dolgov, Igor
AU - Gómez-Benito, Juana
AU - Grendar, Marian
AU - Grice, James W.
AU - Guerrero-Gimenez, Martin E.
AU - Gutiérrez, Andrés
AU - Huedo-Medina, Tania B.
AU - Jaffe, Klaus
AU - Janyan, Armina
AU - Karimnezhad, Ali
AU - Korner-Nievergelt, Fränzi
AU - Kosugi, Koji
AU - Lachmair, Martin
AU - Ledesma, Rubén D.
AU - Limongi, Roberto
AU - Liuzza, Marco T.
AU - Lombardo, Rosaria
AU - Marks, Michael J.
AU - Meinlschmidt, Gunther
AU - Nalborczyk, Ladislas
AU - Nguyen, Hung T.
AU - Ospina, Raydonal
AU - Perezgonzalez, Jose D.
AU - Pfister, Roland
AU - Rahona, Juan J.
AU - Rodríguez-Medina, David A.
AU - Romão, Xavier
AU - Ruiz-Fernández, Susana
AU - Suarez, Isabel
AU - Tegethoff, Marion
AU - Tejo, Mauricio
AU - van de Schoot, Rens
AU - Vankov, Ivan I.
AU - Velasco-Forero, Santiago
PY - 2018/5/15
Y1 - 2018/5/15
N2 - We argue that making accept/reject decisions on scientific hypotheses, including a recent call for changing the canonical alpha level from p = 0.05 to p = 0.005, is deleterious for the finding of new discoveries and the progress of science. Given that blanket and variable alpha levels both are problematic, it is sensible to dispense with significance testing altogether. There are alternatives that address study design and sample size much more directly than significance testing does; but none of the statistical tools should be taken as the new magic method giving clear-cut mechanical answers. Inference should not be based on single studies at all, but on cumulative evidence from multiple independent studies. When evaluating the strength of the evidence, we should consider, for example, auxiliary assumptions, the strength of the experimental design, and implications for applications. To boil all this down to a binary decision based on a p-value threshold of 0.05, 0.01, 0.005, or anything else, is not acceptable.
AB - We argue that making accept/reject decisions on scientific hypotheses, including a recent call for changing the canonical alpha level from p = 0.05 to p = 0.005, is deleterious for the finding of new discoveries and the progress of science. Given that blanket and variable alpha levels both are problematic, it is sensible to dispense with significance testing altogether. There are alternatives that address study design and sample size much more directly than significance testing does; but none of the statistical tools should be taken as the new magic method giving clear-cut mechanical answers. Inference should not be based on single studies at all, but on cumulative evidence from multiple independent studies. When evaluating the strength of the evidence, we should consider, for example, auxiliary assumptions, the strength of the experimental design, and implications for applications. To boil all this down to a binary decision based on a p-value threshold of 0.05, 0.01, 0.005, or anything else, is not acceptable.
KW - Decision making
KW - Null hypothesis testing
KW - P-value
KW - Significance testing
KW - Statistical significance
UR - http://www.scopus.com/inward/record.url?scp=85047014418&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85047014418&partnerID=8YFLogxK
U2 - 10.3389/fpsyg.2018.00699
DO - 10.3389/fpsyg.2018.00699
M3 - Article
AN - SCOPUS:85047014418
SN - 1664-1078
VL - 9
JO - Frontiers in Psychology
JF - Frontiers in Psychology
IS - MAY
M1 - 699
ER -