User login
Photo by Rhoda Baer
A review of the biomedical literature indicates an increase in the use of P values in recent years, but researchers say this technique can provide misleading results.
“It’s usually a suboptimal technique, and then it’s used in a biased way, so it can become very misleading,” said John Ioannidis, MD, of Stanford University in California.
He and his colleagues reviewed the use of P values and recounted their findings in JAMA.
The team used automated text-mining analysis to extract data on P values reported in 12,821,790 MEDLINE abstracts and 843,884 abstracts and full-text articles in PubMed Central from 1990 to 2015.
The researchers also assessed the reporting of P values in 151 English-language core clinical journals and specific article types as classified by PubMed.
They manually evaluated a random sample of 1000 MEDLINE abstracts for reporting of P values and other types of statistical information. And of those abstracts reporting empirical data, 100 articles were assessed in their entirety.
The data showed that reporting of P values more than doubled from 1990 to 2014—increasing from 7.3% to 15.6%.
In abstracts from core medical journals, 33% reported P values in 2014. And in the subset of randomized, controlled clinical trials, nearly 55% reported P values in 2014.
Dr Ioannidis noted that some researchers mistakenly think a P value is an estimate of how likely it is that a result is true.
“The P value does not tell you whether something is true,” he explained. “If you get a P value of 0.01, it doesn’t mean you have a 1% chance of something not being true.”
“A P value of 0.01 could mean the result is 20% likely to be true, 80% likely to be true, or 0.1% likely to be true—all with the same P value. The P value alone doesn’t tell you how true your result is.”
For an actual estimate of how likely a result is to be true or false, Dr Ioannidis said, researchers should instead use false-discovery rates or Bayes factor calculations.
He and his colleagues assessed the use of false-discovery rates, Bayes factor calculations, effect sizes, and confidence intervals in the 796 papers in their review that contained empirical data.
They found that 111 of these papers reported effect sizes, and 18 reported confidence intervals. None of the papers reported Bayes factors or false-discovery rates.
Fewer than 2% of the abstracts the team reviewed reported both an effect size and a confidence interval.
In a manual review of 99 randomly selected full-text articles, the researchers found that 55 articles reported at least 1 P value. But only 4 articles reported confidence intervals for all effect sizes, none used Bayesian methods, and 1 used false-discovery rates.
In light of these findings, Dr Ioannidis advocates more stringent approaches to analyzing data.
“The way to move forward is that P values need to be used more selectively,” he said. “When used, they need to be complemented by effect sizes and uncertainty [confidence intervals]. And it would often be a good idea to use a Bayesian approach or a false-discovery rate to answer the question, ‘How likely is this result to be true?’”
Photo by Rhoda Baer
A review of the biomedical literature indicates an increase in the use of P values in recent years, but researchers say this technique can provide misleading results.
“It’s usually a suboptimal technique, and then it’s used in a biased way, so it can become very misleading,” said John Ioannidis, MD, of Stanford University in California.
He and his colleagues reviewed the use of P values and recounted their findings in JAMA.
The team used automated text-mining analysis to extract data on P values reported in 12,821,790 MEDLINE abstracts and 843,884 abstracts and full-text articles in PubMed Central from 1990 to 2015.
The researchers also assessed the reporting of P values in 151 English-language core clinical journals and specific article types as classified by PubMed.
They manually evaluated a random sample of 1000 MEDLINE abstracts for reporting of P values and other types of statistical information. And of those abstracts reporting empirical data, 100 articles were assessed in their entirety.
The data showed that reporting of P values more than doubled from 1990 to 2014—increasing from 7.3% to 15.6%.
In abstracts from core medical journals, 33% reported P values in 2014. And in the subset of randomized, controlled clinical trials, nearly 55% reported P values in 2014.
Dr Ioannidis noted that some researchers mistakenly think a P value is an estimate of how likely it is that a result is true.
“The P value does not tell you whether something is true,” he explained. “If you get a P value of 0.01, it doesn’t mean you have a 1% chance of something not being true.”
“A P value of 0.01 could mean the result is 20% likely to be true, 80% likely to be true, or 0.1% likely to be true—all with the same P value. The P value alone doesn’t tell you how true your result is.”
For an actual estimate of how likely a result is to be true or false, Dr Ioannidis said, researchers should instead use false-discovery rates or Bayes factor calculations.
He and his colleagues assessed the use of false-discovery rates, Bayes factor calculations, effect sizes, and confidence intervals in the 796 papers in their review that contained empirical data.
They found that 111 of these papers reported effect sizes, and 18 reported confidence intervals. None of the papers reported Bayes factors or false-discovery rates.
Fewer than 2% of the abstracts the team reviewed reported both an effect size and a confidence interval.
In a manual review of 99 randomly selected full-text articles, the researchers found that 55 articles reported at least 1 P value. But only 4 articles reported confidence intervals for all effect sizes, none used Bayesian methods, and 1 used false-discovery rates.
In light of these findings, Dr Ioannidis advocates more stringent approaches to analyzing data.
“The way to move forward is that P values need to be used more selectively,” he said. “When used, they need to be complemented by effect sizes and uncertainty [confidence intervals]. And it would often be a good idea to use a Bayesian approach or a false-discovery rate to answer the question, ‘How likely is this result to be true?’”
Photo by Rhoda Baer
A review of the biomedical literature indicates an increase in the use of P values in recent years, but researchers say this technique can provide misleading results.
“It’s usually a suboptimal technique, and then it’s used in a biased way, so it can become very misleading,” said John Ioannidis, MD, of Stanford University in California.
He and his colleagues reviewed the use of P values and recounted their findings in JAMA.
The team used automated text-mining analysis to extract data on P values reported in 12,821,790 MEDLINE abstracts and 843,884 abstracts and full-text articles in PubMed Central from 1990 to 2015.
The researchers also assessed the reporting of P values in 151 English-language core clinical journals and specific article types as classified by PubMed.
They manually evaluated a random sample of 1000 MEDLINE abstracts for reporting of P values and other types of statistical information. And of those abstracts reporting empirical data, 100 articles were assessed in their entirety.
The data showed that reporting of P values more than doubled from 1990 to 2014—increasing from 7.3% to 15.6%.
In abstracts from core medical journals, 33% reported P values in 2014. And in the subset of randomized, controlled clinical trials, nearly 55% reported P values in 2014.
Dr Ioannidis noted that some researchers mistakenly think a P value is an estimate of how likely it is that a result is true.
“The P value does not tell you whether something is true,” he explained. “If you get a P value of 0.01, it doesn’t mean you have a 1% chance of something not being true.”
“A P value of 0.01 could mean the result is 20% likely to be true, 80% likely to be true, or 0.1% likely to be true—all with the same P value. The P value alone doesn’t tell you how true your result is.”
For an actual estimate of how likely a result is to be true or false, Dr Ioannidis said, researchers should instead use false-discovery rates or Bayes factor calculations.
He and his colleagues assessed the use of false-discovery rates, Bayes factor calculations, effect sizes, and confidence intervals in the 796 papers in their review that contained empirical data.
They found that 111 of these papers reported effect sizes, and 18 reported confidence intervals. None of the papers reported Bayes factors or false-discovery rates.
Fewer than 2% of the abstracts the team reviewed reported both an effect size and a confidence interval.
In a manual review of 99 randomly selected full-text articles, the researchers found that 55 articles reported at least 1 P value. But only 4 articles reported confidence intervals for all effect sizes, none used Bayesian methods, and 1 used false-discovery rates.
In light of these findings, Dr Ioannidis advocates more stringent approaches to analyzing data.
“The way to move forward is that P values need to be used more selectively,” he said. “When used, they need to be complemented by effect sizes and uncertainty [confidence intervals]. And it would often be a good idea to use a Bayesian approach or a false-discovery rate to answer the question, ‘How likely is this result to be true?’”