User login
The power of power
The first trial of beta-blockers in myocardial infarction was entitled “The lack of prophylactic effect of propranolol in myocardial infarction”1—a conclusion inconsistent with our current understanding of beta-blocker therapy. The reason has to do with “statistical power”—a statistic that tells us the chance of finding a significant difference between treatments.2
Type 1 and type 2 errors
We draw conclusions based on the results of clinical trials. No trial is perfect. Trials are designed with the knowledge that there is a probability of drawing a conclusion based on the results that does not represent the truth about 2 or more therapies.
If we conclude from the results of a trial that 2 therapies are of different effectiveness, when in reality they are the same, we have committed what is known as a type 1 error. The probability of making a type 1 error is termed the alpha. Trials are usually designed with an a of 0.05 (5%).
On the other hand, if we conclude that the 2 therapies are the same when they are actually different, we have committed a type 2 error. The probability of making a type 2 error is known as the beta.
Perhaps a bit more intuitively, we are often interested in knowing the probability of finding a difference when there really is one. This probability is called power and may be expressed as 1-β .
Power in study design
In designing a study, the power of a study to detect differences between 2 groups depends upon the number of subjects in each group, whether the groups are equal in size, the variability of responses among subjects, the magnitude of difference one is trying to detect, and the probability of making a type 1 error.3 Researchers can make some educated assumptions to determine the number of subjects to include in a study to assure that clinically relevant differences are found between 2 groups if they exist.
Practicing clinicians should use power to determine the impact of a negative study. For example, the propranolol study1 was designed with a power of only 23%, meaning that there was only a 23% chance of detecting a difference. Drawing conclusions about the lack of effectiveness of propranolol based on this study, therefore, would be a mistake. In clinical trials of an active drug vs a placebo, 100 subjects in each group or more are often needed to detect clinically relevant results—so beware of negative results with small numbers of patients.
Correspondence
Goutham Rao, MD, 3518 Fifth Avenue, Pittsburgh, PA 15261. E-mail: [email protected].
1. Clausen J, Felski M, Jorgensen FS. The lack of prophylactic effect of propranolol in myocardial infarction. Lancet 1966;2:920-924.
2. Freiman JA, Chalmers TC, Smith H. The importance of beta, the Type II error, and sample size in the design and interpretation of the randomized control trial. N Engl J Med 1978;299:690-694.
3. Cohen J. A power primer. Psychological Bulletin 1992;112:155-159.
The first trial of beta-blockers in myocardial infarction was entitled “The lack of prophylactic effect of propranolol in myocardial infarction”1—a conclusion inconsistent with our current understanding of beta-blocker therapy. The reason has to do with “statistical power”—a statistic that tells us the chance of finding a significant difference between treatments.2
Type 1 and type 2 errors
We draw conclusions based on the results of clinical trials. No trial is perfect. Trials are designed with the knowledge that there is a probability of drawing a conclusion based on the results that does not represent the truth about 2 or more therapies.
If we conclude from the results of a trial that 2 therapies are of different effectiveness, when in reality they are the same, we have committed what is known as a type 1 error. The probability of making a type 1 error is termed the alpha. Trials are usually designed with an a of 0.05 (5%).
On the other hand, if we conclude that the 2 therapies are the same when they are actually different, we have committed a type 2 error. The probability of making a type 2 error is known as the beta.
Perhaps a bit more intuitively, we are often interested in knowing the probability of finding a difference when there really is one. This probability is called power and may be expressed as 1-β .
Power in study design
In designing a study, the power of a study to detect differences between 2 groups depends upon the number of subjects in each group, whether the groups are equal in size, the variability of responses among subjects, the magnitude of difference one is trying to detect, and the probability of making a type 1 error.3 Researchers can make some educated assumptions to determine the number of subjects to include in a study to assure that clinically relevant differences are found between 2 groups if they exist.
Practicing clinicians should use power to determine the impact of a negative study. For example, the propranolol study1 was designed with a power of only 23%, meaning that there was only a 23% chance of detecting a difference. Drawing conclusions about the lack of effectiveness of propranolol based on this study, therefore, would be a mistake. In clinical trials of an active drug vs a placebo, 100 subjects in each group or more are often needed to detect clinically relevant results—so beware of negative results with small numbers of patients.
Correspondence
Goutham Rao, MD, 3518 Fifth Avenue, Pittsburgh, PA 15261. E-mail: [email protected].
The first trial of beta-blockers in myocardial infarction was entitled “The lack of prophylactic effect of propranolol in myocardial infarction”1—a conclusion inconsistent with our current understanding of beta-blocker therapy. The reason has to do with “statistical power”—a statistic that tells us the chance of finding a significant difference between treatments.2
Type 1 and type 2 errors
We draw conclusions based on the results of clinical trials. No trial is perfect. Trials are designed with the knowledge that there is a probability of drawing a conclusion based on the results that does not represent the truth about 2 or more therapies.
If we conclude from the results of a trial that 2 therapies are of different effectiveness, when in reality they are the same, we have committed what is known as a type 1 error. The probability of making a type 1 error is termed the alpha. Trials are usually designed with an a of 0.05 (5%).
On the other hand, if we conclude that the 2 therapies are the same when they are actually different, we have committed a type 2 error. The probability of making a type 2 error is known as the beta.
Perhaps a bit more intuitively, we are often interested in knowing the probability of finding a difference when there really is one. This probability is called power and may be expressed as 1-β .
Power in study design
In designing a study, the power of a study to detect differences between 2 groups depends upon the number of subjects in each group, whether the groups are equal in size, the variability of responses among subjects, the magnitude of difference one is trying to detect, and the probability of making a type 1 error.3 Researchers can make some educated assumptions to determine the number of subjects to include in a study to assure that clinically relevant differences are found between 2 groups if they exist.
Practicing clinicians should use power to determine the impact of a negative study. For example, the propranolol study1 was designed with a power of only 23%, meaning that there was only a 23% chance of detecting a difference. Drawing conclusions about the lack of effectiveness of propranolol based on this study, therefore, would be a mistake. In clinical trials of an active drug vs a placebo, 100 subjects in each group or more are often needed to detect clinically relevant results—so beware of negative results with small numbers of patients.
Correspondence
Goutham Rao, MD, 3518 Fifth Avenue, Pittsburgh, PA 15261. E-mail: [email protected].
1. Clausen J, Felski M, Jorgensen FS. The lack of prophylactic effect of propranolol in myocardial infarction. Lancet 1966;2:920-924.
2. Freiman JA, Chalmers TC, Smith H. The importance of beta, the Type II error, and sample size in the design and interpretation of the randomized control trial. N Engl J Med 1978;299:690-694.
3. Cohen J. A power primer. Psychological Bulletin 1992;112:155-159.
1. Clausen J, Felski M, Jorgensen FS. The lack of prophylactic effect of propranolol in myocardial infarction. Lancet 1966;2:920-924.
2. Freiman JA, Chalmers TC, Smith H. The importance of beta, the Type II error, and sample size in the design and interpretation of the randomized control trial. N Engl J Med 1978;299:690-694.
3. Cohen J. A power primer. Psychological Bulletin 1992;112:155-159.
Are NSAIDs more effective than acetaminophen in patients with osteoarthritis?
BACKGROUND: Two smaller randomized controlled trials failed to show a statistically significant difference between acetaminophen and the non steroidal anti-inflammatory drugs (NSAIDs) ibuprofen and naproxen in the treatment of osteoarthritis (OA). However, survey data showing a benefit with NSAIDs and a meta-analysis suggesting their superiority led the researchers to conduct this larger clinical trial.
POPULATION STUDIED: A total of 227 patients were recruited in 12 ambulatory sites either directly or by advertising. The study was not conducted in a primary care setting. Eighty percent of the patients had already seen a rheumatologist before recruitment in the study. Patients were older than 40 years, had Kellgren-Lawrence radiographic scale grade 2 to 4 OA of the hip or knee, and a visual analog pain scale score of 30 mm or greater (range=0-100). Patients with severe comorbidities and hypersensitivity to the medications were excluded.
STUDY DESIGN AND VALIDITY: This was a double-blind crossover trial with all patients receiving both therapies. The study consisted of 2 treatment periods of 6 weeks each, separated by a 3- to 7-day washout period. In period 1, half the patients took diclofenac 75 mg plus 200 μg misoprostol twice daily, and the other group of patients took acetaminophen 1000 mg 4 times daily. Both groups took a placebo of the other medication. In period 2, the therapies for the 2 groups were reversed.
OUTCOMES MEASURED: There were 2 primary outcome measures. The first was the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) directed to the primary involved joint. The second was the Multidimensional Health Assessment Questionnaire (MDHAQ) visual analog pain scale. These pain, stiffness, and functional status scoring systems are the best, although imperfect, measurement tools for OA. Since the scores are derived from patient self-reporting, they are reasonable surrogates for patient-oriented outcomes. Other outcomes measured included gastrointestinal distress, global patient status, a general bodily pain score, and the investigator’s estimate of patient status.
RESULTS: For the first 6-week treatment period, the WOMAC index improved by 12.2 points (on a 100-point scale) for the diclofenac-treated patients and by 6.6 points for the acetaminophen-treated patients. For the second period, the improvement was 12.9 points and 2.1 points, again favoring diclofenac. Likewise, the MDHAQ pain scale improved more with diclofenac plus misoprostol in both treatment periods, 20.8 points (also a 100-point scale) compared with 13.1 for acetaminophen in period 1, and 24.6 points versus 0.4 points for acetaminophen in period 2.
This well-designed trial found the NSAID diclofenac to be more effective than acetaminophen in patients with moderate to severe arthritis. The 2 drugs provided similar pain relief in patients with mild symptoms. For now, patients with mild OA should still be offered acetaminophen based on its better side effect profile and its therapeutic equivalence. For certain patients with more severe symptoms, NSAIDS will be the better choice. Whether either of these agents should be offered before nonpharmacologic or nonsystemic therapy still has not been adequately studied. Perhaps more important, we can now more enthusiastically recommend NSAIDs for our OA patients who do not have contraindications and who have had an inadequate response to acetaminophen. It should be comforting for them and for us to know that there likely will be an added benefit.
BACKGROUND: Two smaller randomized controlled trials failed to show a statistically significant difference between acetaminophen and the non steroidal anti-inflammatory drugs (NSAIDs) ibuprofen and naproxen in the treatment of osteoarthritis (OA). However, survey data showing a benefit with NSAIDs and a meta-analysis suggesting their superiority led the researchers to conduct this larger clinical trial.
POPULATION STUDIED: A total of 227 patients were recruited in 12 ambulatory sites either directly or by advertising. The study was not conducted in a primary care setting. Eighty percent of the patients had already seen a rheumatologist before recruitment in the study. Patients were older than 40 years, had Kellgren-Lawrence radiographic scale grade 2 to 4 OA of the hip or knee, and a visual analog pain scale score of 30 mm or greater (range=0-100). Patients with severe comorbidities and hypersensitivity to the medications were excluded.
STUDY DESIGN AND VALIDITY: This was a double-blind crossover trial with all patients receiving both therapies. The study consisted of 2 treatment periods of 6 weeks each, separated by a 3- to 7-day washout period. In period 1, half the patients took diclofenac 75 mg plus 200 μg misoprostol twice daily, and the other group of patients took acetaminophen 1000 mg 4 times daily. Both groups took a placebo of the other medication. In period 2, the therapies for the 2 groups were reversed.
OUTCOMES MEASURED: There were 2 primary outcome measures. The first was the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) directed to the primary involved joint. The second was the Multidimensional Health Assessment Questionnaire (MDHAQ) visual analog pain scale. These pain, stiffness, and functional status scoring systems are the best, although imperfect, measurement tools for OA. Since the scores are derived from patient self-reporting, they are reasonable surrogates for patient-oriented outcomes. Other outcomes measured included gastrointestinal distress, global patient status, a general bodily pain score, and the investigator’s estimate of patient status.
RESULTS: For the first 6-week treatment period, the WOMAC index improved by 12.2 points (on a 100-point scale) for the diclofenac-treated patients and by 6.6 points for the acetaminophen-treated patients. For the second period, the improvement was 12.9 points and 2.1 points, again favoring diclofenac. Likewise, the MDHAQ pain scale improved more with diclofenac plus misoprostol in both treatment periods, 20.8 points (also a 100-point scale) compared with 13.1 for acetaminophen in period 1, and 24.6 points versus 0.4 points for acetaminophen in period 2.
This well-designed trial found the NSAID diclofenac to be more effective than acetaminophen in patients with moderate to severe arthritis. The 2 drugs provided similar pain relief in patients with mild symptoms. For now, patients with mild OA should still be offered acetaminophen based on its better side effect profile and its therapeutic equivalence. For certain patients with more severe symptoms, NSAIDS will be the better choice. Whether either of these agents should be offered before nonpharmacologic or nonsystemic therapy still has not been adequately studied. Perhaps more important, we can now more enthusiastically recommend NSAIDs for our OA patients who do not have contraindications and who have had an inadequate response to acetaminophen. It should be comforting for them and for us to know that there likely will be an added benefit.
BACKGROUND: Two smaller randomized controlled trials failed to show a statistically significant difference between acetaminophen and the non steroidal anti-inflammatory drugs (NSAIDs) ibuprofen and naproxen in the treatment of osteoarthritis (OA). However, survey data showing a benefit with NSAIDs and a meta-analysis suggesting their superiority led the researchers to conduct this larger clinical trial.
POPULATION STUDIED: A total of 227 patients were recruited in 12 ambulatory sites either directly or by advertising. The study was not conducted in a primary care setting. Eighty percent of the patients had already seen a rheumatologist before recruitment in the study. Patients were older than 40 years, had Kellgren-Lawrence radiographic scale grade 2 to 4 OA of the hip or knee, and a visual analog pain scale score of 30 mm or greater (range=0-100). Patients with severe comorbidities and hypersensitivity to the medications were excluded.
STUDY DESIGN AND VALIDITY: This was a double-blind crossover trial with all patients receiving both therapies. The study consisted of 2 treatment periods of 6 weeks each, separated by a 3- to 7-day washout period. In period 1, half the patients took diclofenac 75 mg plus 200 μg misoprostol twice daily, and the other group of patients took acetaminophen 1000 mg 4 times daily. Both groups took a placebo of the other medication. In period 2, the therapies for the 2 groups were reversed.
OUTCOMES MEASURED: There were 2 primary outcome measures. The first was the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) directed to the primary involved joint. The second was the Multidimensional Health Assessment Questionnaire (MDHAQ) visual analog pain scale. These pain, stiffness, and functional status scoring systems are the best, although imperfect, measurement tools for OA. Since the scores are derived from patient self-reporting, they are reasonable surrogates for patient-oriented outcomes. Other outcomes measured included gastrointestinal distress, global patient status, a general bodily pain score, and the investigator’s estimate of patient status.
RESULTS: For the first 6-week treatment period, the WOMAC index improved by 12.2 points (on a 100-point scale) for the diclofenac-treated patients and by 6.6 points for the acetaminophen-treated patients. For the second period, the improvement was 12.9 points and 2.1 points, again favoring diclofenac. Likewise, the MDHAQ pain scale improved more with diclofenac plus misoprostol in both treatment periods, 20.8 points (also a 100-point scale) compared with 13.1 for acetaminophen in period 1, and 24.6 points versus 0.4 points for acetaminophen in period 2.
This well-designed trial found the NSAID diclofenac to be more effective than acetaminophen in patients with moderate to severe arthritis. The 2 drugs provided similar pain relief in patients with mild symptoms. For now, patients with mild OA should still be offered acetaminophen based on its better side effect profile and its therapeutic equivalence. For certain patients with more severe symptoms, NSAIDS will be the better choice. Whether either of these agents should be offered before nonpharmacologic or nonsystemic therapy still has not been adequately studied. Perhaps more important, we can now more enthusiastically recommend NSAIDs for our OA patients who do not have contraindications and who have had an inadequate response to acetaminophen. It should be comforting for them and for us to know that there likely will be an added benefit.