User login
STUDY DESIGN: Two family physicians attempted to answer 20 questions with each of the databases evaluated. The adequacy of the answers was determined by the 2 physician searchers, and an arbitration panel of 3 family physicians was used if there was disagreement.
DATA SOURCE: We identified 38 databases through nominations from national groups of family physicians, medical informaticians, and medical librarians; 14 of these databases met predetermined eligibility criteria.
OUTCOME MEASURED: The primary outcome was the proportion of questions adequately answered by each database and by combinations of databases. We also measured mean and median times to obtain adequate answers for individual databases.
RESULTS: The agreement between family physician searchers regarding the adequacy of answers was excellent (k=0.94). Five individual databases (STAT! Ref, MDConsult, DynaMed, MAXX, and MDChoice.com) answered at least half of the clinical questions. Some combinations of databases answered 75% or more. The average time to obtain an adequate answer ranged from 2.4 to 6.5 minutes.
CONCLUSIONS: Several current electronic medical databases could answer most of a group of 20 clinical questions derived from family physicians during office practice. However, point-of-care searching is not yet fast enough to address most clinical questions identified during routine clinical practice.
Family physicians and general internists report an average of 6 questions for each half-day of office practice,1-3 and 70% of these questions remain unanswered. The 2 factors that significantly predict whether a physician will attempt to answer a clinical question are the physician’s belief that a definitive answer exists and the urgency of the patient’s problem.4
Gorman and colleagues3 reported that medical librarians found clear answers for 46% of 60 randomly selected questions from family physicians; 51% would affect practice. The medical librarians searched for an average of 43 minutes per question. In a second study,5 medical librarians used MEDLINE and textbooks to answer 86 questions from family physicians. The MEDLINE searches took a mean of 27 minutes, and textbook searches took a mean of 6 minutes. Search results answered 54% of the clinical questions completely or nearly completely. Physicians estimated that the answers would have a “major” or “fairly major” impact on practice for 35% of their questions. MEDLINE searches provided answers to 43% of the questions, while textbook searches provided answers for an additional 11%.
Many physicians do not have the searching skills or access to the range of knowledge resources that librarians use. Even if they did, they do not take the time to conduct such searches during patient care. One study1 found that physicians spent less than 2 minutes on average seeking an answer to a question. Thus, most clinical questions remain unanswered.
Electronic medical databases that provide answers directly (not just reference citations) may make it easier for clinicians to obtain answers at the point of care. We found no systematic evaluation of the capacity of such databases to answer clinical questions. We conducted this study to determine the extent to which current electronic medical databases can answer family physicians’ point-of-care clinical questions.
Methods
Database Selection
We solicited nominations for potentially suitable databases from multiple E-mail lists (including communities of family physicians [Family-L], medical informaticians [FAM-MED] and medical librarians [MEDLIB-L, MCMLA-L]) and through Web searches. A selection team consisting of 3 family physicians (J.S., D.W., B.E.) and a medical librarian (none of whom had financial relationships with any databases) determined whether the nominated databases met our inclusion criteria Table 1.
Clinical Questions
More than 1200 clinical questions had been previously collected from observations of family physicians during office practice.1,5 These questions had been classified by typology (eg, Is test X indicated in situation Y?) and by topic (eg, dermatology).1 We selected questions from these sources that were categorized among the most common typologies (8 of 68 typologies covering 50% of the questions) and the most common topics (7 of 62 topics covering 43% of the questions). These combinations of typologies and topics accounted for 272 (23%) of the 1204 questions.
If necessary, each question was translated by 2 physicians (B.A. and D.W. working together) to meet the following criteria: (1) clear enough to imagine an applicable clinical scenario, (2) answerable (ie, the question could theoretically be answered using clinical references without further patient data regardless of whether an answer was known to exist), (3) clinically relevant, and (4) true to the original question (ie, containing the information need and the modifying factors of the original question).
Each question was then independently proofread by at least 2 other physicians and translated again if necessary. Thirteen questions (5%) that did not meet these criteria after a second translation were dropped. Forty-seven questions (17%) that referred to information needs that could be adequately answered using the Physicians’ Desk Reference6 were dropped (eg, Are Paxil tablets scored?). The remaining 212 questions represented 8 typologies.1 Two or 3 questions were randomly selected from each typology for a total of 20 questions Table 2.
Testing
Two family physicians with experience in computer searching (B.A., D.W.) independently searched for answers using each of the included databases. In the case of DynaMed, for which Dr Alper is the medical director, another family physician was substituted as a searcher, and Dr Alper had no input or control over the testing or arbitration process for answers from DynaMed. Testing took place in April and May 2000.
Searching was performed using computers with Pentium III processors with a 100 megabyte-per-second network connection to the Internet and server-mounted CD-ROMs.
Each searcher used the same 20 questions to evaluate each database. The order of evaluation of the databases was at the discretion of the searchers, but the testing of a database was completed before starting the testing of another database. Searchers became familiar with each database before testing it by using the 5 screening questions.
A maximum of 10 minutes was allowed per question. Each answer was rated as adequate or inadequate. An answer was considered adequate if it contained sufficient information to guide clinical practice. For example, for the question “How do I determine the cause of chronic pruritus?”, the answer from the University of Iowa Family Practice Handbook (www.vh.org/Providers/ClinRef/FPHandbook/Chapter13/01-13.html) was considered adequate, because it included clinically useful recommendations: History should include details about (1) any skin lesions preceding the pruritus; (2) history of weight loss, fatigue, fever, malaise; (3) any recent stress emotionally; and (4) recent medications and travel. Physical examination with emphasis on the skin and its appendages — xerosis, excoriation, lichenification, hydration. Laboratory tests as suggested by the PE, which may include CBC, ESR, fasting glucose, renal or liver function tests, hepatitis panel, thyroid tests, stool for parasites, CXR.
Sources that provided general recommendations without information that could specifically guide clinical practice were considered inadequate. For example: “The cause of generalized pruritus should be sought and corrected. If no skin disease is apparent, a systemic disorder or drug-related cause should be sought.” The searcher recorded the answer and the time it took to obtain it rounded to the nearest number of minutes (1-10).
Scoring and Arbitration
The 2 physician searchers judged the adequacy of the answers to each question for each database. If the searchers both found adequate answers, the result was accepted as adequate, and the average time required to find and interpret the answer was recorded. If neither searcher found an adequate answer, then the answer was deemed inadequate. If only one searcher found an adequate answer, the second searcher evaluated that answer. If the answer was acceptable to the second searcher, it was considered an adequate answer, and the time for the first searcher was recorded.
When searchers disagreed on the adequacy of identified answers, an arbitration panel consisting of 3 family physicians who were not affiliated with any of the databases met independently from the searchers to determine the adequacy of the answers by consensus.
Analysis
Our primary outcome was the proportion of questions adequately answered by each database. We calculated 95% confidence limits for the proportions of adequate answers.7 Means and medians were determined for the time to reach adequate answers for each database. We calculated the k statistic for the independent findings of the 2 searchers and for the results after the searchers reviewed each other’s searches.8 We combined the results of individual databases to determine the proportion of questions answered by all combinations of 2, 3, and 4 databases. We considered the question adequately answered if any of the individual databases adequately answered the question.
Results
Thirty-eight databases were nominated, and 24 did not meet our inclusion criteria Table W1.* Fourteen databases met the inclusion criteria Table 3 and were evaluated with the set of 20 questions (280 answer assessments) by 2 searchers. The Figure summarizes the process of evaluating the answers. The initial agreement between searchers was good k=0.69). Discussion between the searchers resolved 21 (52.5%) of the 40 discrepant answer assessments. These were due to inadequate searching or timing out (searching for 10 minutes) by one searcher, who agreed with the adequacy of the answer found by the other searcher. The agreement between searchers at this stage was excellent (k= 0.94).
The remaining 19 discrepant assessments (for which the searchers had different opinions regarding the adequacy of the answers identified) were referred to the arbitration panel for determination of the final results. Ten of these were deemed adequate.
Results for individual databases in rank order of proportion of questions answered followed by average time to identify adequate answers are reported in Table 3. The combination of STAT!Ref and MDConsult could answer 85% of our set of 20 questions. Four combinations of 2 databases (STAT!Ref and either MAXX, MDChoice.com, Primary Care Guidelines, or Medscape) could answer 80% of our questions. Two combinations of 3 databases (STAT!Ref, MDConsult, and either DynaMed or MAXX) could answer 90% of our questions. Combinations of 4 databases answered the most sample questions (95%, 19/20). These combinations consisted of STAT!Ref, DynaMed, MAXX, and either MDConsult or American Family Physician.
We also evaluated combinations of databases that were available at no cost. The combination of the 2 no-cost databases that answered the largest proportion of questions (75%) was DynaMed and American Family Physician. The greatest proportion of clinical questions that could be answered using the freely available sources was 80%, and this required the use of 3 databases (DynaMed, MDChoice.com, and American Family Physician).
Discussion
Our study suggests that individual databases can answer a considerable proportion of family physicians’ clinical questions. Combinations of currently available databases can answer 75% or more. The searches in this study were based on the combination of efforts of 2 experienced physician searchers. These results may not be replicable in the practice setting but do provide an objective best-case scenario assessment of the content of these databases.
The time required to obtain answers, while much less than searching for original articles, is still longer than the 2-minute average time spent by family physicians in the study by Ely and colleagues.1 Our time estimates are not precise, as time was not the primary focus of our study. Time was only recorded in 1-minute intervals, so searches that took 10 seconds were recorded as 1 minute. Even so, the existence of median times to obtain adequate answers greater than 2 minutes suggests that these databases may require more time than most physicians will take to pursue answers during patient care.
This is the first study to systematically evaluate how many questions can be answered by electronic medical databases. The strengths of this study include the use of a standard set of common questions asked by family physicians, testing by 2 experienced family physician searchers, and a systematic replicable approach to the evaluation. The only similar study we identified was one in which Graber and coworkers9 used 10 clinical questions and tested a commercial site, 2 medical meta-lists, 4 general search engines, and 9 medicine-specific search engines to determine the efficiency of answering clinical questions on the Web. Different approaches answered from 0 to 6 of the 10 questions, but that study looked primarily at sites that were not generally designed for use in clinical practice.
Limitations
Our study was limited by the relatively small number of questions, causing wide confidence intervals. Some answers were present in the databases but not found despite the use of 2 searchers. For example, a database manager identified 2 answers that were not found but would have been considered adequate.
We accepted answers as adequate if, in our judgment, they offered a practical course of action. We did not attempt to determine whether the individual asking the question believed that the answer was adequate nor did we attempt to validate the accuracy or currency of answers using independent standards. Many of the answers were based on sources that were several years old, and few were based on explicit evidence-based criteria. Although we determined the adequacy of answers for clinical practice through formal mechanisms, an in vivo study in which the clinicians asking the questions determined the adequacy of their findings during patient care activities would provide a more accurate assessment.
Our study presents a static evaluation of a dynamic field. Over time, answers may be lost because of lack of maintenance of resource links or may be gained by addition of new materials. Our use of questions gathered several years ago may not accurately reflect the ability of databases to answer current questions, which may be more likely to reflect new tests and treatments.
Many of the databases were designed for purposes other than meeting clinical information needs at the point of care. Performance in this study does not reflect the capacity of these databases to address their stated purposes. For example, the Translating Research Into Practice (TRIP) database is an excellent resource for searches of a large collection of evidence-based resources. These resources are generally limited to summaries of studies with the highest methodologic quality. The TRIP database did not perform well in our study partly because most of our test questions (consistent with questions in clinical practice) cannot currently be answered using studies of the highest methodologic quality. Another example is Medical Matrix, which provides a search engine and annotated summaries for exploring the entire medical Internet and not just clinical reference information.
We did not study the costs involved in using the databases we evaluated, and these costs may have changed since our study was conducted. Most of the databases we included were free to use at the time of the study and at the time of this report. The 3 collections of textbooks required access fees. STAT!Ref, which scored the highest in our study, did so because we used the complete collection available to us through our institutional library. This collection would cost an individual $2189 annually at the time of our study. A starter library was available for $199 annually and would only answer 40% of the questions.
Context
Family physicians and other primary care providers treat patients who have a wide variety of syndromes and symptoms. Because of the scope and breadth of primary care, it is nearly impossible for a clinician to keep up with rapidly changing medical information.10
Connelly and colleagues11 surveyed 126 family physicians and found they used the Physicians’ Desk Reference and colleagues much more often than Index Medicus or computer-based bibliographic retrieval systems. Research literature was used infrequently and rated among the lowest in terms of credibility, availability, searchability, understandability, and applicability. Physicians preferred sources that had low cost and were relevant to specific patient problems over sources that had higher quality.
Conclusions
Current databases can answer a considerable proportion of clinical questions but have not reached their potential for efficiency. It is our hope that as electronic medical databases mature, they will be able to bridge this gap and bring the research literature to the point of care in useful and practical ways. This study provides a snapshot of how far we have come and how far we need to go to meet these needs.
Acknowledgments
Funding for our study was provided by a grant from the American Academy of Family Physicians to support the Center for Family Medicine Science and from 2 Bureau of Health Professions Awards (DHHS 1-D14-HP-00029-01, DHHS 5 T32 HP10038) from the Health Resources and Services Administration to the Department of Family and Community Medicine at University of Missouri-Columbia. The authors would like to acknowledge Erik Lindbloom, MD, MSPH, for assisting with the database testing as a substitute searcher for B.A.; E. Diane Johnson, MLS, for assisting with the selection of databases for study inclusion; Robert Phillips, Jr., MD, MSPH, for arbitration of questions and answers for which the searchers did not reach agreement along with B.E. and J.S.; David Cravens, Erik Lindbloom, Kevin Kane, Jim Brillhart, and Mark Ebell for proofreading the questions for clarity, answerability, and clinical relevance; John Ely and Lee Chambliss for providing clinical questions from their observations; Mark Ebell, John Ely, Erik Lindbloom, Jerry Osheroff, Lee Chambliss, David Mehr, Robin Kruse, John Smucny, and many others for constructive criticism in the design of this study; and Steve Zweig for editorial review.
1. Ely JW, Osheroff JA, Ebell MH, et al. Analysis of questions asked by family doctors regarding patient care. BMJ 1999;319:358-61.
2. Covell DG, Uman GC, Manning PR. Information needs in office practice: are they being met? Ann Intern Med 1985;103:596-99.
3. Gorman PN, Ash J, Wykoff L. Can primary care physicians’ questions be answered using the medical journal literature? Bull Med Lib Assoc 1994;82:140-46.
4. Gorman PN, Helfand M. Information seeking in primary care: how physicians choose which clinical questions to pursue and which to leave unanswered. Med Decis Mak 1995;15:113-19.
5. Chambliss ML, Conley J. Answering clinical questions. J Fam Pract 1996;43:140-44.
6. Medical Economics Physicians’ desk reference. 54th ed. Oradell, NJ: Medical Economics Company; 2000.
7. Pagano M, Gauvreau K. Inference on proportions. Principles of biostatistics. Belmont, Calif: Duxbury Press; 1993;297-298.
8. Sackett DL, Haynes RB, Guyatt GH, Tugwell P. The clinical examination. Clinical epidemiology: a basic science for clinical medicine. Boston, Mass: Little, Brown and Company; 1991;29-30.
9. Graber MA, Bergus GR, York C. Using the World Wide Web to answer clinical questions: how efficient are different methods of information retrieval? J Fam Pract 1999;48:520-24.
10. Dickinson WP, Stange KC, Ebell MH, Ewigman BG, Green LA. Involving all family physicians and family medicine faculty members in the use and generation of new knowledge. Fam Med 2000;32:480-90.
11. Connelly DP, Rich EC, Curley SP, Kelly JT. Knowledge resource p of family physicians. J Fam Pract 1990;30:353-59.
STUDY DESIGN: Two family physicians attempted to answer 20 questions with each of the databases evaluated. The adequacy of the answers was determined by the 2 physician searchers, and an arbitration panel of 3 family physicians was used if there was disagreement.
DATA SOURCE: We identified 38 databases through nominations from national groups of family physicians, medical informaticians, and medical librarians; 14 of these databases met predetermined eligibility criteria.
OUTCOME MEASURED: The primary outcome was the proportion of questions adequately answered by each database and by combinations of databases. We also measured mean and median times to obtain adequate answers for individual databases.
RESULTS: The agreement between family physician searchers regarding the adequacy of answers was excellent (k=0.94). Five individual databases (STAT! Ref, MDConsult, DynaMed, MAXX, and MDChoice.com) answered at least half of the clinical questions. Some combinations of databases answered 75% or more. The average time to obtain an adequate answer ranged from 2.4 to 6.5 minutes.
CONCLUSIONS: Several current electronic medical databases could answer most of a group of 20 clinical questions derived from family physicians during office practice. However, point-of-care searching is not yet fast enough to address most clinical questions identified during routine clinical practice.
Family physicians and general internists report an average of 6 questions for each half-day of office practice,1-3 and 70% of these questions remain unanswered. The 2 factors that significantly predict whether a physician will attempt to answer a clinical question are the physician’s belief that a definitive answer exists and the urgency of the patient’s problem.4
Gorman and colleagues3 reported that medical librarians found clear answers for 46% of 60 randomly selected questions from family physicians; 51% would affect practice. The medical librarians searched for an average of 43 minutes per question. In a second study,5 medical librarians used MEDLINE and textbooks to answer 86 questions from family physicians. The MEDLINE searches took a mean of 27 minutes, and textbook searches took a mean of 6 minutes. Search results answered 54% of the clinical questions completely or nearly completely. Physicians estimated that the answers would have a “major” or “fairly major” impact on practice for 35% of their questions. MEDLINE searches provided answers to 43% of the questions, while textbook searches provided answers for an additional 11%.
Many physicians do not have the searching skills or access to the range of knowledge resources that librarians use. Even if they did, they do not take the time to conduct such searches during patient care. One study1 found that physicians spent less than 2 minutes on average seeking an answer to a question. Thus, most clinical questions remain unanswered.
Electronic medical databases that provide answers directly (not just reference citations) may make it easier for clinicians to obtain answers at the point of care. We found no systematic evaluation of the capacity of such databases to answer clinical questions. We conducted this study to determine the extent to which current electronic medical databases can answer family physicians’ point-of-care clinical questions.
Methods
Database Selection
We solicited nominations for potentially suitable databases from multiple E-mail lists (including communities of family physicians [Family-L], medical informaticians [FAM-MED] and medical librarians [MEDLIB-L, MCMLA-L]) and through Web searches. A selection team consisting of 3 family physicians (J.S., D.W., B.E.) and a medical librarian (none of whom had financial relationships with any databases) determined whether the nominated databases met our inclusion criteria Table 1.
Clinical Questions
More than 1200 clinical questions had been previously collected from observations of family physicians during office practice.1,5 These questions had been classified by typology (eg, Is test X indicated in situation Y?) and by topic (eg, dermatology).1 We selected questions from these sources that were categorized among the most common typologies (8 of 68 typologies covering 50% of the questions) and the most common topics (7 of 62 topics covering 43% of the questions). These combinations of typologies and topics accounted for 272 (23%) of the 1204 questions.
If necessary, each question was translated by 2 physicians (B.A. and D.W. working together) to meet the following criteria: (1) clear enough to imagine an applicable clinical scenario, (2) answerable (ie, the question could theoretically be answered using clinical references without further patient data regardless of whether an answer was known to exist), (3) clinically relevant, and (4) true to the original question (ie, containing the information need and the modifying factors of the original question).
Each question was then independently proofread by at least 2 other physicians and translated again if necessary. Thirteen questions (5%) that did not meet these criteria after a second translation were dropped. Forty-seven questions (17%) that referred to information needs that could be adequately answered using the Physicians’ Desk Reference6 were dropped (eg, Are Paxil tablets scored?). The remaining 212 questions represented 8 typologies.1 Two or 3 questions were randomly selected from each typology for a total of 20 questions Table 2.
Testing
Two family physicians with experience in computer searching (B.A., D.W.) independently searched for answers using each of the included databases. In the case of DynaMed, for which Dr Alper is the medical director, another family physician was substituted as a searcher, and Dr Alper had no input or control over the testing or arbitration process for answers from DynaMed. Testing took place in April and May 2000.
Searching was performed using computers with Pentium III processors with a 100 megabyte-per-second network connection to the Internet and server-mounted CD-ROMs.
Each searcher used the same 20 questions to evaluate each database. The order of evaluation of the databases was at the discretion of the searchers, but the testing of a database was completed before starting the testing of another database. Searchers became familiar with each database before testing it by using the 5 screening questions.
A maximum of 10 minutes was allowed per question. Each answer was rated as adequate or inadequate. An answer was considered adequate if it contained sufficient information to guide clinical practice. For example, for the question “How do I determine the cause of chronic pruritus?”, the answer from the University of Iowa Family Practice Handbook (www.vh.org/Providers/ClinRef/FPHandbook/Chapter13/01-13.html) was considered adequate, because it included clinically useful recommendations: History should include details about (1) any skin lesions preceding the pruritus; (2) history of weight loss, fatigue, fever, malaise; (3) any recent stress emotionally; and (4) recent medications and travel. Physical examination with emphasis on the skin and its appendages — xerosis, excoriation, lichenification, hydration. Laboratory tests as suggested by the PE, which may include CBC, ESR, fasting glucose, renal or liver function tests, hepatitis panel, thyroid tests, stool for parasites, CXR.
Sources that provided general recommendations without information that could specifically guide clinical practice were considered inadequate. For example: “The cause of generalized pruritus should be sought and corrected. If no skin disease is apparent, a systemic disorder or drug-related cause should be sought.” The searcher recorded the answer and the time it took to obtain it rounded to the nearest number of minutes (1-10).
Scoring and Arbitration
The 2 physician searchers judged the adequacy of the answers to each question for each database. If the searchers both found adequate answers, the result was accepted as adequate, and the average time required to find and interpret the answer was recorded. If neither searcher found an adequate answer, then the answer was deemed inadequate. If only one searcher found an adequate answer, the second searcher evaluated that answer. If the answer was acceptable to the second searcher, it was considered an adequate answer, and the time for the first searcher was recorded.
When searchers disagreed on the adequacy of identified answers, an arbitration panel consisting of 3 family physicians who were not affiliated with any of the databases met independently from the searchers to determine the adequacy of the answers by consensus.
Analysis
Our primary outcome was the proportion of questions adequately answered by each database. We calculated 95% confidence limits for the proportions of adequate answers.7 Means and medians were determined for the time to reach adequate answers for each database. We calculated the k statistic for the independent findings of the 2 searchers and for the results after the searchers reviewed each other’s searches.8 We combined the results of individual databases to determine the proportion of questions answered by all combinations of 2, 3, and 4 databases. We considered the question adequately answered if any of the individual databases adequately answered the question.
Results
Thirty-eight databases were nominated, and 24 did not meet our inclusion criteria Table W1.* Fourteen databases met the inclusion criteria Table 3 and were evaluated with the set of 20 questions (280 answer assessments) by 2 searchers. The Figure summarizes the process of evaluating the answers. The initial agreement between searchers was good k=0.69). Discussion between the searchers resolved 21 (52.5%) of the 40 discrepant answer assessments. These were due to inadequate searching or timing out (searching for 10 minutes) by one searcher, who agreed with the adequacy of the answer found by the other searcher. The agreement between searchers at this stage was excellent (k= 0.94).
The remaining 19 discrepant assessments (for which the searchers had different opinions regarding the adequacy of the answers identified) were referred to the arbitration panel for determination of the final results. Ten of these were deemed adequate.
Results for individual databases in rank order of proportion of questions answered followed by average time to identify adequate answers are reported in Table 3. The combination of STAT!Ref and MDConsult could answer 85% of our set of 20 questions. Four combinations of 2 databases (STAT!Ref and either MAXX, MDChoice.com, Primary Care Guidelines, or Medscape) could answer 80% of our questions. Two combinations of 3 databases (STAT!Ref, MDConsult, and either DynaMed or MAXX) could answer 90% of our questions. Combinations of 4 databases answered the most sample questions (95%, 19/20). These combinations consisted of STAT!Ref, DynaMed, MAXX, and either MDConsult or American Family Physician.
We also evaluated combinations of databases that were available at no cost. The combination of the 2 no-cost databases that answered the largest proportion of questions (75%) was DynaMed and American Family Physician. The greatest proportion of clinical questions that could be answered using the freely available sources was 80%, and this required the use of 3 databases (DynaMed, MDChoice.com, and American Family Physician).
Discussion
Our study suggests that individual databases can answer a considerable proportion of family physicians’ clinical questions. Combinations of currently available databases can answer 75% or more. The searches in this study were based on the combination of efforts of 2 experienced physician searchers. These results may not be replicable in the practice setting but do provide an objective best-case scenario assessment of the content of these databases.
The time required to obtain answers, while much less than searching for original articles, is still longer than the 2-minute average time spent by family physicians in the study by Ely and colleagues.1 Our time estimates are not precise, as time was not the primary focus of our study. Time was only recorded in 1-minute intervals, so searches that took 10 seconds were recorded as 1 minute. Even so, the existence of median times to obtain adequate answers greater than 2 minutes suggests that these databases may require more time than most physicians will take to pursue answers during patient care.
This is the first study to systematically evaluate how many questions can be answered by electronic medical databases. The strengths of this study include the use of a standard set of common questions asked by family physicians, testing by 2 experienced family physician searchers, and a systematic replicable approach to the evaluation. The only similar study we identified was one in which Graber and coworkers9 used 10 clinical questions and tested a commercial site, 2 medical meta-lists, 4 general search engines, and 9 medicine-specific search engines to determine the efficiency of answering clinical questions on the Web. Different approaches answered from 0 to 6 of the 10 questions, but that study looked primarily at sites that were not generally designed for use in clinical practice.
Limitations
Our study was limited by the relatively small number of questions, causing wide confidence intervals. Some answers were present in the databases but not found despite the use of 2 searchers. For example, a database manager identified 2 answers that were not found but would have been considered adequate.
We accepted answers as adequate if, in our judgment, they offered a practical course of action. We did not attempt to determine whether the individual asking the question believed that the answer was adequate nor did we attempt to validate the accuracy or currency of answers using independent standards. Many of the answers were based on sources that were several years old, and few were based on explicit evidence-based criteria. Although we determined the adequacy of answers for clinical practice through formal mechanisms, an in vivo study in which the clinicians asking the questions determined the adequacy of their findings during patient care activities would provide a more accurate assessment.
Our study presents a static evaluation of a dynamic field. Over time, answers may be lost because of lack of maintenance of resource links or may be gained by addition of new materials. Our use of questions gathered several years ago may not accurately reflect the ability of databases to answer current questions, which may be more likely to reflect new tests and treatments.
Many of the databases were designed for purposes other than meeting clinical information needs at the point of care. Performance in this study does not reflect the capacity of these databases to address their stated purposes. For example, the Translating Research Into Practice (TRIP) database is an excellent resource for searches of a large collection of evidence-based resources. These resources are generally limited to summaries of studies with the highest methodologic quality. The TRIP database did not perform well in our study partly because most of our test questions (consistent with questions in clinical practice) cannot currently be answered using studies of the highest methodologic quality. Another example is Medical Matrix, which provides a search engine and annotated summaries for exploring the entire medical Internet and not just clinical reference information.
We did not study the costs involved in using the databases we evaluated, and these costs may have changed since our study was conducted. Most of the databases we included were free to use at the time of the study and at the time of this report. The 3 collections of textbooks required access fees. STAT!Ref, which scored the highest in our study, did so because we used the complete collection available to us through our institutional library. This collection would cost an individual $2189 annually at the time of our study. A starter library was available for $199 annually and would only answer 40% of the questions.
Context
Family physicians and other primary care providers treat patients who have a wide variety of syndromes and symptoms. Because of the scope and breadth of primary care, it is nearly impossible for a clinician to keep up with rapidly changing medical information.10
Connelly and colleagues11 surveyed 126 family physicians and found they used the Physicians’ Desk Reference and colleagues much more often than Index Medicus or computer-based bibliographic retrieval systems. Research literature was used infrequently and rated among the lowest in terms of credibility, availability, searchability, understandability, and applicability. Physicians preferred sources that had low cost and were relevant to specific patient problems over sources that had higher quality.
Conclusions
Current databases can answer a considerable proportion of clinical questions but have not reached their potential for efficiency. It is our hope that as electronic medical databases mature, they will be able to bridge this gap and bring the research literature to the point of care in useful and practical ways. This study provides a snapshot of how far we have come and how far we need to go to meet these needs.
Acknowledgments
Funding for our study was provided by a grant from the American Academy of Family Physicians to support the Center for Family Medicine Science and from 2 Bureau of Health Professions Awards (DHHS 1-D14-HP-00029-01, DHHS 5 T32 HP10038) from the Health Resources and Services Administration to the Department of Family and Community Medicine at University of Missouri-Columbia. The authors would like to acknowledge Erik Lindbloom, MD, MSPH, for assisting with the database testing as a substitute searcher for B.A.; E. Diane Johnson, MLS, for assisting with the selection of databases for study inclusion; Robert Phillips, Jr., MD, MSPH, for arbitration of questions and answers for which the searchers did not reach agreement along with B.E. and J.S.; David Cravens, Erik Lindbloom, Kevin Kane, Jim Brillhart, and Mark Ebell for proofreading the questions for clarity, answerability, and clinical relevance; John Ely and Lee Chambliss for providing clinical questions from their observations; Mark Ebell, John Ely, Erik Lindbloom, Jerry Osheroff, Lee Chambliss, David Mehr, Robin Kruse, John Smucny, and many others for constructive criticism in the design of this study; and Steve Zweig for editorial review.
STUDY DESIGN: Two family physicians attempted to answer 20 questions with each of the databases evaluated. The adequacy of the answers was determined by the 2 physician searchers, and an arbitration panel of 3 family physicians was used if there was disagreement.
DATA SOURCE: We identified 38 databases through nominations from national groups of family physicians, medical informaticians, and medical librarians; 14 of these databases met predetermined eligibility criteria.
OUTCOME MEASURED: The primary outcome was the proportion of questions adequately answered by each database and by combinations of databases. We also measured mean and median times to obtain adequate answers for individual databases.
RESULTS: The agreement between family physician searchers regarding the adequacy of answers was excellent (k=0.94). Five individual databases (STAT! Ref, MDConsult, DynaMed, MAXX, and MDChoice.com) answered at least half of the clinical questions. Some combinations of databases answered 75% or more. The average time to obtain an adequate answer ranged from 2.4 to 6.5 minutes.
CONCLUSIONS: Several current electronic medical databases could answer most of a group of 20 clinical questions derived from family physicians during office practice. However, point-of-care searching is not yet fast enough to address most clinical questions identified during routine clinical practice.
Family physicians and general internists report an average of 6 questions for each half-day of office practice,1-3 and 70% of these questions remain unanswered. The 2 factors that significantly predict whether a physician will attempt to answer a clinical question are the physician’s belief that a definitive answer exists and the urgency of the patient’s problem.4
Gorman and colleagues3 reported that medical librarians found clear answers for 46% of 60 randomly selected questions from family physicians; 51% would affect practice. The medical librarians searched for an average of 43 minutes per question. In a second study,5 medical librarians used MEDLINE and textbooks to answer 86 questions from family physicians. The MEDLINE searches took a mean of 27 minutes, and textbook searches took a mean of 6 minutes. Search results answered 54% of the clinical questions completely or nearly completely. Physicians estimated that the answers would have a “major” or “fairly major” impact on practice for 35% of their questions. MEDLINE searches provided answers to 43% of the questions, while textbook searches provided answers for an additional 11%.
Many physicians do not have the searching skills or access to the range of knowledge resources that librarians use. Even if they did, they do not take the time to conduct such searches during patient care. One study1 found that physicians spent less than 2 minutes on average seeking an answer to a question. Thus, most clinical questions remain unanswered.
Electronic medical databases that provide answers directly (not just reference citations) may make it easier for clinicians to obtain answers at the point of care. We found no systematic evaluation of the capacity of such databases to answer clinical questions. We conducted this study to determine the extent to which current electronic medical databases can answer family physicians’ point-of-care clinical questions.
Methods
Database Selection
We solicited nominations for potentially suitable databases from multiple E-mail lists (including communities of family physicians [Family-L], medical informaticians [FAM-MED] and medical librarians [MEDLIB-L, MCMLA-L]) and through Web searches. A selection team consisting of 3 family physicians (J.S., D.W., B.E.) and a medical librarian (none of whom had financial relationships with any databases) determined whether the nominated databases met our inclusion criteria Table 1.
Clinical Questions
More than 1200 clinical questions had been previously collected from observations of family physicians during office practice.1,5 These questions had been classified by typology (eg, Is test X indicated in situation Y?) and by topic (eg, dermatology).1 We selected questions from these sources that were categorized among the most common typologies (8 of 68 typologies covering 50% of the questions) and the most common topics (7 of 62 topics covering 43% of the questions). These combinations of typologies and topics accounted for 272 (23%) of the 1204 questions.
If necessary, each question was translated by 2 physicians (B.A. and D.W. working together) to meet the following criteria: (1) clear enough to imagine an applicable clinical scenario, (2) answerable (ie, the question could theoretically be answered using clinical references without further patient data regardless of whether an answer was known to exist), (3) clinically relevant, and (4) true to the original question (ie, containing the information need and the modifying factors of the original question).
Each question was then independently proofread by at least 2 other physicians and translated again if necessary. Thirteen questions (5%) that did not meet these criteria after a second translation were dropped. Forty-seven questions (17%) that referred to information needs that could be adequately answered using the Physicians’ Desk Reference6 were dropped (eg, Are Paxil tablets scored?). The remaining 212 questions represented 8 typologies.1 Two or 3 questions were randomly selected from each typology for a total of 20 questions Table 2.
Testing
Two family physicians with experience in computer searching (B.A., D.W.) independently searched for answers using each of the included databases. In the case of DynaMed, for which Dr Alper is the medical director, another family physician was substituted as a searcher, and Dr Alper had no input or control over the testing or arbitration process for answers from DynaMed. Testing took place in April and May 2000.
Searching was performed using computers with Pentium III processors with a 100 megabyte-per-second network connection to the Internet and server-mounted CD-ROMs.
Each searcher used the same 20 questions to evaluate each database. The order of evaluation of the databases was at the discretion of the searchers, but the testing of a database was completed before starting the testing of another database. Searchers became familiar with each database before testing it by using the 5 screening questions.
A maximum of 10 minutes was allowed per question. Each answer was rated as adequate or inadequate. An answer was considered adequate if it contained sufficient information to guide clinical practice. For example, for the question “How do I determine the cause of chronic pruritus?”, the answer from the University of Iowa Family Practice Handbook (www.vh.org/Providers/ClinRef/FPHandbook/Chapter13/01-13.html) was considered adequate, because it included clinically useful recommendations: History should include details about (1) any skin lesions preceding the pruritus; (2) history of weight loss, fatigue, fever, malaise; (3) any recent stress emotionally; and (4) recent medications and travel. Physical examination with emphasis on the skin and its appendages — xerosis, excoriation, lichenification, hydration. Laboratory tests as suggested by the PE, which may include CBC, ESR, fasting glucose, renal or liver function tests, hepatitis panel, thyroid tests, stool for parasites, CXR.
Sources that provided general recommendations without information that could specifically guide clinical practice were considered inadequate. For example: “The cause of generalized pruritus should be sought and corrected. If no skin disease is apparent, a systemic disorder or drug-related cause should be sought.” The searcher recorded the answer and the time it took to obtain it rounded to the nearest number of minutes (1-10).
Scoring and Arbitration
The 2 physician searchers judged the adequacy of the answers to each question for each database. If the searchers both found adequate answers, the result was accepted as adequate, and the average time required to find and interpret the answer was recorded. If neither searcher found an adequate answer, then the answer was deemed inadequate. If only one searcher found an adequate answer, the second searcher evaluated that answer. If the answer was acceptable to the second searcher, it was considered an adequate answer, and the time for the first searcher was recorded.
When searchers disagreed on the adequacy of identified answers, an arbitration panel consisting of 3 family physicians who were not affiliated with any of the databases met independently from the searchers to determine the adequacy of the answers by consensus.
Analysis
Our primary outcome was the proportion of questions adequately answered by each database. We calculated 95% confidence limits for the proportions of adequate answers.7 Means and medians were determined for the time to reach adequate answers for each database. We calculated the k statistic for the independent findings of the 2 searchers and for the results after the searchers reviewed each other’s searches.8 We combined the results of individual databases to determine the proportion of questions answered by all combinations of 2, 3, and 4 databases. We considered the question adequately answered if any of the individual databases adequately answered the question.
Results
Thirty-eight databases were nominated, and 24 did not meet our inclusion criteria Table W1.* Fourteen databases met the inclusion criteria Table 3 and were evaluated with the set of 20 questions (280 answer assessments) by 2 searchers. The Figure summarizes the process of evaluating the answers. The initial agreement between searchers was good k=0.69). Discussion between the searchers resolved 21 (52.5%) of the 40 discrepant answer assessments. These were due to inadequate searching or timing out (searching for 10 minutes) by one searcher, who agreed with the adequacy of the answer found by the other searcher. The agreement between searchers at this stage was excellent (k= 0.94).
The remaining 19 discrepant assessments (for which the searchers had different opinions regarding the adequacy of the answers identified) were referred to the arbitration panel for determination of the final results. Ten of these were deemed adequate.
Results for individual databases in rank order of proportion of questions answered followed by average time to identify adequate answers are reported in Table 3. The combination of STAT!Ref and MDConsult could answer 85% of our set of 20 questions. Four combinations of 2 databases (STAT!Ref and either MAXX, MDChoice.com, Primary Care Guidelines, or Medscape) could answer 80% of our questions. Two combinations of 3 databases (STAT!Ref, MDConsult, and either DynaMed or MAXX) could answer 90% of our questions. Combinations of 4 databases answered the most sample questions (95%, 19/20). These combinations consisted of STAT!Ref, DynaMed, MAXX, and either MDConsult or American Family Physician.
We also evaluated combinations of databases that were available at no cost. The combination of the 2 no-cost databases that answered the largest proportion of questions (75%) was DynaMed and American Family Physician. The greatest proportion of clinical questions that could be answered using the freely available sources was 80%, and this required the use of 3 databases (DynaMed, MDChoice.com, and American Family Physician).
Discussion
Our study suggests that individual databases can answer a considerable proportion of family physicians’ clinical questions. Combinations of currently available databases can answer 75% or more. The searches in this study were based on the combination of efforts of 2 experienced physician searchers. These results may not be replicable in the practice setting but do provide an objective best-case scenario assessment of the content of these databases.
The time required to obtain answers, while much less than searching for original articles, is still longer than the 2-minute average time spent by family physicians in the study by Ely and colleagues.1 Our time estimates are not precise, as time was not the primary focus of our study. Time was only recorded in 1-minute intervals, so searches that took 10 seconds were recorded as 1 minute. Even so, the existence of median times to obtain adequate answers greater than 2 minutes suggests that these databases may require more time than most physicians will take to pursue answers during patient care.
This is the first study to systematically evaluate how many questions can be answered by electronic medical databases. The strengths of this study include the use of a standard set of common questions asked by family physicians, testing by 2 experienced family physician searchers, and a systematic replicable approach to the evaluation. The only similar study we identified was one in which Graber and coworkers9 used 10 clinical questions and tested a commercial site, 2 medical meta-lists, 4 general search engines, and 9 medicine-specific search engines to determine the efficiency of answering clinical questions on the Web. Different approaches answered from 0 to 6 of the 10 questions, but that study looked primarily at sites that were not generally designed for use in clinical practice.
Limitations
Our study was limited by the relatively small number of questions, causing wide confidence intervals. Some answers were present in the databases but not found despite the use of 2 searchers. For example, a database manager identified 2 answers that were not found but would have been considered adequate.
We accepted answers as adequate if, in our judgment, they offered a practical course of action. We did not attempt to determine whether the individual asking the question believed that the answer was adequate nor did we attempt to validate the accuracy or currency of answers using independent standards. Many of the answers were based on sources that were several years old, and few were based on explicit evidence-based criteria. Although we determined the adequacy of answers for clinical practice through formal mechanisms, an in vivo study in which the clinicians asking the questions determined the adequacy of their findings during patient care activities would provide a more accurate assessment.
Our study presents a static evaluation of a dynamic field. Over time, answers may be lost because of lack of maintenance of resource links or may be gained by addition of new materials. Our use of questions gathered several years ago may not accurately reflect the ability of databases to answer current questions, which may be more likely to reflect new tests and treatments.
Many of the databases were designed for purposes other than meeting clinical information needs at the point of care. Performance in this study does not reflect the capacity of these databases to address their stated purposes. For example, the Translating Research Into Practice (TRIP) database is an excellent resource for searches of a large collection of evidence-based resources. These resources are generally limited to summaries of studies with the highest methodologic quality. The TRIP database did not perform well in our study partly because most of our test questions (consistent with questions in clinical practice) cannot currently be answered using studies of the highest methodologic quality. Another example is Medical Matrix, which provides a search engine and annotated summaries for exploring the entire medical Internet and not just clinical reference information.
We did not study the costs involved in using the databases we evaluated, and these costs may have changed since our study was conducted. Most of the databases we included were free to use at the time of the study and at the time of this report. The 3 collections of textbooks required access fees. STAT!Ref, which scored the highest in our study, did so because we used the complete collection available to us through our institutional library. This collection would cost an individual $2189 annually at the time of our study. A starter library was available for $199 annually and would only answer 40% of the questions.
Context
Family physicians and other primary care providers treat patients who have a wide variety of syndromes and symptoms. Because of the scope and breadth of primary care, it is nearly impossible for a clinician to keep up with rapidly changing medical information.10
Connelly and colleagues11 surveyed 126 family physicians and found they used the Physicians’ Desk Reference and colleagues much more often than Index Medicus or computer-based bibliographic retrieval systems. Research literature was used infrequently and rated among the lowest in terms of credibility, availability, searchability, understandability, and applicability. Physicians preferred sources that had low cost and were relevant to specific patient problems over sources that had higher quality.
Conclusions
Current databases can answer a considerable proportion of clinical questions but have not reached their potential for efficiency. It is our hope that as electronic medical databases mature, they will be able to bridge this gap and bring the research literature to the point of care in useful and practical ways. This study provides a snapshot of how far we have come and how far we need to go to meet these needs.
Acknowledgments
Funding for our study was provided by a grant from the American Academy of Family Physicians to support the Center for Family Medicine Science and from 2 Bureau of Health Professions Awards (DHHS 1-D14-HP-00029-01, DHHS 5 T32 HP10038) from the Health Resources and Services Administration to the Department of Family and Community Medicine at University of Missouri-Columbia. The authors would like to acknowledge Erik Lindbloom, MD, MSPH, for assisting with the database testing as a substitute searcher for B.A.; E. Diane Johnson, MLS, for assisting with the selection of databases for study inclusion; Robert Phillips, Jr., MD, MSPH, for arbitration of questions and answers for which the searchers did not reach agreement along with B.E. and J.S.; David Cravens, Erik Lindbloom, Kevin Kane, Jim Brillhart, and Mark Ebell for proofreading the questions for clarity, answerability, and clinical relevance; John Ely and Lee Chambliss for providing clinical questions from their observations; Mark Ebell, John Ely, Erik Lindbloom, Jerry Osheroff, Lee Chambliss, David Mehr, Robin Kruse, John Smucny, and many others for constructive criticism in the design of this study; and Steve Zweig for editorial review.
1. Ely JW, Osheroff JA, Ebell MH, et al. Analysis of questions asked by family doctors regarding patient care. BMJ 1999;319:358-61.
2. Covell DG, Uman GC, Manning PR. Information needs in office practice: are they being met? Ann Intern Med 1985;103:596-99.
3. Gorman PN, Ash J, Wykoff L. Can primary care physicians’ questions be answered using the medical journal literature? Bull Med Lib Assoc 1994;82:140-46.
4. Gorman PN, Helfand M. Information seeking in primary care: how physicians choose which clinical questions to pursue and which to leave unanswered. Med Decis Mak 1995;15:113-19.
5. Chambliss ML, Conley J. Answering clinical questions. J Fam Pract 1996;43:140-44.
6. Medical Economics Physicians’ desk reference. 54th ed. Oradell, NJ: Medical Economics Company; 2000.
7. Pagano M, Gauvreau K. Inference on proportions. Principles of biostatistics. Belmont, Calif: Duxbury Press; 1993;297-298.
8. Sackett DL, Haynes RB, Guyatt GH, Tugwell P. The clinical examination. Clinical epidemiology: a basic science for clinical medicine. Boston, Mass: Little, Brown and Company; 1991;29-30.
9. Graber MA, Bergus GR, York C. Using the World Wide Web to answer clinical questions: how efficient are different methods of information retrieval? J Fam Pract 1999;48:520-24.
10. Dickinson WP, Stange KC, Ebell MH, Ewigman BG, Green LA. Involving all family physicians and family medicine faculty members in the use and generation of new knowledge. Fam Med 2000;32:480-90.
11. Connelly DP, Rich EC, Curley SP, Kelly JT. Knowledge resource p of family physicians. J Fam Pract 1990;30:353-59.
1. Ely JW, Osheroff JA, Ebell MH, et al. Analysis of questions asked by family doctors regarding patient care. BMJ 1999;319:358-61.
2. Covell DG, Uman GC, Manning PR. Information needs in office practice: are they being met? Ann Intern Med 1985;103:596-99.
3. Gorman PN, Ash J, Wykoff L. Can primary care physicians’ questions be answered using the medical journal literature? Bull Med Lib Assoc 1994;82:140-46.
4. Gorman PN, Helfand M. Information seeking in primary care: how physicians choose which clinical questions to pursue and which to leave unanswered. Med Decis Mak 1995;15:113-19.
5. Chambliss ML, Conley J. Answering clinical questions. J Fam Pract 1996;43:140-44.
6. Medical Economics Physicians’ desk reference. 54th ed. Oradell, NJ: Medical Economics Company; 2000.
7. Pagano M, Gauvreau K. Inference on proportions. Principles of biostatistics. Belmont, Calif: Duxbury Press; 1993;297-298.
8. Sackett DL, Haynes RB, Guyatt GH, Tugwell P. The clinical examination. Clinical epidemiology: a basic science for clinical medicine. Boston, Mass: Little, Brown and Company; 1991;29-30.
9. Graber MA, Bergus GR, York C. Using the World Wide Web to answer clinical questions: how efficient are different methods of information retrieval? J Fam Pract 1999;48:520-24.
10. Dickinson WP, Stange KC, Ebell MH, Ewigman BG, Green LA. Involving all family physicians and family medicine faculty members in the use and generation of new knowledge. Fam Med 2000;32:480-90.
11. Connelly DP, Rich EC, Curley SP, Kelly JT. Knowledge resource p of family physicians. J Fam Pract 1990;30:353-59.