Author(s)
Frank Yu, PharmD
Takova D. Wallace-Gay, PharmD, BCACP
Reviewed By
C. Whitney White, Pharm.D., BCPS
Emily Prohaska, Pharm.D., BCACP, BCGP
Molino CGRC, Leite-Santos NC, Gabriel FC, et al. Factors Associated With High-Quality Guidelines for the Pharmacologic Management of Chronic Diseases in Primary Care: A Systematic Review. JAMA Intern Med 2019; 179: 553-60.
The Problem
Staying abreast of the rapid production and constant changes to clinical practice guidelines (CPGs) and determining their quality is difficult. The general consensus in the medical community is that CPGs reduce inappropriate care and improve treatment quality and patient safety. However, recent concerns have been raised about the reliability, quality, and validity of CPGs.1
What’s Known
CPGs have been around for many years with the first formal document possibly authored by Hippocrates in ancient Greece. The Institute of Medicine (IOM) blazed a trail in 2011 with their report entitled, “Clinical Practice Guidelines We Can Trust.”2 The IOM stated that CPGs should be “… informed by a systematic review of evidence and an assessment of the benefits and harms of alternative care options.” In 2014, the National Guideline Clearinghouse (NGC) applied the recommendations from the 2011 IOM report to the CPGs available on their website, resulting a dramatic reduction in number of guidelines listed (from 2,619 to 1,440 documents). This gave users a better sense of the reliability of the guideline documents found on the NGC. Unfortunately, in 2018, free access to the NGC was terminated due to a lack of external funding.3 This left users with two choices: 1) pay for access or 2) return to the days of reviewing and vetting guideline documents for themselves.
The 2011 IOM report recommends eight key attributes that should be present in CPGs including: validity, reliability/reproducibility, clinical applicability, clinical flexibility, clarity, involving a multidisciplinary process, scheduled review of new data, and meticulous documentation of guideline development procedures.2 Randomized controlled trials (RCTs), case studies, systematic reviews, expert opinions and other means of providing evidence are used by guideline developers when constructing CPG recommendations. One way to ensure reliability and validity of the CPGs is through a centralized method for assessing the data used to construct these guideline documents into clinically applicable recommendations. Three existing methods include the Grading of Recommendations Assessment, Development, and Evaluation (GRADE), Consolidated Standards of Reporting Trials (CONSORT), and Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) tools. These tools are helpful when evaluating the evidence used to establish CPGs by improving the rigorous appraisal of the primary literature. These criteria should be considered when evaluating CPGs.4-7 According to one study that evaluated 45 guideline documents using the GRADE tool, only 22 of CPGs were relevant to primary care. In these 22 documents there were over 1,500 recommendations, but only approximately 300 of them were pertinent to primary care.8 This example illustrates the need to vet guidelines not only for the quality of the data included but also for the clinical applicability of the proposed recommendations. The Appraisal of Guidelines Research and Evaluation (AGREE) tool, helps developers and reviewers of CPGs assess the variability and quality of how the guidelines were developed. The AGREE tool assesses six domains in guideline development including: Scope and Purpose, Stakeholder Involvement, Rigor of Development, Clarity of Presentation, Applicability, and Editorial Independence. These six areas align with much of the criteria set forth by the IOM. In 2013 the AGREE tool was refined and the AGREE-II instrument was released framed around the same six domains, but with clearer instructions.9
While the tools such as GRADE and AGREE-II are validated and helpful, they are not widely used. Thousands of CPGs are still freely accessible and may include low quality and incomplete data.
What’s New
A recently published systematic review critically appraised the quality of published CPGs for the treatment of chronic diseases in primary care using the AGREE-II tool.10 To conduct the systematic review, the authors searched multiple databases (MEDLINE, Embase, and the Cochrane Library) as well as guideline-specific websites for CPGs containing pharmacotherapy recommendations used to treat common conditions seen in primary care settings. Common conditions included heart disease, lung disease, diabetes, osteoporosis, depression, osteoarthritis, dementia, gastroesophageal reflux disease, and benign prostatic hyperplasia. CPGs published in English, Portuguese, and Spanish between January 1, 2011 and August 30, 2017 were included. Letters, editorials, commentaries, clinical trials, and observational studies were excluded, as well as conditions involving communicable diseases. Three independent evaluators applied the AGREE-II criteria to assess the quality of the CPGs. A consensus was used to determine the final score in each of the six AGREE-II domains. The evaluators did not perform an overall assessment using AGREE-II, deeming it subjective. A final score of 60% or higher in the rigor of development domain was used to classify the CPG as high-quality. To determine factors associated with high-quality guidelines, a multiple logistic regression was performed.
Out of 18,653 documents identified and reviewed, 421 were included in the final analysis and only 99 (23.5%) meet the criteria for high-quality. Performance in each of the six AGREE-II domains can be seen in Table 1. Median scores were less than 50% in all domains except scope and purpose (61%) and clarity of presentation (70%). The range of scores for each domain was broad, with some ranges varying from 0 to 100%.
Table 1. AGREE-II Domain Scores for Primary Care Clinical Practice Guidelines10
Domain |
Median (%) |
Min. (%) |
Max. (%) |
Scope and Purpose |
61 |
22 |
100 |
Stakeholder Involvement |
33 |
0 |
98 |
Rigor of Development |
33 |
3 |
97 |
Clarity of Presentation |
70 |
11 |
100 |
Applicability |
22 |
1 |
96 |
Editorial Independence |
42 |
0 |
100 |
Of the six characteristics associated with high-quality CPGs (See Table 2), a multivariable analysis identified that high-quality CPGs were more likely to include more than 20 authors, come from government institutions, and report funding.
Table 2. Analysis of High-quality Clinical Practice Guideline Characteristics10
Characteristic |
Multivariable Logistic Regression |
Year of Publication |
1.13 (0.95-1.35) |
Number of Authors ≤5 6-10 11-20 >20 |
1 [Reference] 4.30 (1.53-12.15)a 4.45 (1.64-12.07)a 9.08 (3.35-24.62)a |
Type of Institution University Specialty society Governmental institution |
1 [Reference] 3.74 (1.00-13.91)a 10.38 (2.72-39.60)a |
Region Other region Canada or United States Europe Asia Latin America Transcontinental Oceania |
1 [Reference] 3.06 (0.53-17.83) 1.35 (0.23-8.03) 0.84 (0.13-5.55) 2.86 (0.45-18.32) 0.69 (0.08-5.86) 0.91 (0.63-1.31) |
Guideline Version First version Revision or updated version |
1 [Reference] 1.61 (0.86-3.01) |
Reported Funding No Yes |
1 [Reference] 10.34 (4.77-22.39)a |
Scope Broad Narrow |
1 [Reference] 0.96 (0.48-1.91) |
aP < 0.05
Our Critical Appraisal
The strengths of this systematic review are the breadth of the CPGs included, the use of the validated AGREE-II tool, and the inclusion of guidelines in multiple languages and from multiple countries. The results of this study and previous studies have found that the majority of CPGs are of low quality. Consistent with previous literature, this study found that the applicability and rigor of development strongly influence whether a guideline is high-quality. These strengths all increase the external validity of the factors identified to be associated with high-quality guidelines.
This paper has a few limitations, including: 1) not using the recommended number of reviewers for the AGREE-II tool, 2) the lack of a holistic appraisal, 3) the inherent subjective judgement that is part of AGREE-II, and 4) a lack of content evaluation (e.g. a critical appraisal of the underlying data used to make guideline recommendations). While the authors suggested a lack of diversity as a limitation, as only guidelines written in English, Spanish, or Portuguese were included, we do not feel that including other languages would have significantly changed the outcome. Given the widespread use of English in scientific and medical communications, many countries were represented in this analysis. When using the AGREE-II tool, the developing consortium recommends four guideline evaluators or appraisers to increase validity.9 While this study used three evaluators, incorporating a fourth evaluator could have increased validity. Additionally, AGREE-II includes two overall assessments. Each evaluator makes a judgement on the overall quality of the guideline as well as whether the guideline should be used.9 While the study authors did not include these overall assessments, deeming them too subjective, the AGREE-II was validated using these two elements. Including these overall assessments could have provided the readers with a more complete picture of guideline quality.
The absence of guideline content evaluation is a significant limitation and requires follow-up study. The authors identified the factors associated with high-quality guidelines in terms of the development process, but did not appraise the actual content used to generate the guideline recommendations. While a CPG should follow a high-quality process, it is the underlying content that ensures the recommendations are based on high-quality evidence. Future studies should address these questions: 1) Does a guideline deemed high-quality using AGREE-II contain high-quality recommendations using GRADE? 2) Do the randomized controlled trials used in a guideline adhere to the reporting standards in the CONSORT statement? 3) Do the systematic reviews and meta-analyses used in a guideline adhere to the reporting standards in the PRISMA statement?
The Bottom Line
Guidelines developed by 20 or more authors, government institutions, and that report funding appear to be associated with higher quality CPGs. However, assessing the quality of CPGs must also include an evaluation of the content using GRADE, PRISMA, CONSORT or other tools that assist with appraising the quality of the evidence. Practitioners or institutions should evaluate the quality of guideline development and content prior to using them to make formulary decisions, develop disease management protocols, or establish clinical performance metrics. Furthermore, we believe that by vetting CPGs using these validated tools prior to their release, the proportion of high-quality CPGs will increase, thus achieving the IOM vision in Guidelines We Can Trust.2
The Key Points
- Less than one-fourth of CPGs related to the treatment of chronic diseases in primary care settings published between 2011 and 2017 were found to be of high-quality.
- Factors associated with high-quality guidelines included: 1) having >20 authors, 2) being developed by government institutions, and 3) reporting funding
- In addition to incorporating a rigorous development process (using AGREE-II), CPGs should also ensure that recommendations are comprehensive and evidence-based
- Clinicians and institutions should employ a rigorous CPGs assessment process prior to adopting guideline recommendations into practice
FINAL NOTE: This program will be available for recertification credit through the American Pharmacists Association (APhA) Ambulatory Care Review and Recertification Program. To learn more, visit https://www.pharmacist.com/ambulatory-care-review-and-recertification-activities.
- Kung J, Miller RR, Mackowiak PA. Failure of clinical practice guidelines to meet institute of medicine standards: Two more decades of little, if any, progress. Arch Intern Med. 2012;172(21):1628-1633.
- IOM (Institute of Medicine). Clinical Practice Guidelines We Can Trust. Washington, DC: The National Academies Press, 2011.
- Shekelle PG. Clinical Practice Guidelines: What’s Next? JAMA. 2018;320(8):757-758.
- Guyatt GH, Oxman AD, Vist GE, et al. GRADE: an emerging consensus on rating quality of evidence and strength of recommendations. BMJ. 2008;336(7650):924-926.
- Moher D, Jones A, Lepage L, Trials) CGCSfRo. Use of the CONSORT statement and quality of reports of randomized trials: a comparative before-and-after evaluation. JAMA. 2001;285(15):1992-1995.
- Page MJ, Moher D. Evaluations of the uptake and impact of the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) Statement and extensions: a scoping review. Syst Rev. 2017;6(1):263.
- Cruz JE, Fahim G, Moore K. Practice Guideline Development, Grading, and Assessment. P T. 2015;40(12):854-857.
- Steel N, Abdelhamid A, Stokes T, et al. A review of clinical practice guidelines found that they were often based on evidence of uncertain relevance to primary care patients. J Clin Epidemiol. 2014;67(11):1251-1257.
- Brouwers MC, Kho ME, Browman GP, et al. AGREE II: advancing guideline development, reporting and evaluation in health care. CMAJ. 2010;182(18):E839-842
- Molino CGRC, Leite-Santos NC, Gabriel FC, et al. Factors Associated With High-Quality Guidelines for the Pharmacologic Management of Chronic Diseases in Primary Care: A Systematic Review. JAMA Intern Med. 2019.