Evidence-based practice in surgery
Kathryn A. Rigby and Jonathan A. Michaels
Evidence-based medicine is the conscientious, explicit and judicious use of current best evidence in making decisions about the care of individual patients. The practice of evidence-based medicine means integrating individual clinical expertise with the best external clinical evidence from systematic research.1
The concept of evidence-based medicine (EBM) was introduced in the 19th century but has only flourished in the last few decades. Historically, its application to surgical practice can be traced back to the likes of John Hunter and the American Ernest Amory Codman, who both recognised the need for research into surgical outcomes in an attempt to improve patient care.
In mid-19th century Paris, Pierre-Charles-Alexander Louis used statistics to measure the effectiveness of bloodletting, the results of which helped put an end to the practice of leeching. Ernest A. Codman began work as a surgeon in 1895 in Massachusetts. His main area of interest was the shoulder and he became a leading expert in this topic as well as being instrumental in the founding of the American College of Surgeons. He developed his ‘End Result Idea’, a notion that all hospitals should follow up every patient it treats ‘long enough to determine whether or not its treatment is successful and if not, why not?’ in order to prevent similar failures in the future.2 Codman also developed the first registry of bone sarcomas.
In the UK, one of the most important advocates of EBM was Archie Cochrane. His experiences in the prisoner of war camps, where he conducted trials in the use of yeast supplements to treat nutritional oedema, influenced his belief in reliable and scientifically proven medical treatment. In 1972 he published his book Effectiveness and Efficiency. Cochrane advocated the use of the randomised controlled trial (RCT) as the gold standard in the research of all medical treatment and, where possible, systematic reviews of these trials. One of the first systematic reviews of RCTs was of the use of corticosteroid therapy to improve lung function in threatened premature birth. Although RCTs had been conducted in this area, the message of the results was not clear from the individual studies, until the review overwhelmingly showed that corticosteroids reduced both neonatal morbidity and mortality. Had a systematic review been conducted earlier, then the lives of many babies could have been saved, as the review clearly showed that this inexpensive treatment reduced the chance of these babies dying from complications of immaturity by 30–50%.3 In 1992, as part of the UK National Health Service (NHS) Research and Development (R&D) Programme, the Cochrane Collaboration was founded.
Subsequently, in 1995, the first centre for EBM in the UK was established at the Nuffield Department of Clinical Medicine, University of Oxford. The driving force behind this was the American David Sackett, who had moved to a new Chair in Clinical Epidemiology in 1994 from McMaster University in Canada, where he had pioneered self-directed teaching for medical students.
From these roots, interest in EBM has exploded. The Cochrane Collaboration is rapidly expanding, with review groups in many fields of medicine and surgery. EBM is not limited only to hospital-based medicine but is increasingly seen in nursing, general practice and dentistry, and there are many new evidence-based journals appearing.
While clinical experience is invaluable, the rapidly changing world of medicine means that clinicians must keep abreast of new advances and, where appropriate, integrate research findings into everyday clinical practice. Neither research nor clinical experience alone is enough to ensure high-quality patient care; the two must complement each other. Sackett et al. identified five steps that should become part of day-to-day practice and in which a competent practitioner should be proficient:4
This chapter discusses the steps that are necessary to identify, critically appraise and combine evidence, to incorporate the findings into clinical guidance, and to implement and audit any necessary changes in order to move towards EBM in surgery. Many of the organisations and information sources that are relevant to EBM are specific to a particular setting. Therefore, the emphasis in this chapter is on the health services within the UK, although there are comparable arrangements and bodies in many other countries. Links to a number of these are given in the Internet resources described at the end of the chapter.
The need for evidence-based medicine
In 1991, there was still a widely held belief that only a small proportion of medical interventions were supported by solid scientific evidence.5 Jonathan Ellis and colleagues, on behalf of the Nuffield Department of Clinical Medicine, conducted a review of treatments given to 109 patients on a medical ward.6 The treatments were then examined to assess the degree of evidence supporting their use. The authors concluded that 82% of these treatments were in fact evidence based. However, they did suggest that similar studies should be conducted in other specialities. The importance of evidence-based health care in the NHS was formally acknowledged in two government papers, The new NHS7 and A first class service.8 These led to the development of the National Service Frameworks and the National Institute of Clinical Excellence (NICE).
In surgery there is a limited body of evidence from high-quality RCTs. For an RCT to be ethical there needs to be a clinical equipoise. That is, there needs to be a sufficient level of uncertainty about an intervention before a trial can be considered. For example, it would be unethical to conduct an RCT in the use of burr holes for extradural haematomas, because the observational data alone are so overwhelming as to the high degree of effectiveness that it would be unethical to deny someone a burr hole to prove the point.
Many surgeons feel unhappy with having to explain to a patient that there is clinical uncertainty about a treatment, as patients have historically put their trust in surgeons' hands. This reluctance to perform RCTs and the belief that they would be difficult to carry out has led to practices that are poorly supported by high-quality evidence. For example, there is widespread use of radical prostatectomy to treat localised prostatic carcinoma in the USA, despite a distinct lack of evidence to support this procedure.9
New technologies in surgery may be driven into widespread use by market forces, patients' expectations and clinicians' desire to improve treatment options. For example, with laparoscopic surgery, many assumed that it must be ‘better’ because it made smaller holes, there was less pain involved and therefore patients left hospital sooner. It was only after many hospitals had instituted its use that concerns were raised about its real benefits and the adequacy of training in the new technology. In 1996, a group of surgeons from Sheffield published a randomised, prospective, single-blind study that compared small-incision open cholecystectomy with laparoscopic cholecystectomy.10 They demonstrated that in their hands the laparosopic technique offered no real benefit over a mini-cholecystectomy in terms of the postoperative recovery period, hospital stay and time off work, but it took longer to perform and was more expensive.10 There were, however, other factors that may have influenced the results from this study, including surgeon experience, and mini-cholecystectomy has not been widely adopted.
The MRC Laparoscopic Groin and Hernia Trial Group undertook a large multicentre randomised comparison between laparoscopic and open repair of groin hernias.11 The results demonstrated that the laparoscopic procedure was associated with an earlier return to activities and less groin pain 1 year after surgery but it was also associated with more serious surgical complications, an increased recurrence rate and a higher cost to the health service. They suggested that laparoscopic hernia surgery should be confined to specialist surgical centres. NICE have since published guidelines which recommend that laparoscopic surgery is now one of the treatment options for the repair of inguinal hernias.
Some would argue that surgery, unlike drug trials, is operator dependent and that operating experience and skill can affect the outcome of an RCT, and cite this as a reason for not undertaking surgical trials. Although operator factors can introduce bias into a trial, the North American Symptomatic Carotid Endarterectomy Trial has shown that such problems can largely be overcome through appropriate trial design.12 Only surgeons who had been fully trained in the procedure, and who already had a proven low complication rate, were accepted as participants in the trial.
These examples illustrate a clear need for high-quality research to be undertaken into any new technology to assess both its efficacy and its cost-effectiveness before it is introduced into the healthcare system.
However, concerns have been raised about EBM. Sceptics have suggested that it may undermine clinical experience and instinct and replace it with ‘cookbook medicine’ or that it may ignore the elements of basic medical training such as history-taking, physical examination, laboratory investigations and a sound grounding in pathophysiology. Another fear is that purchasers and managers will use it as a means to cut costs and manage budgets.
Nevertheless, EBM can formalise our everyday procedures and highlight problems. It can provide answers by ensuring that the best use is made of existing evidence or it can identify areas in which new research is needed. Although it has a role in assessing the cost-effectiveness of an intervention, it is not a substitute for rationing and often results in practice that, despite being more cost-effective, has greater overall cost.13
The process of evidence-based medicine
EBM requires a structured approach to ensure that clinical interventions are based upon best available evidence. The first stage is always to pose a clinically relevant question for which an answer is required. Such a question should be clear, specific, important and answerable. One way of formulating questions is to think of them as having four key elements (PICO):
Therefore, the question ‘What is the best treatment for cholecystitis?’ needs to be much more clearly formulated if an adequate, evidence-based approach is to be used. A much better question would be ‘For adult patients admitted to hospital with acute cholecystitis (the population), does early open cholecystectomy, laparoscopic cholecystectomy (the interventions) or best medical management (the comparison) produce the lowest mortality, morbidity and total length of stay in hospital (the outcomes)?’ Even this may require more refinement to define further the exact interventions and outcomes of interest.
Once such a question has been clearly defined, a number of further stages of the process can follow:
Sources of evidence
Once a question has been formulated, the next step in undertaking EBM is the identification of all the relevant evidence. The first line for most practitioners is the use of journals. Many clinicians will subscribe to specific journals in their own specialist area and have access to many others through local libraries. However, the vast increase in the number of such publications makes it impossible for an individual to access or read all the relevant papers, even in a highly specialist area.
There has been a huge expansion in the resources that are available for identifying relevant material from other publications, including indexing and abstracting services such as MEDLINE (computerised database compiled by the US National Library of Medicine) and EMBASE. There is also a rapidly expanding set of journals and other services that provide access to selected, appraised and combined results from primary information sources.
As a result, the information sources that provide the evidence to support EBM are vast and include the following:
Some of these are described in more detail below and the Appendix to this chapter provides a list of contact details for further information.
The following are a selection of journals that act as secondary sources, identifying and reviewing other research that is felt to be of key importance to evidence-based practice.
This was first launched in October 1995, by the British Medical Journal (BMJ) Publishing Group. It systematically searches high-quality international journals and provides summaries of the most clinically relevant research articles. The validity of the research is critically appraised by experts and assessed for its clinical applicability. This consequently allows the reader to keep up with the latest advances in clinical practice. It also publishes articles relating to the study and practice of EBM.
This follows similar lines to Evidence-based Medicine, but contains articles more relevant to the nursing field.
Evidence-based Mental Health
This is produced by the BMJ Publishing Group in collaboration with the Royal College of Psychiatrists and British Pyschological Society.
The Internet is becoming an increasingly useful source of medical information and evidence. Details of Internet addresses for many of the sources referred to below are given in the Appendix to this chapter, although this is a rapidly progressing and changing area. There are many journals and databases that are available either free or through subscription, and dedicated search engines such as Google Scholar. This medium also provides a number of advantages over printed material, including ease of searching, hyperlinks to other sources, access to additional supporting materials or raw data and the provision of discussion groups. There are, however, potential problems with the Internet in that there is no quality control and much of the available material is of dubious quality, or published by those with particular commercial or other interests.
This is a new service that provides online access to evidence-based information. It is managed by NICE and is free to use. It has access to NICE pathways, journals and databases, ebooks and the Cochrane library.
BMJ Evidence Centre
This provides information, resources and tools that aid evidence-based practice. It has access to sites that target EBM in relation to patient care, research and patient information, and has updates on current evidence and treatment options.
As described above, the British epidemiologist who inspired this collaboration realised that in order to make informed decisions about healthcare, reliable evidence must be accessible and kept up to date with any new evidence. It was felt that failure to achieve this might result in important developments in healthcare being overlooked. This was to be a key aspect in providing the best healthcare possible for patients. It was also hoped that by making clear the result of an intervention, then work would not be duplicated.
The Cochrane library is the electronic publication of the Cochrane Collaboration and it includes six databases:
The Reviewers' Handbook includes information on the science of reviewing research and details of the review groups. It is also available in hard copy.14
The Cochrane library is regularly updated and amended as new evidence is acquired. It is distributed on disk, CD-ROM and the Internet.3 In order to allow the results of the reviews to be widely used, no one contributor has exclusive copyright of the review.
Centre for Evidence-based Medicine
The Centre for Evidence-based Medicine was established in Oxford. Its chief remit is to promote EBM and Evidence-Based Practice (EBP) in healthcare. It runs workshops and courses in both the practice and teaching of EBM. It also conducts research and development on improving EBP and its website also has many free EBM resources and tools.
Review Body for Interventional Procedures (ReBIP)
This is a joint venture between the Health Services Research Unit at Sheffield University and Aberdeen University. It works under the auspices of NICE's Interventional Procedures Programme (IPP). When there is doubt about the safety and efficacy of any procedure they will be commissioned to provide a systematic review or gather additional data.
Centre for Reviews and Dissemination (CRD)
The CRD was established in January 1994 at the University of York and is now also part of the National Institute for Health Research (NIHR). It is funded by NIHR England, the Department of Health, Public Health Agency, Northern Ireland, and the National Institute for Social Care and Health Research, Welsh Assembly Government. The CRD concentrates specifically on areas of priority to the NHS. It is designed to raise the standards of reviews within the NHS and to encourage research by working with healthcare professionals. It undertakes and disseminates systematic reviews and maintains three databases:
NIHR Health Technology Assessment Programme
The HTA is now part of National Institute for Health Research (NIHR). It commissions independent research into high-priority areas. This includes many systematic reviews and primary research in key areas. The programme publishes details of ongoing HTA projects and monographs of completed research.
This is the process by which we assess the evidence presented to us in a paper. We need to be critical of it in terms of its validity and clinical applicability.
From reading the literature, it is evident that there may be many trials on the same subject, which may all draw different conclusions. Which one should be believed and allowed to influence clinical practice? We owe a duty to our patients to be able to assess accurately all the available information and judge each paper on its own merits before changing our clinical practice accordingly.
Randomised Controlled Trials
The RCT is a comparative evaluation in which the interventions being compared are allocated to the units being studied purely by chance. It is the ‘gold standard’ method of comparing the effectiveness of different interventions.15 Randomisation is the only way to allow valid inferences of cause and effect,16 and no other study design can potentially protect as well against bias.
Unfortunately, not all clinical trials are done well, and even fewer are well reported. Their results may therefore be confusing and misleading, and it is necessary to consider several elements of a trial's design, conduct and conclusions before accepting the results. The first requirement is that there must be sufficient detail available to make such an assessment.
It became clear that there was a need for the presentation of clinical trials to be standardised. The CONSORT (Consolidated Standards of Reporting Trials) statement was developed. The most recent version is CONSORT 2010 (Table 1.1).17,18
The CONSORT 2010 statement lists 25 items that should be included in any trial report, along with a flow chart.17,18
CONSORT 2010 checklist of information to include when reporting a randomised trial*
*We strongly recommend reading this statement in conjunction with the CONSORT 2010 Explanation and Elaboration for important clarifications on all the items. If relevant, we also recommend reading CONSORT extensions for cluster randomised trials, non-inferiority and equivalence trials, non-pharmacological treatments, herbal interventions, and pragmatic trials. Additional extensions are forthcoming: for those and for up-to-date references relevant to this checklist, see www.consort-statement.org.
Reproduced from Schulz KF, Altman DG, Moher D et al. Br Med J 2010; 340:c332 and Moher D, Hopewell S, Schulz KF et al. Br Med J 2010;340:c869. With permission from the BMJ Publishing Group Ltd.
Many journals now encourage authors to submit a copy of the CONSORT statement relating to their paper. A similar checklist has been proposed for the reporting of observational studies (cohort, case–control and cross-sectional). This is called the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement.19 The QUORUM statement is a similar checklist that has been developed to improve the quality of reporting relating to systematic reviews of RCTs.20
The STROBE statement provides a recommended checklist for the reporting of observational studies19 and the QUORUM statement provides similar recommendations for systematic reviews of RCTs.20
The Critical Appraisal Skills Programme (CASP) is a UK-based project designed to develop appraisal skills about effectiveness. It provides half-day workshops and has developed appraisal frameworks based on 10 or 11 questions for RCTs, qualitative research and systematic reviews.
Assuming that the relevant information is available, critical appraisal is required to ensure that the methodology of the trial is such that it will minimise effects on outcome other than true treatment effects, i.e. those owing to chance, bias and confounding:
All good study designs will reduce the effects of chance, eliminate bias and take confounding into account. This requires consideration of many aspects of trial design, including methods of randomisation, blinding and masking, analysis methods and sample size. It also requires the reviewer to consider aspects such as sponsorship and vested interests that may introduce sources of bias. Discussion of methodology for the critical appraisal of RCTs and other forms of study is readily available elsewhere.21
Systematic literature reviews
A systematic review is an overview of primary studies carried out to an exhaustive, defined and repeatable protocol.
There has been an explosion in the published medical literature, with over two million articles a year published in 20 000 journals. The task of keeping up with new advances in medical research has become quite overwhelming. We have also seen that the results of trials in the same subject may be contradictory, and that the underlying message can be masked. Systematic reviews are designed to search out meticulously all relevant studies on a subject, evaluate the quality of each study and assimilate the information to produce a balanced and unbiased conclusion.22
One advantage of a systematic review with a meta-analysis over a traditional subjective narrative review is that by synthesising the results of many smaller studies, the original lack of statistical power of each study may be overcome by cumulative size, and any treatment effect is more clearly demonstrated. This, in turn, can lead to a reduction in delay between research advances and clinical implementation. For example, it has been demonstrated that if the original studies done on the use of anticoagulants after myocardial infarction had been reviewed, their benefits would have been apparent much earlier.23,24 It is obviously essential that both the benefit or any harm caused by an intervention becomes apparent as soon as possible.
Unfortunately, as in reported trials, not all reviews are as rigorously researched and synthesised as one would hope and are open to similar pitfalls as RCTs. The Cochrane Collaboration has sought to rectify this and has worked upon refining the methods used for systematic reviews. It has consequently produced some of the most reliable and useful reviews, and its methods have been widely adopted by other reviewers. The Cochrane Collaboration advises that each review must be based on an explicit protocol, which sets out the objectives and methods so that a second party could reproduce the review at a later date if required.
Because of the increasing importance of systematic reviews as a method of providing the evidence base for a variety of clinical activities, the methods are discussed in some detail below. There are several key elements in producing a systematic review.
1 Develop A Protocol For A Clearly Defined Question
Within a protocol:
In the Cochrane Collaboration, each systematic review is preceded by a published protocol that is subjected to a process of peer review. This helps to ensure high quality, avoids duplication of effort and is designed to reduce bias by setting standards for inclusion criteria before the results from identified studies have been assessed.
2 Literature Search
All published and unpublished material should be sought. This includes examining studies in non-English journals, grey literature, conference reports, company reports (drug companies can hold a lot of vital information from their own research) and any personal contacts, for personal studies or information. The details of the search methodology and search terms used should be specified in order to make the review reproducible and allow readers to repeat the search to identify further relevant information published after the review. The most frequently used initial source of information is MEDLINE but this does have limitations. It only indexes about one-third of all medical articles that exist in libraries (over 10 million in total),25 and an average search by a regular user would only yield about one-fifth of the trials that can be identified by more rigorous techniques for literature searching.26 It also has a bias towards articles published in English. Other electronic and indexed databases should also be searched, but often the only way to ensure that the maximum number of relevant trials are found, wherever published and in whatever language, is to hand search the journals. This is one of the tasks of the Cochrane Collaboration through a database maintained at the Baltimore Cochrane Centre.
One must also be aware, however, that there is a potential for ‘publication bias’. Trials that are more likely to get published are those with a positive result rather than a negative or no-effect result,27 and are also more likely to be cited in other articles.28
3 Evaluating The Studies
Each trial should be assessed to see if it meets the inclusion criteria set out in the protocol (eligibility). If it meets the required standards, then the trial is subjected to a critical appraisal, ideally by two independent reviewers, to ascertain its validity, relevance and reliability. Any exclusions should be reported and justified; if there is missing information from the published article, it may be necessary to attempt to contact the author of the primary research. Reviewers should also, if possible, be ‘blinded’ to the authors and journals of publication, etc. in order to minimise any personal bias.
The Cochrane reviewers are assisted in all these tasks by instructions in the Cochrane Handbook14 and through workshops at the Cochrane Centres.29
4 Synthesis Of The Results
Once the studies have been graded according to quality and relevance, their results may be combined in an interpretative or a statistical fashion. It must be decided if it is appropriate to combine some studies and which comparisons to make. Subgroup or sensitivity analyses may also be appropriate. The statistical analysis is called a meta-analysis and is discussed below.
The review should be summarised. The aims, methods and reported results should be discussed and the following issues considered:
As with any study, a review can be done badly, and the reader must critically appraise a review to assess its quality. Systematic errors may be introduced by omitting some relevant studies, by selection bias (such as excluding foreign language journals) or by including inappropriate studies (such as those considering different patient groups or irrelevant outcomes). Despite all precautions, the findings of a systematic review may differ from those of a large-scale, high-quality RCT. This will be discussed below in relation to meta-analysis.
A meta-analysis is a specific statistical strategy for assembling the results of several studies into a single estimate, which may be incorporated into a systematic literature review.30
Here we must make the distinction that the term ‘meta-analysis’ refers to the statistical techniques used to combine the results of several studies and is not synonymous with systematic review, as it is sometimes used.
A common problem in clinical trials is that the results are not clear-cut, either because of size or because of the design of the trial. The systematic review is designed to eliminate some of these problems and give appropriate weightings to the best- and worst-quality studies, regardless of size. Meta-analysis is the statistical tool used to combine the results and give ‘power’ to the estimates of effect.
Meta-analyses use a variety of statistical techniques according to the type of data being analysed (dichotomous, continuous or individual patient data).14 There are two main models used to analyse the results: the fixed-effect model (logistic regression, Mantel–Haenszel test and Peto's method) and the random-effect model. The major concern with fixed-effect methods is that they assume no clinical heterogeneity between the individual trials, and this may be unrealistic.31 The random-effect method takes into consideration random variation and clinical heterogeneity between trials. In the presentation of meta-analysis, a consistent scale should be chosen for measuring treatment effects and to cope with the possible large scale of difference in proportions, risk ratios or odds ratios that can be used.
Trials can have many different components21 and therefore a meta-analysis is only valid if the trials that it seeks to summarise are homogeneous: you cannot add apples and oranges.32 If trials are not comparable and any heterogeneity is ignored, the analysis can produce misleading results.
Figure 1.1 shows an example of this from a meta-analysis of 19 RCTs investigating the use of endoscopic sclerotherapy to reduce mortality from oesophageal varices in the primary treatment of cirrhotic patients.33 Each trial is represented by a ‘point estimate’ of the difference between the groups, and a horizontal line showing the 95% confidence interval (CI). If the line does not cross the line of no effect, then there is a 95% chance that there is a real difference between the groups. It can be seen that in this case the trials are not homogeneous as some of the lower limits of the CIs are above the highest limits of CIs in other trials. Such a lack of homogeneity may have a variety of causes, relating to clinical heterogeneity (differences in patient mix, setting, etc.) or differences in methods. The degree of statistical heterogeneity can be measured to see if it is greater than is compatible with the play of chance.34 Such a statistical tool may lack statistical power; consequently, results that do not show significant heterogeneity do not necessarily mean that the trials are truly homogeneous and one must look beyond them to assess the degree of heterogeneity.
‘Meta-analysis is on the strongest ground when the methods employed in the primary studies are sufficiently similar that any differences in their results are due to the play of chance.’30
FIGURE 1.1 An example of a meta-analysis of 19 randomised controlled trials investigating the use of endoscopic sclerotherapy to reduce mortality from oesophageal varices in the primary treatment of cirrhotic patients.Reproduced from Chalmer I, Altman DG. Systematic reviews. London: BMJ Publishing, 1995; p. 119. With permission from Blackwell Publishing Ltd.
Views on the usefulness of meta-analyses are divided. On the one hand, they may provide conclusions that could not be reached from other trials because of the small numbers involved. However, on the other hand, they have some limitations and cannot produce a single simple answer to all complex clinical problems. They may give misleading results if used inappropriately where there is a biased body of literature or clinical or methodological heterogeneity. If used with caution, however, they may be a useful tool in providing information to help in decision-making.
Figure 1.2 shows a funnel plot of a meta-analysis relating to the use of magnesium following myocardial infarction.35 The result of each study in the analysis is represented by a circle plotting the odds ratio (with the vertical line being at 1, the ‘line of no effect’) against the trial size. The diamond represents the overall results of the meta-analysis with its pooled data from all the smaller studies shown. This study36 was published in 1993 and showed that it was beneficial and safe to give intravenous magnesium in patients with acute myocardial infarction. The majority of the studies involved show a positive effect of the treatment, as does the meta-analysis. However, the results from this study were contradicted in 1995 by ISIS-4, a very large RCT involving 58 050 patients.37 It had three arms, in one of which intravenous magnesium was given to patients suspected of an acute myocardial infarction. The results are marked on the funnel plot and show that there is no clear benefit for this treatment, contrary to the results of the earlier meta-analysis.
FIGURE 1.2 A funnel plot of a meta-analysis relating to the use of magnesium following myocardial infarction. Points indicate values from small and medium-sized trials; the diamond is the combined odds ratio with 95% confidence interval from the meta-analysis of these trials and the square is that for a mega trial. Reproduced from Egger M, Smith GD. Misleading meta-analysis. Br Med J 1995; 310:752–4. With permission from the BMJ Publishing Group Ltd.
Some would say that this is one of the major problems with using statistical synthesis. An alternative viewpoint is that it is an example of the importance of ensuring that the material fed into a meta-analysis from a systematic review is researched and critically appraised to the highest possible standard. Explanations for the contradictory findings in this review have been given as:32,35
Clinical guidelines are systematically developed statements to assist practitioner and patient decisions about appropriate healthcare for specific clinical circumstances.38
EBM is increasingly advocated in healthcare, and evidence-based guidelines are being developed in many areas of primary healthcare such as asthma,39 stable angina40 and vascular disease.41 Over 2000 guidelines or protocols have been developed from audit programmes in the UK alone. An observational study in general practice has also shown that recommendations that are evidence based are more widely adopted than those that are not.42 The UK Department of Health has also endorsed the policy of using evidence-based guidelines.43
Guidelines may have a number of different purposes:
For clinical policies to be evidence based and clinically useful, there must be a balance between the strengths and limitations of relevant research and the practical realities of the healthcare and clinical settings.44
There are, however, commonly expressed concerns about the use of guidelines:
The effectiveness of a guideline depends on three areas, as identified by Grimshaw and Russell:45
In the UK, there are a number of bodies that produce guidelines and summaries of evidence-based advice.
The National Institute For Clinical Excellence (NICE)
NICE is a special health authority formed on 1 April 1999 by the UK government. The board comprises executive and non-executive members. It is designed to work with the NHS in appraising healthcare interventions and offering guidance on the best treatment methods for patients. It assesses all the evidence on the clinical benefit of an intervention, including quality of life, mortality and cost-effectiveness. It will then decide, using this information if the intervention should be recommended to the NHS.
It produces guidance in three main areas:
Its role was further expanded in 2010 following the NHS White Paper, Equity and Excellence – Liberating the NHS and was tasked with developing 150 quality standards in key areas in order to improve patient outcomes. It is now linked in with NHS Evidence and manages the online search engine that allows easy access to an extensive evidence base and examples of best practice.
Scottish Intercollegiate Guidelines Network (SIGN)
The SIGN was formed in 1993. Its objective is to improve the effectiveness and efficiency of clinical care for patients in Scotland by developing, publishing and disseminating guidelines that identify and promote good clinical practice. SIGN is a network of clinicians from all the medical specialities, nurses, other professionals allied to medicine, managers, social services and researchers. Patients and carers are also represented on the council. Since 2005, SIGN has been part of NHS Quality Improvement Scotland.
Effective Practice And Organisation Of Care (EPOC)
EPOC is a subgroup of the Cochrane Collaboration that reviews and summarises research about the use of guidelines.
Guidelines also need to be critically appraised and a framework has been developed for this46 that uses 37 questions to appraise three different areas of a clinical guideline:
Integrated care pathways (ICPs)
ICPs are known by a number of names, including integrated care plans, collaborative care plans, critical care pathways and clinical algorithms. ICPs are a development of clinical practice guidelines and have emerged over recent years as a strategy for delivering consistent high-quality care for a range of diagnostic groups or procedures. They are usually multidisciplinary, patient-focused pathways of care that provide a framework for the management of a clinical condition or procedure and are based upon best available evidence.
The advantage of ICPs over most conventional guidelines is that they provide a complete package of protocols relating to the likely events for all healthcare personnel involved with the patient during a single episode of care. By covering each possible contingency with advice based upon best evidence, they provide a means of both identifying and implementing optimum practice.
Grading the evidence
There is a traditional hierarchy of evidence, which lists the primary studies in order of perceived scientific merit. This allows one to give an appropriate level of significance to each type of study and is useful when weighing up the evidence in order to make a clinical decision. One version of the hierarchy is given in Box 1.1.21 It must be remembered, however, that this is only a rough guide and that one needs to assess each study on its own merits. Although a meta-analysis comes above an RCT in the hierarchy, a good-quality RCT is far better than a poorly performed meta-analysis. Similarly, a seriously flawed RCT may not merit the same degree of importance as a well-designed cohort study. Checklists have been published that may assist in assessing the methodological quality of each type of study.21
Hierarchy of evidence
From Greenhalgh T. How to read a paper: the basics of evidence based medicine. London: BMJ Publications, 1997; Vol. xvii, p. 196. With permission from the BMJ Publishing Group Ltd.
Similar checklists are available for systematic reviews.21,47,48 As already discussed, the preparation of a systematic review is a complex process involving a number of steps, each of which is open to bias and inaccuracies that can distort the results. Such lists can be used as a guide when preparing a review as well as in assessing one. One checklist used to assess the validity of a review does so by identifying potential sources of bias in each step (Table 1.2).49
Checklist for assessing sources of bias and methods of protecting against bias
Is the question clearly focused?
Is the search for relevant studies thorough?
Are the inclusion criteria appropriate?
Appraisal of the studies
Is the validity of the studies included adequately assessed?
Is missing information obtained from investigators?
How sensitive are the results to changes in the way the review is done?
Interpretation of results
Do the conclusions flow from the evidence that is reviewed? Are recommendations linked to the strength of the evidence?
Are judgments about preferences (values) explicit?
If there is ‘no evidence of effect’, is care taken not to interpret this as ‘evidence of no effect’?
Are subgroup analyses interpreted cautiously?
From Oxman A. Checklists for review articles. Br Med J 1994; 309:648–51. With permission from the BMJ Publishing Group Ltd.
It is hoped that the results of a systematic review will be precise, valid and statistically powerful in order to provide the highest quality information on which to base clinical decisions or to produce clinical guidelines. The strength of the evidence provided by a study also needs to be assessed before making any clinical recommendations. A grading system is required to specify the levels of evidence, and several have previously been reported (e.g. those of the Antithrombotic Therapy Consensus Conference50 or that shown in Table 1.3).
Agency for Health Care Policy and Research grading system for evidence and recommendations
From Hadorn DC, Baker D, Hodges JS et al. Rating the quality of evidence for clinical practice guidelines. J Clin Epidemiol 1996 49:749–54. With permission from Elsevier.
The grading of evidence and recommendations within textbooks, clinical guidelines or ICPs should allow users easily to identify those elements of evidence that may be subject to interpretation or modification in the light of new published data or local information. It should identify those aspects of recommendations that are less securely based upon evidence and therefore may appropriately be modified in the light of patient preferences or local circumstances. This raises different issues to the grading of evidence for critical appraisal and for systematic reviews.
In 1979, the Canadian Task Force on the Periodic Health Examination was one of the first groups to propose grading the strength of recommendations.51 Since then there have been several published systems for rating the quality of evidence, although most were not designed specifically to be translated into guideline development. The Agency for Health Care Policy and Research has published such a system, although this body considered that its level of classification may be too complex to allow clinical practice guideline development.52Nevertheless, the Agency advocated evidence-linked guideline development, requiring the explicit linkage of recommendations to the quality of the supporting evidence. The Centre for Evidence-based Medicine has developed a more comprehensive grading system, which incorporates dimensions such as prognosis, diagnosis and economic analysis.53
These systems are complex; for textbooks, care pathways and guidelines, such grading systems need to be clear and easily understood by the relevant audience as well as taking into account all the different forms of evidence that may be appropriate to such documents.
Determining Strength Of Evidence
There are three main factors that need to be taken into account in determining the strength of evidence:
Type and quality of study
Meta-analyses, systematic reviews and RCTs are generally considered to be the highest quality evidence that is available. However, in some situations these may not be appropriate or feasible. Recommendations may depend upon evidence from other kinds of study, such as observational studies of epidemiology or natural history, or synthesised evidence, such as decision analyses and cost-effectiveness modelling.
For each type of evidence, there are sets of criteria as to the methodological quality, and descriptions of techniques for critical appraisal are widely available.21 Inevitably, there is some degree of subjectivity in determining whether particular flaws or a lack of suitable information invalidates an individual study.
Robustness of findings
The strength of evidence from a published study would depend not only upon the type and quality of a particular study but also upon the magnitude of any differences and the homogeneity of results. High-quality research may report findings with wide confidence intervals, conflicting results or contradictory findings for different outcome measures or patient subgroups. Conversely, sensitivity analysis within a cost-effectiveness or decision analysis may indicate that uncertainty regarding the exact value of a particular parameter does not detract from the strength of the conclusion.
Strong evidence in a set of guidelines must be wholly applicable to the situation in which the guidelines are to be used. For example, a finding from high-quality research based upon a hospital population may provide good evidence for guidelines intended for a similar setting but a lower quality of evidence for guidelines intended for primary care.
Grading System For Evidence
The following is a simple pragmatic grading system for the strength of a statement of evidence, which will be used to grade the evidence in this book (and the other volumes in the Companion series). Details of the definitions are given in Table 1.4. For practical purposes, only the following three grades are required, which are analogous to the levels of proof required in a court of law:
Grading of evidence and recommendations
I ‘Beyond reasonable doubt’. Analogous to the burden of proof required in a criminal court case and may be thought of as corresponding to the usual standard of ‘proof’ within the medical literature (i.e. P < 0.05).
II ‘On the balance of probabilities’. In many cases, a high-quality review of literature may fail to reach firm conclusions because of conflicting evidence or inconclusive results, trials of poor methodological quality or the lack of evidence in the population to which the guidelines apply. Where such strong evidence does not exist, it may still be possible to make a statement as to the ‘best’ treatment on the ‘balance of probabilities’. This is analogous to the decision in a civil court where all the available evidence will be weighed up and a verdict will depend upon the ‘balance of probabilities’.
III ‘Unproven’. Where the above levels of proof do not exist.
All evidence-based guidelines require regular review because of the constant stream of new information that becomes available. In some areas, there is more rapid development and the emergence of new evidence; in these instances, relevant reference will be made to ongoing trials or systematic reviews in progress.
Grading Of Recommendations
Although recommendations should be based upon the evidence presented, it is necessary to grade the strength of recommendation separately from the evidence. For example, the lack of evidence regarding an expensive new technology may lead to a strong recommendation that it should only be undertaken as part of an adequately regulated clinical trial. Conversely, strong evidence for the effectiveness of a treatment may not lead to a strong recommendation for use if the magnitude of the benefit is small and the treatment very costly.
The following grades of recommendations are suggested and details of the definition are given in Table 1.4:
A A strong recommendation, which should be followed.
B A recommendation using evidence of effectiveness, but where there may be other factors to take into account in the decision-making process.
C A recommendation where evidence as to the most effective practice is not adequate, but there may be reasons for making the recommendations in order to minimise cost or reduce the chance of error through a locally agreed protocol.
Implementation of evidence-based medicine
Healthcare professionals have always sought evidence on which to base their clinical practice. Unfortunately, the evidence has not always been available, reliable or explicit, and when it was available it has not been implemented immediately. James Lancaster in 1601 showed that lemon juice was effective in the treatment of scurvy, and in 1747 James Lind repeated the experiment. The British Navy did not utilise this information until 1795 and the Merchant Navy not until 1865. When implementation of research findings is delayed, ultimately the people who suffer are the patients.
A number of different groups of people may need to be committed to the changes before they can take place with any degree of success. These include:
Each of these groups has a different set of priorities. To ensure that their own requirements are met by the proposal, negotiation is required, which takes time. There are many potential barriers to the implementation of recommendations, and clinicians may become so embroiled in tradition and dogma, that they are resistant to change. They may lack knowledge of new developments or the time and resources to keep up to date with the published literature. Lack of training in a new technology, such as laparoscopic surgery or interventional radiology, may thwart their use, even when shown to be effective. Researchers may become detached from the practicalities of clinical practice and the needs of the health service and concentrate on inappropriate questions or produce impractical guidelines. Managers are subject to changes in the political climate and can easily be driven by policies and budgets. The resources available to them may be limited and not allow for the purchase of new technology, and even potentially cost-saving developments may not be introduced because of the difficulties in releasing the savings from elsewhere in the service.
Patients and the general public can also influence the development of the healthcare offered. They are susceptible to the persuasion of the mass media and may demand the implementation of ‘miracle cures’ or fashionable investigations or treatments. Such interventions may not be practical or of any proven benefit. They can also determine the success or failure of a particular treatment. For instance, a treatment may be physically or morally unacceptable, or there may be poor compliance, especially with preventative measures such as diets, smoking cessation or exercise. All these aspects can lead to a delay in the implementation of research findings.
Potential ways of improving this situation include the following:
There is a gap between research and practice, and there is a need for evidence about the effectiveness of different methods of implementing changes in clinical practice. The NHS Central R&D Committee set up an advisory group to look into this problem and identified 20 priorities for evaluation, as shown in Box 1.2.54
Box 1.2 Priority areas for evaluation in the methods of implementation of the findings of research: recommendations of the advisory group to the NHS Central Research and Development Committee
From NHS Central Research and Development Committee. Methods to promote the implementation of research findings in the NHS: priorities for evaluation: report to the NHS Central Research and Development Committee. London: Department of Health, 1995. © Crown copyright 2008. Reproduced under the terms of the Click-Use Licence.
An EPOC review has examined the different methods of implementing evidence-based healthcare and classified them into three broad groups:55
Several groups have looked at implementing evidence-based practice, such as grommet use in glue ear and steroids in preterm delivery:
Successful implementation of research findings into practice appears to be due to a multipronged approach and the UK National Association of Health Authorities and Trusts (NAHAT) has produced an action checklist in order to facilitate this process.59
It must be remembered, however, that EBM is not the sole preserve of experts or clinicians. The research, dissemination and implementation of clinical and economic evaluations have wide-reaching repercussions for the health service. Managers are under increasing pressure to be effective both clinically and for costs, and are accountable at local, regional and national levels. They need to be actively involved and understand the process. As with all interactions between elements in the health service, there must be collaboration, the ultimate goal being an improvement in patient care.
Audit is the systematic critical analysis of the quality of medical care, including the procedures used for diagnosis and treatment, the use of resources, and the resulting outcome and quality of life for the patient.60
The Department of Health has set out policy documents that outline the development and role of audit in today's healthcare system.60,61 Everyone involved in the healthcare process has a responsibility to conduct audit and to assess the quality of care that they provide. In 1996, Donabedian categorised three important elements in the delivery of healthcare:62
Audit is a dynamic cyclical process (an audit loop) in which standards are defined and data are collected against these standards (Fig. 1.3). The results are then analysed and if there are any variances, proposals for change are developed to address the needs. These changes are then implemented and the quality of care reassessed. This closes the audit loop and the procedure begins again. The key to effective audit is that the loop must begin with the development of evidence-based standards. Any success in changing care to meet proposed standards is unlikely to produce more effective clinical care if such standards are set in an arbitrary way. The Royal College of Surgeons of England has published its own guidelines on clinical audit in surgical practice.63
FIGURE 1.3 The audit loop.
One result of the drive to implement audit in the UK was the development in 1993 of a National Confidential Enquiry into Perioperative Deaths (NCEPOD). This is an ongoing national audit and has produced a series of reports and recommendations based upon a peer review process. The process has a high rate of participation and reports with recommendations have resulted in a number of changes in clinical practice. For example, there has been a dramatic reduction in out-of-hours operating following recommendations suggesting that much of this was unsafe and unnecessary.
Possible Sources Of Further Information, Useful Internet Sites And Contact Addresses
The details below provide references to a number of sources of information, particularly those accessible through the Internet. It must be remembered that there are rapid changes in the material available online and Internet addresses are liable to change. Several of these sources provide extensive links to other sites.
Organisations Specialising In Evidence-Based Practice, Systematic Reviews, Etc
Aggressive Research Intelligence Facility (ARIF)
BMJ Evidence Centre
CASP (Critical Appraisal Skills Program)
Centre for Evidence-based Child Health
Centre for Evidence-based Medicine, established in Oxford.
Centre for Evidence-based Mental Health
Centre for Health Evidence, University of Alberta.
Evidence Network – an initiative of the ESRC UK Centre for Evidence-Based Policy and Practice
McMaster University Health Information Research Unit
NIHR Health Technology Assessment Programme
NHS Centre for Reviews and Dissemination University of York.
National Institute for Clinical Excellence (NICE)
Intute: Health and Life Sciences – closed in July 2011 but website still open for next 3 years although will not be updated.
National Institute for Health and Research
NHS Evidence – web-based portal managed by NICE and linked with the National Electronic Library for Health (NeLH). Includes access to My Evidence.
UK Cochrane Centre
Summertown Pavilion, Middle Way, Oxford OX2 7LG
Tel. 01865 516300
home page – http://ukcc.cochrane.org/
The Cochrane Collaboration
Internet access to the Cochrane library and databases:
Cochrane Central Register of Controlled Trials
EPOC – Effective Practice and Organisation of Care
University of Alberta Evidence Based Practice Centre
Sources Of Reviews And Abstracts Relating To Evidence-Based Practice
ACP Journal Club
Bandolier (now an electronic version, independently written by Oxford scientists)
BMJ Clinical Evidence – a compendium of evidence for effective health care
Centre for Evidence Based Purchasing
Cochrane Systematic Reviews (abstracts only)
Effective Health Care Bulletins
Evidence Based Nursing Practice
Evidence Based On-call
PROSPERO – worldwide prospective register of systematic reviews
Journals Available On The Internet
eBMJ (electronic version of the British Medical Journal)
Journal of the American Medical Association (JAMA)
Canadian Medical Association Journal (CMAJ)
Evidence-based Mental Health
Databases, Bibliographies And Catalogues
PUBMED (the free version of MEDLINE)
BestBets – best evidence topics
Trip database – turning research into practice
BMJ Best Health
DUETs – The Database of Uncertainties about the Effects of Treatments publishes uncertainties that cannot currently be answered by referring to reliable up-to-date systematic reviews of existing research evidence
National Research Register Archive – a searchable copy of the archives held by the National Research Register (NRR) Projects Database, up to September 2007
National Institute for Health Research (NIHR) Clinical Network Research Portfolio is a database of clinical research studies that it supports, undertaken in the NHS
Sources Of Guidelines And Integrated Care Pathways
AHRQ (Agency for Healthcare Research and Quality) – provides practical healthcare information, research findings and data to help consumers
Evidence Based Practice Centres – developed in conjunction with the AHRQ
Cedars – Sinai Medical Center, Health Services Research
Home page: http://www.csmc.edu/
National Guideline Clearinghouse
Scottish Intercollegiate Guidelines Network (SIGN)
Scottish Pathways Association
Towards Optimised Practice (TOP) Clinical Guidelines
Cochrane Collaboration Handbook