Practical Transfusion Medicine 4th Ed.

45. Observational and interventional trials in transfusion medicine

Alan Tinmouth1,2, Dean Fergusson2 & Paul C. Hébert2

1General Hematology and Transfusion Medicine, Division of Hematology, Department of Medicine, Ottawa Hospital, Ottawa, Ontario, Canada

2University of Ottawa Centre for Transfusion Research, Clinical Epidemiology Program, The Ottawa Health Research Institute, Ottawa, Ontario, Canada


Randomized controlled clinical trials (RCTs) have evolved to become the ‘gold standard’ clinical research design used to distinguish the risks and benefits of therapeutic interventions. In 1948, for the first time a controlled clinical trial made use of random allocation, a control group and blinding. Additional principles guiding the design of RCTs were first elaborated by Sir Austin Bradford-Hill in the 1960s [1].

Many important questions regarding the use of blood components and alternatives such as blood conservation therapies have not been the subject of well-designed and executed RCTs. Consequently, clinicians frequently base their therapeutic decisions on suboptimal levels of clinical evidence, including observational studies, poorly controlled clinical trials or laboratory studies, and personal experience or observations, which are not evidence based. There are a number of plausible reasons why there are so few large clinical trials in transfusion medicine:

·        transfusion medicine has historically been a laboratory-based specialty with research focused on the product;

·        a relative lack of clinical epidemiologists and clinical trialists interested in transfusion medicine;

·        difficulty in obtaining funding for research of a supportive as opposed to a curative therapy;

·        many of the products have been in standard use for years without good evidence to define the benefits or harms, or specific indications of use; and

·        few industry partners willing to invest in large clinical trials given that products are already in wide use.

It could also be postulated that unique difficulties in the field may have impeded the development of important clinical studies. In this chapter, we outline some of the methodological issues central to the development and conduct of observational and interventional trials, including RCTs, in transfusion medicine.

What is unique about transfusion medicine?

There are a number of unique difficulties in transfusion medicine that require consideration in the development of clinical research. First and foremost, transfusion medicine as a discipline is largely based upon the provision of supportive interventions in the treatment of a wide variety of acute illnesses as opposed to disciplines such as cardiology or oncology where interventions are evaluated in well-defined diseases. Furthermore, studies are conducted by trialists trained within a specific discipline (e.g. oncologists, cardiac surgeons). In comparison, blood components are indicated in response to conditions such as anaemia or coagulopathy induced by medications or disease entities. The supportive nature of transfusions leads to consideration of outcomes that may not be directly clinically relevant to the underlying disease process. In addition, most benefits and risks of care either would not be attributed to supportive interventions or the supportive intervention may not be considered the prime factor influencing the outcome(s).

Conditions that require blood components, such as anaemia and coagulopathies, occur in a broad range of diseases. This raises significant difficulties in designing studies and setting a research agenda. Evaluating a transfusion intervention in many diseases can be problematic as the frequency of the outcome(s) of interest will vary. Larger sample sizes and more robust outcomes would then be required to account for the variation within the patient population. The alternative strategy of smaller trials in targeted populations will limit the generalizability of the study as the conclusions may not be applicable to patient groups outside the studied target population.

A further concern is the complex biological nature of blood components. For instance, red cell concentrates are prepared using a variety of techniques and storage media, and intravenous immunoglobulins are manufactured by different companies using different processes. This leads investigators to consider whether studies evaluating a transfusion intervention should only be done with one product or preparation, or with many different preparations. Indeed, there may be unforeseen or unexpected clinical consequences due to different approaches to the preparation of blood components. In the planning of studies, one must carefully consider whether products are sufficiently similar to consider including in the same study. Regulatory concerns within or between jurisdictions may also impact on the choice of products included in the study.

As discussed in subsequent sections, regulations may not permit the conduct of randomized trials for studies assessing the clinical consequences of different products or testing procedures. Under such constraints, quasi-experimental designs such as before-and-after studies or time-series analyses should be considered.

One of the remaining unique aspects of conducting clinical research in transfusion medicine is that transfusions are often incorporated into complex care paths or within therapeutic algorithms of care. The evaluation of transfusions with many other interventions and diagnostic tests increases the complexity of any clinical evaluation.

Types of studies

To ascertain the effectiveness of an intervention, the RCT remains the preferred study design as it should minimize the most important biases if properly conceived and executed. Despite being the ‘gold standard’, there are often practical, legal, financial and ethical limitations to the use of clinical trials. For instance, exposing subjects to undesirable and dangerous interventions such as cigarettes and toxins would not be permitted in an RCT. While many of these limitations have been well described, one unique obstacle encountered in transfusion medicine is the conduct of an RCT when an intervention is universally implemented, such as a new processing method or testing procedure for the entire blood supply. By implementing an intervention such as universal pre-storage leucocyte reduction or universal polymerase chain reaction (PCR) testing for hepatitis C, an RCT becomes impossible within that population. If an RCT is not possible, other study designs including quasi-experimental and observational designs should be considered.

Observational studies

Two types of observational designs are often considered in clinical research, case–control studies and cohort or prognostic studies (Figure 45.1). In all observational studies, the first step is to define (a) the research hypothesis, (b) the population, (c) the exposure(s), (d) the outcome(s) and (e) the covariates (factors other than the exposure that may influence the occurrence of the outcomes). A case–control study refers to a study where one identifies a group of individuals with an outcome and another group of individuals who would be considered at risk of developing the outcome. Once both groups have been identified, investigators usually seek to identify potential risk factors in both the group with the outcome and the controls. This classic epidemiological design is ideally suited to the investigation of rare diseases and the identification of potential aetiological or risk factors, particulary if there is a long latency period [2]. In transfusion medicine, case–control studies would be ideally suited for the initial study of rare conditions such as transfusion-related acute lung injury (TRALI) and the association between blood transfusion and variant Creutzfeldt–Jakob disease (vCJD). By definition, this study design is always retrospective in nature. Cases, controls and potential risk factors are identified from historical records or past events. By comparing 46 cases of patients with TRALI (cases) and 225 randomly selected transfusion recipients without TRALI (controls), Silliman et al. [3] were able to identify that certain diagnoses (haematological malignancies and cardiac disease) and the age of the platelets were associated with TRALI. In a smaller subset of cases and controls, the implicated units also had greater neutrophil priming activity as compared with controls. While these results demonstrate an association between neutrophil priming activity, which increases in older platelets, and TRALI reactions, the case–control design does not allow causation to be determined.

Fig 45.1 Observational study designs: case–control and cohort studies. Adapted from Tay and Tinmouth [26].


Despite some of the potential advantages of this study design, it is difficult to do well and is fraught with potential biases. A systematic review of case–control studies attempting to establish the association between blood transfusion and vCJD demonstrated that blood transfusion had a protective effect [4]. Such a protective effect makes little sense and is probably the result of some biased approach to the sampling of controls.

The second observational design choice is a cohort study. In this type of study, individuals are identified well in advance of developing a disease and followed forward in time. Ideally, information on potential risk factors would be gathered from patients throughout the period of observation until the occurrence of an outcome or the end of the study. If subjects are identified well in advance of the development of a disease, then comparing individuals who have a given risk factor with individuals who do not may provide important clues to the aetiology of a disease or health state. It may also lead to a better understanding of the course of the disease and its incidence. If patients are identified and followed once a disease has developed, then this design may also provide invaluable prognostic information.

Cohort studies follow patients forward in time and evaluate outcomes based on a known exposure, risk factor or treatment. This design is most powerful when all eligible individuals are identified early, followed prospectively and without any losses to follow-up. A number of cohort studies have examined the relationship between anaemia, red cell transfusion and outcomes such as hospital mortality. These studies illustrate both positive and negative attributes of cohort studies. A retrospective study conducted by Carson and colleagues evaluated the relationship between increasing degrees of anaemia, the presence of ischaemic heart disease and mortality rates [5]. In 1958 for Jehovah's Witness patients, the adjusted odds of death increased from 2.3 (95% confidence interval (CI) 1.4–4.0) to 12.3 (95% CI 2.5–62.1), as preoperative haemoglobin concentrations declined from 10.0–10.9 g/dL to 6.0–6.9 g/dL in patients with cardiac disease as compared with patients without cardiac disease. This study shows a clear relationship between increasing anaemia and death.

In comparison, the risks of anaemia or transfusing older red blood cells and benefits of transfusions may be quite complex. This interdependence is often referred to as confounding by indication. Confounding is the mixing or blurring of effects where an outcome is related to an exposure but the effect is due to a third factor [6]. Observational studies in transfusion medicine have attempted to compare clinical outcomes in patients receiving red blood cells with varied storage times. These studies have major limitations including confounding by indication [7]. Koch et al. [8] retrospectively studied the records of Medicare of 8366 patients who received a red blood cell transfusion during cardiac surgery at a single institution over an 8-year period. Patients were arbitrarily divided into groups that received only red cells stored 14 days or less (newer) or red cells stored for greater than 14 days (older). Patients who received mixed aged red cells (n = 2364) were excluded from the analysis. Transfusion of newer red cells was associated with reductions in a composite outcome of serious adverse events including MI, stroke, sepsis, organ failure and death (adjusted odds ratio 1.16; 95% confidence interval, 1.01 to 1.33; p = 0.03) and a reduction in-hospital mortality (2.8% versus 1.7%, p = 0.004) and 1-year mortality (7.4% versus 11.0%, p < 0.001). Even though the study was published in a prestigious journal, there were several major limitations, making it difficult to draw useful conclusions:

·        association between increased number of units transfused and older red cell units;

·        confounding by indication;

·        dichotomization by a single age point introduces artificial cut-off;

·        a primary composite outcome including 17 items;

·        the timing of the transfusions were unknown;

·        inability to adjust for important but unmeasured variables; and

·        a prolonged observation period that may include changes in transfusion and/or medical practice.

One of the positive aspects of the study by Koch and colleagues was their approach to analysis. They did not include patients who received both newer and older red cells, and they used propensity scoring to match patients in the newer and older red cell transfusion groups. However, as most seen with observational studies, the potential biases and limitations of the analysis do not allow for any definitive inclusions.

The use of cohort studies can be of particular value in the evaluation of a universally implemented intervention such as pre-storage leucocyte reduction. In such a case, subjects must either be sampled over a period of time prior to and after the implementation of the programme (a ‘before-and-after’ or interrupted time-series study) or sampling must occur among subjects who received leucocyte-reduced blood components and another population that did not receive such products (standardized incidence study).

In a before-and-after study design, the frequency of an outcome in a specified population is measured during a period of time when the exposure is absent, followed by a measurement in the same population during a period of time where exposure is present. Consecutive periods before and after the implementation of a treatment are often compared. When a single measurement in both the pre- and postintervention periods are compared, there is the risk that changes occurring as a result of other ongoing factors may be attributed to the intervention. To limit this temporal bias, the changes in the experimental group may be compared to a control group not exposed to the intervention (controlled before-and-after study). Alternatively, an interrupted time-series design (a before-and-after study that makes determinations of an outcome at multiple time points before and after the implementation of an intervention) may be used to account for any temporal changes occurring during the period observation.

In a standardized incidence study, a standardized incidence ratio is calculated by comparing (standardizing) the incidence of an outcome in a defined exposed population with that of a nonexposed population. In the standardization procedure, care is taken to adjust for important confounders. Using universal leucocyte reduction as an example, the incidence of nosocomial infection in Canadian patients receiving a transfusion could be compared to a US population of transfused patients receiving non-leucocyte-reduced blood components.

Well-executed case–control studies may provide clues about the aetiology or risk factors associated with the development of a disease or a complication. A cohort study may provide the best estimate of incidence, prognosis and risks associated with the development of a disease or its complications. Both designs provide weak inferences regarding specific therapeutic interventions because many forms of bias and confounding remain even after complex multivariable analysis. Before-and-after studies and time-series analysis, both quasi-experimental designs, may provide some inferences regarding clinical consequences attributed to the implementation of a universal programme when a randomized trial is not possible [9]. Inherent in both case–control and cohort studies is the inability to determine causality between a risk factor or treatment and a specific outcome.

Randomized controlled trials

Overall design approaches for RCTs

Clinicians, hospital administrators and policymakers should always seek to identify the best evidence for decision making. Researchers should aspire to conduct the highest quality studies. For therapeutic interventions, there is little debate that this should be an RCT. However, there should be an awareness that randomized trials may be complex. The question being addressed, the many choices and compromises made by the investigators pertaining to different study manoeuvres, such as the selection of patients and centres, may affect inferences made from the results of the trial. In this section, a conceptual framework is provided for randomized trials that should assist providers and consumers of clinical research.

The ideal RCT establishes whether therapeutic interventions work and determines the overall benefits and risks of each alternative in predefined patient populations. This is accomplished by minimizing the influence of chance, bias and confounding through appropriate methodology. In addition, the ideal RCT should attempt to fulfil its objectives with the fewest patients possible (often termed ‘statistical efficiency’). Unfortunately, these objectives are often in direct conflict rather than complementary. More importantly, economic considerations often limit our ability to fulfil all these objectives. For instance, by maximizing the efficiency of a study, investigators might sacrifice their ability to draw conclusions in clinically important subgroups because of an inadequate sample size.

The most important consequence of these conflicting objectives is that choices made in the design of RCTs must focus on whether an intervention works or whether it results in more good than harm for patients [10]. Trials that attempt to determine therapeutic efficacy address the question ‘Will the therapy work under ideal conditions?’ Trials attempting to determine therapeutic effectiveness address the question ‘Will the therapy do more good than harm under usual practice conditions in all patients who are offered the intervention?’ Clearly, both questions will yield useful information for health practitioners. Efficacy is often established first and then the intervention may be evaluated for its effectiveness. In pivotal RCTs used in the final phase of obtaining regulatory approval (phase III trials), pharmaceutical companies primarily wish to demonstrate that their product has proven efficacy; rarely are attempts made to demonstrate therapeutic effectiveness.

The design characteristics of efficacy and effectiveness trials tend to differ considerably (Table 45.1). As a consequence of design choices, inferences and threats to the validity of effectiveness and efficacy trials are different. Therefore, one of the first steps in planning an RCT is to determine which of these two design approaches will best reflect the primary study question. Efficacy trials often opt for restricted eligibility, rigorous treatment protocols and disease-specific outcomes responsive to the potential benefits of the experimental intervention. By using this approach efficacy studies attempt to maximize internal validity, defined as the extent to which the experimental findings represent the true effect in study participants. As a consequence, this design approach will often lack the ability to maximize external validity, defined as the extent to which the experimental findings in the study represent the true effect in the target. Hence, there is often a trade-off between the two forms of validity in any one study.

Table 45.1 Comparison of study characteristics using either an efficacy or an effectiveness approach when designing a study.

Study characteristics

Efficacy trial

Effectiveness trial

Research question

Will the intervention work under ideal conditions?

Will the intervention result in more good than harm under usual practice conditions?


Restricted to specialized centres

Open to all institutions

Patient selection

Selected, well-defined patients

A wider range of patients identified using broad eligibility criteria

Study design

Smaller RCT using stringent rules

Larger multicentre RCT using simpler rules

Baseline assessment

Elaborate and detailed

Simple and clinician friendly


Tightly controlled

Less controlled


Optimal therapy under optimal study conditions

Therapy administered by investigators using accepted approaches

Treatment protocols

Rigorous and detailed.

Very general.



Non compliance tolerated.



Related to biologic effect 

Surrogate endpoints

Patient-related such as all-cause mortality or quality of life


By treatment received



Noncompliers removed

All patients included

Data management


Data collection


Minimal and simple

Data monitoring*

Detailed and rigorous.


*Data monitoring refers to the review of source documents and adjudication/verification of outcomes.

As an example of an efficacy trial, Rivers and colleagues undertook an RCT in which they randomly allocated 263 patients with early sepsis and septic shock to receive either goal-directed therapy using a monitor of continuous central venous saturation in one group versus standard care in the other arm of the trial [11]. Both groups received fluids, vasopressors such as noradrenaline (norepinephrine), inotropic agents such as dobutamine and red cell transfusions according to a strict clinical algorithm. In addition to the clinical algorithm, the goal-directed therapy arm was required to maintain mixed venous saturation greater than 70%. Saturations below 70% suggest an ongoing oxygen debt and shock. In the first 6 hours of care in the emergency department, the experimental group received more fluids (4981 mL versus 3499 mL, p < 0.001), more inotropic agents (13.7% versus 0.8% of patients, p < 0.001) and more red cell transfusions (64.1% versus 18.5% of patients, p < 0.001). As a result of the multiple interventions including red cells, in-hospital mortality was decreased from 46.5 to 30.5% (p < 0.009). In this efficacy study, many of the study manoeuvres were tightly controlled. Specifically, the trial was conducted in a single tertiary centre, by a small number of experts, in a well-defined patient population using elaborate treatment algorithms in both the experimental and standard of care groups. This efficacy approach may be contrasted with large cardiovascular trials such as the International Study of Infarct Survival trials in acute myocardial infarction, which enrolled thousands of patients [12]. One of the major shortcomings of effectiveness trials is the limited data collection and the limited control imposed on most aspects of the study design, thereby increasing biological variability, minimizing information on biological mechanisms and curtailing the possibility of understanding negative results, or the influence of cointerventions and confounding on study outcomes. One of the few examples of a large simple trial design in transfusion medicine is the saline-versus-albumin fluid evaluation (SAFE) study, which randomized 7000 critically ill with a wide variety of illnesses to 4% albumin or saline as a resuscitation fluid [13]. This trial found that mortality did not differ between the two groups. The lack of benefit of albumin makes its use, which is expensive, hard to justify in this group of patients. However, as a downside of the broad inclusion criteria, questions remain about the benefits of albumin in subgroups such as septic patients where a biologic rationale supporting its use exist.

Many trials opted for a hybrid approach between large simple trials and tightly controlled clinical studies. The Transfusion Requirements in Critical Care (TRICC) trial [14] or the Platelet Dose (PLADO) trial [15] provide examples. The former was an 838-patient trial that randomly allocated patients to either a restrictive or a liberal transfusion strategy. The study was conducted in 25 clinical centres, enrolled patients using broad eligibility criteria, followed simple treatment strategies for the administration of red cells and ascertained mortality rates and rates of organ failure [14]. The latter randomized 1272 patients in 26 sites with hypoproliferative thrombocytopenia to receive low, medium or high dose prophylactic platelet transfusions, and assessed the bleeding complications and platelet utilization in recipients [15].

RCT design alternatives

Once investigators have chosen whether an efficacy, effectiveness or a hybrid approach will best answer the research question, there are several design options that may be considered (Table 45.2). A two-group parallel design is the most common of RCT design choices (Figure 45.2a). In this design, patients are randomly allocated to one of two therapeutic interventions and followed forward in time. It is the simplest to plan, implement, analyse and, most importantly, interpret. Therefore, a parallel group design is the most frequently adopted choice of RCT design. Parallel group designs may also be used to independently compare three or more treatments [16].

Table 45.2 Types of randomized clinical trial (RCT) designs.


Fig 45.2 Design approaches for randomized controlled trials. Adapted from Tinmouth and Hébert [27]. (a) Randomized two-group parallel design: subjects randomly assigned to treatment A or B. (b) Factorial design: all subjects randomly assigned to treatment A, treatment B, treatment A + B or no treatment. (c) Randomized crossover design: subjects randomly assigned to treatment A followed by treatment B (after washout period) or treatment B followed by treatment A. (d) Randomized cluster design: all subjects in one group/area (e.g. by physician, by hospital, by ward) are assigned to treatment A or B.


The use of factorial designs may also be considered when a number of therapies are being evaluated in combination. For instance, in a two-by-two (2 × 2) factorial design, two interventions are tested both alone and in combination, and compared with a control group (usually a placebo) (Figure 45.2b) [17]. This means that investigators can efficiently test two interventions with only marginal increases in sample size. In addition, the benefits of treatment combinations can be evaluated in a controlled manner. This design is most useful when interactions are either very strong or nonexistent. Thus, before embarking on a large, more complex factorial study, investigators should expect either strong additive or synergistic effects from combined therapy or none at all [17]. Prospective investigators should realize that detecting interactions is also more difficult and requires a much larger sample size as compared with comparison of either therapy with a placebo. Factorial designs have been used very successfully to evaluate thrombolytic therapy in combination with an antiplatelet agent in acute myocardial infarction and unstable angina [18].

Factorial designs imply concurrent comparisons between at least two therapies. It is also possible to implement a design that compares interventions sequentially. For example, two therapies in the early treatment of a disease could be compared, followed by the evaluation of a second intervention in the late phase of care several days later. The authors are not aware of any clinical studies in transfusion medicine that have made use of a factorial design. An example of a factorial trial would be to randomize patients to an algorithm of care versus standard care in addition to either a conservative or liberal transfusion threshold. Traditionally, the factorial design is used to answer two separate study questions.

Both the simple parallel group design and a factorial design are designed using classical or frequentist statistical approaches, where the sample size is fixed according to pre-established assumptions (anticipated outcomes in treatment and control group, power and significance levels) prior to the commencement of enrolment. There are other experimental designs that are more responsive to patient outcomes as the study progresses. Sequential designs use frequentist statistical methods to set boundaries for significance levels that consider the increasing number of comparisons and sample size throughout the study. True sequential studies randomly allocate patients to receive one of two therapies. Pairs of patients are then sequentially compared. The study is terminated as soon as one of the significance boundaries is crossed. One of the major concerns with this design may be its inability to conceal the randomization process and the uncertainty of not knowing the exact sample size in advance. From this methodology, several biostatisticians have developed methods of performing interim analyses in large clinical trials, referred to as group sequential methods [19]. A Bayesian statistical approach offers an alternative methodology. In a Bayesian analysis, previous beliefs about the effectiveness of a therapy are combined with the observed data from the trial to provide a new revised set of likely values for the effectiveness of the therapy. This approach allows for repeated or continuous monitoring of study results as patients accrue. As a result, predetermined sample sizes are not required and enrolment continues until the results meet predetermined significance levels. This can allow for increased trial efficiency as studies will not enrol additional patients unnecessarily or terminate the study prematurely. A Bayesian approach has also been advocated for interim analyses of large clinical trials [20].

Another RCT design option particularly amenable to an efficacy evaluation is a two-period crossover study in which patients are used as their own controls. In a two-period crossover trial, patients are randomized to one of two therapies for a fixed period of time and then proceed to receive the other therapy in a second comparable interval (Figure 45.2c). Minimizing ‘between-subject’ variability in this manner makes significant gains in efficiency. Crossover studies are therefore best suited to relatively stable conditions (stability is required during the study), interventions with rapid onset of action and a very short half-life (the biological effect must disappear prior to the second treatment period), and rapidly modifiable endpoints such as haemodynamic and respiratory measures [17]. An example of a crossover trial in transfusion medicine would be the evaluation of a modified red cell product (e.g. bacterially inactivated or pegylated red cells) in patients with transfusion-dependent congenital anaemias. The time to next transfusion or red cell survival could be measured in patients who receive, in a random order, standard red cell transfusions and modified red cell transfusions for fixed periods of time. An appropriate washout time between the two interventions is required to ensure there is no contamination of the modified red cells during the period of standard transfusions.

All designs discussed so far have described the evaluation of interventions for individual patients. However, it is sometimes necessary to evaluate therapies, protocols, guidelines or treatment programmes for groups of individuals. Using this design, groups or ‘clusters’ such as ICUs, wards, hospitals and physician practice are randomized to receive an intervention or control (Figure 45.2d). Cluster design may be the most appropriate design for evaluating complex or multidimensional interventions such as the implementation of care paths, educational interventions, transfusion audits or other interventions to change transfusion practice. For these evaluations, the cluster is a more natural method of allocation than the individual [9]. Cluster trials are advantageous when there is a real risk that the intervention will be implemented in all patients rather than only the patients assigned to receive the therapy. When individuals in the control group receive the intervention or elements of the intervention, this contamination biases the results of the study. This may easily occur when one is evaluating guidelines, educational interventions, interventions designed to modify health provider behaviour and administrative changes to systems. However, the allocation of interventions to groups rather than individuals comes at a cost. The sample size is usually larger as a result of the nonindependence within the group and it is often difficult to infer what happened at an individual level [21]. As a result, this design has many detractors. An additional concern in cluster trials is the possibility of large variations between clusters that may make it difficult to detect actual differences between therapies [21].

In a cluster randomized trial, Murphy et al. [22] randomized wards at different hospitals to receive units of red cells with labels reminding nurses to check the patient and component identification. The randomization by wards was important to ensure that transfusions given without the reminder were not given by nurses who had been previously exposed to the intervention (reminders tags). In this study, the reminder tags did not result in an improvement in the bedside check for transfusion.

Selecting a study population

In transfusion medicine, most blood components are currently used in a wide variety of diseases and conditions. The choice of study population will invariably depend on the study question, the underlying hypothesis and on a number of other factors. The choice of a hypothesis that will address either therapeutic efficacy or effectiveness will have a substantial impact on the selection of the study population [10]. Specifically, in choosing an efficacy approach, investigators usually perform the study in a well-defined patient population where the intervention has the highest probability of demonstrating an effect. This may be done by narrowly defining the patient population through the use of restrictive eligibility criteria and disease definitions, as well as selecting specialized centres with clinical expertise in the field. Choosing a narrowly defined study population will decrease overall variability attributed to patient selection but may have adverse consequences such as hampering patient recruitment and jeopardizing the generalizability of study results. When defining the eligibility criteria for an effectiveness trial, investigators should consider utilizing more liberal criteria in a wide range of clinical settings (e.g. medical or surgical critically ill patients with a broad range of primary diagnoses or underlying conditions from a range of tertiary care centres).

On the spectrum between highly selected patients (efficacy) and a large patient population (effectiveness), investigators should consider a number of factors in making the decision (Table 45.3). The spectrum of biological activity of the intervention is an important consideration. For instance, a narrow spectrum of biological activity should translate into restricted eligibility while a broad spectrum of biological activity should yield more liberal eligibility criteria. Eligibility may also be restricted through the selection of study centres. In efficacy trials, highly specialized units should be sought while studies evaluating the effectiveness of an intervention would require the inclusion of a large number of nonspecialized centres. In practice, investigators should first focus on therapeutic efficacy and study the intervention in high risk and/or well-defined patient populations.

Table 45.3 Considerations in determining which design approach to implement in transfusion trials.


Choice of design

Criteria to consider

Favouring efficacy

Favouring effectiveness


Limited evidence

Efficacy well documented

Importance of the question

Rare and less serious

Common and serious problem


Not demonstrated

Adequate accrual and confirmed feasibility


Unknown or significant consequences

Minimal or acceptable risks given benefits


Limited or unknown benefits

Significant benefits anticipated

Table 45.4 Guides to the choice of outcome measure in an RCT.


1. Is the outcome causally related to the consequences of the disease?

2. Is the outcome clinically relevant to the healthcare providers and/or patients?

3. Has the validity of the outcome (for complex outcomes such as scoring systems or composite outcomes) been established?

4. Is the outcome easily and accurately determined?

5. Is the outcome responsive to changes in a patient's condition?

6. Is the outcome measure potentially able to discriminate between patients who benefit from a therapy from patients in the control group?

Selecting outcomes (Table 45.4)

In most clinical trials, the clinical investigative team should consider a number of potential outcomes, both fatal and nonfatal. An outcome is defined as a measurement (e.g. haematocrit) or an event (e.g. death) potentially modified following the implementation of an intervention. If all are given equal consideration, concerns arise about multiple comparisons and interpretation of a study with heterogeneous findings. Thus it is important to choose a primary outcome that will determine an intervention's therapeutic success or failure. Secondary outcomes will provide supportive evidence in secondary analyses and assess potential adverse outcomes. As a corollary, a predefined hierarchy implies that the investigators believe that clinically or statistically important differences in secondary outcomes, in the absence of important changes in the primary outcome, will not be interpreted as strong evidence of therapeutic benefit. The primary outcome is also essential in determining the sample size requirements in a clinical trial. Thus, once a decision has been made to determine either therapeutic efficacy or effectiveness (or possibly a combined approach), the second task facing investigators is determining and ranking outcomes as primary and secondary.

The choice of study outcome is one of the most important design considerations to be made by investigators. However, there are a number of factors that should be considered prior to selection of an outcome. The primary outcomes should be considered clinically important and easily ascertained. By fulfilling these two criteria, the investigator will have a much greater chance of influencing clinical practice once a study has been completed and published. Outcomes should also measure what they are supposed to measure (validity) and be precise and reproducible. An outcome must be able to detect a clinically important true positive or negative change in the patient's condition following a therapy.

The sample size in a clinical trial comparing two therapies is based on the baseline event rate, the expected incremental benefit or difference, the level of significance (α) and the power to detect differences (1 – β). Establishing the incremental benefit of a new therapy is vitally important because of the enormous sample size repercussions. A sample size calculation for an RCT requires that the investigators establish the minimum therapeutic effect detectable within the trial. This difference in outcomes between interventions is referred to as the minimally important difference (MID) or minimal clinically important difference (MCID) [10]. The MID is essentially establishing the level of discrimination in the study population who are exposed to the interventions given acceptable levels of type I (finding a difference when one does not truly exist) and type II (not finding a difference when one truly exists) errors and the baseline event rate. Too often, investigators calculate a sample size based on very large and unrealistic expected differences in outcomes. To determine a plausible effect size, investigators should ask themselves the following questions:

·        What difference or incremental benefit can be realistically expected of the experimental therapy? (Anticipated biological effect of therapy.)

·        Are the required number of patients available to participate in the clinical trial? (Feasibility.)

·        How much of a benefit, given the added costs and expected adverse effects of therapy, would be required for clinicians, patients and administrators to adopt a new therapy? (Overall benefit of therapy.)

As a concrete example, assuming that a given study population has an expected mortality rate of 25% in the standard therapy group while the experimental therapy is expected to decrease mortality by an absolute difference of 12.5% (a 50% relative risk reduction), the total number of patients required would be approximately 250. Most therapies used in the ICU would not be expected to decrease mortality so dramatically. More realistic expectations may be in the range of a 5% absolute decrease (a 20% relative risk reduction), which would require a total sample size of 2200 patients respectively if the baseline mortality was 25%. Investigators need to consider whether an absolute incremental benefit in the range of 5–10% is attainable using the experimental therapy. If not, another more discriminating outcome should be sought.

Frequently, the treatment effect or difference in the desired outcome is small. As a result, a surrogate or composite outcome may be chosen as the primary outcome for a trial to reduce the sample size. A surrogate outcome is defined as a laboratory or physical measure that accurately reflects a clinically meaningful outcome and, therefore, can act as a substitute outcome with the goal of reducing the sample size [23,24]. A composite outcome combines more than one individual outcome. The latter may increase statistical efficiency and can combine multiple endpoints that are equally important [24]. Both approaches must be used judiciously and results interpreted with caution [24,25]. Surrogate endpoints should clearly predict the clinical outcome, which may not be the case (e.g. corrected count increment and bleeding in platelet transfusion trials) [23,25]. Composite outcomes must be related, equally important, biologically plausible and clinically relevant [23,24].


In this chapter on interventional studies in transfusion medicine, several major observational and interventional design characteristics have been discussed. Study design issues of special interest to health professionals interested in transfusion medicine have been outlined. Suggestions when planning a RCT in transfusion medicine are provided in Table 45.5. Observational and quasi-experimental studies may provide invaluable information in transfusion medicine. Although RCTs provide the most unbiased and accurate assessment of the efficacy and effectiveness of therapeutic and preventive interventions, they remain challenging and expensive to conduct. As more research groups form to address unanswered therapeutic questions in transfusion medicine, investigators will invariably better understand the strengths and limitations of different RCT design characteristics.

Table 45.5 Suggestions when planning an RCT in transfusion medicine.

1. Explicitly determine whether you are primarily interested in establishing therapeutic efficacy or effectiveness.

2. Whenever possible, undertake an RCT as part of a broader research programme.

3. If the study intervention is complex (or risky) or if other aspects of study feasibility are questionable, a pilot study should be considered.

4. Whenever possible, investigators should use simple rather than complex designs (two groups parallel design versus factorial design).

5. The study population should be tailored to the intervention.

6. Ideally, the study intervention and treatment protocols should not aim to substantially modify or affect usual clinical practice.

7. Given the complexity of RCTs, data collection should aim to clearly describe the study population, describe cointerventions and all major study outcomes.

8. In choosing primary study endpoints investigators should focus on patient-oriented outcomes rather than surrogate or biological markers.

9. If you are planning a seminal RCT, you may only have one chance to get it right. When making compromises, always opt to answer questions that most clinicians consider most important.

10. In establishing the minimally important difference, select a potentially achievable benefit.

Key points

1. Properly conducted RCTs are the best means to evaluate the risk and benefits of therapeutic interventions.

2. Observational studies can be useful when RCTs are not feasible: case–control studies are particularly useful to evaluate rare outcomes and cohort studies can examine outcomes following known exposures, risk factors or therapies; however, all observational studies are prone to biases and cannot show causation.

3. The design of an RCT depends on whether the investigators wish to evaluate the efficacy or the effectiveness of an intervention.

4. A two-group parallel group design is the simplest RCT to design, execute and evaluate, but alternative designs can be useful in specific circumstances.

5. Selecting the appropriate study population and the outcomes are critical to ensure both the feasibility of completing the RCT and the generalizability and clinical relevance of the study results.


The authors wish to thank our students, teachers and colleagues who contributed many of the ideas outlined in this manuscript.


1. Hill AB. The clinical trial. N Engl J Med 1952; 247(4): 113–119.

2. Kelsey RA, Whittemore AS, Evans AS & Thompson WD. Case Control Studies: I. Planning and Execution. Oxford: Oxford University Press; 1996, pp. 188– 213.

3. Silliman CC, Boshkov LK, Mehdizadehkashi Z, Elzi DJ, Dickey WO, Podlosky L et al. Transfusion-related acute lung injury: epidemiology and a prospective analysis of etiologic factors. Blood 2003; 101(2): 454– 462.

4. Wilson K, Code C & Ricketts MN. Risk of acquiring Creutzfeldt–Jakob disease from blood transfusions: systematic review of case–control studies. Br Med J 2000; 321(7252): 17–19.

5. Carson JL, Duff A, Berlin JA, Lawrence VA, Poses RM, Huber EC et al. Perioperative blood transfusion and postoperative mortality. J Am Med Assoc 1998; 279(3): 199–205.

6. Grimes DA & Schulz KF. Bias and causal associations in observational research. Lancet 2002; 359(9302): 248–252.

7. van de Watering L, for the Biomedical Excellence for Safer Transfusion (BEST) Collaborative. Pitfalls in the current published observational literature on the effects of red blood cell storage. Transfusion 2011; 51(8): 1847–1854.

8. Koch CG, Li L, Sessler DI, Figueroa P, Hoeltge GA, Mihaljevic T et al. Duration of red-cell storage and complications after cardiac surgery. N Engl J Med 2008; 358(12): 1229–1239.

9. Grimshaw J, Campbell M, Eccles M & Steen N. Experimental and quasi-experimental designs for evaluating guideline implementation strategies. Fam Pract 2000; 17 (Suppl. 1): S11–S16.

10. Sackett D. The principles behind the tactic of performing clinical trials. In: RB Haynes, D Sackett, G Guyatt & P Tugwell (eds). Clinical Epidemiology: How to Do Clinical Practice Research. Lippincott Williams and Wilkins; 2009, pp. 173–243.

11. Walker ID, Walker JJ, Colvin BT, Letsky EA, Rivers R & Stevens R. Investigation and management of haemorrhagic disorders in pregnancy. Haemostasis and Thrombosis Task Force. J Clin Pathol 1994; 47(2): 100–108.

12. Randomised trial of intravenous atenolol among 16 027 cases of suspected acute myocardial infarction: ISIS-1. First International Study of Infarct Survival Collaborative Group. Lancet 1986; 2(8498): 57–66.

13. Finfer S, Bellomo R, Boyce N, French J, Myburgh J & Norton R. A comparison of albumin and saline for fluid resuscitation in the intensive care unit. N Engl J Med 2004; 350(22): 2247–2256.

14. Hebert PC, Wells G, Blajchman MA, Marshall J, Martin C, Pagliarello G et al. A multicenter, randomized, controlled clinical trial of transfusion requirements in critical care. Transfusion Requirements in Critical Care Investigators, Canadian Critical Care Trials Group. N Engl J Med 1999; 340(6): 409–417.

15. Slichter SJ, Kaufman RM, Assmann SF, McCullough J, Triulzi DJ, Strauss RG et al. Dose of prophylactic platelet transfusions and prevention of hemorrhage. N Engl J Med 2010; 362(7): 600–613.

16. Fergusson DA, Hebert PC, Mazer CD, Fremes S, MacAdams C, Murkin JM et al. A comparison of aprotinin and lysine analogues in high-risk cardiac surgery. N Engl J Med 2008; 358(22): 2319–2331.

17. Friedman LM, Furberg CD & DeMets DL. Fundamentals of Clinical Trials, 3rd edn. New York: Springer-Verlag; 1998.

18. ISIS-3: a randomised comparison of streptokinase vs tissue plasminogen activator vsanistreplase and of aspirin plus heparin vs aspirin alone among 41,299 cases of suspected acute myocardial infarction. ISIS-3 (Third International Study of Infarct Survival) Collaborative Group. Lancet 1992; 339(8796): 753–770.

19. Spiegelhalter DJ, Myles JP, Jones DR & Abrams KR. Bayesian methods in health technology assessment: a review. Health Technol Assess 2000; 4(38): 1–130.

20. Freedman LS, Spiegelhalter DJ & Parmar MK. The what, why and how of Bayesian clinical trials monitoring. Stat Med 1994; 13(13–14): 1371–1383.

21. Donner A Klar N. Design and Analysis of Cluster Randomization Trials in Health Research. London: Arnold; 2000.

22. Murphy MF, Casbard AC, Ballard S, Shulman IA, Heddle N, AuBuchon JP et al. Prevention of bedside errors in transfusion medicine (PROBE-TM) study: a cluster-randomized, matched-paired clinical areas trial of a simple intervention to reduce errors in the pretransfusion bedside check. Transfusion 2007; 47(5): 771–780.

23. Heddle NM, Arnold DM & Webert KE. Time to rethink clinically important outcomes in platelet transfusion trials. Transfusion 2011; 51(2): 430–434.

24. Heddle NM & Cook RJ. Composite outcomes in clinical trials: what are they and when should they be used? Transfusion 2011; 51(1): 11–13.

25. Arnold DM & Lim W. The use and abuse of surrogate endpoints in clinical research in transfusion medicine. Transfusion 2008; 48(8): 1547–1549.

26. Tay J & Tinmouth A. Observational studies: what is a cohort study? Transfusion 2007; 47(7): 1115–1117.

27. Tinmouth A & Hébert P. Interventional trials: an overview of design alternatives. Transfusion 2007; 47(4): 565–567.

Further reading

Campbell DT, Stanley JC. Experimental and Quasi-experimental Designs for Research. Chicago, IL: Rand McNally College Publishing Company; 1966.

Carson JL, Duff A, Poses RM et al. Effects of anaemia and cardiovascular disease on surgical mortality and morbidity. Lancet 1996; 348: 1055–1060.

Friedman LM, Furberg CD & Demets DL. Fundamentals of Clinical Trials, 3rd edn. St Louis, MI: Mosby Year Book; 1996.

Grimes DA & Schulz KF. Bias and causal associations in observational research. Lancet 2002; 359(9302): 248–252.

Guyatt GH, Sackett DL & Cook DJ. Users' guides to the medical literature II. How to use an article about therapy or prevention. Are the results of the study valid? J Am Med Assoc 1993; 270: 2598–2601.

Haynes RB, Sackett DL, Guyatt GH & Tugwell P. Clinical Epidemiology: A Basic Science for Clinical Medicine, 3rd edn. Philadelphia, PA: Lippincott Williams and Wilkins, 2006.

Hébert PC, Wells G, Blajchman MA et al. and the Transfusion Requirements in Critical Care investigators for the Canadian Critical Care Trials Group. Transfusion Requirements in Critical Care: a multicentre randomized controlled clinical trial. N Engl J Med 1999; 340: 409–417.

Sackett DL. Bias in analytic research. J Chron Dis 1979; 32: 51–63.

Sackett DL. The competing objectives of randomized trials. N Engl J Med 1980; 303: 1059–1060.

Sackett DL & Gent M. Controversy in counting and attributing events in clinical trials. N Engl J Med 1979; 301: 1410–1412.

The SAFE Study Investigators. A comparison of albumin and saline for fluid resuscitation. New Engl J Med 2004; 350: 2247–2256.