Maureen G. Phipps
Daniel W. Cramer
• Clinical research includes a range of research disciplines and approaches, including patient-oriented research, clinical trials, epidemiology, and outcomes research.
• Patient-oriented research centers on understanding mechanisms of human disease, studies of therapies, or interventions for disease.
• Clinical trials use a controlled experimental design to assess the effectiveness of an intervention on an outcome.
• Epidemiology is the study of the distribution and determinants of health and disease in specific populations.
• Outcomes research and health services research include studies that seek to identify the most effective and efficient intervention, treatments, and services for patient care.
• Study designs include experimental studies (clinical trials), observational studies (cohort studies, case-control studies, and cross-sectional studies), and descriptive studies (case reports and case series).
• Scientific validity of a research study is evaluated by understanding the study question, how the study was designed, and whether chance, bias, or confounding could have accounted for the findings.
Study Designs
Medical practice is evolving to include complex options for patient treatment and preventive care, in part because clinical research methods and techniques to guide patient care have advanced. To evaluate whether new treatments and diagnostic approaches should be integrated into clinical practice or decide whether observational data reported in the literature is relevant, clinicians should understand the fundamental strengths and limitations of clinical research methods and the level of evidence different types of studies provide.
As outlined by the National Institute of Child Health and Human Development, clinical research includes patient-oriented research involving understanding mechanisms of human disease, studies of therapies or interventions for disease, clinical trials, and technology development.
Epidemiologic methods and behavioral research are used in clinical research to examine the distribution of disease and the factors that affect health and how people make health-related decisions. Outcomes research and health services research include studies that seek to identify the most effective and efficient intervention, treatments, and services for patient care (1).
The purpose of a research study is to test a hypothesis about and to measure an association between exposure (or treatment) and disease occurrence (or prevention). The type of study design influences the way the study results should be interpreted.
Analytic studies are often subdivided into experimental studies (clinical trials) and observational studies (cohort studies, case-control studies, and cross-sectional studies).
Descriptive studies (case reports and case series) often provide useful information for informing future analytic studies.
The common types of clinical research study methods, considerations for the strength of evidence for the specific study design, and interpretation of the results are presented. Although there is debate about which system should be used for evaluating the strength of evidence from an individual study, a well-designed and executed clinical trial presents the highest level of evidence (2). Other types of studies should be designed to best approach the strengths of a clinical trial.
Clinical Trials
Clinical trials include intervention studies where the assignment to the treatment or control condition is controlled by the investigator and the outcomes to be measured are clearly defined at the time the trial is designed. Features of randomized clinical trials include randomization (in which participants are randomly assigned to exposures), unbiased assessment of outcome, and analysis of all participants based on the assigned exposure (an "intention to treat" analysis).
There are many different types of clinical trials, including studies designed to evaluate treatments, prevention techniques, community interventions, quality-of-life improvements, and diagnostic or screening approaches (3). Since 2007, investigators conducting randomized clinical trials are expected to register the trial to comply with mandatory registration and results reporting requirements (4).
Clinical Trial Phases
New investigational drugs or treatments are usually evaluated by clinical trials in phases with more people being involved as the purpose of the study becomes more inclusive (3).
Phase I Trials In these trials, researchers test an experimental drug or treatment for the first time in a small group of people (20–80) to evaluate its safety, determine a safe dosage range, and identify side effects.
Phase II Trials In these, the experimental study drug or treatment is given to a larger group of people (100–300) to see whether it is effective and to further evaluate its safety.
Phase III Trials In phase III trials, the experimental study drug or treatment is given to large groups of people (1,000–3,000) to confirm its effectiveness, monitor side effects, compare it to commonly used treatments, and collect information that will allow the experimental drug or treatment to be used safely.
Phase IV Trials These are postmarketing studies that delineate additional information, including the drug’s risks, benefits, and optimal use.
Randomized Controlled Double-Blinded Clinical Trial
The randomized controlled double-blinded clinical trial is considered the gold standard for evaluating interventions because randomizing treatment assignment and blinding both the participant and the investigator are the cornerstones for minimizing bias. When studies are not randomized or blinded, bias may result from preferential assignment of treatment based on patient characteristics or an unintentional imbalance in baseline characteristics between treatment groups, leading to confounding.
Although not all studies can be designed with blinding, the efforts used in the trial to minimize bias from nonblinding should be explained. Investigators are expected to provide evidence that the factors that might influence outcome[MB1], such as age, stage of disease, medical history, and symptoms, are similar in patients assigned to the study protocol compared with patients assigned to placebo or traditional treatment. Published reports from the clinical trial are expected to include a table showing a comparison of the treatment groups with respect to potential confounders and to demonstrate that the groups did not differ in any important ways before the study began.
CONSORT Checklist
Clearly defining the outcome or criteria for successful treatment helps ensure unbiased assessment of the outcome. A well-designed clinical trial has a sufficient number of subjects enrolled to ensure that a "negative" study (one that does not show an effect of the treatment) has enough statistical power to evaluate the predetermined (a priori), expected treatment effect. The Consolidated Standards of Reporting Trials (CONSORT) Statement is an evidence-based, minimum set of recommendations for reporting on randomized controlled trials developed by the CONSORT Group to alleviate the problems arising from inadequate reporting of randomized controlled trials.The 25-item CONSORT checklist (Table. 4.1) and flow diagram (Fig. 4.1) offer a standard way for authors to prepare reports of trial findings, facilitating their complete and transparent reporting and aiding their critical appraisal and interpretation (5).
Figure 4.1 CONSORT flow diagram.
Table 4.1 CONSORT checklist.
Clinical Trial Design Considerations
Clinical trials are considered a gold standard, because when done well they provide information about both relative and absolute risks and minimize concerns about bias and confounding (see the section on Presenting and Understanding the Results of Analytic Studies). Many clinical research questions are not amenable to clinical trials because of cost restraints, length of time required to complete the study, and feasibility of recruitment and implementation.
When evaluating the results from a clinical trial, consider how restrictive inclusion and exclusion criteria may narrow the participant population to such a degree that there may be concerns about external validity or generalizing the results. Other concerns include blinding, loss to follow-up, and clearly defining the outcome of interest. When the results of a randomized controlled trial do not show a significant effect of the treatment or intervention, the methods should be evaluated to understand what assumptions (expected power and effect size) were made to determine the necessary sample size for the study.
Intention-to-Treat Analysis
Randomized controlled trials should be evaluated with an intention-to-treat analysis, which means that all of the people randomized at the initiation of the trial should be accounted for in the analysis with the group to which they were assigned. Unless part of the overall study design, even if a participant stopped participating in the assigned treatment or "crossed over" to another treatment during the study, they should be analyzed with the group to which they were initially assigned. All of these considerations help to minimize bias in the design, implementation, and interpretation of a clinical trial (6).
Observational Studies
In cases where the exposure and outcome are not amenable to an experimental design, because the exposure is known or suspected to have harmful effects, observational studies may be used to assess association. Observational studies, including cohort, case-control, and cross-sectional studies, are analytic studies that take advantage of "natural experiments" in which exposure is not assigned by the investigator; rather, the individuals are assessed by the investigator for a potential exposure of interest (present or absent) and outcomes (present or absent). The timing of the evaluation of the exposure and outcome defines the study type.
Cohort Studies
Cohort studies often are referred to as longitudinal studies. Cohort studies involve identifying a group of exposed individuals and unexposed individuals and following both groups over time to compare the rate of disease (or outcome) in the groups. Cohort studies may be prospective, meaning that the exposure is identified before outcome, or retrospective, in which the exposure and outcome have already occurred when the study is initiated. Even in a retrospective cohort study, the study is defined by the fact that the cohorts were identified based on the exposure (not the outcome), and individuals should be free of disease (outcome) at the beginning time point for the cohort study (Fig. 4.2).
Figure 4.2 Schematic of prospective and retrospective cohort study designs.
In a study that includes a survival analysis, the two cohort groups (exposed and nonexposed) begin with a population that is 100% well (or alive) at the beginning of the study. The groups are followed over time to calculate the percentage of the cohort still well (or alive) at different time points during the study and at the end of the study. Although a survival analysis typically describes mortality after disease (i.e., cancer patients who died within 5 years), it can be adapted to other events and outcomes (e.g., the percentage of women who become pregnant while using long-acting contraceptives).
Cohort Study Design
Strengths of cohort studies include the ability to obtain both attributable and relative risks because the occurrence of the outcome is being compared in two groups (see the section on Presenting and Understanding the Results of Analytic Studies). However, only associations can be established, not causality. Because randomization is not part of the study design, the investigator must consider that a factor associated with the exposure may lead to the outcome rather than the exposure itself. Misclassifying the exposure or the outcome and confounding variables are potential sources of bias in cohort studies.
Given that truly prospective cohort studies can be expensive and take a long time for completion, there should be compelling evidence for the public health importance of the exposure(s) and association(s) being addressed. Issues related to sample size and participant retention in the study protocol are as important in cohort studies as they are in randomized controlled trials.
Case-Control Studies
A case-control study starts with the identification of individuals with a disease or outcome of interest and a suitable control population without the disease or outcome of interest. The controls should represent a sample of the population from which the cases arose and who were at risk for the disease or outcome but did not develop it. The relationship between a particular attribute or exposure to the disease is retrospectively studied by comparing how the cases and controls differed in that exposure (Fig. 4.3).
Figure 4.3 Schematic of case control study design.
Odds Ratio
The measure of association for a case-control study is the odds ratio, which is the ratio of exposed cases to unexposed cases, divided by the ratio of exposed to unexposed controls (see the section on Presenting and Understanding the Results of Analytic Studies). If an entire population could be characterized by its exposure and disease status, the exposure odds ratio would be identical to the relative risk obtainable from a cohort study of the same population. Although the relative risk (RR) cannot be calculated directly from a case control study, it can be used as an estimate of the relative risk when the sample of cases and controls are representative of all people with or without the disease and when the disease being studied is uncommon. Attributable risk is not directly obtainable in a case-control study.
Case-Control Study Considerations
The advantages of case-control studies are that they are generally lower in cost and easier to conduct than other analytic studies. Case-control studies are most feasible for examining the association between a relatively common exposure and a relatively rare disease. Disadvantages include greater potential for selection bias, recall bias, and misclassification bias.
Case-control studies may be especially prone to selection bias and recall bias. Investigators need to understand sampling issues around which cases and controls were selected for their study and how these may have affected exposure rates. Subtle issues, such as interviewer technique, may affect the likelihood that cases may recall or report exposures more readily than controls.
Cross-Sectional Studies
Cross-sectional studies assess both the exposure and the outcome at the same point in time. Individuals are surveyed to provide a "snapshot" of health events in the population at a particular time. Cross-sectional studies are often called prevalence studies because the disease exists at the time of the study, and the longitudinal follow-up and disease duration are not known. Prevalence is the existing number of cases at a specific point in time.
Cross-sectional studies are often done to evaluate a diagnostic test. The value of the test (predictor) is compared with the outcome (disease). The results of these evaluations are often presented as sensitivity and specificity. The sensitivity and specificity represent the characteristics of a given diagnostic test and do not vary by population characteristics. In contrast the negative and positive predictive values of a test do vary with the baseline characteristics of a population such as prevalence of a disease (Fig. 4.4).
Figure 4.4 Comparison of sensitivity, specificity and predictive values when the prevalence of the disease varies.
Cross-Sectional Study Considerations
Although cross-sectional studies are primarily descriptive, they may contribute information about suggested risk factors for disease by showing how that disease varies by age, sex, race, or geography. In ecologic studies, disease rates in various populations are correlated with other population characteristics (e.g., endometrial cancer rates worldwide are positively correlated with per capita fat consumption and negatively correlated with cereal and grain consumption) (7).
Caution must be used in interpreting findings from a cross-sectional study because there is no temporal relationship between the exposure and the outcome; therefore, causality cannot be established. However, cross-sectional data can be valuable in informing analytic study designs or used as supporting data for documenting the consistency of an association.
Descriptive Studies
Descriptive studies, case reports and case series, do not include comparison groups.
Case Reports and Case Series
In a case report or case series, the characteristics of individuals who have a particular disease are described. A case report usually describes an unusual clinical scenario or procedure in a single patient, whereas a case series usually includes a larger group of patients with similar exposures or outcomes. Just because members of a case series share a particular characteristic, it cannot be assumed that there is a cause-and-effect relationship.
Hypotheses about exposures and disease may be developed from descriptive studies that should then be explored in analytic studies. Because a case series has no comparison group, statistical tests of association between the exposure and outcome cannot be performed. A case series usually does not yield any measure of association other than estimates of the frequency of a particular characteristic among members included in the case series.
Presenting and Understanding the Results of Analytic Studies
To present the results of clinical trials or observational studies, a variety of rates and measures may be derived, as summarized below. To judge the scientific validity of the results of clinical studies, an investigator needs to consider whether the finding could have occurred simply by chance, by performing appropriate statistical testing, or if there are other possible explanations for the reported association, including bias or confounding.Besides statistical significance and freedom from bias or confounding, there are several additional criteria that can be applied to judge whether the treatment truly did affect disease outcome or whether an exposure truly caused the disease, as outlined below.
Rates and Measures
The terminology associated with rates and measures include (Fig. 4.5):
Figure 4.5 Calculating rates and measures.
• Incidence (IR)—frequency of newly identified disease or event (outcome).
• Prevalence (PR)—frequency of an existing disease or outcome during a specified period or point in time.
• Odds Ratios (OR)—ratio of the probability of an exposure in one group (cases) compared with probability of the exposure in another group (controls).
• Relative Risk (RR)—ratio of risk in the exposed group compared with the risk in the unexposed group. If the RR = 1 (or not significantly different from 1) then the risk in the exposed group is equal to the risk in the nonexposed group. RR >1 may suggest a positive association with the exposed group having greater risk than the nonexposed group, whereas a RR <1 implies a negative association with the exposed group having less risk than the nonexposed group.
• Absolute Risk Reduction (ARR)—the difference in risk between the unexposed (control) group and the exposed (treatment) group.
• Relative Risk Reduction (RRR)—the percentage of reduction in the risk comparing the unexposed (control) group to the exposed (treatment) group.
• Number Needed to Treat (NNT)—represents the number of people who would need treatment (or the intervention) to prevent one additional outcome (to calculate the NNT, take the inverse of the ARR, i.e., 1 ÷ ARR).
• Sensitivity—among the people who have the outcome, this is the proportion who have a positive test.
• Specificity—among the people who do not have the outcome, this is the proportion who have a negative test.
• Negative Predictive Value (NPV)—among the people who have a negative test, this is the proportion who do not have the outcome.
• Positive Predictive Value (PPV)—among the people who have a positive test, this is the proportion who have the outcome.
Statistical Testing
Statistical testing is used in clinical research for hypothesis testing in which the investigator is evaluating the study results against the null hypothesis (that there is no difference between the groups). Results from statistical testing allow the investigator to evaluate how likely it is that the study result is caused by chance rather than an intervention or exposure (p value). In the case where a study failed to find a significant difference, it is equally important to describe the likelihood that the study conclusion was wrong and that a difference truly exists. Finally, it is important to provide as precise a measure of the treatment effect or association as possible and convey to the reader the plausible range that the "true" effect resides (or confidence interval).
P Value and Statistical Significance
The p value is a reflection of the probability of a type I error (alpha). This reflects the probability that a difference between study groups could have arisen by chance alone. In other words, it is the probability that there is a difference between therapies, interventions, or observed groups when a true difference does not exist.
Historically in the medical literature, a p value of less than or equal to 0.05 was used to determine statistical significance. This reflects a probability of 1 in 20 that the null hypothesis was rejected based on the results from the study sample. This p value may be adjusted downward if multiple associations are being tested and the chances of false discovery are high. In genome-wide association studies, in which hundreds of thousands of genetic variants are tested between groups, p values are frequently set at 10−7 (0.0000001).
Beta Error and Power
Type II (or beta) error reflects the probability of failing to reject the null hypothesis when in reality it is incorrect (i.e., there truly was a treatment effect or a difference between the observed groups). In clinical trials it is important for the investigator to address the beta error, even in the design stage of the study. Study planners should calculate the power (or 1–the [MB2]beta error) that their study would have to detect in an association, given assumptions made about the differences expected between treatments, and design the study size accordingly. Small clinical trials may be cited as evidence for "no effect of therapy," even though statistical power may not have been addressed at all or is well below the target of 80% power, generally set as adequate justification for a selected sample size.
Confidence Intervals
Confidence intervals (CI) provide the investigator an estimated range in which the true statistical measure (e.g., mean, proportion, and relative risk) is expected to occur. A 95% confidence interval implies that if the study were to be repeated within the same sample population numerous times, the confidence interval estimates would contain the true population parameter 95% of the time. In other words, the probability of observing the true value outside of this range is less than 0.05. When evaluating measures of association, such as odds ratio or relative risk with 95% CI, values that include 1 (no difference) are not considered statistically significant.
Meta-analysis
One way of improving precision of the effect measure and narrowing the confidence interval is to perform a meta-analysis in which treatment effects from several clinical trials are aggregated to provide a summary measure. Meta-analysis is a favorite tool of the Cochrane database, with which clinicians should be familiar (8). However, there are important considerations in interpreting the meta-analysis, including whether studies were similar enough in their design to be aggregated. Guides for systematic reviews and meta-analysis that involve randomized controlled trials (i.e., the Preferred Reporting Items for Systematic Reviews and Meta-Analyses [PRISMA] statement) and observational studies (i.e., the Meta-analysis Of Observational Studies in Epidemiology [MOOSE] guidelines)[MB3] are excellent resources for both the investigator and the reviewer (9,10).
Bias
Bias is a systematic error in the design, conduct, or analysis of a study that can result in invalid conclusions. It is important for an investigator to anticipate the types of bias that might occur in a study and correct them during the design of the study, because it may be difficult or impossible to correct for them in the analysis.
• Information bias occurs when participants are classified incorrectly with respect to exposure or disease. This may occur if records are incomplete or if the criteria for exposure or outcome were poorly defined, leading to misclassification.
• Recall bias is a specific type of information bias that may occur if cases are more likely than controls to remember or to reveal past exposures. In addition to establishing well-defined study criteria and accessing complete records, information bias may be reduced by blinding interviewers to a participant’s study group.
• Selection bias may occur when choosing cases or controls in a case-control study and when choosing exposed or unexposed subjects in a cohort study. A systematic error in selecting participants may influence the outcome by distorting the measure of association between the exposure and the outcome. Including an adequately large study sample and obtaining information about nonparticipants may reduce bias or provide information to evaluate potential selection bias.
Confounding
A confounder is a known risk factor for the disease and is associated with the exposure. The confounder may account for the apparent effect of the exposure on the disease or mask a true association. Confounders have unequal distributions between the study groups.
• Age, race, and socioeconomic status are potential confounders in many studies. Results may be adjusted for these variables by using statistical techniques such as stratification or multivariable analysis. Adjusting for confounding variables aids in understanding the association between the outcome and exposure if the confounding variable were constant.
• Multivariable analysis is a statistical technique commonly used in epidemiologic studies that simultaneously controls a number of confounding variables. The results from an adjusted analysis include the adjusted odds ratio or relative risk that reflects an association between the exposure and the outcome and accounts for the specific known confounders that were included in the analysis.
Causality and Generalizability
The criteria needed to establish a causal relationship between two factors, especially exposure and disease, are defined (11). Although there are nine separate criteria for judging whether an association is likely to be causal, several of these criteria are most relevant for clinical studies.
• Biologic gradient or dose response refers to a relationship between exposure and outcome such that a change in the duration, amount, or intensity of the exposure is associated with a corresponding increase or decrease in disease risk.
• Plausibility refers to knowledge of the pathologic process of the disease or biologic effects of the exposure that would reasonably support an association. Plausibility overlaps with another concept, coherence, which also refers to compatibility with the known biology of the disease.
• Experiment refers to the evidence that the disease or outcome can be prevented or improved by an experiment that eliminates, reduces, or otherwise counters the exposure.
• Consistency refers to whether the association was repeatedly observed by different investigators, in different locations and circumstances.
• Temporality refers to the concept that cause must precede effect. For example, is it possible in a case-control study that symptoms of preclinical disease could lead to the exposure? Investigators must demonstrate that the exposure was present before the disease developed.
• Strength refers to the strength of association. The further the deviation of the relative risk or odds ratio from 1, the stronger the association and the easier it is to accept that the study results are real. For example, studies have shown that the possession of a BRCA mutation may increase the lifetime risk for ovarian or breast cancer some 30-fold. Although strength is a very important criterion, large-scale genetic studies suggest that other factors are equally important. For example, multiple studies reported several variants at the 8q24 chromosomal region associated with prostate and other cancers (12). Even though possession of one allele may change risk by only about 15% (i.e., OR = 1.15), the consistency and high statistical significance suggest the association cannot be caused by chance and this is considered a true association.
Summary
Reviewing the medical literature is part of the ongoing education for those who provide clinical care. Incorporating research findings into clinical care is enhanced by understanding different study designs, their strengths and weaknesses, and the measures of association they are able to provide. Evaluating whether there is enough evidence available to support changing a specific medication, procedure, or protocol used to care for patients is the cornerstone to improving clinical practice. In a field that is rapidly progressing, understanding clinical research helps physicians provide optimal care for the women they treat everyday.
References
1. National Institutes of Health. Eunice Kennedy Shriver, National Institute of Child Health & Human Development. Clinical research and clinical trials: what is clinical research? 2009. Available online at: http://www.nichd.nih.gov/health/clinicalresearch
2. Atkins D, Eccles M, Flottorp S, et al. Systems for grading the quality of evidence and the strength of recommendations I: critical appraisal of existing approaches. The GRADE Working Group. BMC Health Serv Res 2004;4:38.
3. U.S. National Institutes of Health. 2010. http://www.clinicaltrials.gov
4. U.S. Food and Drug Administration. Regulatory information: Food and Drug Administration Amendments Act (FDAAA) of 2007. Available online at: http://www.fda.gov/RegulatoryInformation/Legislation/FederalFoodDrugandCosmeticActFDCAct/SignificantAmendmentstotheFDCAct/FoodandDrugAdministrationAmendmentsActof2007/default.htm
5. Schulz KF, Altman DG, Moher D. CONSORT Group. CONSORT 2010 Statement: updated guidelines for reporting parallel group randomized trials. BMJ 2010;340:c332.
6. Hulley SB, Cummings SR, Browner WS,et al. Designing clinical research. 3rd ed. Philadelphia: Lippincott Williams & Wilkins, 2007.
7. Armstrong B, Doll R. Environmental factors and cancer incidence and mortality in different countries, with special reference to dietary practices. Int J Cancer 1975;15:617–631.
8. The Cochrane Library. About the Cochrane library. Available online at: http://www.thecochranelibrary.com/view/0/AboutTheCochraneLibrary.html
9. Liberati A, Altman DG, Tetzlaff J, et al. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. J Clin Epidemiol 2009;62:e1–34.
10. Stroup DF, Berlin JA, Morton SC, et al. Meta-analysis of observational studies in epidemiology: a proposal for reporting. Meta-analysis Of Observational Studies in Epidemiology (MOOSE) group. JAMA 2000;283:2008–2012.
11. Hill AB. The environment and disease: association or causation? Proc R Soc Med 1965;58:295–300.
12. Yeager M, Chatterjee N, Ciampa J, et al. Identification of a new prostate cancer susceptibility locus on chromosome 8q24. Nat Genet 2009;41:1055–1057.
SECTION II BASIC PRINCIPLES