﻿ Biostatistics for the Gastroenterologist - Gastroenterology and Hepatology Board Review: Pearls of Wisdom, 3th Edition

## Gastroenterology and Hepatology Board Review: Pearls of Wisdom, Third Edition

### CHAPTER 57. Biostatistics for the Gastroenterologist

Kendra K. Schmid, PhD and Elizabeth Lyden, MS The two most common measures of central tendency are the mean and median. Describe how to calculate each and explain the major difference between them.

To calculate the mean (average), add up all of the observations and divide by the number of observations. The median is the middle observation when all of the observations have been ordered. The mean can be influenced by extreme data values while the median is not. For a skewed distribution, which is a better choice as a measure of central tendency, the mean or the median? Why?

The median, since the mean can be influenced by extreme values. Describe how the mean and median would compare for skewed distributions.

For a distribution that is skewed to the right (positively skewed), the mean will be greater than the median. For a distribution that is skewed to the left (negatively skewed), the mean will be less than the median. For symmetric distributions, the mean and the median will be similar. Name and describe three measures of dispersion.

The most common measure of dispersion is the standard deviation, which measures the average distance between observations and the mean. Other measures of dispersion are range (maximum minimum), coefficient of variation (standard deviation/mean × 100%) and the interquartile range (75th percentile 25th percentile). Which measure of dispersion is most appropriate to present with each measure of central tendency?

The standard deviation should be reported with the mean as it measures the average distance between observations and the mean. The interquartile range can be reported with the median. Range is often reported with either mean or median. It is not appropriate to report standard deviation with the median. An investigator is interested in comparing two numerical distributions that are measured on different scales. Should the researcher use the standard deviation or the coefficient of variation? Why?

The investigator should use the coefficient of variation since the coefficient of variation adjusts for the scales of the variables by dividing the standard deviation by the mean. Explain the difference between standard deviation and standard error.

The standard deviation summarizes the variability of a group of observations. The standard error summarizes the expected variability of a sample statistic based on taking repeated samples of the same size. What statistic is most often used to describe the relationship between two numerical variables?

Correlation coefficient. For two variables X and Y, the correlation coefficient is r = .89. How does this describe the relationship between X and Y?

Since r is positive and close to 1, as the values of X increase, the values of Y also tend to increase. In this case, we say that X and Y are positively correlated and the relationship between them is strong. True/False: A correlation coefficient is r = .05. This means that X and Y are not related.

False. The correlation coefficient r measures the linear relationship between X and Y. Since r is close to 0, we can only say that X and Y are not linearly related. They may, however, be related in some manner which is not linear. True/False: The correlation between two variables, X and Y, is r =.70. The correlation between two other variables, Z and W, is r = -.79. Variables Z and W demonstrate a stronger linear relationship.

True. Z and W have a stronger linear relationship because the correlation between the two is larger. The sign of the correlation coefficient tells the direction of the relationship while the magnitude gives information on the strength. True/False: The equation of the regression line for two variables X and Y is: Y = 1.34 - .7XX and Y are positively correlated.

False. X and Y are negatively correlated. You can tell by examining the slope of the regression equation to see that as X increases, Y decreases. Describe the coefficient of determination.

The coefficient of determination, or r2, is the squared correlation. It describes the proportion of the variation in the dependent variable (Y) that is explained by the independent variable (X). Name a distribution that has a bell shape and a distribution that is skewed.

The normal distribution and t-distribution are bell shaped. The chi-square distribution and F distribution are skewed. Name a distribution typically used to describe count data and one typically used to describe success-failure data.

The Poisson distribution is often used to model count data and the binomial distribution can be used for success- failure data. According to The Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute (NCI), the probability that a newly diagnosed patient with colon cancer has Stage I disease (confined to the primary site) is .39 and Stage IV disease (metastasized) is .20. What is the probability that a newly diagnosed patient with colon cancer has either Stage I or Stage IV disease?

.39 + .20 = .59. In contrast, the probability that a newly diagnosed person has neither Stage I nor Stage IV is .41 (1 – .59 = .41). True/False: The events described in the previous question are mutually exclusive.

True. Mutually exclusive events are events that cannot occur simultaneously. These two events are mutually exclusive since a newly diagnosed patient with colon cancer cannot be classified with both Stage I and Stage IV diseases. Describe what it means for two events to be independent.

Two events are independent if the outcome of the first event does not affect the outcome of the second event. In tests used for screening purposes, differentiate between sensitivity and specificity.

Sensitivity indicates how good a screening test is at identifying the disease for which it is testing. It refers to the proportion of patients with the disease who test positive. Sensitivity can be calculated as follows: True positive/(True positive + False negative) × 100%.

Specificity indicates how good a screening test is at identifying the nondiseased group. It refers to the proportion of patients without the disease who test negative. Specificity can be calculated as follows: True negative/(True negative + False positive) × 100%. Define positive predictive value of a screening test and indicate when it is most predictive of the disease in question.

The positive predictive value of a screening test is the probability that the patient has the disease if the patient tested positive for the disease. The higher the prevalence of the disease in the population being tested, the more likely a positive test is a true positive. Calculate the sensitivity, specificity, positive predictive value, negative predictive value, and prevalence from the following 2 × 2 table, which shows the results of screening for IgA antigliadin (AGA) for diagnosing celiac disease (Gastroenterology. 2001 Apr;120[5,1]:A395).  Define odds.

Odds are used to compare one outcome to another outcome. For example, if in a certain population the odds of having a specific disease are 1:3, we would expect one person to have the disease for every three who do not. Define odds ratio.

The odds ratio measures the association between a risk factor and a disease. It is defined as the odds of disease among individuals exposed to a risk factor divided by the odds of disease among individuals not exposed to that risk factor. What does an odds ratio of 1 indicate?

An odds ratio of 1 indicates no association between a potential risk factor and the disease of interest. In other words, the odds of disease among exposed individuals are similar to the odds of disease among unexposed individuals. True/False: The odds of having disease A are twice as high in vegetarians as in nonvegetarians (ie, odds ratio = 2). The corresponding odds ratio for disease B is 0.5. Disease A is more strongly associated with eating habits.

False. The strength of the association is the same, only the direction of the association differs. The odds of disease A are twice as high in vegetarians as in nonvegetarians, while the odds of disease B are twice as high in nonvegetarians as in vegetarians. Distinguish between relative risk reduction (RRR) and absolute risk reduction (ARR).

RRR is the reduction of adverse outcomes achieved by a treatment expressed as a proportion of the adverse outcomes in the control group. ARR is the difference in the rates of adverse outcomes between the control and treated groups.

ARR = (% of control patients with adverse outcome) - (% of treatment patients with adverse outcome) Define what is meant by the number needed to treat (NNT).

NNT is the number of patients that need to be treated with a new therapy to prevent one additional adverse outcome. It is the inverse of the ARR.  True/False: The odds ratio estimates the relative risk when the disease of interest is common in the study population (> 10%).

False. The more prevalent the disease of interest, the more the odds ratio overestimates the relative risk when it is > 1 or underestimates it when it is < 1. The odds ratio provides a good estimate of the relative risk when the disease of interest is rare in the study population (< 10%). In a study designed to demonstrate that giving a proton pump inhibitor (PPI) prior to eradication therapy of Helicobacter pylori (HP) provides better patient adherence to therapy and therefore improved successful eradication rates than immediate triple therapy (TT), 33.33% of the TT group (controls) failed to achieve HP eradication and 18.42% of the PPI prior group (treated) failed to achieve HP eradication. (Gastroenterology. 2010 May;138[5,1]:S 338). Calculate the ARR, RRR, and NNT (with the TT group considered to be the control).  Differentiate between point estimation and interval estimation. Which type is preferred?

Point estimation involves using a summary statistic obtained from a sample as an estimate of a population parameter (for example, the sample mean is used as an estimate of the population mean, µ Interval estimation involves creating an interval around the point estimate that contains reasonable values for the population parameter (also known as a confidence interval). Interval estimation is almost always preferred to point estimation as it provides information on the variability of the estimate. What does a 95% confidence interval imply?

It implies that if we repeatedly select random samples from the same population as our data and construct interval estimates for each sample, 95 out of 100 of the intervals would be expected to contain the true parameter. Explain the meaning of the 95% confidence interval: (23 < µ < 35).

If a large number of such confidence intervals are constructed, about 95% of them will contain the true mean; therefore, we can be 95% confident that the true mean is between 23 and 35. It does not mean that the probability that µ is between 23 and 35 is 95%. Differentiate between type I and type II errors in hypothesis testing.

Type I error (false-positive) is rejecting the null hypothesis when it is true; the probability of a type I error is denoted by α.

Type II error (false-negative) is failing to reject the null hypothesis when it is false; the probability of a type II error is denoted by β. The power of a statistical test is 1 – β. Differentiate between a two-tailed and a one-tailed test.

A two-tailed (or nondirectional) test occurs when researchers do not know a priori the direction of the value they expect to observe in the sample. For example, they want to know if the sample mean differs from the population mean.

A one-tailed (or directional) test occurs when researchers know a priori the direction of any true difference between the value observed in the sample and the population parameter. For example, they want to know if the sample mean is larger (or smaller) than the population mean. True/False: The P-value is the probability of obtaining a result as extreme or more extreme than the result obtained from the sample when the null hypothesis is assumed to be true.

True. A researcher is interested in the following hypothesis test: Is this a one-tailed or a two-tailed test?

This is a one-tailed test. This can be determined by examining the alternative hypothesis. For this test, the null hypothesis is rejected only if µ is sufficiently greater than zero. The hypothesis test H0µ = 0 versus H1µ ≠ 0 is an example of a two-tailed test since the null hypothesis is rejected if µ is sufficiently greater than zero or if µ is sufficiently less than zero. A researcher is interested in the following hypothesis test with a .05 level of significance (α = .05): The P-value is found to be .03. What is the conclusion of the test?

The P-value rule is to reject H0 if the P-value is less than α and do not reject H0 if the P-value is greater than or equal to a. Since the P-value is .03 and α = .05, the conclusion is reject H0 (the null hypothesis). How is the power of this type of test defined?

The power of a test is the probability of rejecting the null hypothesis when in fact the null hypothesis is false. Typically, at least 80% power to detect an effect is desired. In a study designed to look at antibiotic prophylaxis in necrotizing pancreatitis, a total sample size of 200 patients was calculated to demonstrate with a power of 90% that antibiotic prophylaxis reduces the proportion of patients with infected pancreatic necrosis from 40% placebo (PLA) to 20% ciprofloxacin/metronidazole (CIP/MET) (Gastroenterology. 2004 Apr;126:997-1004.). If the researcher decides that a difference between groups of 10% is more clinically relevant, how will the required number of patients be affected?

To detect a smaller effect (10% versus 20% difference), more patients will be needed. If we want to distinguish groups based on a finer scale, more information about the groups is needed. Larger effects are easier to see, so they require fewer subjects. A researcher is interested in the following hypothesis test with significance level .05 (α = .05): The 95% confidence interval for µ is: (–2.3 ≤ µ ≤ 1.5). What is the conclusion of the hypothesis test?

Since zero is included in the above confidence interval, zero is included as one of the reasonable values for µ. In other words, it is possible that in fact µ = 0. Therefore, the conclusion is do not reject H0. What is meant by a nonparametric test?

Nonparametric tests do not specify the distribution of the data. In other words, they are distribution-free tests. Nonparametric tests should be used when the data are not normally distributed or the distribution is unknown. Explain the difference between an observational study and an experimental study.

In an observational study, patients are observed (no intervention is applied) and the characteristics of interest are recorded. In an experimental study, an intervention is applied and the effect of the treatment on the subjects is analyzed. Explain the difference between a case-control study and a cohort study.

A case-control study compares a group of individuals with the disease of interest (cases) with a group of individuals without the disease of interest (controls), and looks back in time to see how the characteristics of the two groups differ. A cohort study identifies a group of individuals exposed to risk factor and a group of individuals unexposed to a risk factor, and follows the groups over a period of time to see if the risk factor affects disease development. What is a double-blind trial and what is its purpose?

A double-blind trial is when neither the experimenter nor the subject knows whether they are in the control group or the treatment group. The purpose is to reduce the chance for bias by preventing the researcher from interpreting the results in a manner that supports the researcher’s goals. This is especially important in a trial with a subjective outcome. What does concealment of allocation mean with respect to clinical trials?

Concealment of allocation means that the researcher who enrolls patients into a trial (ie, attempts to get informed consent from the patient) does not know if the next patient to enter the trial will get the experimental treatment or a placebo. Differentiate between an intention-to-treat analysis and per protocol analysis.

Intention-to-treat analysis is when all patients who enrolled in a clinical trial are included in the final data analysis, regardless of whether or not the patient completed the trial.

Per protocol analysis is when only patients who properly complete the clinical trial are included in the final data analysis. A study is designed so that every member of the population has an equal probability of being selected for the study. What sampling method is being used?

Simple random sampling. Differentiate between the life-table method and the Kaplan–Meier method of calculating estimates of the survival distribution.

Life-table method: The time axis showing the total observation period or follow-up time is divided into distinct intervals (not necessarily of equal length) and the numbers of deaths and withdrawals (censored patients) are shown for each interval. This method is useful when the exact times of death or withdrawal are unknown.

Kaplan–Meier method: Exact times of death and withdrawal must be known as calculations are made at each time of death. Similar to the life-table method, the survival curve will resemble a step function; curves constructed by the Kaplan–Meier estimate will step down at each time of death. Define median survival.

Median survival is defined as the time at which 50% of the population under study has “failed” (died, progressed, relapsed, etc). Explain what is meant by censored observation.

Censored observation refers to patients who do not reach a disease endpoint (died, progressed, relapsed, etc) during their period of follow-up. A patient is censored at time t if the patient has been followed up to time t and has not “failed” (died, progressed, relapsed, etc).

• • • SUGGESTED READINGS • • •

Petrie A, Sabin C. Medical Statistics at a Glance. 3rd ed. West Sussex, UK: John Wiley & Sons; 2009.

Altman DG. Practical Statistics for Medical Research. London: Chapman & Hall; 1991.

﻿﻿