*Aki Morikawa and Andrew J. Vickers*

**Descriptive Statistics**

Summarizing data/describe attributes of a set of data

**Continuous data:** Mean & standard deviation (symmetric distribution or for decision making) median & quartiles (skewed distribution or for ready description of a distribution)

**Categorical data** (counts or frequencies): Frequency & percentage, prevalence expressed as a proportion (N w/a given dz at a given point in time/population at risk at that point in time), incidence expressed in terms of a unit of time (N of new cases occurred in a given time interval population at risk at the beginning of that time interval). Adjusted rates to compare different rates among different populations.

**Time-to-event data:** Number of events, median follow-up for participants w/o an event, median survival, probability of event at a given follow-up time (eg, 65% probability of death by five y)

**Measurement of Comparisons and Associations**

**Correlation (r):** Relationship between two numerical measures, ranges from −1 to +1. r = 1 → linear relationship between 2 variables.

**RR:** Ratio of incidence in “exposed” to incidence in “non-exposed”. RR = (Incidence group1/incidence group 2).

**OR:** Often used in case-control study. Ratio of odds in “exposed”/odds in “non-exposed.” Odds are calculated as number w/event divided by number w/o event

**Reliability:** Inter-rater reliability. Kappa (κ) → agreement between two observers on a binary variable

**Other comparisons:** t-test for mean or Wilcoxon test for nonparametric (continuous variable, 2 groups); ANOVA (continuous variable multiple groups); Χ^{2} test for proportions; McNemar test for proportions in paired groups; for multiple comparisons, probability of type I error ↑

**Confidence Intervals**

**Variability of estimate. Can include means, proportions or effects estimates such as RR.** Interpreted as a plausible range of values. Commonly 95% is used (accompanying p-value of 0.05 cut off) but other values can be specified depending on situation & study question.

**Hypothesis Testing**

A statistical test to test a study specific hypothesis vs. alternative

Common study hypothesis: Null hypothesis (H_{0})

1-tailed or 2-tailed test; 2-tail test is routine except under unusual circumstances; H_{0} rejected if the observation statistically significantly different (< or >). Specify level of significance alpha value (α); commonly 0.05 but chose other value depending on study question; α may be referred as type I error or FP rate.

Calculate p-value; if the p-value < α, H_{0} is rejected. If p-value ≥α, h_{0} is not rejected

**P-value** → the probability of a result as extreme as or more extreme than the one observed if H_{0} is true

**Type I error** = α (FP) or **type II error** = β (FN). **Power** = the prob of rejecting H_{0} when false (ie, correctly) & calculated by 1 − β.

**CI vs. P-value**

• P-value → mixes effect of size of estimate & variability (often influenced by sample size)

• P-value → calculated in relation to testing a specific hypothesis vs. CI → describes variability apart from size of estimate

**Sample Size Calculation**

Info commonly needed: Study type & question/end points (eg, survival? RR?), hypothesis, ie, expected size or difference of effects, α, 1 − β, test statistics, one side vs. two-sided test; for survival data → length of accrual & follow-up, median survival times & probabilities for H_{0} & alternate; specific statistical assumptions need to be satisfied (eg, no significant competing causes). Highly complex calculation involving considerable statistical judgment.

**Evaluating Diagnostic Procedure**

A test evaluated against a gold standard (a surrogate for true disease status) & observations classified as TP, TN, FP, & FN

**Figure 2-2**

**ROC curve:** A graphical display of TP (se) vs. FP (1 – sp) for different cut-off points of test. Evaluate the trade-off between se & sp for accuracy. Accuracy ↑ → curve moves toward the left upper corner.

Decision curve: A graphical display of clinical net benefit against threshold prob. Test w/the highest net benefit leads to best clinical outcome.

**Analysis of Time-to-Event Data**

**Unique issues:** Not all pts enter study at the time, some may withdraw or be lost to follow-up, & not all pts have an event w/in the study period; censored observations. General approach: Determine analyze in terms of event probabilities.

**Methods:** Actuarial or life-table analysis for survival times; Kaplan-Meier product-limited estimates (KM survival curve) → step-down wise graphical curve for cumulative prob of survival; logrank test → compare KM curves (assumption: Proportional hazard or hazard ratio is constant over time)

**Competing risk:** Calculate incidence separately for event of interest (eg, cancer recurrence) & competing event (eg, death from other causes)

**Modeling and Analysis of Multiple Variables**

**Modeling methods** → allow for analysis of multiple independent variables for a given outcome; w/specific assumptions that model needs to satisfy

Methods: Linear regression, COX (for censored observations, eg, survival), logistic regression (binary outcome), multilevel regression (for various levels, eg, individual vs. society level variable). Use for prediction modeling, adjusting for confounders in observational studies.

**Meta-analysis**

Summarize a set of studies w/a specific study question; a summary effect measure w/CI; forest plot → used to summarize the studies; test for heterogeneity among the studies: Cochran’s Q or I^{2} statistics; Funnel plot → assess possible publication bias. Sn analysis → evaluate effect of including/excluding certain studies.