1、Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics,Biostatistics 510 13-15 March 2007 Carla Talarico,Overview,Variable stratification Cochran-Mantel-Haenszel (CMH) statistics Matching and matched data Agreement statistics McNemars Test Cohens Kappa,Stratification by a
2、 Third Variable,Exposure of interest Disease outcome Third variable, e.g., confounder,Confounding,Effect of exposure on disease may be different in the presence of a third variable (“Confounder”) Reflects the fact that epidemiologic research is conducted among humans with unevenly distributed charac
3、teristics Results because of a lack of comparability between the exposed and unexposed groups in the base population,Controlling for Confounding,Design phase of studies Randomization in experimental studies Restriction Matching Analysis phase Stratified analysis Model fitting,Stratified Analyses: Th
4、e CMH Option in SAS,Gives a stratified statistical analysis of the relationship between Exposure (E) and Disease (D), after controlling for a Confounder (C):,Proc freq;tables C * E * D / cmh; Run;,Proc freq;tables C1 * C2 * E * D / cmh; Run;,Can simultaneously stratify by multiple confounders:,Estim
5、ates of Common Relative Risk for 2x2 Tables,Adjusted odds ratio (OR) and relative risk (RR) for stratified 2x2 tables with 95% CL Obtain OR and RR estimates for association between Exposure and Disease, adjusted for the Confounder For this course, report the Mantel-Haenszel estimate of the common od
6、ds ratio, ORMH,Breslow-Day Test for Homogeneity of the Odds Ratios,For stratified 2x2 tables Null hypothesis is that the ORs are equal across all strata 2 distribution with q 1 df, where q is the number of strata Alternative hypothesis is that at least one stratum-specific OR differs from other stra
7、tum-specific ORs,2BD (cont),If reject H0 for 2BD test: There is evidence for heterogeneity of ORs across strata; not appropriate to report the adjusted common OR Report the stratum-specific ORs when effect modification is present,CMH Statistic 1: Nonzero Correlation,Tests the null hypothesis of no a
8、ssociation vs. the alternative hypothesis that there is a linear association between the row and column variables in at least one stratum Both row and column variables have to be ordinal Under H0, 2 with 1 df,CMH Statistic 2: Row Mean Scores Differ,Tests the null hypothesis of no association vs. the
9、 alternative hypothesis that the mean scores of the table rows are unequal for at least one stratum Useful only when the column variable is ordinal Under H0, 2 with (r 1) df,CMH Statistic 3: General Association,Tests the null hypothesis of no association vs. the alternative hypothesis that there is
10、some kind of association between the row and column variables for at least one stratum Does not require the row or column variable to be ordinal Under H0, 2 with (r 1)(c 1) df,Matching,Control for confounding more efficiently than if the matching had not been performed Design phase of a study Gain s
11、tatistical efficiency in effect estimation,Matching (cont),Select comparison participants into a study such that they are the same (or nearly the same) on certain variable(s) Matched design requires a matched analysis Once match on a variable, the effect of that variable cannot be estimated in your
12、data set,Matched Data and the AGREE Option in SAS,AGREE option computes tests and measures of agreement for square tables (where the number of rows equal the number of columns),AGREE Option in SAS,AGREE option generates:-McNemars Test-Kappa-Weighted Kappa,McNemars Test of Symmetry for Matched Sample
13、s,For 2x2 tables Appropriate when have data from matched pairs of subjects with a dichotomous (yes/no) outcome Null hypothesis of marginal homogeneity Werner data set of matched pairs, comparing proportion of women with high cholesterol who take birth control pill to the proportion of women with hig
14、h cholesterol who do not take the pill2 distribution with 1 df,McNemars Test for Matched Proportions,Werner data set with age-matched pairs,2M = (21 23)2(21 +23)= 0.0909,There are 92 pairs. 45.65% of the NoPill group have high chol. 47.83% of the Pill group have high chol.,Simple Kappa Coefficient (
15、Cohens Kappa),Measure of inter-rater agreement, corrected for chanceScale from -1 to +1 = +1 when there is perfect agreement = 0 when the agreement equals that expected by chance Magnitude of Kappa reflects the strength of the agreement, beyond chance,Cohens Kappa (cont),SAS gives 95% CI for Kappa K
16、appa Guidelines (Landis and Koch),Good Resources for Categorical Data Analysis and SAS,SAS: Categorical Data Analysis Using The SAS System by Maura E. Stokes, Charles S. Davis, and Gary G. Koch. 2nd Ed, SAS Institute Inc., Cary, NC, 2000. See pages 155-156 of Biostat 510 course pack Kappa: “The Measurement of Observer Agreement for Categorical Data,” by J. Richard Landis and Gary G. Koch. Biometrics 33(1):159-174, 1977,